Next:4.5
Splay TreesUp:4.4
Binary Search TreePrevious:4.4
Binary Search Tree
4.4.1 Average Case Analysis of
BST Operations
RESULT
If we insert n random elements into an initially empty BST, then
the average path length from the root to a node is O(log n)
-
Note that the BST is formed by insertions only. Obviously the tree so formed
need not be complete.
-
We shall assume that all orders of n inserted elements are equally likely.
This means that any of the n! permutations is equally likely to be the
sequence of keys inserted.
Let |
|
|
P(n) |
= |
average path length in a BST with n nodes |
|
|
(average number of nodes on the path from the root |
|
|
to a node, not necessarily a leaf) |
|
|
|
Let |
|
|
a |
= |
first element inserted. This will be the root of |
|
|
the BST. Also this is equally likely to be the first, second |
|
|
. . . , ith, . . . , or nth in the sorted order of
the |
|
|
n elements. |
Note that P(0) = 0 and P(1) = 1. Consider a fixed i,
0 in
- 1. If i elements are less than a, the BST will look like in Figure
4.16.
Figure 4.16: A typical binary search tree with n
elements
|
-
-
Since all orders for the i small elements are equally likely and
likewise for the (n - i - 1) larger elements,
Average path length in the
-
left subtree = P(i)
-
right subtree = P(n - i - 1)
-
-
For a fixed i, let us compute the average path length of the above
tree.
-
Number of probes if the element a is being sought = 1
-
Average number of probes if an element from the LST is sought = 1+P(i)
-
Average number of probes if an element from the RST is sought = 1 + P(n
- i - 1)
-
Probability of seeking any of the n elements =
-
-
|
= |
1
+ i(1 + P(i)) + (n - i - 1)(1 + P(n
- i - 1)) |
|
|
= |
1 + P(i)
+ P(n
- i - 1) |
|
|
= |
P(n, i), say. |
|
-
-
Observe that P(n) is given by
-
P(n) =
Prob{LST has $i$ nodes }P(n, i)
Since the probability that the LST has i elements which is the
same as the probability that a is the (i + 1th element (where i
= 0, 1,..., n - 1) = ,
we have
P(n) |
= |
P(n, i) |
|
|
|
|
|
|
= |
1
+ P(i)
+ P(n
- i - 1 |
|
|
|
|
|
|
= |
1 + iP(i)
+ (n - i - 1)P(n - i - 1 |
|
|
|
|
|
|
= |
1 +
iP(i) |
|
since
iP(i) =
(n - i - 1)P(n - i - 1)
-
-
Thus the average path length in a BST satisfies the recurrence:
P(n) = 1 +
iP(i)
-
-
We shall show that P(n)
1 + 4log n, by Induction.
Basis:
-
P(1) is known to be 1. Also the RHS = 1 for n = 1
Induction:
-
Let the result be true
i < n. We shall show that the above is true for i
= n.
Consider
P(n) |
|
1 +
i(1 + 4log i) |
|
|
|
|
|
|
= |
1 +
4ilogi +
i |
|
|
|
|
|
|
|
1 +
4ilogi + tex2html_image_mark>#tex2html_wrap_indisplay25297#
since
i |
|
Thus
P(n)
2 +
ilog i
Now
ilog i |
= |
ilogi +
ilog i |
|
|
|
|
|
|
|
ilog
+
ilog n |
|
|
|
|
|
|
|
log
+ log
n |
|
|
|
|
|
|
= |
logn
- |
|
Therefore
P(n)
2 + log
-
= 1 + 4log n
A more careful analysis can be done and it can be shown that
P(n)
1 + 1.4log n
Next:4.5
Splay TreesUp:4.4
Binary Search TreePrevious:4.4
Binary Search Tree
eEL,CSA_Dept,IISc,Bangalore