![next](http://lcm.csa.iisc.ernet.in/dsa/next_motif.gif)
![up](http://lcm.csa.iisc.ernet.in/dsa/up_motif.gif)
Next:4.5
Splay TreesUp:4.4
Binary Search TreePrevious:4.4
Binary Search Tree
4.4.1 Average Case Analysis of
BST Operations
RESULT
If we insert n random elements into an initially empty BST, then
the average path length from the root to a node is O(log n)
-
Note that the BST is formed by insertions only. Obviously the tree so formed
need not be complete.
-
We shall assume that all orders of n inserted elements are equally likely.
This means that any of the n! permutations is equally likely to be the
sequence of keys inserted.
Let |
|
|
P(n) |
= |
average path length in a BST with n nodes |
|
|
(average number of nodes on the path from the root |
|
|
to a node, not necessarily a leaf) |
|
|
|
Let |
|
|
a |
= |
first element inserted. This will be the root of |
|
|
the BST. Also this is equally likely to be the first, second |
|
|
. . . , ith, . . . , or nth in the sorted order of
the |
|
|
n elements. |
Note that P(0) = 0 and P(1) = 1. Consider a fixed i,
0
i
n
- 1. If i elements are less than a, the BST will look like in Figure
4.16.
Figure 4.16: A typical binary search tree with n
elements
![\begin{figure}\centerline{\psfig{figure=figures/Fbst1.ps}}\end{figure}](img169.gif) |
-
![$ \mbox{$\bullet$}$](img5.gif)
-
Since all orders for the i small elements are equally likely and
likewise for the (n - i - 1) larger elements,
Average path length in the
-
left subtree = P(i)
-
right subtree = P(n - i - 1)
-
![$ \mbox{$\bullet$}$](img5.gif)
-
For a fixed i, let us compute the average path length of the above
tree.
-
Number of probes if the element a is being sought = 1
-
Average number of probes if an element from the LST is sought = 1+P(i)
-
Average number of probes if an element from the RST is sought = 1 + P(n
- i - 1)
-
Probability of seeking any of the n elements =
![$ {\frac{1}{n}}$](img170.gif)
-
-
|
= |
![$\displaystyle {\frac{1}{n}}$](img26.gif) 1
+ i(1 + P(i)) + (n - i - 1)(1 + P(n
- i - 1))![$\displaystyle \left.\vphantom{1 + i (1 + P(i)) + (n-i-1) (1 + P(n-i-1))}\right\}$](img172.gif) |
|
|
= |
1 + P(i)
+ P(n
- i - 1) |
|
|
= |
P(n, i), say. |
|
-
![$ \mbox{$\bullet$}$](img5.gif)
-
Observe that P(n) is given by
-
P(n) =
Prob{LST has $i$ nodes }P(n, i)
Since the probability that the LST has i elements which is the
same as the probability that a is the (i + 1th element (where i
= 0, 1,..., n - 1) =
,
we have
P(n) |
= |
![$\displaystyle {\frac{1}{n}}$](img26.gif)
P(n, i) |
|
|
|
|
|
|
= |
![$\displaystyle {\frac{1}{n}}$](img26.gif) ![$\displaystyle \left\{\vphantom{\sum_{i=0}^{n-1} \space \left[1 + \frac{i}{n} P(i) +\frac{n-i-1}{n} P(n-i-1\right]}\right.$](img175.gif) ![$\displaystyle \sum_{i=0}^{n-1}$](img107.gif) 1
+ P(i)
+ P(n
- i - 1![$\displaystyle \left.\vphantom{1 + \frac{i}{n} P(i) +\frac{n-i-1}{n} P(n-i-1}\right]$](img177.gif) ![$\displaystyle \left.\vphantom{\sum_{i=0}^{n-1} \space \left[1 + \frac{i}{n} P(i) +\frac{n-i-1}{n} P(n-i-1\right]}\right\}$](img178.gif) |
|
|
|
|
|
|
= |
1 + ![$\displaystyle {\frac{1}{n^2}}$](img179.gif) ![$\displaystyle \sum_{i=0}^{n-1}$](img107.gif) iP(i)
+ (n - i - 1)P(n - i - 1![$\displaystyle \left.\vphantom{iP(i) +(n-i-1)P(n-i-1}\right]$](img181.gif) |
|
|
|
|
|
|
= |
1 + ![$\displaystyle {\frac{2}{n^2}}$](img182.gif)
iP(i) |
|
since
iP(i) =
(n - i - 1)P(n - i - 1)
-
![$ \mbox{$\bullet$}$](img5.gif)
-
Thus the average path length in a BST satisfies the recurrence:
P(n) = 1 + ![$\displaystyle {\frac{2}{n^2}}$](img182.gif)
iP(i)
-
![$ \mbox{$\bullet$}$](img5.gif)
-
We shall show that P(n)
1 + 4log n, by Induction.
Basis:
-
P(1) is known to be 1. Also the RHS = 1 for n = 1
Induction:
-
Let the result be true
i < n. We shall show that the above is true for i
= n.
Consider
P(n) |
![$\displaystyle \leq$](img8.gif) |
1 + ![$\displaystyle {\frac{2}{n^2}}$](img182.gif)
i(1 + 4log i) |
|
|
|
|
|
|
= |
1 + ![$\displaystyle {\frac{2}{n^2}}$](img182.gif)
4ilogi + ![$\displaystyle {\frac{2}{n^2}}$](img182.gif)
i |
|
|
|
|
|
|
![$\displaystyle \leq$](img8.gif) |
1 + ![$\displaystyle {\frac{2}{n^2}}$](img182.gif)
4ilogi + tex2html_image_mark>#tex2html_wrap_indisplay25297#![$\displaystyle {\frac{n^2}{2}}$](img184.gif)
since
i![$\displaystyle \leq$](img8.gif) ![$\displaystyle {\frac{n^2}{2}}$](img184.gif) |
|
Thus
P(n)
2 + ![$\displaystyle {\frac{8}{n^2}}$](img186.gif)
ilog i
Now
ilog i |
= |
ilogi +
ilog i |
|
|
|
|
|
|
![$\displaystyle \leq$](img8.gif) |
ilog
+
ilog n |
|
|
|
|
|
|
![$\displaystyle \leq$](img8.gif) |
log
+ log
n |
|
|
|
|
|
|
= |
logn
- ![$\displaystyle {\frac{n^2}{8}}$](img189.gif) |
|
Therefore
P(n)
2 + ![$\displaystyle {\frac{8}{n^2}}$](img186.gif)
![$\displaystyle \left(\vphantom{\frac{n^2}{2} \log - \frac{n^2}{8} }\right.$](img191.gif)
log
- ![$\displaystyle {\frac{n^2}{8}}$](img189.gif)
= 1 + 4log n
A more careful analysis can be done and it can be shown that
P(n)
1 + 1.4log n
![next](http://lcm.csa.iisc.ernet.in/dsa/next_motif.gif)
![up](http://lcm.csa.iisc.ernet.in/dsa/up_motif.gif)
Next:4.5
Splay TreesUp:4.4
Binary Search TreePrevious:4.4
Binary Search Tree
eEL,CSA_Dept,IISc,Bangalore