

 
If we insert n random elements into an initially empty BST, then the average path length from the root to a node is O(log n)
| Let | ||
| P(n) | = | average path length in a BST with n nodes | 
| (average number of nodes on the path from the root | ||
| to a node, not necessarily a leaf) | ||
| Let | ||
| a | = | first element inserted. This will be the root of | 
| the BST. Also this is equally likely to be the first, second | ||
| . . . , ith, . . . , or nth in the sorted order of the | ||
| n elements. | 
Note that P(0) = 0 and P(1) = 1. Consider a fixed i,
0  i
i n
- 1. If i elements are less than a, the BST will look like in Figure
4.16.
n
- 1. If i elements are less than a, the BST will look like in Figure
4.16.
 
 
|  | 

Average path length in the


| = |   1
+ i(1 + P(i)) + (n - i - 1)(1 + P(n
- i - 1))  | ||
| = | 1 +  P(i)
+  P(n
- i - 1) | ||
| = | P(n, i), say. | 

 Prob{LST has $i$ nodes }P(n, i)
 
Prob{LST has $i$ nodes }P(n, i) ,
we have
,
we have
| P(n) | = |   P(n, i) | |
| = |  ![$\displaystyle \left\{\vphantom{\sum_{i=0}^{n-1} \space \left[1 + \frac{i}{n} P(i) +\frac{n-i-1}{n} P(n-i-1\right]}\right.$](img175.gif)   1
+  P(i)
+  P(n
- i - 1 ![$\displaystyle \left.\vphantom{1 + \frac{i}{n} P(i) +\frac{n-i-1}{n} P(n-i-1}\right]$](img177.gif) ![$\displaystyle \left.\vphantom{\sum_{i=0}^{n-1} \space \left[1 + \frac{i}{n} P(i) +\frac{n-i-1}{n} P(n-i-1\right]}\right\}$](img178.gif) | ||
| = | 1 +    iP(i)
+ (n - i - 1)P(n - i - 1 ![$\displaystyle \left.\vphantom{iP(i) +(n-i-1)P(n-i-1}\right]$](img181.gif) | ||
| = | 1 +   iP(i) | 
since 
 iP(i) =
iP(i) =  (n - i - 1)P(n - i - 1)
(n - i - 1)P(n - i - 1)

 iP(i)
iP(i)
 1 + 4log n, by Induction.
1 + 4log n, by Induction. i < n. We shall show that the above is true for i
= n.
i < n. We shall show that the above is true for i
= n.
Consider
| P(n) |  | 1 +   i(1 + 4log i) | |
| = | 1 +   4ilogi +   i | ||
|  | 1 +   4ilogi +  tex2html_image_mark>#tex2html_wrap_indisplay25297#   since  i   | 
Thus 
                                                                                                   
P(n)  2 +
2 + 
 ilog i
ilog i
Now
|  ilog i | = |  ilogi +  ilog i | |
|  |  ilog  +  ilog n | ||
|  |  log  +  log
n | ||
| = |  logn
-  | 
Therefore 
               
P(n)  2 +
2 + 

 log
-
log
- 
 = 1 + 4log n
= 1 + 4log n
A more careful analysis can be done and it can be shown that 
               
P(n)  1 + 1.4log n
1 + 1.4log n
 


