Way up high in the binary tree

Two binary bits smiled at me

I shook that tree as hard as I could

Down came the binary bits,

Mmmm–were they good! -based on a children’s song

Now it turns out that this binary tree is an AVL tree. I wonder how high this tree is! Why does it concern us ? The reason is that the height will give us the worst case running time of the binary search tree algorithm. An AVL tree maintains its height by making sure the difference in height between the left and right subtree is at most 1. If is the number of nodes contained in the tree of height h, then is equal to the number of nodes in the left subtree plus the number of nodes in the right subtree PLUS 1 (the root node). In the worst case, the two subtrees differ in height by at most 1, therefore

with initial conditions and .

By adding 1 to both sides of this recurrence relation we get

If we let , we can rewrite the above recurrence relation as

The initial conditions are now

and

Now this is interesting! This recurrence relation is very similar to the Fibonacci recurrence relation except that the initial conditions are different. From the previous post, we found that the general solution of this recurrence relation is

where and are constants yet to be determined. Let us solve this recurrence relation in the general case where and . Computing the first 2 terms of the recurrence relation, we get two simultaneous equations in and .

Solving for from the first equation, we get

Plugging this into the second equation and solving for

From which we solve for

Substituting this to the recurrence relation of , we have

Expanding this expression you’ll end up with a rather complex expression involving cross terms of and

But that is where the complexity ends. Observe that

Using this result, we can write

Furthermore, we also observe that

Using all these results we have

Notice that the expressions in square brackets are just fibonacci expressions for n-1 and n, respectively. If we use them in the expression for , we can express it in terms of the fibonacci numbers

Returning to our computation for the height of the AVL tree, since and , we finally have

We can further simplify this by

Since ,

The formula above expresses the number of nodes of the AVL tree in terms of the height h. In terms of the golden ratio, we can write the last expression as

**Approximating the nth Fibonacci Number**

We can further simplify the above expression by observing that the nth fibonacci number is approximately equal to . To see this, we compute the distance of to :

Using this approximation, we can write

So if n is the number of nodes in an AVL tree, we can estimate the height using the above formula, that is

Using a change of base

Again using the fibonacci number approximation, we have . This simplifies our estimate to

where .

If we have an array of a million names, then the worst case running time of the binary search is

.

That's incredibly efficient!