Archive for March, 2011

March 15, 2011

A Mutually Beneficial Arrangement

by Bobby Corpus

It’s been said that the programmers of today are the pizza delivery boys of tomorrow. This has happened the last time the world witnessed a severe financial crisis. Usually, those who are not able to cope up with the current trends in IT are the first ones to go. One trend that is here to stay is the increasing number of cores of our computers. In order to maximize the power of multicore architecture, the programmers of today should be able to do concurrent programming correctly. Concurrent programming is the art of using the extra CPU capacity of your computer by exploiting parallelism to speed up your applications. This is a non-trivial task and requires a different mindset. In these series of articles, we will explore what concurrent programming is and its mathematical basis.

Imagine a concurrent program composed of two threads trying to read or write a value into an array. Let’s suppose that we have an array of 10 numbers and 2 threads. Thread 1 writes a number to the next empty array location. If the array is full, thread 1 will not write anything but will wait until at least one location is empty. Thread 2 will read and delete the last number in the array. If the array is empty, thread 2 will not read anything. How do we ensure that thread 1 will not write past the array length? How do we ensure that thread 2 will not read an empty element?

The section of code that does the updating or reading of the array is called the critical section. This is the portion of the code where we need to ensure that only one thread is executing at a time. To do this, we employ a mechanism for the two threads to know whose turn it is to enter the critical section. Below is a code that tries to solve this problem. First we define a global variable that is shared by both threads. This variable called turn will specify which thread can enter the critical section. The value of turn is either 1 or 2. A value of 1 means that thread 1 can enter the critical section while a value of 2 means that thread 2 can enter the critical section.


global turn = 1

This code is to be executed by thread 1:

#loop forever
1 execute non-critical section
2 await turn = 1
3 execute critical section
4 turn = 2

The code to be executed by thread 2 is similar:

#loop forever
1 execute non-critical section
2 await turn = 2
3 execute critical section
4 turn = 1

The critical section has be coded in such a way that it takes a short time as possible. We shall assume that any thread that enters the critical section should exit from it eventually. This is called the progress assumption.

We can imagine the execution of this program by the two threads to be described by a state specified by three numbers:

1. the pointer to the line that thread 1 is to execute.
2. the pointer to the line that thread 2 is to execute.
3. the value of the turn variable.

For example, the initial state is described by the triple (1,1,1). This means that thread 1 pointer is at line 1, thread 2 pointer is at line 1 and the value of the turn variable is 1. Let us first make it clear that when the pointer is at line 1, the thread has not yet executed that line. It only means that if the CPU scheduler will allow thread 1 to execute, it will execute line 1. For example, if the pointer of thread 1 is at line 4, it does not mean that the value of turn = 2. It only means that when thread 1 goes from state 4 to state 1, then the value of turn should have been set to 2.

We can draw the entire execution of the two threads using a state diagram as shown below:


Let’s try to understand this diagram. The execution begins with the state (1,1,1) which means that thread 1 pointer is at 1, thread 2 pointer is at 1, and the value of turn = 1. This state can either go to (2,1,1) or (1,2,1) depending on which thread the CPU scheduler wants to run first. The state transitions are indicated by arrows. From the diagram, you will see that when the value of turn = 1, thread 2 cannot go past pointer 2. In the same way, when the value of turn = 2, thread 1 cannot go past pointer 2.

Correctness Properties of Concurrent Programs

Writing concurrent programs is not easy. You have to demonstrate that your program is correct. What does correct mean? A concurrent program is correct if:

1. Mutual exclusion holds. There is at most one thread that enters the critical section.
2. No deadlock occurs. This means that no thread holds a resource that is needed by another thread, and vice versa. If this occurs, then each thread will be waiting for the other thread to release the resource and the application will appear to hang.
3. No starvation occurs. If a thread wants to enter the critical section, then it should eventually be able to do so.

Is our algorithm correct? First, the algorithm satisfies mutual exclusion. There is no state in which both thread pointers are pointing to line 3, which is the critical section. This means you cannot see states (3,3,1) or (3,32). There is no deadlock. The two threads do not get stuck on it’s preprotocol. Suppose that the value of turn = 1 and both threads execute line 2. Since the turn = 1, then thread 1 will enter its critical section. By the assumption of progress, it will exit its critical section and set turn = 2. This will then allow thread 2 to enter its critical section. Therefore, no thread gets stuck trying to enter its critical section.

There is however starvation. There is no assumption of progress on the non-critical section. This means that if turn = 1 and thread 1 will get stuck in its non-critical section, this will prevent thread 2 from entering its critical section. Therefore, the algorithm is not correct as it does not satisfy all conditions.

In the next post, we will examine a mechanism, called a semaphore, that is able to solve this problem.

March 7, 2011

Way Up High A Binary Tree

by Bobby Corpus

Way up high in the binary tree
Two binary bits smiled at me
I shook that tree as hard as I could
Down came the binary bits,
Mmmm–were they good! -based on a children’s song

Now it turns out that this binary tree is an AVL tree. I wonder how high this tree is! Why does it concern us ? The reason is that the height will give us the worst case running time of the binary search tree algorithm. An AVL tree maintains its height by making sure the difference in height between the left and right subtree is at most 1. If N_h is the number of nodes contained in the tree of height h, then N_h is equal to the number of nodes in the left subtree plus the number of nodes in the right subtree PLUS 1 (the root node). In the worst case, the two subtrees differ in height by at most 1, therefore

\displaystyle N_h = N_{h-1} + N_{h-2} + 1

with initial conditions N_0 = 0 and N_1 = 1.

By adding 1 to both sides of this recurrence relation we get

\displaystyle N_h + 1 = N_{h-1} + 1 + N_{h-2} + 1

If we let b_h = N_h + 1, we can rewrite the above recurrence relation as

\displaystyle b_h = b_{h-1} + b_{h-2}

The initial conditions are now

b_0 = N_0 + 1 = 0 + 1 = 1 and b_1 = N_1 + 1 = 1 + 1 = 2

Now this is interesting! This recurrence relation is very similar to the Fibonacci recurrence relation except that the initial conditions are different. From the previous post, we found that the general solution of this recurrence relation is

\displaystyle b_n = \alpha_1 \phi^n + \alpha_2 (1-\phi)^n

where \alpha_1 and \alpha_2 are constants yet to be determined. Let us solve this recurrence relation in the general case where b_0 = s and b_1=t. Computing the first 2 terms of the recurrence relation, we get two simultaneous equations in \alpha_1 and \alpha_2.

\displaystyle b_0 = s = \alpha_1 + \alpha_2
\displaystyle b_1 = t = \alpha_1 \phi + \alpha_2 (1-\phi)

Solving for \alpha_1 from the first equation, we get

\displaystyle \alpha_1 = s- \alpha_2

Plugging this into the second equation and solving for \alpha_2

\displaystyle t = (s-\alpha_2) \phi + \alpha_2 - \alpha_2\phi
\displaystyle t = s\phi + \alpha_2(1- 2\phi)
\displaystyle \alpha_2 = \frac{t-s\phi}{1-2\phi}

From which we solve for \alpha_1

\displaystyle \alpha_1 = s - \frac{t-s\phi}{1-2\phi}
\displaystyle = \frac{s(1-\phi) - t }{ 1-2\phi}

Substituting this to the recurrence relation of b_n, we have

\displaystyle b_n = \frac{s(1-\phi) -t }{1-2\phi} \phi^n + \frac{t - s\phi}{1-2\phi} (1-\phi)^n

Expanding this expression you’ll end up with a rather complex expression involving cross terms of \phi^n and (1-\phi)^n

\displaystyle b_n = \frac{s\phi^n ( 1- \phi) - t\phi^n + t(1-\phi)^n - s\phi(1-\phi)^n}{1-2\phi}

But that is where the complexity ends. Observe that

\displaystyle \phi(1-\phi) = \frac{1+\sqrt{5}}{2}\cdot \frac{1-\sqrt{5}}{2}
\displaystyle = \frac{1-5}{4} = -1

Using this result, we can write

\phi^{n}(1-\phi) = \phi^{n-1}\phi (1-\phi) = -\phi^{n-1}
\phi (1-\phi)^n = \phi (1-\phi) (1-\phi)^{n-1} = - (1-\phi)^{n-1}

Furthermore, we also observe that

\displaystyle 1-2\phi = 1 - \frac{1+ \sqrt{5}}{2} = -\sqrt{5}

Using all these results we have

\displaystyle b_n = - s\phi^{n-1} - t\phi^n + t(1-\phi)^n + s(1-\phi)^{n-1}
\displaystyle = s \Big[ \frac{(\phi^{n-1} - (1-\phi)^{n-1}}{\sqrt{5}} +\Big]  t \Big[ \frac{(\phi^n - (1-\phi)^n}{\sqrt{5}}\Big]

Notice that the expressions in square brackets are just fibonacci expressions for n-1 and n, respectively. If we use them in the expression for b_n, we can express it in terms of the fibonacci numbers

\displaystyle  b_n = sf_{n-1} + tf_n

Returning to our computation for the height of the AVL tree, since b_0 = s = 1 and b_1 = t = 2, we finally have

\displaystyle b_n = f_{n-1} + 2f_n

We can further simplify this by

\displaystyle b_n = f_{n-1} + f_n + f_n
\displaystyle = \Big(f_{n-1} + f_n\Big) + f_n
\displaystyle = f_{n+1} + f_n
\displaystyle = f_{n+2}

Since b_n = N_h + 1,

\displaystyle N_h + 1 = f_{n+2}
\displaystyle N_h = f_{n+2} -1

The formula above expresses the number of nodes of the AVL tree in terms of the height h. In terms of the golden ratio, we can write the last expression as

\displaystyle N_h = \frac{\phi^{n+2} - (1-\phi)^{n+2}}{\sqrt{5}} - 1

Approximating the nth Fibonacci Number

We can further simplify the above expression by observing that the nth fibonacci number is approximately equal to \phi^n/\sqrt{5}. To see this, we compute the distance of f_n to \phi^n/\sqrt{5}:

\displaystyle |f_n - \frac{\phi^n}{\sqrt{5}}| = |\frac{\phi^n - (1-\phi)^n}{\sqrt{5}} - \frac{\phi^n}{\sqrt{5}} | = |\frac{(1-\phi)^n}{\sqrt{5}}| < \frac{1}{\sqrt{5}} < \frac{1}{2}

Using this approximation, we can write

\displaystyle N_h = \frac{\phi^{h+2}}{\sqrt{5}} -1

So if n is the number of nodes in an AVL tree, we can estimate the height using the above formula, that is

\displaystyle n \approx \frac{\phi^{h+2}}{\sqrt{5}} -1
\displaystyle n + 1 \approx \frac{\phi^{h+2}}{\sqrt{5}}
\displaystyle \sqrt{5}(n+1) \approx \phi^{h+2}
\displaystyle h \approx \log_{\phi} \Big( \sqrt{5} (n+1)\Big) -2

Using a change of base

\displaystyle n \approx \frac{\log_2 \Big( \sqrt{5}(n+1)\Big)}{\log_2 \phi} -2
\displaystyle \approx \frac{\log_2\Big( \sqrt{5}(n+1)\Big) - 2\log_2\phi}{\log_2 \phi}
\displaystyle \approx \frac{1}{\log_2 \phi} \cdot \log_2\Big(\frac{\sqrt{5}(n+1)}{\phi^2}\Big)

Again using the fibonacci number approximation, we have \sqrt{5}/\phi^2 \approx f_2 = 1. This simplifies our estimate to

\displaystyle h \approx 1.44\log_2(n+1)

where 1/\log_2 \phi = 1.44.

If we have an array of a million names, then the worst case running time of the binary search is

\displaystyle 1.44\log_2(1000000 + 1) = 28.70.

That's incredibly efficient!

March 6, 2011

Eating The Fruit Of The AVL Tree

by Bobby Corpus

We have seen that searching a name in a sorted array of a million names is very efficient when using binary search. In fact, the running time is logarithmic, which means we only need 19 comparisons, in the worst case, to figure out if our search is successful or not. We have also seen that by using a binary search tree, we can save more efficiencies when we are searching on a dynamic array of names, that is, when the list of names grows. However, we also showed that if the BST is fed with a sorted array of names, then the search becomes very inefficient. In fact, in the worst case, we need to compare 1 million times before we realize that we don’t have a name in the array.

In the last post, we built a BST from an array of a sorted list of animals. Below is the result of that construction, repeated here for convenience.

This is a sad state of affairs. We have found a very good data structure for searching but it fails when fed with sorted data. Is there anything we can do to minimize the height of this tree? Fortunately, there is! The data structure is called a self-balancing binary search tree and there are many of them. We will only take a look at one and it is called AVL tree*.

A self-balancing binary search tree is able to maintain it’s height by keeping track of a variable in each node called the balance factor. The balance factor for a node x is defined as

Height of left subtree of node minus the height of the right subtree.

Below is an example of a binary tree with balance factors indicated inside the circles. The triangles that you see in the diagram are subtrees. They may or may not be present in the node but we show them in the diagram for some reason which will become clear later.

An AVL tree employs mechanisms to limit the difference in height between subtrees to 1, 0 or -1. These mechanisms are called rotations. There are only 4 rotations that are needed to balance an AVL tree. The figure below shows you how rotations work.

The first thing to look for after inserting a node is find a balance factor that is -2 or +2. in the first tree, the balance factor is +2. this means that the left subtree is imbalanced. Next, examine the balance factor of the left child. If the balance factor is -1, then the right subtree is higher than the left. To balance this, the child in yellow replaces its parent (green node) which now becomes its left child. What used to be the right child of the node in yellow is now the right child of the green node.

However, this rotation does not change the balance factor of the root node. Another rotation will establish the balance. It involves replacing the root node ( in red) with the yellow node and making the root node the right child of the yellow node. What used to be the right child of the yellow node now becomes the left child of the red node.

When the balance factor of the root node is -2 it means that the right subtree is imbalanced. If you observe carefully, this is just symmetric to the case above and the rotation can easily be figured out from the figure.

Balancing the Animal Tree

Let us show how we can use these transformations to fix our tree. From the figure below, you can see that the imbalance occurs after we insert the cobra. At this point, the balance factor of the “alligator” node is 2. Rotating this tree by making bat the root node fixes the tree.

We then insert dog and fox before we encounter another imbalance. Replacing cobra with dog and making cobra the left child of dog fixes the imbalance.


Inserting horse to this new configuration creates an imbalance at the root of the tree. Replacing bat with dog and making cobra the right child of bat fixes the tree.

You can follow the rest of the process until we have all ten words inserted into the BST. The result is a tree with a maximum height of 4.


What does this mean? It means that it only takes a maximum of 4 comparisons to find any animal in this structure. It also means it only takes a maximum of 4 comparisons to determine if an animal is not in the structure.

In the next post, we are going to compute the the height of an AVL tree to get an estimate of the complexity of the search.

* AVL is named after Adelson-Velskii and EM Landis.

Follow

Get every new post delivered to your Inbox.