Archive for ‘Mathematics’

November 20, 2011

Divided We Compute, United We Reduce

by Bobby Corpus

Once upon a time, in a far away village lay a dying old man. He called his sons to his deathbed and spoke to them one last time. He said “Sons, see that bundle of sticks? Each one of you try to break it. The one who can break it will inherit all my riches”. Each son, being greedy, wanted all the riches for himself. So each one of them tried to break the bundle of sticks but none of them succeeded. The old man asked his servant to untie the bundle and said to his sons, “Each one of you now get one stick and break it”. Without any effort, each son was able to break the stick. The old man said “You see, when you unite, no task will be difficult. The riches that I told you was a lie. We are broke. When i’m dead, make sure  you unite so that you can survive.”

Fast forward to modern times. You can think of the bundle of sticks to be a complex problem that is itself composed of smaller problems. The sons are the processors of your computer. When each processor was given the task to solve the complex problem, it fails to solve it in a reasonable amount of time. When the complex problem is decomposed into smaller problems and given to each processor, each processor is now able to solve the smaller problems quickly thereby solving the big problem quickly as well.

The process of decomposing a problem into smaller problems and solving them in separate processors is called Parallel Computing. In this article, we will compute how fast a certain algorithm will run when parallelized. The problem we want to investigate is sorting an array of a million (2^{20}) integers.

Efficient Sorting

Suppose you have an array \{ a_1, a_2, a_3, ..., a_n \} that you want to sort based on pairwise comparison.  The sorted array is just one of the many permutations of the array \{a_1, a_2, a_3,\ldots, a_n\}. In fact, if you have n different objects to sort, then there are exactly n! ways to arrange these objects, and one of them is the sorted state. You can imagine the sorting process as a decision tree. Say, for example we have array A={ a,b,c }. To sort this, we first compare a with b and there are 2 ways this can go. Either a \le b or a > b. If a \le b, we then compare b and c. This also give either b \le c or b > c. As you can see from the diagram below, this is nothing but a decision tree.



Since the height of this binary tree is lg(n!), then we have

\lg(n!) = \lg\Big[ n \cdot (n - 1) \cdot (n-2) \cdots 1\Big] \le \lg n + \lg (n-1) \cdots \lg1 \le \underbrace{\lg n \cdots \lg n}_\text{ n times}
\lg(n!) \le n\cdot \lg n

There are efficient algorithms that are able to sort of this complexity. For example, the merge sort has this complexity. Therefore, if you have an array of 2^{20} elements, then the complexity is

2^{20} \cdot \lg(2^{20}) = 2^{20} \cdot (20) = 20971520

that is, it takes about 20 million comparisons to sort an array of 1 million. Could we do any better than this? We can either upgrade the cpu of the machine doing the sorting or use two or more machines to divide the work among those machines. In this article, we are going to investigate the impact of dividing the work into smaller chunks and farming it to other processors.

Divide and Conquer

Assume we have an array n=2^{20} elements that we need to sort and suppose we have two identical processors we can use. Divide the array into 2 equal sized arrays. Give the first array to the first processor and the other half to the second processor. Apply an efficient sorting algorithm to the subarrays to produce a sorted array for each processor. We then combine the result of processor 1 and processor 2 to one big array by merging the two sorted arrays. The diagram below illustrates the process of computation:



This is also known as the MapReduce algorithm. Mapping is the process of assigning subsets of the input data to processors where each processor computes the partial result. Reducing is the process of aggregating the results of each processor to the final solution of the problem.

The process of merging is straightforward. Given two sorted arrays, begin by comparing the first element of each array. The smaller of the two will then occupy the first slot in the big array. The second element of the array from which we took the smallest element will now become the first element of that array. Repeat the process until all elements of both arrays have already occupied slots in the big array. The diagram below illustrates the algorithm of merging.




If you count the total number of comparisons that you need to merge two sorted arrays, you will find that it takes n-1 comparisons. Therefore, the complexity of the merging process is O(n).

Since each processor has n/2 sized subarrays, the sorting complexity is therefore n/p \lg (n/p). Furthermore, since the merging process takes O(n) comparisons, the total complexity of the parallel sorting process is therefore

\displaystyle n/p \lg(n/p) + n

In our example, C=2^{20}/2 \lg(2^{20}/2) + 2^{20}=  11010048 comparisons compared to 2^{20} \lg(2^{20}) = 20971520 when run sequentially. For large values of n, n/p \lg(n/p) dominates n, therefore the complexity of the parallel algorithm is O(n/p \lg(n/p)).

Can we do any better?

For a given value of n, what do you think is the value of p that reduces the running time to O(n)? If we take n=2^{20} and plot complexity against p = \{ 2, 4, 8, 16, 32\} we get the diagram below.



In this diagram, we also plotted the horizontal line y=2^{20}. The intersection of this line with the plot of \displaystyle f(p) = \frac{n}{p} \lg(\frac{n}{p}) gives us the value of p such that the total comparisons is already linear, that is,

\displaystyle f( p ) = n
\displaystyle \frac{n}{p} \lg(\frac{n}{p})  = n

To get the value of p numerically, we have to solve the root of the equation

\displaystyle g( p ) = \frac{n}{p} \lg(\frac{n}{p}) - n = 0

Simplifying,

\displaystyle \frac{1}{p} \lg(\frac{n}{p}) - 1 = 0
\displaystyle \lg(\frac{n}{p}) = p
\displaystyle \frac{n}{p} = 2^p
\displaystyle p2^p - n = 0

Since this is a non-linear equation, we can solve this using the Newton's method. It is a method to compute the roots by approximation given an initial value of the solution. Starting from a guess solution p_1, the root can be approximated using the recursive formula

\displaystyle p_{n+1} = p_n - \frac{g( p_n)}{g\prime ( p_n)}

where g\prime ( p ) is the first derivative of g( p ) . Applying the rules of derivatives, we get

\displaystyle g\prime ( p ) = p\cdot 2^p \ln 2 + 2^p

Substituting this to the formula for Newton's method, we get

\displaystyle p_{n+1} = p_n - \frac{p2^p - n}{p2^p \ln 2 - 2^p}

Below is an R code using newton’s method to compute the root of the equation g(p).

g=function(n,p){
	p* 2^p - n
}

gprime=function(n,p){
	p*2^p *log(2) - 2^p
}

newton=function(p,n,iter){
	tmp = p
	for(i in 1:iter){
		p=p-g(n,p)/gprime(n,p)
		
		if(abs(p-tmp)< 0.0001){
			break		
		}

		print(p)
		tmp=p
	}
	print(p)
}

Running this code, we get the value of p = 16:

> newton(15,n,100)
[1] 16.80905
[1] 16.08829
[1] 15.98603
[1] 16.00286
[1] 15.99944
[1] 16.00011
[1] 15.99998
[1] 16

Ignoring network latency, by distributing the input evenly into 16 processors, we get a running time of O(n) time complexity for n=2^{20} array of items. Therefore, instead of doing 20 million comparisons, you only need 1 million comparisons to sort 1 million objects.

In this age of multicore processors, parallel computing is fast becoming the norm than the exception. Learning to harness the power of multicores is becoming an extremely handy skill to have.

February 6, 2011

Gilligan’s Island

by Bobby Corpus

Suppose you and a couple of people were shipwrecked in an island (which, for dramatic effect, we call Gilligan’s Island). Assume that none of you were friends with each other in the beginning, but over time you all managed to be friends with at least one other person. What is the probability that two of you have the same number of friends? Now suppose that the initial number of people shipwrecked, including you is 60, what is the least number of people whose surnames start with the same letter?

This kind of problem is an instance of what is known as the Pigeonhole Principle.

If n pigeons fly into m pigeonholes, where n > m, then one pigeonhole has at least 2 or more pigeons in it.

This principle is so intuitive that even a child will agree with you on this. Now the question is why are the 2 problems in the previous paragraph an instance of the Pigeonhole Principle?

In the first problem, let n be the number of people shipwrecked in Gilligan’s Island. Then the maximum number of friends anyone can have is therefore n-1 and the minimum number of friends anyone can have is 1 ( each person has at least one friend). Imagine the pigeonholes to be the possible number of friends anyone can have. The number of such pigeonholes is n-1. By assigning each person to a pigeonhole, you are in effect assigning the number of friends a person has. Since n > n -1, by the Pigeonhole Principle, one pigeonhole has at least 2 or more pigeons ( or at least 2 people have the same number of friends). Therefore, the probability of at least 2 persons having the same number of friends is equal to 1.

In general, the Pigeonhole Principle can be stated as

If n pigeons fly into m pigeonholes such that n > k\cdot m, for some integer k, then there is a pigeonhole with at least k+1 pigeons in it.

For example, suppose n = 11 pigeons flying into m=5 pigeonholes. By letting k = 2, we have 11 > 2\cdot 5 and by the generalized Pigeonhole Principle, there is one pigeonhole that has at least k+1 = 2+1 = 3 pigeons in it.

Going back to the second problem, we know that there are only 26 letters in the English alphabet. If we imagine the m = 26 letters to be pigeonholes and the n = 60 people to be pigeons, then by the generalized Pigeonhole Principle, we have

\displaystyle 60 > k\cdot 26

Solving for the minimum integer k that satisfies this, we get k = 2. This means that there are at least k +1 = 2+1 = 3 persons whose surnames begin with the same letter.

Now try this problem for size. Suppose you have 20 black socks and 20 white socks in a dark room. Since you cannot see clearly in the dark, you decided to pick any sock at random. What is the minimum number of socks you need to pick in order to get a pair with the same color?

Follow

Get every new post delivered to your Inbox.