Application: Efficiency of Algorithms II Lecture 45 Section 9.5 Tue, Apr 24, 2007
Algorithm Analysis We will analyze the merge sort algorithm.
The Merge Sort Algorithm The Merge Sort Algorithm sorts a list of numbers by repeatedly merging longer and longer sublists. The initial sublists are of size 1. The final sublist is the entire list.
Example 88 16 2 43 57 67 79 34 5 71 62 69 49 90 29 65
Example 88 16 2 43 57 67 79 34 5 71 62 69 49 90 29 65 88 16 2 43 57 67 79 34 5 71 62 69 49 90 29 65 16 88 2 43 57 67 79 34 5 71 62 69 49 90 29 65 2 16 43 88 34 57 67 79 5 62 69 71 29 49 65 90 2 16 34 43 57 67 79 88 5 29 49 62 65 69 71 90 2 5 16 29 34 43 49 57 62 65 67 69 71 79 88 90
Analyzing the Merge Sort We will use two functions: MergeSort(). Merge(). To analyze the algorithm, we will count the number of comparisons required to sort a list of length n.
The MergeSort() Function The MergeSort() function is recursive. void MergeSort(int a[], int low, int high) { if (low < high) int mid = (low + high)/2; MergeSort(a, low, mid); MergeSort(a, mid + 1, high); Merge(a, low, mid, high); } return;
The MergeSort() Function The initial call would be to a non-recursive “starter” function. where MergeSort() is MergeSort(a, size); void MergeSort(int a[], int size) { MergeSort(a, 0, size – 1); return; }
The Merge() Function The real action takes place in the Merge() function. This function makes a single pass down each of the two lists to be merged, comparing elements of the first list to elements of the second list. The smaller element is copied into the new list.
The Merge() Function void Merge(int a[], int low, int mid, int high) { int b[high – low + 1]; int i = low; int j = mid + 1; int k = 0; while (i <= mid && j <= high) if (a[i] < a[j]) b[k++] = a[i++]; else b[k++] = a[j++]; } :
The Merge() Function : while (i <= mid) b[k++] = a[i++]; while (j <= high) b[k++] = a[j++]; j = low; for (i = 0; i < high; i++) a[j++] = b[i]; return; }
Analysis of Merge() What is the growth rate of Merge()? Inspect the loops. The length of each loop is proportional to the length of the lists. Therefore, the run-time of Merge() is (n), where n is the combined length of the two lists. In fact, # comparisons n – 1.
Analysis of MergeSort() Now we can analyze MergeSort(). Let cn be the number of comparisons needed by MergeSort() for a list of length n. Then cn cfloor(n/2) + cceiling(n/2) + (n – 1). Furthermore, c1 = 0.
Analysis of MergeSort() Assume the worst case: cn = cfloor(n/2) + cceiling(n/2) + (n – 1). Compute: c1 = 0. c2 = 2c1 + 1 = 1. c3 = c1 + c2 + 2 = 3. c4 = 2c2 + 3 = 5.
Analysis of MergeSort() c5 = c2 + c3 + 4 = 8. c6 = 2c3 + 5 = 11. c7 = c3 + c4 + 6 = 14. c8 = 2c4 + 7 = 17. c9 = c4 + c5 + 8 = 21. and so on.
Analysis of MergeSort() Note that c1 = 0 = –1 1 + 1. c2 = 1 = 0 2 + 1. c4 = 5 = 1 4 + 1. c8 = 17 = 2 8 + 1. Does c16 = 3 16 + 1 = 49? Does c32 = 4 32 + 1 = 129?
Analysis of MergeSort() This generalizes as cn = (log2 n – 1) n + 1 = n log2 n – n + 1
Analysis of MergeSort() Assume the worst case: cn = cfloor(n/2) + cceiling(n/2) + (n – 1). Show by induction that cn n log2 n – n + 1. The base case is trivial. So suppose the inequality is true for all n from 1 to k – 1 for some k 2. Show that the inequality is true when n = k.
Analysis of MergeSort() Suppose k is even. Then
Analysis of MergeSort() Suppose k is odd. Then we need the following lemma. Lemma: if k > 1, then (k – 1)log(k – 1) + (k + 1)log(k + 1) > 2k log k. Proof: Use calculus.
Analysis of MergeSort() Then
Analysis of MergeSort()
Analysis of MergeSort() Thus, cn is (n log2 n). Now we will show that cn is O(n log2 n). It will follow that cn is (n log2 n).
Analysis of MergeSort() Show by induction that cn 2n log2 n. The base case is trivial. Suppose that the inequality is true for all n from 1 to k – 1 for some k 2. Show that it is true when n = k.
Analysis of MergeSort() Suppose k is even. Then
Analysis of MergeSort() Suppose k is odd. Then
Analysis of MergeSort()
Analysis of MergeSort() Therefore, the worst case of the merge sort is (n log2 n).
Analysis of MergeSort() Suppose that MergeSort() sorts a list of 100 numbers in 1 s. How long will it take to sort a list of one thousand numbers? How long will it take to sort a list of one million numbers? How long will it take to sort a list of one billion numbers?