Sorting - Merge Sort Cmput Lecture 12 Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture is based on code from the book: Java Structures by Duane A. Bailey or the companion structure package Revised 1/26/00
©Duane Szafron About This Lecture In this lecture we will learn about a sorting algorithm called the Merge Sort. We will study its implementation and its time and space complexity.
©Duane Szafron Outline The Merge Sort Algorithm Merge Sort - Arrays Time and Space Complexity of Merge Sort
©Duane Szafron The Sort Problem Given a collection, with elements that can be compared, put the elements in increasing or decreasing order
©Duane Szafron Merge Sort Algorithm - Splitting Split the list into two halves: the bottom part and the top part. Make a recursive call to Merge Sort for each half
©Duane Szafron Merge Sort Algorithm - Merging Merge the bottom sorted half and the top sorted half into a sorted whole using a Merge algorithm
©Duane Szafron Merge Algorithm 0 The Merge algorithm requires temporary storage, the same size as the collection. Before a merge, we place each sub- collection in its own collection
©Duane Szafron Merge Algorithm 1 Copy the smallest from the two collections
©Duane Szafron Merge Algorithm 2 Copy the smallest from the two collections
©Duane Szafron Merge Algorithm 3 Copy the smallest from the two collections
©Duane Szafron Merge Algorithm 4 Copy the smallest from the two collections
©Duane Szafron Merge Algorithm 5 Copy the smallest from the two collections
©Duane Szafron Merge Algorithm 6 Copy the smallest from the two collections
©Duane Szafron Merge Algorithm 7 When one of the two collections is empty: –If the bottom collection is emptied first, then elements remain in the top collection, but they are in the correct place, so stop
©Duane Szafron Merge Algorithm 8 –If the top collection is emptied first, then elements remain in the bottom collection. –For example, if the original list had the 90 and 60 exchanged and the 80 and 30 exchanged, the top collection would be emptied first –All remaining elements must be copied from the bottom collection to the top.
©Duane Szafron Merge Algorithm - Arrays 1 private static void merge(Comparable top[ ], Comparable bottom[ ] int low, int middle, int high) { // pre: top[middle..high] are ascending // bottom[low..middle-1] are ascending // post: top[low..high] are ascending inttopIndex; //index of top array int botIndex; //index of bottom array int resultIndex; //index of result array topIndex = middle; botIndex = low; resultIndex = low; code based on Bailey pg. 85
©Duane Szafron Merge Algorithm - Arrays 2 // as long as both collections are not empty while (bottIndex < middle) && (topIndex <= high) if (top[topIndex].compareTo(bottom[botIndex]) < 0) // copy from the top collection top[resultIndex++] = top[topIndex++]; else // or copy from the bottom collection top[resultIndex++] = bottom[botIndex++]; // copy any remaining elements from the bottom for (botIndex = botIndex; botIndex < middle; botIndex++) top[resultIndex++] = bottom[botIndex++]; } code based on Bailey pg. 85 What about remaining elements in top?
©Duane Szafron MergeSort Algorithm The MergeSort is recursive, so we create a wrapper method and the recursive method that it calls.
©Duane Szafron MergeSort Algorithm - Arrays public static void mergeSort(Comparable anArray[], int size) { // pre: 0 <= size <= anArray.length // post: values in anArray[0..size - 1] are in // ascending order mergeSortR(anArray, new int[size], 0, size - 1); } code based on Bailey pg. 88
©Duane Szafron MergeSortR Algorithm - Arrays public static void mergeSortR(Comparable anArray[ ], Comparable temp[ ], int low, int high) { // pre: 0 <= low <= high <= anArray.length // post: values in anArray[low..high] are in ascending order int size; // size of the entire collection int index; // index for copying bottom into temp size = high - low + 1; if (size < 2) return; middle = (low + high) / 2; for (index = low; index < middle; index++) temp[index] = anArray[index]; mergeSortR(temp, anArray, low, middle - 1); mergeSortR(anArray, temp, middle, high); merge(anArray, temp, low, middle, high); } code based on Bailey pg. 86
©Duane Szafron MergeSort Calling Sequence mSR(a, t, 0, 8) –mSR(t, a, 0, 3) mSR(t, a, 0, 1) –msR(t, a, 0, 0) * –msR(a, t, 1, 1) * –merge(a, t, 0, 1, 1) mSR(a, t, 2, 3) –msR(t, a, 2, 2) * –msR(a, t, 3, 3) * –merge(a, t, 2, 3, 3) merge(a, t, 0, 2, 3) * will not be traced –mSR(a, t, 4, 8) mSR(t, a, 4, 5) –mSR(t, a, 4, 4) * –mSR(a, t, 5, 5) * –merge(a, t, 4, 5, 5) mSR(a, t, 6, 8) –mSR(t, a, 6, 6) * –mSR(a, t, 7, 8) mSR(t, a, 7, 7) * mSR(a, t, 8, 8) * merge(a, t, 7, 8, 8) –merge(a, t, 6, 7, 8) merge(a, t, 4, 6, 8) –merge(a, t, 0, 4, 8)
©Duane Szafron Merge Sort Trace mSR(0,8) mSR(0,3) mSR(0,1) mergeSort()
©Duane Szafron Merge Sort Trace merge(0,1,1) mSR(2,3) merge(2,3,3)
©Duane Szafron Merge Sort Trace 3 merge(0,2,3) mSR(4,8) mSR(4,5)
©Duane Szafron Merge Sort Trace 4 merge(4,5,5) mSR(6,8) mSR(7,8)
©Duane Szafron Merge Sort Trace 5 merge(7,8,8) merge(6,7,8) merge(4,6,8)
©Duane Szafron Merge Sort Trace 6 merge(0,4,8)
©Duane Szafron Counting Comparisons 1 How many comparison operations are required for a merge sort of an n-element collection? The recursive sort method calls merge() once, each time it divides the collection in half. Then there are n - 1 calls to merge: middle = (low + high) / 2; for (index = low; index < middle; index++) temp[index] = anArray[index]; mergeSortR(temp, anArray, low, middle - 1); mergeSortR(anArray, temp, middle, high); merge(anArray, temp, low, middle, high); i-1 where i = log n 1 log n 2 i-1 = n-1
©Duane Szafron Counting Comparisons 2 // as long as both collections are not empty while (bottIndex < middle) && (topIndex <= high) if (top[topIndex].compareTo(bottom[botIndex]) < 0) // copy from the top collection top[resultIndex++] = top[topIndex++]; else // or copy from the bottom collection top[resultIndex++] = bottom[botIndex++]; Assume that the collection is always divided exactly in half. Each time merge is executed for two lists of size k/2, it does comparisons in a loop and each time, either the bottom index or top index is incremented.
©Duane Szafron Comparisons - Best Case 1 In the best case, the same index (top or bottom) is incremented every time through the loop, so there are k/2 comparisons on each call to merge, for a call with two lists of size k/2. Notice that each level has the same total number of comparisons: n/2 k=4 k=2 k=1 k=2 k=1 k=4 k=2 k=1 k=2 k= best case comparisons per merge n=k=8
©Duane Szafron Comparisons - Best Case 2 Since the depth of the tree is log(n), the total number of comparisons is: n/2 * log(n) = O(n log(n))
©Duane Szafron Comparisons - Worst Case 1 In the worst case, the index that is incremented alternates each time through the loop, so there are k comparisons on each call to merge, for a call with two lists of size k/2. k=4 k=2 k=1 k=2 k=1 k=4 k=2 k=1 k=2 k= worst case comparisons per merge n=k=8
©Duane Szafron Comparisons - Worst Case 2 Since the depth of the tree is log(n), the total number of comparisons is: n * log(n) = O(n log(n))
©Duane Szafron Comparisons - Average Case The average case must be between the best case and the worst case so the average case is: O(n log(n))
©Duane Szafron Counting Assignments 1 // as long as both collections are not empty while (bottIndex < middle) && (topIndex <= high) if (top[topIndex].compareTo(bottom[botIndex]) < 0) // copy from the top collection top[resultIndex++] = top[topIndex++]; else // or copy from the bottom collection top[resultIndex++] = bottom[botIndex++]; How many assignment operations are required for a merge sort of an n-element collection? In the merge method, there is one assignment for every comparison:
©Duane Szafron Counting Assignments 2 However, the recursive sort method “copies” half of a collection of size k to the other collection which takes an extra k/2 assignments for each merge: middle = (low + high) / 2; for (index = low; index < middle; index++) temp[index] = anArray[index]; mergeSortR(temp, anArray, low, middle - 1); mergeSortR(anArray, temp, middle, high); merge(anArray, temp, low, middle, high);
©Duane Szafron Counting Assignments 3 In addition, during a merge, any remaining elements on the bottom (but not the top), must be copied: // copy any remaining elements from the bottom for (botIndex = botIndex; botIndex < middle; botIndex++) top[resultIndex++] = bottom[botIndex++];
©Duane Szafron Assignments - Best Case In the best case there are no remaining bottom elements so each merge has k/2 extra assignments. Since the depth of the tree is log(n), the total number of assignments is: n* log(n) = O(n log(n)) k=8 k=4 k=2 k=1 k=2 k=1 k=4 k=2 k=1 k=2 k= best case comparisons per merge best case extra assignments per merge
©Duane Szafron Assignments - Worst Case In the worst case there are k/2 remaining bottom elements so each merge has k extra assignments. Since the depth of the tree is log(n), the total number of assignments is: 2*n* log(n) = O(n log(n)) k=8 k=4 k=2 k=1 k=2 k=1 k=4 k=2 k=1 k=2 k= worst case comparisons per merge worst case Extra assignments per merge
©Duane Szafron Assignments - Average Case The average case is between the best and worst case so the average number of assignments must also be: O(n log(n)).
©Duane Szafron Time Complexity of Merge Sort Best case: O(n log(n)) for comparisons and assignments. Worst case O(n log(n)) for comparisons and assignments. Average case O(n log(n)) for comparisons and assignments. Note that the insertion sort is actually a better sort than the merge sort if the original collection is almost sorted.
©Duane Szafron Space Complexity of Merge Sort Besides the collection itself, an extra collection of references of the same size is needed. Therefore, the space complexity of Merge Sort is still O(n), but doubling the collection storage may sometimes be a problem. Does the recursion require extra space and if so, how much?