Presentation is loading. Please wait.

Presentation is loading. Please wait.

ALG0183 Algorithms & Data Structures Lecture 16 Merge sort 8/25/20091 ALG0183 Algorithms & Data Structures by Dr Andy Brooks comparison sort worse-case,

Similar presentations


Presentation on theme: "ALG0183 Algorithms & Data Structures Lecture 16 Merge sort 8/25/20091 ALG0183 Algorithms & Data Structures by Dr Andy Brooks comparison sort worse-case,"— Presentation transcript:

1 ALG0183 Algorithms & Data Structures Lecture 16 Merge sort 8/25/20091 ALG0183 Algorithms & Data Structures by Dr Andy Brooks comparison sort worse-case, average case, and best-case O(nlogn) usually not in-place algorithm (additional list when merging) usually stable implementations The main disadvantage of merge sort is that it uses extra memory proportional to n. If you are sorting a very big list you will need a lot of memory. Merge sort is stable if the merge function is stable. recursion/sjálfkvaðning recursive/endurkvæmur Chapter 8.5

2 Merge Sort definition http://www.itl.nist.gov/div897/sqg/dads/HTML/mergesort.html 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 2 Definition: A sort algorithm that splits the items to be sorted into two groups, recursively sorts each group, and merges them into a final, sorted sequence. Run time is Θ(n log n).sortrecursivelymergesΘ Θ Formal Definition: f(n) = Θ (g(n)) means there are positive constants c 1, c 2, and k, such that 0 ≤ c 1 g(n) ≤ f(n) ≤ c 2 g(n) for all n ≥ k. The values of c 1, c 2, and k must be fixed for the function f and must not depend on n.

3 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 3 Recursion definition http://www.itl.nist.gov/div897/sqg/dads/HTML/recursion.html An algorithmic technique where a function, in order to accomplish a task, calls itself with some part of the task. Note: Every recursive solution involves two major parts or cases, the second part having three components. base case(s), in which the problem is simple enough to be solved directly, and recursive case(s). A recursive case has three components: divide the problem into one or more simpler or smaller parts of the problem, call the function (recursively) on each part, and combine the solutions of the parts into a solution for the problem. Depending on the problem, any of these may be trivial or complex. Note: it can be possible to run into stack memory problems if the depth of the recursion is large. For merge sort, however, the depth of the recursion is O(log 2 n) – see later.

4 History & Ongoing Development “A Meticulous Analysis of Mergesort Programs”, Jyrki Katajainen and Jesper Larsson Träff, Lecture Notes In Computer Science, Vol. 1203, Proceedings of the Third Italian Conference on Algorithms and Complexity, pp 217 – 228, 1997 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 4 “Mergesort is as important in the history of sorting as sorting in the history of computing. A detailed description of bottom-up mergesort, together with a timing analysis, appeared in a report by Goldstine and von Neumann [6] as early as 1948. Today numerous variants of the basic method are known, for instance, top-down mergesort (see, e.g., [17, pp. 165-166]), queue mergesort [7], in-place mergesort (see, e.g., [8]), natural mergesort (see, e.g., [11, pp. 159-163]), as well as other adaptive versions of mergesort (see [5, 14] and the references in these surveys). The development in this paper is based on bottom-up mergesort, or straight mergesort as it was called by Knuth [11, pp. 163-165].” “New implementations for two-way and four-way bottom-up mergesort are given, the worst-case complexities of which are shown to be bounded by 5.5nlog 2 n + O(n) and 3.25nlog 2 n+O(n), respectively.”

5 Step-by-step example http://en.wikipedia.org/wiki/Merge_sort 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 5

6 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 6 Step-by-step example from a student paper on Merge sort by Luis Quiles

7 Pseudocode implementation http://www.codecodex.com/wiki/Merge_sort#Pseudocode 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 7

8 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 8 Pseudocode implementation http://www.codecodex.com/wiki/Merge_sort#Pseudocode

9 Lewis code © Addison Wesley 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 9 public static void mergeSort (Comparable[] data, int min, int max) { if (min < max) { int mid = (min + max) / 2; mergeSort (data, min, mid); mergeSort (data, mid+1, max); merge (data, min, mid, max); }

10 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 10 Lewis code © Addison Wesley public static void merge (Comparable[] data, int first, int mid, int last) { Comparable[] temp = new Comparable[data.length]; int first1 = first, last1 = mid; // endpoints of first subarray int first2 = mid+1, last2 = last; // endpoints of second subarray int index = first1; // next index open in temp array // Copy smaller item from each subarray into temp until one // of the subarrays is exhausted while (first1 <= last1 && first2 <= last2) { if (data[first1].compareTo(data[first2]) < 0) { temp[index] = data[first1]; first1++; } else { temp[index] = data[first2]; first2++; } index++; }

11 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 11 Lewis code © Addison Wesley // Copy remaining elements from first subarray, if any while (first1 <= last1) { temp[index] = data[first1]; first1++; index++; } // Copy remaining elements from second subarray, if any while (first2 <= last2) { temp[index] = data[first2]; first2++; index++; } // Copy merged data into original array for (index = first; index <= last; index++) data[index] = temp[index]; }

12 Stable or unstable? Record keys are a,b, and c. There are 4 records. Two records have the same key b. Let x and y subscripts be used to distinguish records with key b. 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 12 bxbx byby ca bxbx byby ca bxbx cbyby a if (data[first1].compareTo(data[first2]) < 0) { temp[index] = data[first1]; first1++; } else { temp[index] = data[first2]; first2++; } Record b x is not less than b y, so b y is placed into the temp array first. (Applies to all test cases.) Code should be <=0, not <0. Lewis code is unstable. Check for yourself. idealised

13 Weiss code, Figure 8.8 © Addison Wesley 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 13

14 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 14 Weiss code, Figure 8.9 © Addison Wesley Weiss code is stable.

15 “Selecting the Right Algorithm” Talk given by Michail G. Lagoudakis at the 2001 AAAI Fall Symposium Series: Using Uncertainty within Computation, Cape Cod, MA, November 2001. http://www2.isye.gatech.edu/~mlagouda/presentations.html 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 15

16 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 16 “Selecting the Right Algorithm” Talk given by Michail G. Lagoudakis at the 2001 AAAI Fall Symposium Series: Using Uncertainty within Computation, Cape Cod, MA, November 2001. http://www2.isye.gatech.edu/~mlagouda/presentations.html Earlier cross-over point.

17 The dashed line represents the algorithm which starts with Insertion Sort and then swaps over to Quicksort at the cross-over point. The hybrid algorithm makes use of knowledge of the interaction between algorithms and performs even better: Insertion Sort, then Merge Sort, then Quicksort. 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 17 “Selecting the Right Algorithm” Talk given by Michail G. Lagoudakis at the 2001 AAAI Fall Symposium Series: Using Uncertainty within Computation, Cape Cod, MA, November 2001. http://www2.isye.gatech.edu/~mlagouda/presentations.html Optimal policies must be determined empirically i.e. do an experiment.

18 Many possible improvements can be made to merge sort. Sorting can be speeded up by choosing a more efficient algorithm for small n. – Advice is to use insertion sort for small n. The merge need not be performed if the highest element of the first subarray is less than the lowest element in the second subarray. (less than or equal to?) – Java´s merge sort has this improvement. – The improvement can certainly help for nearly ordered lists. – (There is a small cost finding the highest and lowest elements.) Four-way merging has been reported as being better than two-way merging. 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 18

19 Some Big-Oh merge sort (recursive) Assume the number of items N to be sorted is a power of 2. Assume the cost of merging two sublists of size N/2 is N comparisons. (actually N-1 comparisons) Let T(N) equal the number of comparisons needed to sort N items. – A proxy measure for the time needed. (ignoring moves/swaps) The time to sort N items is the time to sort two sublists of size N/2 plus the time to merge the two sublists together. The recurrence relation is: T(N) = 2 T(N/2) + N for N > 1 T(1) = 0 This recurrence relation is typical of many “divide-and- conquer” algorithms. There are several ways of proving that T(N) is Nlog 2 N 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 19

20 O(nlogn) proof by recursion tree 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 20 16 8 4 2 If the number of items N = 16, there are log 2 16 = 4 levels.


Download ppt "ALG0183 Algorithms & Data Structures Lecture 16 Merge sort 8/25/20091 ALG0183 Algorithms & Data Structures by Dr Andy Brooks comparison sort worse-case,"

Similar presentations


Ads by Google