EFFICIENCY & SORTING II CITS1001
2 Scope of this lecture Quicksort and mergesort Performance comparison
Recursive sorting All of the algorithms so far build up the “sorted part” of the array one element at a time What if we take a completely different approach? Faster algorithms split the elements to be sorted into groups, sort the groups separately, then combine the results There are two principal approaches “Intelligent” splitting and “simple” combining Simple splitting and intelligent combining These are divide-and-conquer algorithms 3
Quicksort When sorting n items, Quick Sort works as follows Choose one of the items p to be the pivot Partition the items into L (items smaller than p) and U (items larger than p) L’ = sort(L) U’ = sort(U) The sorted array is then L’ + p + U’, in that order Intelligent splitting, and simple combining 4
5 Behaviour of quicksort Choose a pivot (7) Items smaller than the pivot Items larger than the pivot Sort Append
6 Second level Choose a pivot (9) Items smaller than the pivot Items larger than the pivot Sort Append
What if l == u ? 7 Code for quickSort public static void quickSort(int[] a) { qsort(a, 0, a.length – 1); } // sort a[l..u] inclusive private static void qsort(int[] a, int l, int u) { if (l < u) { int p = partition(a, l, u); qsort(a, l, p – 1); qsort(a, p + 1, u); }
8 Code for partition // put the pivot into si, // with smaller items on its left and larger items on its right private static int partition(int[] a, int l, int u){ // this code always uses a[u] as the pivot int si = l; for (int i = l; i < u; i++) if (a[i] <= a[u]) swap(a, i, si++); // swap small elements to the front swap(a, si, u); // swap the pivot to be between the smalls and larges return si; }
9 Behaviour of partition sia[u] sia[u] a[0] < a[u], so a[0] ↔ a[si] and si sia[u] a[2] < a[u], so a[2] ↔ a[si] and si sia[u] a[5] < a[u], so a[5] ↔ a[si] and si++ a[7] ↔ a[si], return si sia[u]
Mergesort When sorting n items, Merge Sort works as follows Let F be the front half of the array, and B be the back half F’ = sort(F) B’ = sort(B) Merge F’ and B’ to get the sorted list – repeatedly compare their first elements and take the smaller one Simple splitting, and intelligent combining 10
11 Behaviour of mergesort Front halfBack half Sort Merge
12 Second level Front halfBack half Sort Merge
Again, if l == u, there is only one element: no sorting is needed 13 Code for mergeSort public static void mergeSort(int[] a){ msort(a, 0, a.length - 1); } // sort a[l..u] inclusive private static void msort(int[] a, int l, int u){ if (l < u) {int m = (l + u) / 2; msort(a, l, m); msort(a, m + 1, u); merge(a, l, m, u);} }
14 Code for merge // merge a[l..m] with a[m+1..u] private static void merge(int[] a, int l, int m, int u) { while (l <= m && a[l] <= a[m + 1]) l++; // small elements on the 1st list needn't be moved if (l <= m) // if the 1st list is exhausted, we're done { while (u >= m + 1 && a[u] >= a[m]) u--; // large elements on the 2nd list needn't be moved int start = l; // record the start and finish points of the 1st list int finish = m++; int[] b = new int[u - l + 1]; // this is where we will put the sorted list int z = 0; while (m <= u) // while the 2nd list is alive, copy the smallest element to b if (a[l] <= a[m]) b[z++] = a[l++]; else b[z++] = a[m++]; while (z < b.length) b[z++] = a[l++]; // copy the rest of the 1st list for (int i = 0; i < b.length; i++) a[start + i] = b[i]; // copy the sorted list back from b }
Efficiency experiment Is there any difference between the performance of all these sorting algorithms? After all they all achieve the same result… Which one(s) are more efficient? Why? Experiment: use the provided Sorter class to estimate the execution time of each algorithm for sorting a large, disordered array Graph your results 15
Performance Comparison AlgorithmTime to sort (ms) 1,000 items 10,000 items 100,000 items 1,000,000 items Bubble220020,0002,052,560 Selection1515,925605,594 Insertion1232,575281,493 Quick Merge
Analysis Why are quicksort and mergesort so much faster? The first three algorithms all reduce the number of items to be sorted by one in each pass And each pass takes linear time Therefore their overall run-time is n 2, i.e. quadratic Multiplying the number of items by 10 multiplies run-time by 10 2 = 100 Quicksort and mergesort reduce the number of items by half at each level And each level takes linear time Therefore their overall run-time is nlog 2 n Multiplying the number of items by 10 multiplies run-time by 10 and a bit 17
A note on the accuracy of such tests Assessing the execution time of Java code this way is not completely accurate You will not always get the same results Activities such as garbage collection may affect the times Or just if your computer is running other applications concurrently We “average out” unrepresentative examples by using Random data Multiple runs 18
Summary We study sorting algorithms because they provide good examples of many of the features that affect the run-time of program code. When checking the efficiency of your own code, consider Number of loops, and depth of nesting Number of comparison operations Number of swap (or similar) operations 19