Download presentation
Presentation is loading. Please wait.
Published byFrancis Daniels Modified over 9 years ago
1
ALG0183 Algorithms & Data Structures Lecture 17 Quicksort 8/25/20091 ALG0183 Algorithms & Data Structures by Dr Andy Brooks comparison sort worse-case O(n 2 ) average case, best-case O(nlogn) in-place algorithm unstable sort Unstable sorts can be made stable by doing extra work. This can involve a pass through the sorted data to put back in order any “out-of-order” duplicate keys. But how do you detect “out-of-order”?
2
8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 2 Quicksort definition http://www.itl.nist.gov/div897/sqg/dads/HTML/quicksort.html Pick an element from the array (the pivot), partition the remaining elements into those greater than and less than this pivot, and recursively sort the partitions. There are many variants of the basic scheme above: to select the pivot, to partition the array, to stop the recursion on small partitions, etc.partition recursively Note: Quicksort has running time Θ(n²) in the worst case, but it is typically O(n log n). In practical situations, a finely tuned implementation of quicksort beats most sort algorithms, including sort algorithms whose theoretical complexity is O(n log n) in the worst case.Θ(n²)worst caseO(n log n) SelectSelect can be used to always pick good pivots, thus giving a variant with O(n log n) worst-case running time.O(n log n)
3
Step-by-step example Figure 8.10 © Addison Wesley 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 3 How do you select a pivot? How do you create partitions?
4
8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 4 Pseudocode implementation “Sorting efficiency”, MN Skaredoff, Journal of Clinical Monitoring and Computing, Volume 3, pp 201- 209, 1987 ©Springer quicksort(P +1, higher);?
5
8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 5 Pseudocode implementation “Sorting efficiency”, MN Skaredoff, Journal of Clinical Monitoring and Computing, Volume 3, pp 201- 209, 1987 ©Springer
6
Step-by-step example Figure 4. in “Sorting efficiency”, MN Skaredoff, Journal of Clinical Monitoring and Computing, Volume 3, pp 201-209, 1987 ©Springer 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 6 T is bigger than the pivot. C is smaller than the pivot. Swap them. scan up and down
7
picking the pivot Weiss Chapter 8.6.3 Degenerate inputs are the cases when the list is already sorted or when the list contains records that all have the same key. Choosing the first element as the pivot is acceptable if the data is in random order, but not if the data is already sorted. With sorted data, the partitioning is extremely imbalanced and results in a worse-case performance of O(N 2 ). “Never use the first element of the pivot.” “Stay away from any strategy that looks only at some key near the front or end of the input group.” Weiss 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 7
8
A safe choice is to use the middle element (low+high/2) as the pivot. – With data that is already sorted, this is the perfect pivot in each recursive call. – Note that it is still possible to construct a data sequence that forces O(N 2 ) behaviour for this pivot. These worse-case inputs are produced by “adversarial algorithms”. The best pivot is the median value, but to reduce costs of calculation, the median-of-three pivot works by choosing the median of the first, middle, and last elements. – With data that is already sorted, the pivot is the median. A pivot might also be chosen at random. 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 8 picking the pivot Weiss Chapter 8.6.3 median/miðtala
9
The simplest way of finding the median of the first, middle, and last elements is to sort them in the array. This means: – The pivot should be swapped with the element in the next-to-last position. – Iterations can start at low+1 and high-2. – Whenever i is searching for a large element, the iteration is guaranteed to stop at the pivot (“and we stop on equality”). – Whenever j is searching for a small element, the iteration is guaranteed to stop at the first element (“and we stop on equality”). 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 9 median-of-three Weiss Chapter 8.6.6
10
8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 10 median-of-three Weiss Chapter 8.6.6 Figures 8.18, 8.19, and 8.20 © Addison Wesley
11
Small arrays Weiss Chapter 8.6.7 Insertion sort is better than quicksort for small n, so one way of optimising quicksort is to switch to insertion sort when sorting small arrays. “A good cut off is 10 elements... The actual best cutoff is machine dependent.” Weiss 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 11 Optimal policies must be determined empirically i.e. do an experiment.
12
Switching to insertion: when? http://www.angelfire.com/pq/jamesbarbetti/articles/sorting/001_QuicksortIsBroken.htm Timing results on 5,000 distinct long integers. The data shows single measures only, not averages, but the trends are reasonably clear. For N=5,000, the greatest time savings occur around m=9 or 10. 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 12 ascending order (green) random order (blue) descending order (black) sawtooth order (red)
13
Lewis code © Addison Wesley In the first call made to quickSort, the values for min and max would encompass all the elements to be sorted. The call to partition returns the index position of the pivot point. – The pivot point is in its final correct position. 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 13 public static void quickSort (Comparable[] data, int min, int max) { int pivot; if (min < max) { pivot = partition (data, min, max); // make partitions quickSort(data, min, pivot-1); // sort left partition quickSort(data, pivot+1, max); // sort right partition }
14
8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 14 Lewis code © Addison Wesley private static int partition (Comparable[] data, int min, int max) { // Use first element as the partition value Comparable partitionValue = data[min]; int left = min; int right = max; while (left < right) { // Search for an element that is > the partition element while (data[left].compareTo(partitionValue) <= 0 && left < right) left++; // Search for an element that is < the partition element while (data[right].compareTo(partitionValue) > 0) right--; if (left < right) swap(data, left, right); } // Move the partition element to its final position swap (data, min, right); return right; } using the first element as pivot! scan up scan down
15
Stable or unstable? Record keys are a,b, and c. There are 4 records. Two records have the same key b. Let x and y subscripts be used to distinguish records with key b. 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 15 bxbx byby ca Lewis code is unstable. pivot left right bxbx byby ca leftrightpivot bxbx byby ac if (left < right) swap(data, left, right); pivotleftright abyby bxbx c swap (data, min, right); rightreturn right; pivot in right place, but b x and b y have order swapped
16
Weiss code © Addison Wesley 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 16 public static > void quicksort( AnyType [ ] a ) { quicksort( a, 0, a.length - 1 ); } private static final int CUTOFF = 10; public static final void swapReferences( AnyType [ ] a, int index1, int index2 ) { AnyType tmp = a[ index1 ]; a[ index1 ] = a[ index2 ]; a[ index2 ] = tmp; } convenience method no need to specify array limits in call switch to insertion sort for small lists standard swap routine with tmp variable, a compiler optimisation is to inline this swapping code and save a method call Our laboratory tests found quicksort to be stable, but the tests were on small lists! We were testing the stability of insertion sort!
17
Weiss code © Addison Wesley 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 17 private static > void quicksort( AnyType [ ] a, int low, int high ) { if( low + CUTOFF > high ) insertionSort( a, low, high ); else { // Sort low, middle, high int middle = ( low + high ) / 2; if( a[ middle ].compareTo( a[ low ] ) < 0 ) swapReferences( a, low, middle ); if( a[ high ].compareTo( a[ low ] ) < 0 ) swapReferences( a, low, high ); if( a[ high ].compareTo( a[ middle ] ) < 0 ) swapReferences( a, middle, high ); for 10 or less elements use insertion sort median-of-three
18
Weiss code © Addison Wesley 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 18 // Place pivot at position high - 1 swapReferences( a, middle, high - 1 ); AnyType pivot = a[ high - 1 ]; // Begin partitioning int i, j; for( i = low, j = high - 1; ; ) { while( a[ ++i ].compareTo( pivot ) < 0 ); while( pivot.compareTo( a[ --j ] ) < 0 ); if( i >= j ) break; swapReferences( a, i, j ); } // Restore pivot swapReferences( a, i, high - 1 ); quicksort( a, low, i - 1 ); // Sort small elements quicksort( a, i + 1, high ); // Sort large elements } scan up scan down stop on equality can swap order of duplicate keys Weiss code is unstable.
19
Some Big-Oh quicksort (recursive), best-case Best-case occurs when the sublists are balanced throughout. – e.g. ordered data with the median as pivot Assume the number of items N to be sorted is a power of 2. Assume the cost of partitioning into two sublists of size N/2 is N comparisons. (actually N-1 comparisons) Let T(N) equal the number of comparisons needed to sort N items. – A proxy measure for the time needed. (ignoring moves/swaps) The time to sort N items is the time to sort two sublists of size N/2 plus the time to partition into the two sublists. The recurrence relation is: T(N) = 2 T(N/2) + N for N > 1 T(1) = 0 This recurrence relation is typical of many “divide-and-conquer” algorithms. Big-Oh is O(nlog 2 n). 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 19
20
Some Big-Oh quicksort (recursive), worse-case Worse-case occurs when the sublists are fully unbalanced throughout. The set of small elements is empty and the set of large elements has all the elements except the pivot. – e.g. ordered data using the first element as pivot Assume the cost of partitioning N elements is N comparisons. (actually N-1 comparisons) Let T(N) equal the number of comparisons needed to sort N items. – A proxy measure for the time needed. (ignoring moves/swaps) The time to sort N items is the time to sort N-1 items plus the time to partition into the two sublists: empty & (N-1). The recurrence relation is: T(N) = T(N-1) + N for N > 1 T(1) = 0 T(N-1) = T(N-2) + (N-1), T(N-2) = T(N-3) + (N-2),... T(2) = T(1) + 2. So T(N) = T(1) + 2 +3 +... + N = N(N+1)/2 = O(N 2 ). 8/25/2009 ALG0183 Algorithms & Data Structures by Dr Andy Brooks 20
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.