Chapter 9 continued: Quicksort Lecture 19 Chapter 9 continued: Quicksort Similar to mergesort: divide and conquer, recursive algorithm Average running time is O(nlogn) O(n2) worst case performance, (can be made exponentially unlikely) Simple to understand and prove correct But for many years hard to implement In practice, faster than mergesort ADS2 Lecture 19
Basic algorithm to sort array S: If number of elements in S is 0 or 1, then return. Pick any element v in S. This is called the pivot Partition S-{v} (the remaining elements in S) into sets S1 = set of elements in S-{v} v and S2 = set of elements in S-{v} v Return{quicksort (S1) followed by v followed by quicksort(S2)} Note that: Choice of pivot is crucial When we reassemble, no need to merge ADS2 Lecture 19
Example: illustration of steps 13 81 92 43 31 65 57 26 75 0 select pivot 13 43 31 57 26 0 65 81 92 75 partition quicksort small quicksort large 65 75 81 92 0 13 26 31 43 57 0 13 26 31 43 57 65 75 81 92 ADS2 Lecture 19
Example for you Use quicksort to sort: 23 3 -7 6 -18 5 1 23 3 -7 6 -18 5 1 Include all stages How did you choose your pivot in each case? describe the strategy used (e.g. “let first element be the pivot”) ADS2 Lecture 19
Picking the pivot – some inadvisable options Use first element Acceptable if input is random Very poor if input is presorted or in reverse order Virtually all the elements go into S1 or into S2 This happens consistently throughout recursive calls If input already presorted, will take quadratic time to do nothing at all! Choose larger of first two elements Same problems as above Pick pivot randomly Theoretically a good idea, unlikely that random number would consistently provide a bad partition (unless random number generator flawed – possible!) Random number generation is expensive ADS2 Lecture 19
Median of three partitioning Median of n numbers is the ([n/2] +1)th element (index [n/2]) when placed in order E.g. median of -1,0,1,4,5,11 is the 4th , i.e. 4 median of -2, 0 , 8, 33, 34, 35, 57 is the 4th , i.e. 33 Median would be the ideal pivot Would be balanced (approx half elements would go into S1 and half into S2) But hard to calculate and slow (need to sort first!) Compromise : median of left, centre and right element (without ordering whole array) Centre position is (left+right)/2 Take the middle of the three. For -2, 0, 8, 33, 24, 35, -7 take median of -2, 33 and -7 (i.e. -2) For -2 ,0, 8,33,24,35,26,-7 take median of -2,33 and -7 (i.e. -2 again) ADS2 Lecture 19
Partitioning strategy One of several. Known to give good results First step, move pivot to far right by swapping with right element Second step, set i to left element, and j to right-1 8 1 4 9 6 3 5 2 7 0 identify pivot swap pivot and last element 8 1 4 9 0 3 5 2 7 6 j i We will assume all elements are distinct And consider the case when they are not, later ADS2 Lecture 19
Partitioning strategy contd. Move all small elements to left part of array, and all large elements to right While i is to the left of j, move i right, skipping over elements smaller than pivot Move j left, skipping over elements larger than pivot When i and j have stopped, i is pointing to large element, and j to small If i is to left of j, swap those elements Finally, swap element pointed to by i with the pivot. ADS2 Lecture 19
Partitioning example before first swap i stays the same as 8 is already> pivot j skips along to element 2 Swap those elements and repeat the process 8 1 4 9 0 3 5 2 7 6 j i after first swap: 2 1 4 9 0 3 5 8 7 6 i j before second swap: 2 1 4 9 0 3 5 8 7 6 j i 2 1 4 5 0 3 9 8 7 6 i j after second swap: before third swap: i 2 1 4 5 0 3 9 8 7 6 j STOP: i and j have crossed After swap with pivot: 2 1 4 5 0 3 6 8 7 9 ADS2 Lecture 19 i
Partitioning example continued When finished, evey element in position p<i is small, and every element in position p>i is large Would then apply quicksort to the lists (arrays) 2,1,4,5,0 3 and 8,7,9 Example for you: complete the next stage of the process, i.e. partition the list 2,1,4,5,0,3 ADS2 Lecture 19
When some elements are identical When some elements are identical Most importantly, when an element that is not pivot has same value as pivot should we stop or skip ? Should do the same for i and j so that partitioning is not biased - In each case, suppose that all elements are identical. Stopping: many swaps between identical elements but i and j cross in middle, so when pivot replaced 2 nearly equal partitions are created (so, like mergesort, O(nlogn)) So we choose to stop Skipping: no swaps between identical elements but i and j do not cross in middle, so pivot will be (re)placed at last, or second last position. Unequal parts. If all vals identical, O(n2) see board ADS2 Lecture 19
Small arrays : using a cutoff For small arrays, quicksort not as fast as insertion sort As quicksort is recursive, these cases will occur frequently Solution: use quicksort until arrays are small, then use insertion sort on whole array Works very well as insertion sort v efficient for nearly sorted arrays Can save about 15% in running time. Good cutoff range is n=10, but any cutoff between 5 and 20 likely to produce similar results Also avoids problems like finding median of three when only two elements left ADS2 Lecture 19
Quicksort in practice MedianOfThree algorithm see board for discussion Algorithm MedianOfThree(A,x,y): Input: An array A and integers x,y0, such that A has indices x .. y Output: value of pivot Note : A will have pivot at position y c (x+y)/2 if A[x]>A[c] then swap(A,x,c) if A[x]>A[y] then swap(A,x,y) if A[c]>A[y] then swap(A,c,y) pivot c return pivot see board for discussion ADS2 Lecture 19
Quicksort in practice contd. Algorithm QuickSort(A,x,y,cutoff): Input: An array A and integers x,y0, such that A has indices x .. y, and integer cutoff Output : A (almost) sorted if (y-x)>cutoff then pivot=MedianOfThree(A,x,y) i x j y-1 while i<=j do while A[i]<pivot do i i+1 while (A[j]>pivot) and (j>=i) do j j-1 if (i<j) then swap(A,i,j) swap(A,i,y) QuickSort(A,x,i-1,cutoff) QuickSort(A,i+1,y,cutoff) return A ADS2 Lecture 19
QuickSort – Putting it all together This is an exercise for you (part of Lab 5) Need to include following methods: swap medianOfThree insertionSort qSort You will be sorting an array of country names. quickSort will need to be generic Will also need to convert mergeSort to be generic, and compare timings for quickSort, mergeSort, insertionSort, and bubbleSort. ADS2 Lecture 19
Analysis of Quicksort As for mergesort, T(0)=T(1)=1 Running time for array of size n is equal to running time of two recursive calls plus the linear time spent in the partition (pivot selection takes only constant time) Basic quicksort relation is T(n) = T(i) +T(n-i-1) +cn where i=|S1| = number of elements in S1 Worst case: O(n2) Best-Case: O(n log n) Average-case: O(n log n) will prove the first two of these on the board ADS2 Lecture 19
Finally: Comparison of quicksort and mergesort Both routines recursively solve two subproblems and require linear additional work, but Unlike mergesort, in quicksort subproblems not guaranteed to be of equal size However, quicksort is faster because partitioning step can be performed “in place”, very efficiently. More than makes up for (2). ADS2 Lecture 19
Some examples for you (See extra quicksort examples contained in Lectures folder) Sort 3,1,4,1,5,9,2,6,5,3,5 using quicksort with median-of-three partitioning and a cutoff of 3 Perform the first partitioning stage on 11,10,9,8,7,6,5,4,3,2,1 What do you notice? Construct a permutation of 20 elements that is as bad as possible for quicksort, using median-of-three partitioning and a cutoff of 3 ADS2 Lecture 19