CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing pivot Average as good as best case! Most work of Quicksort in comparisons Each call to partition is constant plus number of comparisons in for loop Let X = total number of comparisons in all calls to Partition
CSC317 2 Quicksort on average run time Let X = total number of comparisons in all calls to Partition Rename smallest and largest elements in A as z 1, z 2, …, z n Consider z i, z j with indices i < j Example: [ ] z 1 =1; z 2 =2; z 3 =3; z 4 =4 if z i and z j compared otherwise
CSC317 3 Quicksort on average run time How many times will z i and z j be compared? Example: A=[ ] z i = 3, z j = 9 When will two elements be compared? Only if one of them (3 or 9) is chosen as a pivot, since in Partition, each element compared only with pivot. They will then never be compared again, since pivot is not in the subsequent recursions to Partition
CSC317 4 Quicksort on average run time How many times will z i and z j be compared? Example: A=[ ] z i = 3, z j = 9 When will two elements NOT be compared? If pivot=5, then none of [8 6 9] will be compared to [ ], so z i and z j not compared. Not in this call of partition, or any other call, since these sets will then be separated
CSC317 5 Quicksort on average run time Probability that z i and z j be compared? Consider i < j and set Z ij = z i, z i+1, …, z j This set will remain together in calls to Partition, unless one of them is a pivot Because otherwise pivot will be smaller than i or larger than j, so these will remain together (Not necessarily in order in array A)
CSC317 6 Quicksort on average run time Probability that z i and z j be compared? Consider i < j and set Z ij = z i, z i+1, …, z j If i or j chosen first as pivot within this set, they’ll be compared Otherwise, another element will be chosen as pivot, and they will be separated Since (j-i+1) elements in this set, the probability of any element as pivot: (Not necessarily in order in array A)
CSC317 7 Quicksort on average run time Pr {z i compared to z j } = Pr { z i or z j chosen first as pivot in Z ij } = (since mutually exclusive) Pr { z i first element chosen from Z ij } + Pr { z j first element chosen from Z ij } =
CSC317 8 Quicksort on average run time Let X = total number of comparisons in all calls to Partition if z i and z j compared otherwise All possible comparisons i and j Prob z i compared to z j
CSC317 9 Quicksort on average run time Prob z i compared to z j k= j-i
CSC Quicksort on average run time We showed that average run time with random pivots for any input array is O(n log n) Randomness is in choosing pivot Average as good as best case!
CSC Order statistics: another use of partition Array n unsorted Find k th smallest entries k = 1: Minimum : Median
CSC Order statistics: another use of partition Array n unsorted 1 st, 2 nd, 3 rd smallest to largest; median … Another use of Partition
CSC Order statistics: Let’s start simple (1 st order) Minimum(A) 1. Min = A[1] 2. for i=2 to A.length 3. if min>A[i] 4. min = A[i] 5. return min Worst case? Θ(n) Best case? Θ(n)
CSC Order statistics Input: set A with n distinct numbers Output: find i th smallest values Selection problem – more general problem Upper bound? We can solve in O(n log n) since we can always sort and find i th index in array But we want to do BETTER!!!
CSC Order statistics Input: set A with n distinct numbers Output: find i th smallest values Selection problem – more general problem O(n) on average with randomized algorithm Amazing given that (at least on average) similar to finding just minimum!
CSC Selection problem Randomized‐Select(A,p,r,i) 1 if p==r //base case 2return A[p] 3 q = Randomized-partition(A, p, r) 4 k = q-p+1 //number elements from left up to pivot 5 if i==k // pivot is the i th smallest 6return A[q] 7 elseif i<k 8return Randomized-Select(A,p,q-1, i) //ith smallest left 9 else return Randomized-Select(A,q+1,r,i-k) //on right p r q < pivot> pivot
CSC p r q < pivot> pivot Randomized-partition (A, p, r) 1. i = Random(p, r) 2. swap A[r] with A[i] 3. x = A[r] 4. i=p-1 5. for j = p to r-1 6. if A[ j ] <= x 7. i = i + 1; 8. swap A[ i ] with A[ j ] 9. swap A[ i+1 ] with A[ r ] 10. return i+1
CSC Selection problem Randomized‐Select(A,p,r,i) 1 if p==r //base case 2return A[p] 3 q = Randomized-partition(A, p, r) 4 k = q-p+1 //number elements from left up to pivot 5 if i==k // pivot is the i th smallest 6return A[q] 7 elseif i<k 8return Randomized-Select(A,p,q-1, i) //ith smallest left 9 else return Randomized-Select(A,q+1,r,i-k) //on right p r q < pivot> pivot Example: A = [ ], find the 3 rd smallest value
CSC Selection problem p r q < pivot> pivot How is this different from randomized version of Quicksort? Answer: Only one recursion (left or right); not two
CSC Selection problem p r q < pivot > pivot Analysis: Worst case: When we always partition around largest remaining Element, and recursion on array size n‐1 (which takes θ(n) ) Worse than a good sorting scheme.
CSC Selection problem p r q < pivot > pivot Analysis: However a 1/10 and 9/10 scheme isn’t bad … Remember master theorem: Therefore: Average solution isn’t bad either ( θ(n), Proof similar to quicksort).