ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions Algorithm Design and Analysis (ADA) , Semester Quicksort
ADA: 5. Quicksort2 1.Quicksort 2.Partitioning Function 3.Analysis of Quicksort 4.Quicksort in PracticeOverview
ADA: 5. Quicksort3 Proposed by Tony Hoare in Voted one of top 10 algorithms of 20th century in science and engineering o A divide-and-conquer algorithm. Sorts “ in place ” -- rearranges elements using only the array, as in insertion sort, but unlike merge sort which uses extra storage. Very practical (after some code tuning). 1. Quicksort
ADA: 5. Quicksort4 Quicksort an n-element array: 1. Divide: Partition the array into two subarrays around a pivot x such that elements in lower subarray ≤ x ≤ elements in upper subarray. 2. Conquer: Recursively sort the two subarrays. 3. Combine: Nothing to do. Key: implementing a linear-time partitioning function Divide and conquer
ADA: 5. Quicksort5 quicksort(int[] A, int left, int right) if (left < right) // If the array has 2 or more items pivot = partition (A, left, right) // recursively sort elements smaller than the pivot quicksort(A, left, pivot-1) // recursively sort elements bigger than the pivot quicksort(A, pivot+1, right)Pseudocode
ADA: 5. Quicksort6 Quicksort Diagram pivot
ADA: 5. Quicksort7 quicksort will stop when the subarray is 0 or 1 element big. When the subarray gets to a small size, switch over to dedicated sorting code rather than relying on recursion. quicksort is tail-recursive, a recursive behaviour which can be optimized. Fine Tuning the Code
ADA: 5. Quicksort8 Tail-call optimization avoids allocating a new stack frame for a called function. o It isn't necesary because the calling function only returns the value that it gets from the called function. The most common use of this technique is for optimizing tail-recursion o the recursive function can be rewritten to use a constant amount of stack space (instead of linear) Tail-Call Optimization
ADA: 5. Quicksort9 Before applying tail-call optimization: Tail-Call Graphically After applying it:
Pseudocode Before: int foo(int n) { if (n == 0) return A(); else { int x = B(n); return foo (x); } After: int foo(int n) { if (n == 0) return A(); else { int x = B(n); goto start of foo() code with x as argument value }
ADA: 5. Quicksort11 PARTITION(A, p, q) // A[p.. q] x ← A[p] // pivot = A[p] Running time i ← p// index = O(n) for n for j ← p + 1 to q elements. if A[ j] ≤ x then i ← i + 1 // move the i boundary exchange A[i] ↔ A[ j] // switch big and small exchange A[p] ↔ A[i] return i // return index of pivot 2. Partitioning Function
ADA: 5. Quicksort12 Example of partitioning scan right until find something less than the pivot scan right until find something less than the pivot
ADA: 5. Quicksort13 Example of partitioning
ADA: 5. Quicksort14 Example of partitioning
ADA: 5. Quicksort15 Example of partitioning swap 10 and 5
ADA: 5. Quicksort16 Example of partitioning resume scan right until find something less than the pivot
ADA: 5. Quicksort17 Example of partitioning
ADA: 5. Quicksort18 Example of partitioning
ADA: 5. Quicksort19 Example of partitioning swap 13 and 3
ADA: 5. Quicksort20 Example of partitioning swap 10 and 2
ADA: 5. Quicksort21 Example of partitioning
ADA: 5. Quicksort22 Example of partitioning j runs to the end
ADA: 5. Quicksort23 Example of partitioning swap pivot and 2 so in the middle
ADA: 5. Quicksort24 The analysis is quite tricky. Assume all the input elements are distinct o no duplicate values makes this code faster! o there are better partitioning algorithms when duplicate input elements exist (e.g. Hoare's original code) Let T(n) = worst-case running time on an array of n elements. 3. Analysis of Quicksort
ADA: 5. Quicksort25 QUICKSORT runs very slowly when its input array is already sorted (or is reverse sorted). o almost sorted data is quite common in the real-world This is caused by the partition using the min (or max) element which means that one side of the partition will have has no elements. Therefore: T(n) = T(0) +T(n-1) + Θ(n) = Θ(1) +T(n-1) + Θ(n) = T(n-1) + Θ(n) = Θ(n 2 ) (arithmetic series) 3.1. Worst-case of quicksort no elements n-1 elements
ADA: 5. Quicksort26 T(n) = T(0) +T(n-1) + cn Worst-case recursion tree
ADA: 5. Quicksort27 T(n) = T(0) +T(n-1) + cn T(n) Worst-case recursion tree
ADA: 5. Quicksort28 T(n) = T(0) +T(n-1) + cn cn T(0) T(n-1) Worst-case recursion tree
ADA: 5. Quicksort29 T(n) = T(0) +T(n-1) + cn cn T(0) c(n-1) T(0) T(n-2) Worst-case recursion tree
ADA: 5. Quicksort30 T(n) = T(0) +T(n-1) + cn cn T(0) c(n-1) T(0) T(n-2) T(0) Θ(1) Worst-case recursion tree
ADA: 5. Quicksort31 T(n) = T(0) +T(n-1) + cn Worst-case recursion tree
ADA: 5. Quicksort32 In the worst case, quicksort isn't any quicker than insertion sort. So why bother with quicksort? It's average case running time is very good, as we'll see. Quicksort isn't Quick?
ADA: 5. Quicksort33 If we’re lucky, PARTITION splits the array evenly: T(n) = 2T(n/2) + Θ(n) = Θ( n log n ) (same as merge sort) 3.2. Best-case Analysis Case 2 of the Master Method Case 2 of the Master Method
ADA: 5. Quicksort34 What if the split is always 1/10 : 9/10? T(n) = T(1/10n) + T(9/10n) + Θ(n) 3.3. Almost Best-case
ADA: 5. Quicksort35 T(n) Analysis of “ almost-best ” case
ADA: 5. Quicksort36 cn T(1/10n) T(9/10n) Analysis of “ almost-best ” case
ADA: 5. Quicksort37 cn T(1/10n) T(9/10n) T(1/100n ) T(9/100n) T(9/100n) T(81/100n) Analysis of “ almost-best ” case
ADA: 5. Quicksort38 Analysis of “ almost-best ” case
ADA: 5. Quicksort39 Analysis of “ almost-best ” case short path short path long path long path cn * short path cn * long path all leaves
ADA: 5. Quicksort40 Short path node value: n (1/10)n (1/10) 2 n ... 1 n(1/10) sp = 1 n = 10 sp // take logs log 10 n = sp Long path node value: n (9/10)n (9/10) 2 n ... 1 n(9/10) lp = 1 n = (10/9) lp // take logs log 10/9 n = lp Short and Long Path Heights sp steps lp steps
ADA: 5. Quicksort41 Suppose we alternate good, bad, good, bad, good, partitions …. G(n) = 2B(n/2) + Θ(n) good B(n) = L(n – 1) + Θ(n) bad Solving: G(n) = 2( G(n/2 – 1) + Θ(n/2) ) + Θ(n) = 2G(n/2 – 1) + Θ(n) = Θ(n log n) How can we make sure we choose good partitions? 3.4. Good and Bad Good!
ADA: 5. Quicksort42 IDEA : Partition around a random element. Running time is then independent of the input order. No assumptions need to be made about the input distribution. No specific input leads to the worst-case behavior. The worst case is determined only by the output of a random-number generator. Randomized Quicksort
ADA: 5. Quicksort43 Quicksort is a great general-purpose sorting algorithm. o especially with a randomized pivot o Quicksort can benefit substantially from code tuning o Quicksort can be over twice as fast as merge sort Quicksort behaves well even with caching and virtual memory. 4. Quicksort in Practice
ADA: 5. Quicksort44 Running time estimates: Home PC executes 10 8 compares/second. Supercomputer executes compares/second Timing Comparisons Lesson 1. Good algorithms are better than supercomputers. Lesson 2. Great algorithms are better than good ones.