David Luebke 1 5/20/2015 CS 332: Algorithms Quicksort
David Luebke 2 5/20/2015 Administrivia l Head count: how many people can’t make it to the scheduled office hours? l How many people would be interested in organizing tutors for this class? l Reminder: homework due tonight at midnight (drop box)
David Luebke 3 5/20/2015 Review: Heaps l A heap is a “complete” binary tree, usually represented as an array: A =
David Luebke 4 5/20/2015 Review: Heaps To represent a heap as an array: Parent(i) { return i/2 ; } Left(i) { return 2*i; } right(i) { return 2*i + 1; }
David Luebke 5 5/20/2015 Review: The Heap Property l Heaps also satisfy the heap property: A[Parent(i)] A[i]for all nodes i > 1 n In other words, the value of a node is at most the value of its parent n The largest value is thus stored at the root (A[1]) l Because the heap is a binary tree, the height of any node is at most (lg n)
David Luebke 6 5/20/2015 Review: Heapify() Heapify() : maintain the heap property n Given: a node i in the heap with children l and r n Given: two subtrees rooted at l and r, assumed to be heaps n Action: let the value of the parent node “float down” so subtree at i satisfies the heap property u If A[i] < A[l] or A[i] < A[r], swap A[i] with the largest of A[l] and A[r] u Recurse on that subtree n Running time: O(h), h = height of heap = O(lg n)
David Luebke 7 5/20/2015 Review: BuildHeap() BuildHeap() : build heap bottom-up by running Heapify() on successive subarrays Walk backwards through the array from n/2 to 1, calling Heapify() on each node. n Order of processing guarantees that the children of node i are heaps when i is processed l Easy to show that running time is O(n lg n) l Can be shown to be O(n) n Key observation: most subheaps are small
David Luebke 8 5/20/2015 Review: Heapsort() Heapsort() : an in-place sorting algorithm: n Maximum element is at A[1] n Discard by swapping with element at A[n] u Decrement heap_size[A] u A[n] now contains correct value Restore heap property at A[1] by calling Heapify() n Repeat, always swapping A[1] for A[heap_size(A)] l Running time: O(n lg n) BuildHeap : O(n), Heapify : n * O(lg n)
David Luebke 9 5/20/2015 Tying It Into The Real World l And now, a real-world example…
David Luebke 10 5/20/2015 Tying It Into The “Real World” l And now, a real-world example…combat billiards n Sort of like pool... n Except you’re trying to kill the other players… n And the table is the size of a polo field… n And the balls are the size of Suburbans... n And instead of a cue you drive a vehicle with a ram on it l Problem: how do you simulate the physics? Figure 1: boring traditional pool
David Luebke 11 5/20/2015 Combat Billiards: Simulating The Physics l Simplifying assumptions: n G-rated version: No players u Just n balls bouncing around n No spin, no friction u Easy to calculate the positions of the balls at time T n from time T n-1 if there are no collisions in between n Simple elastic collisions
David Luebke 12 5/20/2015 Simulating The Physics l Assume we know how to compute when two moving spheres will intersect n Given the state of the system, we can calculate when the next collision will occur for each ball n At each collision C i : u Advance the system to the time T i of the collision u Recompute the next collision for the ball(s) involved u Find the next overall collision C i+1 and repeat n How should we keep track of all these collisions and when they occur?
David Luebke 13 5/20/2015 Review: Priority Queues l The heap data structure is often used for implementing priority queues n A data structure for maintaining a set S of elements, each with an associated value or key Supports the operations Insert(), Maximum(), and ExtractMax() n Commonly used for scheduling, event simulation
David Luebke 14 5/20/2015 Priority Queue Operations l Insert(S, x) inserts the element x into set S l Maximum(S) returns the element of S with the maximum key l ExtractMax(S) removes and returns the element of S with the maximum key
David Luebke 15 5/20/2015 Implementing Priority Queues HeapInsert(A, key) // what’s running time? { heap_size[A] ++; i = heap_size[A]; while (i > 1 AND A[Parent(i)] < key) { A[i] = A[Parent(i)]; i = Parent(i); } A[i] = key; }
David Luebke 16 5/20/2015 Implementing Priority Queues HeapMaximum(A) { // This one is really tricky: return A[i]; }
David Luebke 17 5/20/2015 Implementing Priority Queues HeapExtractMax(A) { if (heap_size[A] < 1) { error; } max = A[1]; A[1] = A[heap_size[A]] heap_size[A] --; Heapify(A, 1); return max; }
David Luebke 18 5/20/2015 Back To Combat Billiards l Extract the next collision C i from the queue l Advance the system to the time T i of the collision l Recompute the next collision(s) for the ball(s) involved l Insert collision(s) into the queue, using the time of occurrence as the key l Find the next overall collision C i+1 and repeat
David Luebke 19 5/20/2015 Using A Priority Queue For Event Simulation l More natural to use Minimum() and ExtractMin() l What if a player hits a ball? n Need to code up a Delete() operation n How? What will the running time be?
David Luebke 20 5/20/2015 Quicksort l Sorts in place l Sorts O(n lg n) in the average case l Sorts O(n 2 ) in the worst case l So why would people use it instead of merge sort?
David Luebke 21 5/20/2015 Quicksort l Another divide-and-conquer algorithm n The array A[p..r] is partitioned into two non- empty subarrays A[p..q] and A[q+1..r] u Invariant: All elements in A[p..q] are less than all elements in A[q+1..r] n The subarrays are recursively sorted by calls to quicksort n Unlike merge sort, no combining step: two subarrays form an already-sorted array
David Luebke 22 5/20/2015 Quicksort Code Quicksort(A, p, r) { if (p < r) { q = Partition(A, p, r); Quicksort(A, p, q); Quicksort(A, q+1, r); }
David Luebke 23 5/20/2015 Partition Clearly, all the action takes place in the partition() function n Rearranges the subarray in place n End result: u Two subarrays u All values in first subarray all values in second n Returns the index of the “pivot” element separating the two subarrays l How do you suppose we implement this function?
David Luebke 24 5/20/2015 Partition In Words l Partition(A, p, r): n Select an element to act as the “pivot” (which?) n Grow two regions, A[p..i] and A[j..r] u All elements in A[p..i] <= pivot u All elements in A[j..r] >= pivot n Increment i until A[i] >= pivot n Decrement j until A[j] <= pivot n Swap A[i] and A[j] n Repeat until i >= j n Return j
David Luebke 25 5/20/2015 Partition Code Partition(A, p, r) x = A[p]; i = p - 1; j = r + 1; while (TRUE) repeat j--; until A[j] <= x; repeat i++; until A[i] >= x; if (i < j) Swap(A, i, j); else return j; Illustrate on A = {5, 3, 2, 6, 4, 1, 3, 7}; What is the running time of partition() ?
David Luebke 26 5/20/2015 Analyzing Quicksort l What will be the worst case for the algorithm? l What will be the best case for the algorithm? l Which is more likely? l Will any particular input elicit the worst case?
David Luebke 27 5/20/2015 Analyzing Quicksort l What will be the worst case for the algorithm? n Partition is always unbalanced u One subarray is size n - 1, the other is size 1 u This happens when q = p l What will be the best case for the algorithm? l Which is more likely? l Will any particular input elicit the worst case?
David Luebke 28 5/20/2015 Analyzing Quicksort l What will be the worst case for the algorithm? n Partition is always unbalanced l What will be the best case for the algorithm? n Partition is perfectly balanced u Both subarrays of size n/2 l Which is more likely? l Will any particular input elicit the worst case?
David Luebke 29 5/20/2015 Analyzing Quicksort l What will be the worst case for the algorithm? n Partition is always unbalanced l What will be the best case for the algorithm? n Partition is perfectly balanced l Which is more likely? n The latter, by far n Except... l Will any particular input elicit the worst case?
David Luebke 30 5/20/2015 Analyzing Quicksort l What will be the worst case for the algorithm? n Partition is always unbalanced l What will be the best case for the algorithm? n Partition is perfectly balanced l Which is more likely? n The latter, by far, except... l Will any particular input elicit the worst case? n Yes: Already-sorted input
David Luebke 31 5/20/2015 Analyzing Quicksort l In the worst case: T(1) = (1) T(n) = T(n - 1) + (n) l What does this work out to? T(n) = (n 2 )
David Luebke 32 5/20/2015 Analyzing Quicksort l In the best case: T(n) = 2T(n/2) + (n) l What does this work out to? T(n) = (n lg n) (by the Master Theorem)
David Luebke 33 5/20/2015 Improving Quicksort l The real liability of quicksort is that it runs in O(n 2 ) on already-sorted input n What can we do about this? l Book discusses two solutions: n Randomize the input array n Pick a random pivot element l How will these solve the problem?
David Luebke 34 5/20/2015 Coming Up l Up next: Analyzing randomized quicksort