Priority Queues and Heaps Fundamental Data Structures and Algorithms Klaus Sutner January 29, 2004
Plan Today: Demo: intelligent math handbook Priority Queues Binary heaps Implementation questions Reading: Chapter 20 in MAW. See also 6.7 and 6.8.
We want to simulate a chain of events E i. Each event E i is associated with a time t i. No problem: sort the times, and run through them in order. t 1 < t 2 < t 3 <... < t n But what if an event can cause the creation of some new events in the future? A Simulation
We can still sort the original events by time, but not the additional ones: we don't yet know what/when they are. We need to be able to dynamically - insert an event into a data structure, and - extract the next event. And, of course, the operations should be cheap. A Simulation
Suppose a robot wants to cross a street. No jaywalking, it can only move straight forward or backward. Maximum speed of the robot is c. Cars are coming from either side, but the robot has complete information about when and where they cross your line. Problem: Find a way for the robot to get across without getting hit. An Example: Street Crossing
How do we model the given information? In space-time a car is just a rectangular box. Movement of the robot is constrained to a light- cone. More Street Crossing time space
Hugging an obstacle (not that unrealistic, just inflate the obstacles to account for safety buffer). More Street Crossing
Note how obstacles may be “fused” together. A Death Trap
The times when obstacles appear/disappear are clearly events. But there are more: The times when the robot touches an obstacle. Or when the light cones from both ends of a disappearing obstacle meet. Note that these events have to be computed as we go along, there is no way to compute them all at the outset. Street Crossing
Note how the first event happens before the first obstacle arrives! We need to go backwards in time, starting at the end of the last obstacle. Time Travel
How does this translate into a data structure? What are the crucial operations? What would be a simple (not necessarily efficient) implementation? Where is room for improvement? Towards An Algorithm
Priority Queue Interface Crucial Operations insert find-min delete-min Think about just inserting times (real numbers), but in reality there is associated information.
Some Applications Event simulations Shortest path computation Huffman coding Sorting (sort of) Not a universal data structure (like hash tables) but irreplaceable in some cases.
PQ Interface Requires total order on the elements of the universe. Could be done with any traversable container: just run through and find the min. We would like much better performance. Binary search trees come to mind: min element is in the leftmost position.
Linked list deleteMinO(1)O(N) insert O(N)O(1) Search trees All operationsO(log N) Heaps avg (assume random)worst deleteMinO(log N) O(log N) insert2.6O(log N) special case : buildheapO(N)O(N) i.e., insert*N or Possible priority queue implementations
Linked list deleteMinO(1)O(N) insert O(N)O(1) Search trees All operationsO(log N) Heaps avg (assume random)worst deleteMinO(log N) O(log N) insert2.6O(log N) special case : buildheapO(N)O(N) i.e., insert*N or Possible priority queue implementations
Linked list deleteMinO(1)O(N) insert O(N)O(1) Search trees All operationsO(log N) Heaps avg (assume random)worst deleteMinO(log N) O(log N) insert2.6O(log N) buildheapO(N)O(N) N inserts or Possible priority queue implementations
A Digression: Perfect and Complete Binary Trees
Perfect binary trees
Perfect binary trees
Perfect binary trees
Perfect binary trees How many nodes? h=3 N=15
Perfect binary trees How many nodes? In general: N = = 2 h Most of the nodes are leaves h=3h=3 N=15
A Serious Proof Define PBT by structural induction: - nil is a PBT and H(nil) = 0 - (a,L,R) is a PBT whenever L and R are PBTs and H(L) = H(R). Moreover, H( (a,L,R) ) = H(L) + 1. size(nil) = 0 size( (a,L,R) ) = size(L) + size(R) + 1. Then size(T) = 2 H(T) -1.
Quiz Break
Red-green quiz In a perfect binary tree, what is the sum of the heights of the nodes? Give a mathematical characterization, the prettier the better. Give a tight upper bound (in big-O terms)
Perfect binary trees What is the sum of the heights? S = < N = O(N) h=3 N=15
Complete binary trees
Complete binary trees
Representing complete binary trees Linked structures?
Representing complete binary trees Linked structures? No! Instead, use arrays!
Representing complete binary trees Arrays Parent at position i in the array Children at positions 2i and 2i
Representing complete binary trees Arrays ( 1-based ) Parent at position i Children at 2i and 2i
Representing complete binary trees Arrays ( 1-based ) Parent at position i Children at 2i and 2i
Representing complete binary trees Arrays ( 1-based ) Parent at position i Children at 2i (and 2i+1)
Representing complete binary trees Arrays ( 1-based ) Parent at position i Children at 2i and 2i+1. public class BinaryHeap { private Comparable[] heap; private int size; public BinaryHeap(int capacity) { size=0; heap = new Comparable[capacity+1]; }...
Representing complete binary trees Arrays Parent at position i Children at 2i and 2i+1. Example: find the leftmost child int left=1; for(; left<size; left*=2); return heap[left/2]; Example: find the rightmost child int right=1; for(; right<size; right=right*2+1); return heap[(right-1)/2];
Implementing Priority Queues with Binary Heaps
Binary heaps: the invariant Representation invariants 1. Structure property Complete binary tree (i.e. the elements of the heap are stored in positions 1…size of the array) 2. Heap order property Parent keys less than children keys
Heaps Representation invariant 1. Structure property Complete binary tree Hence: efficient compact representation 2. Heap order property Parent keys less than children keys Hence: rapid insert, findMin, and deleteMin O(log(N)) for insert and deleteMin O(1) for findMin
The heap order property Each parent is less than each of its children. Hence: Root is less than every other node. (obvious proof by induction)
Operating with heaps Representation invariant: All methods must: 1. Produce complete binary trees 2. Guarantee the heap order property All methods may assume 1. The tree is initially complete binary 2. The heap order property holds
Constructor method All methods must: 1. Produce complete binary trees Trivially true 2. Guarantee the heap order property Also trivially true This is the base case
findMin () The code public boolean isEmpty() { return size == 0; } public Comparable findMin() { if(isEmpty()) return null; return heap[1]; } Does not change the tree Trivially preserves the invariant
insert (Comparable x) Process 1. Create a “hole” at the next tree cell for x. heap[size+1] This preserves the completeness of the tree. 2. Percolate the hole up the tree until the heap order property is satisfied. This assures the heap order property is satisfied.
insert (Comparable x) Process 1. Create a “hole” at the next tree cell for x. heap[size+1] This preserves the completeness of the tree assuming it was complete to begin with. 2. Percolate the hole up the tree until the heap order property is satisfied. This assures the heap order property is satisfied assuming it held at the outset.
Percolation up public void insert(Comparable x) throws Overflow { if(isFull()) throw new Overflow(); int hole = ++size; for(; hole>1 && x.compareTo(heap[hole/2])<0; hole/=2) heap[hole] = heap[hole/2]; heap[hole] = x; }
Percolation up Bubble the hole up the tree until the heap order property is satisfied. hole = 11 HOP false Not really there... 21
Percolation up Bubble the hole up the tree until the heap order property is satisfied. hole = 11hole = 5 HOP falseHOP false
Percolation up Bubble the hole up the tree until the heap order property is satisfied. hole = 5hole = 2 HOP falseHOP true done
Percolation up public void insert(Comparable x) throws Overflow { if(isFull()) throw new Overflow(); int hole = ++size; for(; hole>1 && x.compareTo(heap[hole/2])<0; hole/=2) heap[hole] = heap[hole/2]; heap[hole] = x; } Integer division
deleteMin() /** * Remove the smallest item from the priority queue. the smallest item, or null, if empty. */ public Comparable deleteMin( ) { if(isEmpty()) return null; Comparable min = heap[1]; heap[1] = heap[size--]; percolateDown(1); return min; } Temporarily place last element at top Grab min element !!!
Percolation down Bubble the transplanted leaf value down the tree until the heap order property is satisfied
Percolation down Bubble the transplanted leaf value down the tree until the heap order property is satisfied
Percolation down Bubble the transplanted leaf value down the tree until the heap order property is satisfied done
percolateDown(int hole) private void percolateDown( int hole ) { int child = hole; Comparable tmp = heap[hole]; for( ; hole*2 <= size; hole=child ) { child = hole * 2; if(child!=size && heap[child+1].compareTo(heap[child]) < 0) child++; if(array[child].compareTo(tmp) < 0) heap[hole] = heap[child]; else break; } heap[hole] = tmp; } Is there a right child? Bubble up if smaller than tmp Initially 1 Select smaller child Finally, place the orphan Start at left child Exit loop if not bubbling
deleteMin () Observe that both components of the representation invariant are preserved by deleteMin. 1. Completeness The last cell ( heap[size] ) is vacated, providing the value to percolate down. This assures that the tree remains complete. 2. Heap order property
deleteMin () Observe that both components of the representation invariant are preserved by deleteMin. 1. Completeness The last cell ( heap[size] ) is vacated, providing the value to percolate down. This assures that the tree remains complete. 2. Heap order property
deleteMin () Observe that both components of the representation invariant are preserved by deleteMin. 1. Completeness The last cell ( heap[size] ) is vacated, providing the value to percolate down. This assures that the tree remains complete. 2. Heap order property The percolation algorithm assures that the orphaned value is relocated to a suitable position.
buildHeap() Start with complete (unordered) tree Starting from bottom, repeatedly call percolateDown() Void buildHeap () { for (int i = size/2; i>0; i--) percolateDown(i); }
buildHeap() performance At each iteration, have to do work proportional to the height of the current node Therefore, total running time is bounded by the sum of the heights of all of the nodes
Another Digression: Invariants
Representation Invariants Necessary for a clear understand an algorithm or piece of code. The most useful kind of comment you can make in a piece of code!
(Invariants) Plus ça change, plus c’est la même chose. The role of an induction hypothesis
(Invariants and induction) Induction hypothesis What you are allowed to assume At the start About the result values of a recursive call About the object state when a method is called What you must deliver At the end Of the result values when the recursive call returns About the object state when the method returns
(Invariants and induction) Induction hypothesis What you are allowed to assume At the start of a loop iteration About the result values of a recursive call About the object state when a method is called What you must deliver At the end of the loop iteration Of the result values when the recursive call returns About the object state when the method returns
(Invariants and induction) Induction hypothesis What you are allowed to assume About the result values of a recursive call About the object state when a method is called What you must deliver Of the result values when the recursive call returns About the object state when the method returns
(Invariants and induction) Induction hypothesis What you are allowed to assume At the start of a loop iteration About the result values of a recursive call About the object state when a method is called What you must deliver At the end of the loop iteration Of the result values when the recursive call returns About the object state when the method returns
(Invariants and induction) Induction hypothesis What you are allowed to assume At the start of a loop iteration About the result values of a recursive call About the object state when a method is called What you must deliver At the end of the loop iteration Of the result values when the recursive call returns About the object state when the method returns
(Invariants) Plus ça change, plus c’est la même chose The role of an induction hypothesis Invariants in programs Loop invariants Recursion invariants Representation invariants
(Representation invariant) What must always be true of the data structure when an operation completes. Theorem : Suppose each constructor assures the representation invariant is initially correct. Suppose each method preserves the representation invariant, assuming it is true initially. Then the representation invariant will always be true at the completion of each method.
(Representation invariant) What must always be true of the data structure when an operation completes. Theorem : Suppose each constructor assures the representation invariant is initially correct. Suppose each method preserves the representation invariant, assuming it is true initially. Then the representation invariant will always be true at the completion of each method.
(Representation invariants) What must always be true of the data structure when an operation completes. Theorem: Suppose each constructor assures the representation invariant is initially correct. Suppose each method preserves the representation invariant, assuming it is true initially. Then the representation invariant will always be true at the completion of each method. Assuming the code is not concurrent!
Heapsort Obviously we can use a priority queue to sort... just insert all the keys then do repeated deleteMins However we can take advantage of the fact that the heap always uses the first part of the array to do this with no extra space. This is called heapsort.
Heapsort Reverse the sense of the heap order (largest at the root) Start from position 0 in the array (children are 2i+1 and 2i+2) Call it percDown(a, i, len) Public static void heapsort(Comparable[] a) { for(int i = a.length/2; i>=0; i--) percDown(a, i, a.length); for (int j = a.length-1; j>0; j--) { swapReferences(a, 0, j); percDown(a, 0, j); }
Heapsort invariant At the start of the loop over j: a[j]…a[a.length-1] are sorted a[0]…a[j-1] are a heap
Homework 3 On Blackboard. You’re all working on it already. Right?