Heaps Binomial Heaps Lazy Binomial Heaps 1
Heaps / Priority queues Binary Heaps Binomial Lazy Binomial Heaps Fibonacci Insert O(logn) O(1) Find-min Delete-min Decrease-key Meld – Worst case Amortized Delete can be implemented using Decrease-key + Delete-min Decrease-key in O(1) time important for Dijkstra and Prim
Binomial Heaps
Binomial Trees B0 B1 B2 B3 B4
Binomial Trees B0 B1 B2 B3 B4
Binomial Trees B0 B1 B2 B3 Bk Bk … Bk−1 Bk−2 Bk−1 Bk−1
Binomial Trees Bk Bk B0 … Bk−1 Bk−1 B k−2 Bk−1
Min-heap Ordered Binomial Trees 2 15 5 17 25 40 20 35 18 11 38 45 58 31 45 67 key of child key of parent
Tournaments Binomial Trees D D D F F A C A D F G G E B A B C D E F G H H The children of x are the items that lost matches with x, in the order in which the matches took place.
Binomial Heap A list of binomial trees, at most one of each rank Pointer to root with minimal key 23 9 15 33 40 20 35 Each number n can be written in a unique way as a sum of powers of 2 11 = (1011)2 = 8+2+1 45 58 31 At most 67 log2(n+1)trees
Ordered forest Binary tree 23 9 15 33 40 20 35 45 58 31 x info key child next 67 child – leftmost child next – next “sibling” 2 pointers per node 11
Forest Binary tree 23 15 40 20 35 23 9 9 33 33 15 45 58 31 40 67 45 20 Heap order “half ordered” 67 58 31 35
Binomial heap representation Q n 2 pointers per node No explicit rank information How do we determine ranks? first min x 23 9 15 33 40 20 35 info key child next 45 58 31 “Structure V” 67
Alternative representation Q Reverse sibling pointers Make lists circular Avoids reversals during meld n first min 23 9 15 33 40 20 35 info key child next 45 58 31 “Structure R” [Brown (1978)] 67
Linking binomial trees a ≤ b x Bk a y b Bk−1 O(1) time
Linking binomial trees Linking in first representation Linking in second
Melding binomial heaps Link trees of same degree Q1: Q2:
Melding binomial heaps Link trees of same degree Q1: B0 Q2: B3 B4 Like adding binary numbers Maintain a pointer to the minimum O(logn) time
Insert 11 23 15 9 33 40 20 35 45 58 31 67 New item is a one tree binomial heap Meld it to the original heap O(logn) time
When we delete the minimum, we get a binomial heap Delete-min 23 15 9 33 40 20 35 When we delete the minimum, we get a binomial heap 45 58 31 67
Delete-min 23 15 33 40 20 35 When we delete the minimum, we get a binomial heap Meld it to the original heap O(logn) time 45 58 31 67 (Need to reverse list of roots in first representation)
Decrease-key using “sift-up” 2 17 25 18 11 38 I 40 20 35 45 58 31 67 15 5 45 Decrease-key(Q,I,7)
Decrease-key using “sift-up” 2 I 15 5 17 25 40 20 35 18 11 38 45 58 7 45 Need parent pointers (not needed before) 67
Decrease-key using “sift-up” 2 I 15 5 17 25 40 7 35 18 11 38 45 58 20 45 67 Need to update the node pointed by I
Decrease-key using “sift-up” 2 I 15 5 17 25 40 7 35 18 11 38 45 58 20 45 67 Need to update the node pointed by I
Decrease-key using “sift-up” 2 17 25 18 11 38 I 40 15 35 45 58 20 67 7 5 45 How can we do it?
Adding parent pointers Q Adding parent pointers n first min 23 9 15 33 40 20 35 45 58 31 67
Adding a level of indirection Q Adding a level of indirection n “Endogenous vs. exogenous” first min 23 9 15 x I 33 40 20 35 item child next parent 45 58 31 Nodes and items are distinct entities info key node 67
Heaps / Priority queues Binary Heaps Binomial Lazy Binomial Heaps Fibonacci Insert O(logn) O(1) Find-min Delete-min Decrease-key Meld – Worst case Amortized
Lazy Binomial Heaps 30
Binomial Heaps Lazy Binomial Heaps A list of binomial trees, at most one of each rank, sorted by rank (at most O(logn) trees) Pointer to root with minimal key Lazy Binomial Heaps An arbitrary list of binomial trees (possibly n trees of size 1)
Lazy Binomial Heaps An arbitrary list of binomial trees Pointer to root with minimal key 10 29 9 33 59 87 15 19 40 20 35 20 45 58 31 67
Lazy Meld Concatenate the two lists of trees Update the pointer to root with minimal key O(1) worst case time Lazy Insert Add the new item to the list of roots Update the pointer to root with minimal key O(1) worst case time
Lazy Delete-min ? Remove the minimum root and meld ? 10 29 15 9 59 19 33 40 20 35 20 45 58 31 67 May need (n) time to find the new minimum
Consolidating / Successive Linking 10 29 15 59 40 20 35 19 33 45 58 31 20 67 … 1 2 3
Consolidating / Successive Linking 29 15 59 40 20 35 19 33 45 58 31 20 67 … 10 1 2 3
Consolidating / Successive Linking 15 59 40 20 35 19 33 45 58 31 67 … 10 29 1 2 3
Consolidating / Successive Linking 59 40 20 35 19 45 58 31 67 10 … 15 29 33 1 2 3
Consolidating / Successive Linking 40 20 35 19 45 58 31 67 10 … 15 29 59 33 1 2 3
Consolidating / Successive Linking 20 35 19 31 20 10 40 15 29 … 45 58 33 59 67 1 2 3
Consolidating / Successive Linking 35 19 20 10 40 15 29 … 20 45 58 33 59 67 31 1 2 3
Consolidating / Successive Linking 19 20 35 10 59 40 15 29 … 20 45 58 33 67 31 1 2 3
Consolidating / Successive Linking 19 20 10 20 40 15 29 … 35 31 45 58 33 67 59 1 2 3
Consolidating / Successive Linking At the end of the process, we obtain a non-lazy binomial heap containing at most log(n+1) trees, at most one of each rank 10 20 40 15 29 … 35 31 19 45 58 33 67 20 59 1 2 3
Outline Lazy Union Extract-Min Analysis
Amortized Thinking Union violates binomial heaps structural property Let the Extract-Min convert the lazy version back to a clean binomial heap Extract-Min can cost ϴ( n ) in worst-case However, it is O( log n ) amortized time ! Union : increase potential Extract-Min : decrease potential
Extract-Min : Amortized Analysis Let Ti be the number of trees after the ith operation Let r be the rank of the tree containing the min. Extract-Min actual cost ϴ( # trees ) = ϴ( r + Ti-1 ) = O( log n + Ti-1) r = O( log n ) since all the trees are binomial trees
Extract-Min : Amortized Analysis Let the potential function Фi = Ti ai = ci + Фi - Фi-1 = O( log n + Ti-1) + (Ti - Ti-1) = O( log n + Ti-1) + O( log n ) - Ti-1 = O( log n ) + O( Ti-1) - Ti-1 = O( log n )
Potential Energy H Ф
Potential Energy H Ф Lazy Binomial Heap - Insert
Potential Energy H Ф Lazy Binomial Heap – Extract-Min
Potential Energy H Ф Lazy Binomial Heap – Consolidate
Potential Energy H Ф Lazy Binomial Heap – Consolidate
Potential Energy H Ф Lazy Binomial Heap – Consolidate
Accounting Method Invarient : one baht credited at each root node for future consolidating task. H
Accounting Method Insert : amortized cost = 2 H one for list insertion + one stored at the root
Accounting Method Extract-Min : amortized cost = O( log n ) H
Accounting Method Extract-Min : amortized cost = O( log n ) H
Accounting Method Extract-Min : amortized cost = O( log n ) O( log n ) children consolidate
Total cost = O( (T0+L) + L + log2n ) Cost of Consolidating T0 – Number of trees before T1 – Number of trees after L – Number of links T1 = T0−L (Each link reduces the number of tree by 1) Total number of trees processed – T0+L (Each link creates a new tree) Putting trees into buckets or finding trees to link with Handling the buckets Linking Total cost = O( (T0+L) + L + log2n ) = O( T0+ log2n ) As L ≤ T0
Amortized Cost of Consolidating (Scaled) actual cost = T0 + log2n Change in potential = = T1−T0 Amortized cost = (T0 + log2n ) + (T1−T0) = T1 + log2n ≤ 2 log2n As T1 ≤ log2n Another view: A link decreases the potential by 1. This can pay for handling all the trees involved in the link. The only “unaccounted” trees are those that were not the input nor the output of a link operation.
Consolidating / Successive Linking At the end of the process, we obtain a non-lazy binomial heap containing at most logn trees, at most one of each degree Worst case cost – O(n) Amortized cost – O(logn) Potential = Number of Trees
Lazy Binomial Heaps Actual cost Change in potential Amortized cost Insert O(1) 1 Find-min Delete-min O(k+T0+logn) k1+T1−T0 O(logn) Decrease-key Meld Rank of deleted root
Heaps / Priority queues Binary Heaps Binomial Lazy Binomia l Heaps Fibonacci Insert O(logn) O(1) Find-min Delete-min Decrease-key Meld – Worst case Amortized
Heaps / Priority queues Lazy binomial heap - amortized cost Union : increase potential : ϴ (1) Insert : increase potential : ϴ (1) Extract-Min : decrease potential : O( log n ) Decrease-Key, Delete : ϴ ( log n )
Properties of binomial trees 1) | Bk | = 2k 2) degree(root(Bk)) = k 3) depth(Bk) = k ==> The degree and depth of a binomial tree with at most n nodes is at most log(n). Define the rank of Bk to be k
Binomial heaps (def) A collection of binomial trees at most one of every rank. Items at the nodes, heap ordered. 5 6 1 8 2 9 10 Possible rep: Doubly link roots and children of every nodes. Parent pointers needed for delete.
Binomial heaps (operations) Operations are defined via a basic operation, called linking, of binomial trees: Produce a Bk from two Bk-1, keep heap order. 1 6 4 11 6 9 5 10 2 9 5 8 5 10
Binomial heaps (ops cont.) Basic operation is meld(h1,h2): Like addition of binary numbers. B5 B4 B2 B1 h1: B4 B3 B1 B0 + h2: B4 B3 B0 B5 B4 B2
Binomial heaps (ops cont.) Findmin(h): obvious Insert(x,h) : meld a new heap with a single B0 containing x, with h deletemin(h) : Chop off the minimal root. Meld the subtrees with h. Update minimum pointer if needed. delete(x,h) : Bubble up and continue like delete-min decrease-key(x,h,) : Bubble up, update min ptr if needed All operations take O(log n) time on the worst case, except find-min(h) that takes O(1) time.
Amortized analysis We are interested in the worst case running time of a sequence of operations. Example: binary counter single operation -- increment 00000 00001 00010 00011 00100 00101
Amortized analysis (Cont.) On the worst case increment takes O(k). k = #digits What is the complexity of a sequence of increments (on the worst case) ? Define a potential of the counter: (c) = ? Amortized(increment) = actual(increment) +
Amortized analysis (Cont.) Amortized(increment1) = actual(increment1) + 1- 0 Amortized(increment2) = actual(increment2) + 2- 1 + … Amortized(incrementn) = actual(incrementn) + n- (n-1) iAmortized(incrementi) = iactual(incrementi) + n- 0 iAmortized(incrementi) iactual(incrementi) if n- 0 0
Amortized analysis (Cont.) Define a potential of the counter: (c) = #(ones) Amortized(increment) = actual(increment) + Amortized(increment) = 1+ #(1 => 0) + 1 - #(1 => 0) = O(1) ==> Sequence of n increments takes O(n) time
Binomial heaps - amortized ana. (collection of heaps) = #(trees) Amortized cost of insert O(1) Amortized cost of other operations still O(log n)
Binomial heaps + lazy meld Allow more than one tree of each rank. Meld (h1,h2) : Concatenate the lists of binomial trees. Update the minimum pointer to be the smaller of the minimums O(1) worst case and amortized.
Binomial heaps + lazy meld As long as we do not do a delete-min our heaps are just doubly linked lists: 9 11 4 6 9 5 Delete-min : Chop off the minimum root, add its children to the list of trees. Successive linking: Traverse the forest keep linking trees of the same rank, maintain a pointer to the minimum root.
Binomial heaps + lazy meld Possible implementation of delete-min is using an array indexed by rank to keep at most one binomial tree of each rank that we already traversed. Once we encounter a second tree of some rank we link them and keep linking until we do not have two trees of the same rank. We record the resulting tree in the array Amortized(delete-min) = = (#(trees at the end) + #links + max-rank) - #links (2log(n) + #links) - #links = O(log(n))
Binomial heaps + lazy delete Allow more than one tree of each rank. Meld (h1,h2), Insert(x,h) -- as before Delete(x,h) : simply mark x as deleted. Deletemin(h) : y = findmin(h) ; delete(y,h) How do we do findmin ?
Binomial heaps + lazy delete Traverse the trees top down purging deleted nodes and stopping at each non-deleted node Do successive linking on the forest you obtain.
Binomial heaps + lazy delete
Binomial heaps + lazy delete
Binomial heaps + lazy delete (ana.) Modify the potential a little: (collection of heaps) = #(trees) + #(deleted nodes) Insert, meld, delete : O(1) delete-min : like find-min What is the amortized cost of find-min ?
Binomial heaps + lazy delete (ana.) What is the amortized cost of find-min ? amortized(find-min) = amortized(purging) + amortized(successive linking + scan of undeleted nodes) We saw that: amortized(successive linking) = O(log(n)) Amortized(purge) = actual(purge) + (purge) Actual(purge) = #(nodes purged) + #(new trees) (purge) = #(new trees) - #(nodes purged) So, amortized(find-min) = O(log(n) + #(new trees) )
Binomial heaps + lazy delete (ana.) How many new trees are created by the purging step ? Let p = #(nodes purged), n = total #(nodes) Then #(new trees) = O( p*(log(n/p)+ 1) ) So, amortized(find-min) = O( p*(log(n/p)+ 1) ) Proof. Suppose the i-th purged node, 1 i p, had ki undeleted children. One of them has degree at least ki-1. Therefore in its subtree there are at least 2(ki-1) nodes.
Binomial heaps + lazy delete (ana.) Proof (cont). How large can k1+k2+ . . . +kp be such that i=1 2(ki-1) n ? Make all ki equal log(n/p) + 1, then i ki = p*(log(n/p)+ 1) p
Application: The round robin algorithm of Cheriton and Tarjan (76) for MST We shall use a Union-Find data structure. The union find problem is where we want to maintain a collection of disjoint sets under the operations 1) S=Union(S1,S2) 2) S=find(x) Can do it in O(1) amortized time for union and O((k,n)) amortized time for find, where k is the # of finds, and n is the number of items (assuming k ≥ n).
A greedy algorithm for MST Start with a forest where each tree is a singleton. Repeat the following step until there is only one tree in the forest: Pick T F, pick a minimum cost edge e connecting a vertex of T to a vertex in T’, add e to the forest (merge T and T’ to one tree) Prim’s algorithm: picks always the same T Kruskal’s algorithm: picks the lightest edge out of F
Cheriton & Tarjan’s ideas Keep the trees in a queue, pick the first tree, T, in the queue, pick the lightest edge e connecting it to another tree T’. Remove T’ from the queue, connect T and T’ with e. Add the resulting tree to the end of the queue.
Cheriton & Tarjan (cont.)
Cheriton & Tarjan (implementation) The vertices of each tree T will be a set in a Union-Find data structure. Denote it also by T Edges with one endpoint in T are stored in a heap data structure. Denoted by h(T). We use binomial queues with lazy meld and deletion. Find e by doing find-min on h(T). Let e=(v,w). Find T’ by doing find(w). Then create the new tree by T’’= union(T,T’) and h(T’’) = meld(h(T),h(T’))
Cheriton & Tarjan (implementation) Note: The meld implicitly delete edges. Every edge in h(T) with both endpoints in T is considered as “marked deleted”. We never explicitly delete edges! We can determine whether an edge is deleted or not by two find operations.
Cheriton & Tarjan (analysis) Assume for the moment that find costs O(1) time. Then we can determine whether a node is marked deleted in O(1) time, and our analysis is still valid. So, we have at most 2m implicit delete operations that cost O(m). at most n find operations that cost O(n). at most n meld and union operations that cost O(n). at most n find-min operations. The complexity of these find-min operations dominates the complexity of the algorithm.
Cheriton & Tarjan (analysis) Let mi be the number of edges in the heap at the i-th iteration. Let pi be the number of deleted edges purged from the heap at the find-min performed by the i-th iteration. So, we proved that the i-th find-min costs O(pi *(log(mi / pi)+ 1) ). We want to bound the sum of these expressions. We will bound i mi first.
Cheriton & Tarjan (analysis) Divide the iterations into passes as follows. Pass 1 is when we remove the original singleton trees from the queue. Pass i is when we remove trees added to the queue at pass i-1. What is the size of a tree removed from the queue at pass j ? At least 2j . (Prove by induction) So how many passes are there ? At most log(n)
Cheriton & Tarjan (analysis) An edge can occur in at most two heaps of trees in one pass. So i mi 2m log(n) Recall we want to bound O(i pi *(log(mi / pi)+ 1) ). 1) Consider all find-mins such that pi mi / log2(n): O(i pi *(log(mi / pi)+ 1) ) = O(i pi log log(n)) = O(m loglog(n)) 2) Consider all find-mins such that pi mi / log2(n): O(i pi *(log(mi / pi)+ 1) ) = O(i (mi / log2(n)) log(mi)) = O(m)
Cheriton & Tarjan (analysis) We obtained a time bound of O(m loglog(n)) under the assumption that find takes O(1) time. But if you amortize the cost of the finds on O(m loglog(n)) operations then the cost per find is really (m loglog(n),n) = O(1)
One-pass successive linking A tree produced by a link is immediately put in the output list and not linked again Worst case cost – O(n) Amortized cost – O(logn) Potential = Number of Trees Exercise: Prove it!
One-pass successive Linking 10 29 15 59 40 20 35 19 33 45 58 31 67 … 1 2 3
One-pass successive Linking 29 15 59 40 20 35 19 33 45 58 31 67 … 10 1 2 3
One-pass successive Linking 15 59 40 20 35 19 33 45 58 31 20 67 … 1 2 3 10 Output list: 29