1 Fibonacci heaps, and applications
2 Yet a better MST algorithm (Fredman and Tarjan) Iteration i: We grow a forest, tree by tree, as follows. Start with a singleton vertex and continue as in Prim’s algorithm until either 1) The size of the heap is larger than k i 2) Next edge picked is connected to an already grown tree 3) Heap is empty (if the graph is connected this will happen only at the very end)
3 Contract each tree into a single vertex and start iteration i+1. How do we contract ? Do a DFS on the tree, marking for each vertex the # of the tree which contains it. Each edge e gets two numbers l(e), h(e) of the trees at its endpoints. If h(e) = l(e) remove e (self loop). (stable) Bucket sort by h(e) and by l(e), parallel edge then become consecutive so we can easily remove them. O(m) time overall.
4 Let n i be the number of vertices in the i-th iteration. O(m) inserts, O(m) decrease-key, O(n i ) delete-min total : O(n i log(k i ) + m) Set k i = 2 (2m/ni) so the work per phase is O(m). Analysis: each iteration takes linear time
5 How many iterations do we have ? Every tree in iteration i is incident with at least k i edges. So n i+1 k i 2m i 2m ==> n i+1 2m i / k i 2m / k i ==> k i+1 = 2 (2m/n i+1 ) 2 ki m/n
6 This runs in O(m (m,n)) Once k i n we stop. So the number of iterations is bounded by the minimum i such that m/n n i j = min{i | 2m/n log i (n) } = (m,n)
7 Summary The overall complexity of the algorithm is O(m (m,n) ) Where (m,n) = min{i | log i (n) 2m/n} for every m n (m,n) log*(n) For m > n log(n) the algorithm degenerates to Prim’s. One can prove that O(m (m,n) ) = O(nlogn + m).
8 So our record is O(m (m,n) ), can we do better ? Where is the bottleneck now ? We may scan an edge (m,n) times. When we abandon a heap we will rescan an edge per vertex in the heap. Delay scanning of edges (heavy edges)
9 Packets (Gabow, Galil, Spencer, Tarjan) Group the edges incident to each vertex into packets of size p each Sort each packet Treat each packet as a single edge (the first edge in the packet)
10 Working with packets When you extract the min from the heap it is associated with a packet whose top edge is (u,v). You add (u,v) to the tree, delete (u,v) from its packet, and relax this packet of u Traverse the packets of v and relax each of them How do you relax a packet ?
11 Relaxing packet p of vertex v Check the smallest edge (v,u) in p If u is already in the tree, discard (v,u), and recur If u is not in the heap: insert it into the heap with weight w(v,u) If u is in the heap: If the weight of u is larger than the weight of (v,u) then decrease its key Let p be the packet with larger weight among the current and the previous packet associated with u, discard its first edge and recur on p
12 Analysis Initialization: O(m) to partition into packets, O(mlog(p)) to sort the packets An iteration: O(n i log(k i )) for extract-mins, and O(m/p) work for each packet Additional work per packet we can charge to an edge which we discard…… Total O(m) O ( m + mlog(p) + Σ(n i log(k i ) + m/p) ) iterations Summing up
13 Set k i = 2 (2m/pn i ) so the work per phase is O(m/p). Analysis (Cont) O ( mlog(p) + Σ(n i log(k i ) + m/p) ) iterations Every tree in iteration i is incident with at least k i packets So n i+1 k i 2m/p ==> n i+1 2m / pk i ==> k i+1 = 2 (2m/pn i+1 ) 2 ki m/n Set k 1 = 2 (2m/n) ; the work in the first phase is O(m) We have ≤ (m,n) iterations
14 The running time is Analysis (Cont) O ( mlog(p) + (m,n) m/p ) Choose p= (m,n) so we get O ( mlog( (m,n)) )
15 If we want the running time per iteration to be m/p, how do we do contractions ? But we cheated… Use union/find (conract by uniting vertices and concatenating their adjacency lists) You get an (m,n) overhead when you relax a packet and overall O ( m (m,n) + m log( (m,n)) + Σn i ) = O(m log( (m,n)))
16 We cannot partition the edges incident to a vertex into packets each of size p exactly ? Furthermore, it wasn’t our only cheat.. So there can be at most one undersized packet per vertex When you contract merge undersized packets How do you merge undersized packets ?
17 This gives log(p) overhead to relax a packet Use a little F-heap to represent each packet But still the overall time is O(m log( (m,n)))