Download presentation
Presentation is loading. Please wait.
External-Memory MST (Arge, Brodal, Toma)
Minimum-Spanning Tree Given a weighted, undirected graph G=(V,E), the minimum-spanning tree (MST) problem is the problem of finding a spanning tree for G of minimum weight. Assumptions: 1.G is connected; 2.No two edges in G have the same weight.
External-Memory Graph Algorithms Standard two-level I/O model with a single disk: N = V + E M = number of vertices/edges that can fit into internal memory. B = number of vertices/edges per disk block. The graph is given as a list of edges sorted by vertex.
External-Memory Graph Algorithms (2) For MST and CC, randomize O(sort(E)) I/Os algorithms are known.
Prim’s Algorithm abdcef 1 5 6 4 3 7 9 2 8 Priority Queue: a b c d e f b 1 56 36 7 287 a 47 c 7 d {b,a} {a,c} {c,d} {d,e} {a, f} e
Prim’s Algorithm (2) Prim’s algorithm cannot be implemented efficiently in external memory: It is not guaranteed that even the priority queue alone fits in memory. Thus, we cannot in general get the current vertex priority without using an I/O. A direct implementation leads to an Ω(E) I/O algorithm.
Prim’s Algorithm (3) abdcef 1 5 6 4 3 7 9 2 8 Priority Queue: {b,a} {a,c} {c,d} {d,e} {a, f} Modification: store edges in the priority-queue instead of vertices. b {b,a} (1) {b,c} (5) {b,d} (6) a {a,c} (3) {b,c} (5) {b,d} (6) {a, f} (7) c {c,d} (2) {b,d} (6) {c,b} (5) {a, f} (7) {b,c} (5) {c,e} (8) d {d,e} (4) {b,d} (6) {c,b} (5) {a, f} (7) {b,c} (5) {c,e} (8) {d,b} (6) {c,b} (5) {a, f} (7) {b,c} (5) {e,c} (8) {b,d} (6) {c,e} (8) {d,b} (6) {e, f} (9) e {b,d} (6) {e,c} (8) {d,b} (6) {c,e} (8) {a, f} (7) {e, f} (9) {a, f} (7) {e,c} (8) {c,e} (8) {e, f} (9) f {e,c} (8) {c,e} (8) {e, f} (9) {f, e} (9) Any two edges have distinct weights
Modified Prim Algorithm The correctness follows directly from the correctness of the original algorithm (“blue rule” still applies). Efficiency: –At least one I/O per vertex in order to read its adjacency list => O(V + E/B) I/Os. –O(E) operations on external priority queue can be performed in O(sort(E)). –Thus in total we have O(V + sort(E)) I/Os.
Boruvka’s Algorithm abdcef 1 5 6 4 3 7 9 2 8 {b,a} {c,d} {d,e} {a, f} (1) Select for each vertex the minimum weight edge adjacent to it. (2) Contract the graph and return to (1)
Boruvka’s Algorithm abf cde 3,5,6,9 {b,a} {a,c} {c,d} {d,e} {a, f} (1) Select for each vertex the minimum weight edge adjacent to it. (2) Contract the graph and return to (1)
External-Memory Boruvka’s Step For each vertex v, let C(v) be the lightest vertex adjacent to it. Let G’ be the graph obtained by taking only edges of the form (v, C(v)) for each v. Let G’ d be the graph obtained by directing each edge (v, C(v)) in G’ from C(v) to v. The goal is to contract each connected component in G’ into a single vertex.
Unique Representatives In each connected component of G’ d : Each vertex has indegree 1. The weight of the edges along any root-leaf path is increasing. There is exactly one cycle, consisting of the minimal weight edge.
External-Memory Boruvka’s Step (2) The roots can be easily identified, and we can choose them to be the unique representatives of the components in G’. We would like to replace each edge (u, v) with an edge (u r, v r ), where u r and v r are the unique representatives of the components containing u and v respectively. Then, we can remove parallel & self edges, and obtain the contracted graph.
External-Memory Boruvka’s Step (3) abdcef 1 5 6 4 3 7 9 2 8 GG’G’ d L: (b,a) (1); (a, f) (7) (c,d) (2); (d,e) (4) (d,e) (4) (a, f) (7) Priority Queue: a (1) [b] d (2) [c] Initialized with each vertex that is an immediate successor of a root vertex. d (2) [c] f (7) [b] Output: b → b c → c a → b d → c f → b e → c e (4) [c] f (7) [b]
External-Memory Boruvka’s Step (4) To finish the contraction: 1.sort the output of the previous phase and E by the first component. Then scan the two lists simultaneously, replacing each edge (v, u) in E with (v r,u). 2.sort the output and E by the second component, and then scan the two lists replacing each edge (v r, u) in E with (v r, u r ). 3.sort E by both components and by weight, and with a single scan remove duplicate & self edges.
Boruvka’s Step - I/O efficiency 1.Lightest incident edges can be collected in O(E/B) I/Os in a simple scan of the edge-list representation of G (we assume it is sorted). 2.Detection of cycles in G’ d can be done in O(sort(V)) I/Os: sort the collected edges by weight and find duplicates in a single scan. remove edges to break cycles and identify unique representatives.
Boruvka’s Step - I/O efficiency (2) 3.The list L contains each edge in G’ d at most twice, and can be constructed in O(sort(V)) I/Os: sort one instance of the list of edges by the second component. sort another instance by the first component. create the structure of L in a single scan and sort it by weight. 4.The PQ can be initialized in a similar way in O(sort(V)) I/Os.
Boruvka’s Step - I/O efficiency (3) 5.We perform a total of V insertions to PQ, and V extract-min operations. That can be performed in O(sort(V)) I/Os. 6.Replacing the edges of G with the unique representatives is done using a few sorting and scanning operations as described before. Here the entire edge list is sorted, and thus O(sort(E)) I/Os are needed. Total: O(E/B + sort(V) + sort(E)) = O(sort(E)) I/Os.
Results So Far O(sort(E)·lg(V·B/E)) I/Os 1.Contract G until V ≤ E/B using Boruvka’s steps. 2.Run Prim on the result. O(sort(E) · lgV) I/OsModified Boruvka O(V + sort(E)) I/OsModified Prim It is possible to perform lg(V·B/E) Boruvka’s steps using lglg(V·B/E) superphases requiring O(sort(E)) I/Os each.
Yet a better MST algorithm Superphase Algorithm At superphase i : Let N i = 2 (3/2) i (N i +1 = N i ·(N i ) 1/2 ) Let G i = (V i, E i ) be the graph prior to superphase i. Let E i ‘ E i be the set that for each vertex contains the √N i lightest edges incident to it. Let the blocking value for a vertex be the weight of the √N i + 1 th lightest edge incident to it (or infinity if no such edge exists). E i ‘ and blocking values can be found with O(sort(E i )) I/Os as described earlier.
Superphase Algorithm At superphase i, perform on G i ‘ log√N i contraction phases as described before, but now select the lightest edge incident to a vertex only if it is smaller than its blocking value. After a single contraction, the blocking value of a supervertex is set to be the minimum of the blocking values of the contracted vertices. After that, the remaining edges of E i ‘ contain all edges of E i adjacent to supervertex v with weight smaller than the blocking value of v. Thus only edges that actually belong to the MST are contracted.
Superphase Algorithm (2) But how many vertices remain after each superphase? The blocking value might prevents us from selecting an edge for v. But if so than: The blocking value of v corresponds to the blocking value of some vertex u in V i, and v must contain the √N i edges adjacent to u in E i ‘. Thus v must be the contraction of at least √N i vertices from V i If no blocking value prevents us from selecting an edge for v, then after log√N i phases, v must be the contraction of at least 2 log√N i = √N i vertices.
Superphase Algorithm (3) It can be proved by induction on i that V i ≤ 2V / N i : For i = 0, N i = 2 and V 0 = V. V i+1 ≤ V i / √N i ≤ (2V / N i ) / √N i = 2V / N i+1 Conclusion: E i ‘ ≤ V i √N i ≤ 2V / √N i Thus, in order to reduce the number of vertices by a factor of √N i we used so far: O(sort(E i ) + sort(E i ‘) · log√N i ) = O(sort(E) + sort(V / √N i ) · log√N i ) = O(sort(E)) I/Os.
Superphase Algorithm (4) In order to finish a superphase, we need to reincorporate edges from E i not selected to E i ‘: During the contraction phases, maintain a list C of the form (v, v s ) for v V i. Use the output of the Boruvka’s step, as described earlier, in order to update C: Sort C by second component and the output by first component and scan them simultaneously. This is done using O(sort(V i )) I/Os. In total, in order to maintain C, we use: O(sort(V i )·log√N i ) = O(sort(V / N i )·log√N i ) = O(sort(V)) I/Os.
Superphase Algorithm – I/O Efficiency 1.E i ‘ and blocking values are computed in O(sort(E i )) I/Os. 2.Each superphase takes up O(sort(E)) I/Os. 3.Maintaining the list C during the superphase is done with O(sort(V)) I/Os. 4.Given C, the edges in (E i \ E i ‘) can be reincorporated in O(sort(E)) as we did in the single contraction algorithm. 5.Finally, in order to reduce V to E/B, log 3/2 lg(V·B / E) superphases are needed. 6.Total: O(sort(E)·lglg(V·B / E)) I/Os.
Similar presentations
© 2025 Inc.
All rights reserved.