A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

Greedy Algorithms Greed is good. (Some of the time)
Greed is good. (Some of the time)
Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.
A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees 黃則翰 R 蘇承祖 R 張紘睿 R 許智程 D 戴于晉 R David R. Karger.
A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees David R. Karger David R. Karger Philip N. Klein Philip N. Klein Robert E. Tarjan.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
1 Spanning Trees Lecture 20 CS2110 – Spring
Discussion #36 Spanning Trees
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 13 Minumum spanning trees Motivation Properties of minimum spanning trees Kruskal’s.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
Lecture 18: Minimum Spanning Trees Shang-Hua Teng.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 11 Instructor: Paul Beame.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Minimum Spanning Trees and Clustering By Swee-Ling Tang April 20, /20/20101.
Design and Analysis of Computer Algorithm September 10, Design and Analysis of Computer Algorithm Lecture 5-2 Pradondet Nilagupta Department of Computer.
1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
MST Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
Télécom 2A – Algo Complexity (1) Time Complexity and the divide and conquer strategy Or : how to measure algorithm run-time And : design efficient algorithms.
Greedy Algorithms and Matroids Andreas Klappenecker.
Spanning Trees CSIT 402 Data Structures II 1. 2 Two Algorithms Prim: (build tree incrementally) – Pick lower cost edge connected to known (incomplete)
Lecture 19 Greedy Algorithms Minimum Spanning Tree Problem.
Spanning Trees. A spanning tree for a connected, undirected graph G is a graph S consisting of the nodes of G together with enough edges of G such that:
Minimal Spanning Tree Problems in What is a minimal spanning tree An MST is a tree (set of edges) that connects all nodes in a graph, using.
Minimum Spanning Trees CSE 373 Data Structures Lecture 21.
Minimum Bottleneck Spanning Trees (MBST)
Algorithm Design and Analysis June 11, Algorithm Design and Analysis Pradondet Nilagupta Department of Computer Engineering This lecture note.
::Network Optimization:: Minimum Spanning Trees and Clustering Taufik Djatna, Dr.Eng. 1.
Midwestern State University Minimum Spanning Trees Definition of MST Generic MST algorithm Kruskal's algorithm Prim's algorithm 1.
Data Structures & Algorithms Graphs Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
Randomized Min-Cut Algorithm
Graphs Representation, BFS, DFS
Chapter 9 : Graphs Part II (Minimum Spanning Trees)
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008
May 12th – Minimum Spanning Trees
Lecture 26 CSE 331 Nov 2, 2016.
Spanning Trees.
Great Theoretical Ideas in Computer Science
Lecture 12 Algorithm Analysis
CISC 235: Topic 10 Graph Algorithms.
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
MST in Log-Star Rounds of Congested Clique
Minimum Spanning Trees
Data Structures & Algorithms Graphs
Graph Algorithm.
Discrete Mathematics for Computer Science
Minimum Spanning Tree.
Randomized Algorithms CS648
Data Structures – LECTURE 13 Minumum spanning trees
Lecture 26 CSE 331 Nov 1, 2017.
CS 583 Analysis of Algorithms
Hannah Tang and Brian Tjaden Summer Quarter 2002
Lecture 12 Algorithm Analysis
Minimum Spanning Tree Algorithms
Lecture 27 CSE 331 Oct 31, 2014.
Lecture 28 CSE 331 Nov 7, 2012.
Minimum Spanning Trees
CSE 373: Data Structures and Algorithms
Lecture 12 Algorithm Analysis
Spanning Trees Lecture 20 CS2110 – Spring 2015.
Analysis of Algorithms
Lecture 23: Minimum Spanning Trees
Time Complexity and the divide and conquer strategy
Minimum Spanning Trees
Presentation transcript:

A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees David R. Karger, Stanford Philip N. Klein, Brown Robert E. Tarjan, Princeton Presented By: Claire Le Goues Theory lunch August 14, 2008 Give year of publication: 1995. Outline and signpost. Also add slide numbers?

Minimum Spanning Tree (MST) Definition: a subset of edges in an undirected, weighted graph such that all the vertices are touched by at least one of the edges and the combined weight of all the edges is minimal (the weight of this tree is 65)

MST: Why? Cable/phone line layouts Airplane route scheduling Electricity grids …and so on. Doing it quickly and efficiently is important! Title color. Could actually motivate and actually say stuff. What is the baseline? Is it in NP? How do you check an MST in linear time? The best known techniques are nlogn and you can verify in order n time. Mot Motivation is that datasets are really huge.

Previous (Deterministic) Approaches Greedy algorithms, polynomial time: Baruvka (1926) Prim’s and Kruskal’s algorithms. Better algorithms, O(nlogn): Chazelle Gabow et al. etc Baruvka’s goal was efficient electrical coverage of Moravia. “Beyond the scope of this presentation.” Knowledge of P and K is helpful but not necessary; I’m going to be walking you through a new algorithm to solve this problem.

Introducing: RANDOMNESS Can we do better? SURE! (probabilistically) Introducing: RANDOMNESS We want our expected run-time to be linear time. Usually does very well.

A Sorting Digression Quicksort (vs. Mergesort, for example) Pick a random pivot point and recurse on parts. If we pick the point properly, it’s much better! But we need to know how/what to pick, and how to combine the subparts KKT’s algorithm is analogous. A little more detail on what quicksort is doing when building things back up - combining two sorted lists takes O(n) time. We want to combine MSTs that’s fast - we’re basically writing down a list of edges.

Preliminaries: Cut and Cycle Properties Cut property: For any proper nonempty subset X of the vertices, the lightest edge with exactly one endpoint in X belongs to the minimum spanning tree Cycle property: For any cycle C in a graph, the heaviest edge in C does not appear in the minimum spanning tree

More preliminaries w(x,y) is the weight of the edge {x, y} Where F is a forest in G, F(x,y) is the path connecting x and y in F, and wF(x,y) is the maximum weight of an edge on F(x,y) An edge (x,y) is F-heavy if w(x,y) > wF(x,y) and F-light otherwise. F-heavy edges don’t belong in an MST! Starting with small picture graph thing to show how the F-heavy edge is not included in the MST.

F-Heavy Edges 3 A B 7 12 1 8 1 D C

F-Heavy Edges 3 A B 7 12 1 8 1 D C

F-Heavy Edges 3 A B w(B,D) = 12 7 12 1 8 1 D C

F-Heavy Edges 3 A B 7 12 1 8 1 D C

F-Heavy Edges 3 A B 7 Wf(B,D) = 8 12 1 8 1 D C

F-Heavy Edges 3 A B 12 > 8, So w(B,D) is F-Heavy! 7 12 1 8 1 D C

F-Heavy Edges 3 A B 7 12 1 8 1 D C

F-Heavy Edges 3 A B w(A,D) = 7 7 12 1 8 1 D C

F-Heavy Edges 3 A B wf(A,D) = 1 7 12 1 8 1 D C

F-Heavy Edges 7 > 1, So w(A,D) is F-Heavy! 3 B A 7 12 1 8 D 1 C Notice 1 D C

F-Heavy Edges - The Final MST 3 A B 7 12 1 8 Notice I could keep sampling forests in this graph and throwing away F-Heavy edges, which don’t belong in the MST, and I’d wind up with this MST for the graph. That’ll take too long, though; is there a better way we could sample the forests in the graph? That is the gist of this algorithm. 1 D C

One more thing: Boruvka Step For each vertex v, select the minimum-weight edge incident to v. The selected edges defined connected components; coalesce these components into single vertices. Delete all resulting loops, isolated vertices, and all but the lowest-weight edge among each set of multiple edges.

The Algorithm STEP 1: Perform two Boruvka steps. STEP 2: Choose a subgraph H by selecting each edge independently with probability 1/2. Apply the algorithm to the subgraph, producing a minimum spanning forest F of H. Throw away all F-heavy edges in the entire graph. STEP 3: Recurse on the remaining graph to compute a spanning forest F’. Return edges contracted in STEP 1 along with the edges of F’ For the purposes of illustration, I will only perform one Boruvka step. Each step reduces the graph by a factor of 2

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 Audience participation? 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 We know all of these edges need to be in a minimum spanning tree because of something? 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 c 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 15 14 13 1 16 11 17 1 5 12 10 11 10 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

3 3 15 14 13 1 9 16 11 17 1 5 12 10 11 10 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

17 11, 12 9 14, 15 12 12 3 6 7 0,2,3,8,10,18 14 14 10 5 13 1, 9, 16 6 5 5 7 4 4,6,7,13,17 11 5, 19, 20 18

17 11, 12 (A) 9 14, 15 (C) 12 12 3 6 7 0,2,3,8,10,18 (B) 14 14 10 5 13 1, 9, 16 (D) 6 5 5 7 4 4,6,7,13,17 (E) 11 5, 19, 20 (F) 18

A C 12 3 6 7 B 10 5 D 7 Now we need to connect these components. 4 E F

A C 12 3 6 7 B 10 5 D 7 4 E F

A C 12 3 6 7 B 10 5 D 7 4 E F

A C 12 3 7 B D 7 F-heavy edge 4 E F

A C 12 3 7 B D 7 4 E F

A C 12 3 7 B D 7 4 E F

A C 12 3 7 B D 7 4 E F

A B, C, D 7 E, F

A B, C, D 7 E, F

A B,C,D,E,F

A C 12 3 6 7 B 10 5 D 7 4 E F

A C 12 3 6 7 B 10 5 D 7 4 E F

A C 6 B 10 5 D E F

A C 6 B 10 5 D E F

A C 6 B 10 5 D E F

A C 6 B 10 5 D E F

C A,B,E D F

C D F A,B,E A B,C,D,E,F

C D F A B E 10 6 5 A B, C, D E, F 7

A C 6 B 5 D 7 E F

A C 12 3 6 7 B 10 5 D 7 4 E F

C D F A B E 10 4 7 6 5 12 3 C D F A B E 4 7 12 3

A C 12 3 7 B D 7 Never use the words “this” or “that”. You’ve seen me bring up information from the bottomost recursive step, now I need to move back up a level. The next deth focused on the nodes on the right, so the next slide shows those nodes, and how we combined them. 4 E F

C D F A B E 10 4 7 6 5 12 3 C D F A B E 4 7 12 3

A C 12 3 6 7 B 10 5 D 7 4 E F

A C 12 3 6 7 B 10 5 D 7 4 E F

A C 12 3 6 7 B 10 5 D 7 4 E F

A C 12 3 6 7 B 10 5 D 7 4 E F

15 14 1 A 3 12 6 7 B 10 5 D 7 4 E F

15 14 1 A 3 12 6 7 B 10 5 D 7 4 E F

11 12 5 15 14 1 9 3 17 12 6 7 12 14 B 10 5 D 7 4 E F

11 12 5 15 14 1 9 3 17 12 6 7 12 14 B 10 5 D 7 4 E F

3 2 8 10 18 6 1 9 4 11 12 5 9 3 15 14 1 17 10 13 12 6 12 14 7 10 6 5 14 D 5 5 7 4 E F

3 2 8 10 18 6 1 9 4 11 12 5 9 3 15 14 1 17 10 13 12 6 12 14 7 10 6 5 14 D 5 5 7 4 E F

3 2 8 10 18 6 1 9 4 11 12 5 9 3 15 14 1 9 1 16 2 5 17 12 10 13 6 12 14 10 7 6 5 5 14 5 4 E 7 11 F

3 2 8 10 18 6 1 9 4 11 12 5 9 3 15 14 1 9 1 16 2 5 17 12 10 13 6 12 14 10 7 6 5 5 14 5 4 E 7 11 F

3 2 8 10 18 6 1 9 4 11 12 5 9 3 15 14 1 9 1 16 2 5 17 12 10 13 11 5 19 20 2 3 6 12 14 10 7 6 5 5 14 5 E 7 4 18

3 2 8 10 18 6 1 9 4 11 12 5 9 3 15 14 1 9 1 16 2 5 17 12 10 13 11 5 19 20 2 3 6 12 14 10 7 6 5 5 14 5 E 7 4 18

3 2 8 10 18 6 1 9 4 11 12 5 9 3 15 14 1 9 1 16 2 5 17 12 10 10 13 12 11 4 6 7 13 17 5 14 5 19 20 2 3 6 7 5 14 6 5 5 7 18 4

3 2 8 10 18 6 1 9 4 11 12 5 9 3 15 14 1 9 1 16 2 5 17 12 10 10 13 12 11 4 6 7 13 17 5 14 5 19 20 2 3 6 7 5 14 6 5 5 7 18 4

3 3 9 14 16 11 17 1 1 5 12 10 13 11 10 15 10 12 9 12 2 13 14 2 19 6 7 8 1 7 3 5 5 6 7 14 9 2 2 4 6 5 Conclusion for this - hard for humans to verify this. Maybe ask the audience which edges are important, write them down? Largest edge? 5 3 7 18 5 6 4 5 4 18 4 4 17 20 3

Why Does It Terminate? The Boruvka step reduces the size of the graph by a factor of 2 Terminate when all the nodes are coalesced in all the subgraphs, then build back up Signpost: correctness and then speed. We vs it

Why Does It Work? Proof by induction: Cut property: edges contracted in STEP 1 are in the MST. Cycle property: edges deleted in STEP 2 not in the MST. Remaining edges of the original graph’s MST form an MST of the contracted graph. By the inductive hypothesis, MST of the remaining graph is correctly determined in recursive call of STEP 3. Induction on the steps of the algorithm. Sketch a base case and an induction step. Has optimal substructure. Transition for how we’re going to do the timing analysis - there’s a recurrence relation to figure out how many steps there are and how long each step takes.

How Long Does It Take? WORST CASE: Total time spent in steps 1-3, ignoring recursion, is linear in the number of edges. Step 1 is two steps of Buruvka, linear Step 2 is linear using the modified Dixon-Rauch-Tarjan verification algorithm Total running time is bounded by the total number of edges and in all recursive subproblems So how many edges are there in all of the problems? Worst case, you never pick edges. Left subproblem is the whole thing, and right subproblem is nothing.

Worst Case, Continued Assume a graph with n vertices and m edges. m >= n/2. Each invocation generates at most two recursive subproblems. Consider the binary tree of recursive subproblems, where the root is the initial problem. First subproblem is the left child (step 2), second is right child (step 3).

Worst Case, Continued At depth d, the tree of subproblems has at most 2d nodes, each a problem on a graph of at most n/4d vertices. The the depth of the tree is at most logv n, and there are at most 2n vertices total in the original problem and all subproblems. A subproblem at depth d contains at most (n/4d)2/2 edges. The total number of edges in all subproblems at an recursive depth d is therefore, at most, m.

Worst Case, Finally! Since there are O(logn) depths in the tree, the total number of edges in all recursive subproblems is O(mlogn) This is the same as Baruvka’s original algorithm. Fix their phrasing.

Best Case The expected running time is O(m) The expected number of edges in the left subproblem is at most half the expected number of edges in its parent. m + (m/2) + (m/4) … = 2m The sum of expected number of edges in every subproblem in the left side of the tree is therefore at most 2m. Given the worst case, but assume we can flip coins reasonably, to get expected case. M + m/2 + m/4 … is 2m.

Best Case, continued The total number of edges is bounded by 2m and the expected total number of edges in all right subproblems The number of edges in the right subproblem is at most twice the number of vertices in the subproblem. It turns out that the right subproblem blah blah, but I will elide that reasoning, because we did that earlier.

Best Case, continued The total number of vertices in all right subproblems is at most n/2. So the expected number of edges in the original problem and all subproblems is at most 2m+n The algorithm runs in O(m) time with probability 1 - (exp(-Omega(m))

THE END

…questions?