MST, Topological Sort and Disjoint Sets

Slides:



Advertisements
Similar presentations
Chapter 9 Greedy Technique. Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: b feasible - b feasible.
Advertisements

Lecture 15. Graph Algorithms
Graph Traversals and Minimum Spanning Trees. Graph Terminology.
Minimum Spanning Trees Definition of MST Generic MST algorithm Kruskal's algorithm Prim's algorithm.
Discussion #36 Spanning Trees
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 13 Minumum spanning trees Motivation Properties of minimum spanning trees Kruskal’s.
Disjoint Union / Find CSE 373 Data Structures Lecture 17.
Graph Traversals and Minimum Spanning Trees : Fundamental Data Structures and Algorithms Rose Hoberman April 8, 2003.
Minimum Spanning Tree Algorithms
1 Graphs: shortest paths & Minimum Spanning Tree(MST) Fundamental Data Structures and Algorithms Ananda Guna April 8, 2003.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
1 Minimum Spanning Trees Definition of MST Generic MST algorithm Kruskal's algorithm Prim's algorithm.
Minimum-Cost Spanning Tree weighted connected undirected graph spanning tree cost of spanning tree is sum of edge costs find spanning tree that has minimum.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 10 Instructor: Paul Beame.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 11 Instructor: Paul Beame.
More Graph Algorithms Weiss ch Exercise: MST idea from yesterday Alternative minimum spanning tree algorithm idea Idea: Look at smallest edge not.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Midwestern State University Minimum Spanning Trees Definition of MST Generic MST algorithm Kruskal's algorithm Prim's algorithm 1.
Trees Featuring Minimum Spanning Trees HKOI Training 2005 Advanced Group Presented by Liu Chi Man (cx)
Design and Analysis of Computer Algorithm September 10, Design and Analysis of Computer Algorithm Lecture 5-2 Pradondet Nilagupta Department of Computer.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
Chapter 2 Graph Algorithms.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Algorithm Course Dr. Aref Rashad February Algorithms Course..... Dr. Aref Rashad Part: 5 Graph Algorithms.
Minimum Spanning Trees and Kruskal’s Algorithm CLRS 23.
Spanning Trees CSIT 402 Data Structures II 1. 2 Two Algorithms Prim: (build tree incrementally) – Pick lower cost edge connected to known (incomplete)
Union-find Algorithm Presented by Michael Cassarino.
1 Minimum Spanning Trees (some material adapted from slides by Peter Lee, Ananda Guna, Bettina Speckmann)
Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.
Fundamental Data Structures and Algorithms Margaret Reid-Miller 21 April 2005 Equivalence and Union-Find.
1 Today’s Material The dynamic equivalence problem –a.k.a. Disjoint Sets/Union-Find ADT –Covered in Chapter 8 of the textbook.
Minimum Spanning Trees Featuring Disjoint Sets HKOI Training 2006 Liu Chi Man (cx) 25 Mar 2006.
CS 146: Data Structures and Algorithms July 16 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
MA/CSSE 473 Days Answers to student questions Prim's Algorithm details and data structures Kruskal details.
Minimum Spanning Trees CSE 373 Data Structures Lecture 21.
1 22c:31 Algorithms Minimum-cost Spanning Tree (MST)
WEEK 12 Graphs IV Minimum Spanning Tree Algorithms.
Lecture 12 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Algorithm Design and Analysis June 11, Algorithm Design and Analysis Pradondet Nilagupta Department of Computer Engineering This lecture note.
CSE 589 Applied Algorithms Spring 1999 Prim’s Algorithm for MST Load Balance Spanning Tree Hamiltonian Path.
Midwestern State University Minimum Spanning Trees Definition of MST Generic MST algorithm Kruskal's algorithm Prim's algorithm 1.
November 22, Algorithms and Data Structures Lecture XII Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Introduction to Algorithms
Chapter 5 : Trees.
Greedy method Idea: sequential choices that are locally optimum combine to form a globally optimum solution. The choices should be both feasible and irrevocable.
Lecture 12 Algorithm Analysis
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
Minimum-Cost Spanning Tree
CMSC 341 Disjoint Sets Based on slides from previous iterations of this course.
Minimum Spanning Tree.
Minimum-Cost Spanning Tree
Minimum-Cost Spanning Tree
Lecture 12 Algorithm Analysis
Minimum Spanning Tree Algorithms
Minimum Spanning Tree.
Minimum Spanning Trees
Fundamental Data Structures and Algorithms
Kruskal’s algorithm for MST and Special Data Structures: Disjoint Sets
Lecture 12 Algorithm Analysis
Topological Sorting Minimum Spanning Trees Shortest Path
Minimum-Cost Spanning Tree
Presentation transcript:

MST, Topological Sort and Disjoint Sets 15-211 Fundamental Data Structures and Algorithms Ananda Guna April 6, 2006

In this lecture Prim’s revisited Kruskal’s algorithm Topological Sorting Union Find algorithms Disjoint Sets Analysis

Prim’s Algorithm Algorithm is based on the idea of two sets S = vertices in the current MST V-S = vertices not in the current MST Find the minimum edge (u,v) such that u is in S and v is in V-S Add the edge to MST and node v to S At the end algorithm guarantees that we have constructed a MST Note: MST is not unique

Prim’s Algorithm Invariant At each step, we add the edge (u,v) s.t. the weight of (u,v) is minimum among all edges where u is in the tree and v is not in the tree Each step maintains a minimum spanning tree of the vertices that have been included thus far When all vertices have been included, we have a MST for the graph!

Running time of Prim’s algorithm Initialization of priority queue (array): O(|V|) Update loop: |V| calls Choosing vertex with minimum cost edge: O(|V|) with heaps O(log (|V|)) Updating distance values of unconnected vertices: each edge is considered only once during entire execution, for a total of O(|E|) updates Overall cost without heaps: What is the run time complexity if heaps are used? O(|E| + |V|2))

Correctness Lemma: Let G be a connected weighted graph and let G’ be a subgraph of G that is contained in a MST T. Let C be a component of G’. Let S be the set of all edges with one vertex in C and other not in C. If we add a minimum edge weight in S to G’, then the resulting graph is contained in a minimal spanning tree of G

Correctness Theorem: Prim’s algorithm correctly finds a minimal spanning tree Proof: by induction show that tree constructed at each iteration is contained in a MST. Then at the termination, the tree constructed is a MST Base case: tree has no edges, and therefore contained in every spanning tree Inductive case: Let T be the current tree constructed using Prim’s algorithm. By inductive argument, T is contained in some MST. Let (u,v) be the next edge selected by Prim’s, such that u in T and v not in T. Let G’ be T together with all vertices not in T. Then T is a component of G’ and (u,v) is a minimum weight edge with one vertex in T and one not in T. Then by lemma, when (u,v) is added to G’ , the resulting graph is also contained in a MST.

Kruskal’s Algorithm

forest: {a}, {b}, {c}, {d}, {e} Another Approach Create a forest of trees from the vertices Repeatedly merge trees by adding “safe edges” until only one tree remains A “safe edge” is an edge of minimum weight which does not create a cycle a c e d b 2 4 5 9 6 forest: {a}, {b}, {c}, {d}, {e}

Kruskal’s algorithm Initialization a. Create a set for each vertex v  V b. Initialize the set of “safe edges” A comprising the MST to the empty set c. Sort edges by increasing weight a c e d b 2 4 5 9 6 F = {a}, {b}, {c}, {d}, {e} A =  E = {(a,d), (c,d), (d,e), (a,c), (b,e), (c,e), (b,d), (a,b)}

Kruskal’s algorithm For each edge (u,v)  E in increasing order while more than one set remains: If u and v, belong to different sets U and V a. add edge (u,v) to the safe edge set A = A  {(u,v)} b. merge the sets U and V F = F - U - V + (U  V) Return A

Kruskal’s algorithm b a (b,e), (c,e), (b,d), (a,b)} d e c Forest 9 b a 2 6 E = {(a,d), (c,d), (d,e), (a,c), (b,e), (c,e), (b,d), (a,b)} d 4 5 5 4 5 e c Forest {a}, {b}, {c}, {d}, {e} {a,d}, {b}, {c}, {e} {a,d,c}, {b}, {e} {a,d,c,e}, {b} {a,d,c,e,b} A  {(a,d)} {(a,d), (c,d)} {(a,d), (c,d), (d,e)} {(a,d), (c,d), (d,e), (b,e)}

Kruskal’s Algorithm Summary After each iteration, every tree in the forest is a MST of the vertices it connects Algorithm terminates when all vertices are connected into one tree Both Prim’s and Kruskal’s algorithms are greedy algorithms Complexity of Kruskal’s algorithm O(|E| log |E|) to sort the edges O(|V|) initial sets O(|V||log|V|) find and union operations What if the edges are maintained in a PQ?

Topological Sort

Topological sort Definition: A topological sort of G=(V,E) is an ordering of all of G’s vertices v1, v2, …, vn such that for every edge (vi,vj) in E, i<j.

In a topological ordering no arrow can point backward Pour foundation Building permit Framing Plumbing Electrical wiring Paint interior Paint exterior Building permit Pour foundation Framing Electrical wiring Plumbing Paint exterior Paint interior In a topological ordering no arrow can point backward

Finding a topological sort Place the vertices in order from left to right No edge arrow can point backward If an order can be found, we can do tasks from left to right Questions Does a graph always need to have a topological sort? If so, can there be more than one topological sort for a given graph What if the graph has a cycle? Is it possible to have a topological sort?

How to find a topological sort If the graph has no vertex with in- degree 0, can we find a topological sort? If the graph has a vertex with in-deg 0, then start with that vertex Delete the vertex and put that at the front of the sorted list repeat

Implementing topological sort We need to maintain indegrees Maintain an array of indegrees Indeg[i] is the indegree of the vertex i As you delete vertices, reduce the indegree of all the vertices it is pointing to. When a vertex gets indegree 0, put that into a list of nodes to be deleted

Example 1 4 7 6 3 5 2 indegree 1 2 3 4 5 6 7

Topological sort Nodes in a dag can be ordered linearly. Topological orders: 1,2,5,4,3,6,7 2,1,5,4,7,3,6 2,5,1,4,7,3,6 Etc. For our building example, any topological order is a feasible schedule. 1 4 7 6 3 5 2

Homework Build all topological orders of the following graph Building permit Pour foundation Framing Plumbing Electrical wiring Paint interior Paint exterior

Topological sort algorithm Suppose in degree is stored with each node. Q: What is the cost of storing in-degree (assume adjacency list implementation) After the graph is built? (cost?) While building the graph? (cost?) Scan all nodes, pushing roots onto a stack. (cost?) Repeat until stack is empty: (cost?) Pop a root r from the stack and output it. (cost?) For all nodes n (non-roots) such that (r,n) is an edge, decrement n’s in degree. If 0 then push onto the stack. (cost?) O(|V|+|E|), but better in practice. Q: How can we tell if a graph has a cycle?

Union Find

Equivalence relations The relation “~” is an equivalence relation if (for all a, b, and c) a ~ a reflexive a ~ b iff b ~ a symmetric a ~ b & b~ c  a ~ c transitive Examples “<” transitive, not reflexive, not symmetric “<=” transitive, reflexive, not symmetric “e1 = O(e2)” transitive, not reflexive, not symmetric “==” transitive, reflexive, symmetric “connected” transitive, reflexive, symmetric

Equivalence relations Let U = {1,2,3,4,5,6,7,8,9} and 1~5, 6~8, 7~2, 9~8, 3~7, 4~2, 9~3 U contains two equivalence classes w.r.t. “~”: {2,3,4,6,7,8,9} and {1,5} 3~5 iff 3 and 5 belong to the same equivalence class. Let ~ be an equivalence relation “~” over a set U. Each member a of U has an equivalence class with respect to “~”: [a] = {b | a ~ b}

Equivalence relations The set of equivalence classes are a partition of U. { {2,3,4,6,7,8,9}, {1,5} } In general i  j implies Pi  Pj={}. For each a  U, there is exactly one i such that a  Pi . Why study Equivalence Relations? What problems can be solved by understanding equivalence relations? Common ancestor problem Maze problem

Applications

Maze Generator How can we generate maze like this? figure 24.1 A 50 x 88 maze How can we generate maze like this?

An Application - The Maze problem A maze is a grid of rooms separated by walls. Each room has a name. Think of maze as a graph: Nodes x, y, z represent rooms Edges (x,y) indicate that Rooms x and y are adjacent, and There is no wall between them. a b c d h g f e i j k l p o n m Randomly knock out walls until we get a good maze.

Mathematical formulation A set of rooms: {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p} Identify pairs of adjacent rooms that have an open wall between them. E.g., (a,b) and (g,k) are pairs.

Mazes as graphs a b c d e f g h i j k l m n o p {(a,b), (b,c), (a,e), (e,i), (i,j), (f,j), (f,g), (g,h), (d,h), (g,k), (m,n), (n,o), (k,o), (o,p), (l,p)}

Unique solutions What property must the graph have for the maze to have a solution? A path from (a) to (p). What property must it have for the maze to have a unique solution? The graph must be a tree. a b c d e f g h i j k l m n o p

Mazes as trees Informally, a tree is a graph where: Each node has a unique parent. Except a unique root node that has no parent A spanning tree is a tree that includes all of the nodes. Why is it good to have a spanning tree? e b i c j Trees have no cycles! f g h k d o n p m l

Dynamic equivalence How can we check for dynamic equivalence Do two elements belong to the same equivalence class? Is there a path from one node to another? Union-Find Abstraction find(i) returns the name of the set containing i. union(i,j) joins the sets containing i and j. Effects Calls to union can change future find results Calls to find do not change future find results.

The Union-Find Interface Represent elements as ints Let 0  i  N stand for ei  {e0,… ,eN-1} Find identifies the set containing i. int find(int i) Equivalence testing find(i) == find(j) Union called for an effect void union(int i, int j) Affects results of future calls to find

Understanding Union-Find

Forest and trees Each set is a tree {1}{2}{0,3} {4}{5} union(1,2) adds a new subtree to a root {1,2}{0,3}{4}{5} union(0,1) adds a new subtree to a root {1,2,0,3}{4}{5} 1 2 3 4 5 1 3 4 2 5 1 3 4 2 5

Forest and trees - Array Representation {1,2,0,3}{4}{5} find(2) = 1 find(4) = 4 Array representation 3 -1 1 1 -1 -1 0 1 2 3 4 5 1 4 5 2 3

Find Operation {1,2,0,3}{4}{5} find(0) = 1 3 -1 1 1 -1 -1 3 -1 1 1 -1 -1 0 1 2 3 4 5 public int find(int x) { if (s[x] < 0) return x; return find(s[x]); } 1 4 5 2 3

Union Operation {1,2}{0,3}{4}{5} {1,2,0,3}{4}{5} union(0,2) 4 2 5 {1,2}{0,3}{4}{5} {1,2,0,3}{4}{5} union(0,2) 3 -1 1 -1 -1 -1 before 3 -1 1 1 -1 -1 after 0 1 2 3 4 5 public void union(int x, int y){ S[find(x)] = find(y) }

The problem Find must walk the path to the root Unlucky combinations of unions can result in long paths 1 2 3 4 5 6

Path compression for find find flattens trees Redirect nodes to point directly to the root Do this while traversing path from node to root. 1 3 4 2 5 1 4 5 2 3

Path compression find flattens trees Redirect nodes to point directly to the root Do this while traversing path from node to root. public int find(int x) { if (s[x]< 0) return x; return s[x] = find(s[x]); }

Union by size 1 3 2 Union-by-size 4 Representational trick Performance Join lesser size to greater Label with sum of sizes Find (with/without path comp.): No effect Representational trick Positive numbers: index of parent Negative numbers: root, with size -s[x] Performance When depth of a tree increases on union, it is always at least twice previous size. Hence maximum of log(N) steps that increase depth. Hence maximum time for find is O(log(N)). 4 1 3 2

union by height union shallow trees into deep trees Tree depth increases only when depths equal Track path length to root 3 -3 1 1 -1 -1 0 1 2 3 4 5 Tree depth at most O(log N) 3 1 1 3 4 2 5 1

Union by height, details Different heights Join lesser height to greater Do not change height values Equal heights Join either tree to the other Add one to height of result Find: Without path compression No effect With path compression Must recalculate height Can involve looking at many subtrees 1 3 2 2

Union by rank Path compression is easy to implement when we use union-by-size. However, union-by-height is problematic with path compression Definition Rank of a node is initialized to 0 Updated only during union operation Union-by-rank Union: Different ranks Join lesser rank to greater Do not change rank value Equal ranks Join either to the other Add one to rank of result Find, with path compression Yields good performance

All the code class UnionFind { int[] u; UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; } int find(int i) { int j,root; for (j = i; u[j] >= 0; j = u[j]) ; root = j; while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; } return root; void union(int i,int j) { i = find(i); j = find(j); if (i !=j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; }

The UnionFind class class UnionFind { int[] u; UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; } int find(int i) { ... } void union(int i,int j) { ... }

Iterative find int find(int i) { int j, root; for (j = i; u[j] >= 0; j = u[j]) ; root = j; while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; } return root; }

union by size i = find(i); j = find(j); if (i != j) { void union(int i,int j) { i = find(i); j = find(j); if (i != j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; } }

Analysis of UnionFind

Analysis of Union-Find The algorithm Union: by rank Find: with path compression 3 1 1 2 2 3 4 5 1 6

Analysis - Rank tree size Lemma. After a sequence of union instructions, a node of rank r will have at least 2r descendents, including itself. Proof. r = 0. 20 = 1. r > 0. Let T be the smallest rank-r tree and X be its root. Suppose T was result of union(T1, T2) and X was root of T1. The ranks of T1 and T2 must both be r-1. If rank of Ti were r then T could not be smallest rank-r tree. Also, since the union increased rank, the Ti ranks must be equal. By induction hypothesis, each Ti has at least 2r-1 descendents. Total must therefore be at least 2r. Note on path compression Path compression doesn’t affect rank Though it does affect height!

Analysis - Nodes of rank r Lemma. The number of nodes of rank r is at most N/2r. Proof. Each node of rank r roots a subtree of at least 2r nodes. No node within the subtree can be of rank r. So all subtrees of rank r are disjoint. At most N/2r subtrees. Examples: rank 0: at most N subtrees (i.e., every node is a root). rank log(N): at most 1 subtree (of size N).

Analysis - Ranks on a path Lemma. Node rank always increases from leaf to root. Proof. Obvious if no path compression. With path compression, nodes are promoted from lower levels and hence were of lesser rank.

Time bounds Variables M operations. N elements. Algorithms Simple forest representation Worst: find O(N). mixed operations O(MN). Average: tricky Union by height; Union by size Worst: find O(log N). mixed operations O(M log N). Average: mixed operations O(M) Path compression in find Worst: mixed operations: “nearly linear” [analysis in 15-451]