Minimum Spanning Trees

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

Chapter 23 Minimum Spanning Tree
More Graph Algorithms Minimum Spanning Trees, Shortest Path Algorithms.
1 Greedy 2 Jose Rolim University of Geneva. Algorithmique Greedy 2Jose Rolim2 Examples Greedy  Minimum Spanning Trees  Shortest Paths Dijkstra.
Chapter 23 Minimum Spanning Trees
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 13 Minumum spanning trees Motivation Properties of minimum spanning trees Kruskal’s.
Minimum Spanning Tree Algorithms
CSE 780 Algorithms Advanced Algorithms Minimum spanning tree Generic algorithm Kruskal’s algorithm Prim’s algorithm.
1 7-MST Minimal Spanning Trees Fonts: MTExtra:  (comment) Symbol:  Wingdings: Fonts: MTExtra:  (comment) Symbol:  Wingdings:
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Analysis of Algorithms CS 477/677
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Chapter 9: Graphs Spanning Trees Mark Allen Weiss: Data Structures and Algorithm Analysis in Java Lydia Sinapova, Simpson College.
Graphs CS 400/600 – Data Structures. Graphs2 Graphs  Used to represent all kinds of problems Networks and routing State diagrams Flow and capacity.
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
MST Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Minimum Spanning Trees and Kruskal’s Algorithm CLRS 23.
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
Shortest Paths CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
November 13, Algorithms and Data Structures Lecture XII Simonas Šaltenis Aalborg University
Chapter 23: Minimum Spanning Trees: A graph optimization problem Given undirected graph G(V,E) and a weight function w(u,v) defined on all edges (u,v)
1 22c:31 Algorithms Minimum-cost Spanning Tree (MST)
MST Lemma Let G = (V, E) be a connected, undirected graph with real-value weights on the edges. Let A be a viable subset of E (i.e. a subset of some MST),
Lecture 12 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos and Alexandra Stefan University of Texas at Arlington These slides are.
November 22, Algorithms and Data Structures Lecture XII Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Lecture ? The Algorithms of Kruskal and Prim
Minimum Spanning Trees
Introduction to Algorithms
Shortest Paths and Minimum Spanning Trees
Single-Source Shortest Paths
Minimum Spanning Trees
Minimum Spanning Trees
Lecture 12 Algorithm Analysis
Unit 3 Graphs.
Data Structures & Algorithms Graphs
Minimum-Cost Spanning Tree
Minimum Spanning Tree.
Minimum Spanning Tree.
Graphs & Graph Algorithms 2
Algorithms and Data Structures Lecture XII
Minimum-Cost Spanning Tree
Data Structures – LECTURE 13 Minumum spanning trees
Chapter 13 Graph Algorithms
Outline This topic covers Prim’s algorithm:
Minimum-Cost Spanning Tree
CS 583 Analysis of Algorithms
Minimum Spanning Trees
Chapter 11 Graphs.
Lecture 12 Algorithm Analysis
Minimum Spanning Tree Algorithms
Shortest Paths and Minimum Spanning Trees
Minimum Spanning Trees
Minimum Spanning Tree.
Minimum Spanning Trees
Fundamental Structures of Computer Science II
Single-Source Shortest Paths
Lecture 12 Algorithm Analysis
Advanced Algorithms Analysis and Design
Lecture 14 Minimum Spanning Tree (cont’d)
Chapter 23: Minimum Spanning Trees: A graph optimization problem
Chapter 9: Graphs Spanning Trees
Minimum-Cost Spanning Tree
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Minimum Spanning Trees
More Graphs Lecture 19 CS2110 – Fall 2009.
Presentation transcript:

Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos and Alexandra Stefan University of Texas at Arlington These slides are based on “Algorithms in C” by R. Sedgewick

Weighted Graphs Each edge has a weight. 1 7 2 5 3 4 6 10 20 30 15 25 Each edge has a weight. Example: a transportation network (roads, railroads, subway). The weight of each road can be: The length. The expected time to traverse. The expected cost to build. Example: in a computer network, the weight of each edge (direct link) can be: Latency. Expected cost to build.

Spanning Tree A spanning tree is a tree that connects all vertices of the graph. The weight of a spanning tree is the sum of weight of its edges. 1 7 2 5 3 4 6 10 20 30 15 25 1 7 2 5 3 4 6 10 20 30 15 25 Weight:20+15+30+20+30+25+20 = 160 Weight:10+20+15+30+20+10+15 = 120

Minimum-Cost Spanning Tree (MST) 1 7 2 5 3 4 6 10 20 30 15 25 Minimum-Cost Spanning Tree (MST) Minimum-cost spanning tree (MST): Connects all vertices of the graph (spanning tree). Has the smallest possible total weight of edges. Finding a minimum-cost spanning tree is an important problem in weighted graphs. 1 7 2 5 3 4 6 10 20 30 15

Minimum-Cost Spanning Tree (MST) 1 7 2 5 3 4 6 10 20 30 15 25 Minimum-Cost Spanning Tree (MST) Assume that the graph is: connected undirected edges can have negative weights. Warning: later in the course (when we discuss Dijkstra's algorithm) we will make the opposite assumptions: Allow directed graphs. Not allow negative weights. 1 7 2 5 3 4 6 10 20 30 15

Prim's Algorithm - Overview Start from an tree that contains a single vertex. Keep growing that tree, by adding at each step the shortest edge connecting a vertex in the tree to a vertex outside the tree. As you see, it is a very simple algorithm, when stated abstractly. However, we have several choices regarding how to implement this algorithm. We will see three implementations, with significantly different properties from each other.

Prim's Algorithm - Simple Version Assume an adjacency matrix representation. Each vertex is a number from 0 to |V|-1. We have a |V|*|V| adjacency matrix ADJ, where: ADJ[v][w] is the weight of the edge connecting v and w. If v and w are not connected, ADJ[v][w] = infinity.

Prim's Algorithm - Simple Version Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree, the non-tree vertex of that edge.

Example 1 7 2 5 3 4 6 10 20 30 15 25 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge.

Example 1 7 2 5 3 4 6 20 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. 20 30 10 15 30 15 25 10 20

Example 1 7 2 5 3 4 6 20 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. 20 30 10 15 30 15 25 10 20

Example 1 7 2 5 3 4 6 20 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. 20 30 10 15 30 15 25 10 20

Example 1 7 2 5 3 4 6 20 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. 20 30 10 15 30 15 25 10 20

Example 1 7 2 5 3 4 6 20 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. 20 30 10 15 30 15 25 10 20

Example 1 7 2 5 3 4 6 20 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. 20 30 10 15 30 15 25 10 20

Example 1 7 2 5 3 4 6 20 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. 20 30 10 15 30 15 25 10 20

Example 1 7 2 5 3 4 6 20 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. 20 30 10 15 30 15 25 10 20

Example 1 7 2 5 3 4 6 10 20 30 15 Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge.

Prim's Algorithm - Simple Version Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. Running time?

Prim's Algorithm - Simple Version Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. Most naive implementation: time ??? Every time we add a new vertex and edge, go through all edges again, to identify the next edge (and vertex) to add.

Prim's Algorithm - Simple Version Start by adding vertex 0 to the MST (minimum-cost spanning tree). Repeat until all vertices have been added to the tree: From all edges connecting vertices from the current tree to vertices outside the current tree, select the smallest edge. Add that edge to the tree, and also add to the tree the non-tree vertex of that edge. Most naive implementation: time O(VE). Every time we add a new vertex and edge, go through all edges again, to identify the next edge (and vertex) to add.

Prim's Algorithm - Dense Graphs A dense graph is nearly full, and thus has O(V2) edges. For example, think of a graph where each vertex has at least V/2 neighbors. Just reading the input (i.e., looking at each edge of the graph once) takes O(V2) time. Thus, we cannot possibly compute a minimum-cost spanning tree for a dense graph in less than O(V2) time. Prim's algorithm can be implemented so as to take O(V2) time, which is optimal for dense graphs.

Prim's Algorithm - Dense Graphs Again, assume an adjacency matrix representation. Every time we add a vertex to the MST, we need to update, for each vertex, x, not in the tree: The smallest edge connecting it to the tree. Let wt[x] be the weight of that edge If no edge connects x to the tree, wt[x] = infinity. The tree vertex fr[x] associated with the edge whose weight is wt[x]. These quantities can be updated in O(V) time when adding a new vertex to the tree. Then, the next vertex to add is the one with the smallest wt[x].

Example 1 7 2 5 3 4 6 20 Every time we add a vertex to the MST, we need to update, for each vertex, x, not in the tree: The smallest edge wt[x] connecting it to the tree. If no edge connects x to the tree, wt[x] = infinity. The tree vertex fr[x] associated with the edge whose weight is wt[x]. When we add a vertex, x, to the MST, we mark it as "processed" by setting: st[x] = fr[x]. 20 30 10 15 30 15 25 10 20

Prim's Algorithm: Dense Graphs void GRAPHmstV(Graph G, int st[], double wt[]) { int v, x, min; for (v = 0; v < G->V; v++) // initialize datastructures: st, wt, fr { st[v] = -1; fr[v] = v; wt[v] = maxWT; } st[0] = 0; // start with node 0 wt[G->V] = maxWT; // symbolic node G->V used as a sentinel to stop select nodes for (min = 0; min != G->V; ) { // min is the next node to add to the MST; start with 0 v = min; st[min] = fr[min]; // add min to the MST //loop will: update nodes outside of MST+v, find closest next node, min. for (x = 0, min = G->V; x < G->V; x++) //for every node, x, in G; closest=sentinel if (st[x] == -1) // x is not in MST { if (G->adj[v][x] < wt[x]) // the added v node, brings x closer to the tree { wt[x] = G->adj[v][x]; fr[x] = v; } // update wt and fr for x if (wt[x] < wt[min]) min = x; // track the minimum weight edge from cut }}}

Prim's Algorithm: Dense Graphs // same code as above. Only comments were removed. void GRAPHmstV(Graph G, int st[], double wt[]) { int v, x, min; for (v = 0; v < G->V; v++) { st[v] = -1; fr[v] = v; wt[v] = maxWT; } st[0] = 0; wt[G->V] = maxWT; for (min = 0; min != G->V; ) { v = min; st[min] = fr[min]; for (x = 0, min = G->V; w < G->V; w++) if (st[x] == -1) { if (G->adj[v][x] < wt[x]) { wt[x] = G->adj[v][x]; fr[x] = v; } if (wt[x] < wt[min]) min = x; }}}

Prim's Algorithm - Dense Graphs Running time: ???

Prim's Algorithm - Dense Graphs Running time: O(V2) Optimal for dense graphs.

Prim's Algorithm for Sparse Graphs A sparse graph is a graph that has only a few edges adjacent to each vertex. This is somewhat vague. If you want a specific example, think of a case where the number of edges is linear to the number of vertices. For example, if each vertex can only have between 1 and 10 neighbors, than the number of edges can be at most (10*V)/2 .

Prim's Algorithm for Sparse Graphs If we use an adjacency matrix representation, then we can never do better than O(V2) time. Why?

Prim's Algorithm for Sparse Graphs If we use an adjacency matrix representation, then we can never do better than O(V2) time. Why? Because just scanning the adjacency matrix to figure out where the edges are takes O(V2) time. The adjacency matrix itself has size V*V.

Prim's Algorithm for Sparse Graphs If we use an adjacency matrix representation, then we can never do better than O(V2) time. Why? Because just scanning the adjacency matrix to figure out where the edges are takes O(V2) time. The adjacency matrix itself has size V*V. We have already seen an implementation of Prim's algorithm, using adjacency matrices, which achieves O(V2) running time. For sparse graphs, if we want to achieve better running time than O(V2), we have to switch to an adjacency lists representation.

Prim's Algorithm for Sparse Graphs Quick review: what exactly is an adjacency lists representation?

Prim's Algorithm for Sparse Graphs Quick review: what exactly is an adjacency lists representation? Each vertex is a number between 0 and V (same as for adjacency matrices). The adjacency information is stored in an array ADJ of lists. ADJ[x] is a list containing all neighbors of vertex x. What is the sum of length of all lists in the ADJ array?

Prim's Algorithm for Sparse Graphs Quick review: what exactly is an adjacency lists representation? Each vertex is a number between 0 and V (same as for adjacency matrices). The adjacency information is stored in an array ADJ of lists. ADJ[w] is a list containing all neighbors of vertex w. What is the sum of length of all lists in the ADJ array? 2*E (each edge is included in two lists).

Prim's Algorithm for Sparse Graphs For sparse graphs, we will use an implementation of Prim's algorithm based on: A graph representation using adjacency lists. A priority queue (heap) containing the set of vertices on the fringe with the distance as the key. A vertex, x, will be included in this priority queue if there is a vertex v in the tree and (v,x) is the shortest edge connecting x to a vertex in the tree. The priority queue will hold pairs [weight,vertex] where the weight will be used as the key (the smaller the weight, the higher the priority) We will continue to use arrays fr and st.

Prim's Algorithm - PQ Version Initialize a priority queue, P, and arrays st and fr v = vertex 0 Add [0,v] to P While (P not empty) [v_weight, v] = remove_minimum(P) Add v to the spanning tree: st[v] = fr[v] For each edge (v, x) adjacent to v: If x in the spanning tree => continue Else, if x is not in P (i.e. fr[x] == -1), insert [weight of edge(v,x) , x] to P. Else, if [x_weight, x] is in P, we have cases: x_weight ≤ weight of edge (v,x) => do nothing x_weight > weight of edge (v,x) => mark v as the closest vertex in the tree to x: set fr[x] = v and update [x_weight, x] to [weight of (v,w), x] in P

Prim's Algorithm - PQ Version runtime Initialize a priority queue, P, and arrays st and fr O(V) v = vertex 0 O(1) Add [0,v] to P O(1) While (P not empty) O(V) [v_weight, v] = remove_minimum(P) O(lgV) O(VlgV) Add v to the spanning tree: st[v] = fr[v] O(1) For each edge (v, x) adjacent to v: O(E) If x in the spanning tree => continue Else, if x is not in P (i.e. fr[x] == -1), insert [weight of edge(v,x) , x] to P. Else, if [x_weight, x] is in P, we have cases: x_weight ≤ weight of edge (v,x) => do nothing x_weight > weight of edge (v,x) => mark v as closest vertex in the tree to x: set fr[x] = v and update [x_weight, x] to [weight of (v,x), x] in P O(lgV) O(E lgV)

Kruskal's Algorithm

Kruskal's Algorithm: Overview Kruskal's algorithm works with a forest (a set of trees). Initially, each vertex in the graph is its own tree. We keep merging trees together, until we end up with a single tree.

Kruskal's Algorithm: Overview Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. The abstract description is simple, but we need to think carefully about how exactly to implement these steps as it affects the runtime.

Kruskal's Algorithm: An Example Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. (work-out example) 1 7 2 5 3 4 6 10 20 30 15 25

Kruskal's Algorithm: An Example Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. 1 7 2 5 3 4 6 10 20 30 15 25

Kruskal's Algorithm: An Example Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. 10 1 7 2 5 3 4 6 20 30 15 25

Kruskal's Algorithm: An Example Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. 10 1 7 2 5 3 4 6 20 30 15 25

Kruskal's Algorithm: An Example Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. 10 1 7 2 5 3 4 6 20 30 15 25

Kruskal's Algorithm: An Example Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. 10 1 7 2 5 3 4 6 20 30 15 25

Kruskal's Algorithm: An Example Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. 10 1 7 2 5 3 4 6 20 30 15

Kruskal's Algorithm: Simple Implementation Assume graphs are represented using adjacency lists. Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. How? We will use the same representation for forests that we used for union-find. We will have an id array, where each vertex will point to its parent. The root of each tree will be the ID for that tree. Time it takes for this step ???

Kruskal's Algorithm: Simple Implementation Assume graphs are represented using adjacency lists. Step 1: Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. How? We will use the same representation for forests that we used for union-find. We will have an id array, where each vertex will point to its parent. The root of each tree will be the ID for that tree. Time it takes for this step? O(V)

Kruskal's Algorithm: Simple Implementation Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Initialize F to some edge with infinite weight. For each edge F' connecting any two vertices (v, w): Determine if v and w belong to two different trees in the forest. If so, update F to be the shortest of F and F'. Connect those two trees into a single tree using edge F.

Kruskal's Algorithm: Simple Implementation Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Initialize F to some edge with infinite weight. For each edge F' connecting (v, w): Determine if v and w belong to two different trees in the forest. HOW? If so, update F to be the shortest of F and F'. Connect those two trees into a single tree using edge F. HOW?

Kruskal's Algorithm: Simple Implementation Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Initialize F to some edge with infinite weight. For each edge F' connecting (v, w): Determine if v and w belong to two different trees in the forest. HOW? By comparing find(v) with find(w). Time: If so, update F to be the shortest of F and F'. Connect those two trees into a single tree using edge F. HOW? By calling union(v, w). Time:

Kruskal's Algorithm: Simple Implementation Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Initialize F to some edge with infinite weight. For each edge F' connecting (v, w): Determine if v and w belong to two different trees in the forest. HOW? By comparing find(v) with find(w). Time: O(lg V) If so, update F to be the shortest of F and F'. Connect those two trees into a single tree using edge F. HOW? By calling union(v, w). Time: O(1)

Kruskal's Algorithm: Simple Implementation Repeat until the forest contains a single tree: Total time for all iterations: Find the shortest edge F connecting two trees in the forest. Time: Initialize F to some edge with infinite weight. For each edge F' connecting (v, w): Determine if v and w belong to two different trees in the forest. HOW? By comparing find(v) with find(w). Time: O(lg V) If so, update F to be the shortest of F and F'. Connect those two trees into a single tree using edge F. HOW? By calling union(v, w). Time: O(1)

Kruskal's Algorithm: Simple Implementation Repeat until the forest contains a single tree: Total time for all iterations: O(V*E*lg(V)) Find the shortest edge F connecting two trees in the forest. Time: O(E*lg(V)) Initialize F to some edge with infinite weight. For each edge F' connecting (v, w): Determine if v and w belong to two different trees in the forest. HOW? By comparing find(v) with find(w). Time: O(lg V) If so, update F to be the shortest of F and F'. Connect those two trees into a single tree using edge F. HOW? By calling union(v, w). Time: O(1)

Running Time for Simple Implementation Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Repeat until the forest contains a single tree: Find the shortest edge F connecting two trees in the forest. Connect those two trees into a single tree using edge F. Running time for simple implementation: O(V*E*lg(V)).

Kruskal's Algorithm: Faster Version Sort all edges, save result in array K. Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. For each edge F in K (in ascending order). If F is connecting two trees in the forest: Connect the two trees with F. If the forest is left with a single tree, break (we are done).

Kruskal's Algorithm: Faster Version Sort all edges, save result in array K. Time? Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Time? For each edge in K (in ascending order). Time? If F is connecting two trees in the forest: Time? Connect the two trees with F. Time? If the forest is left with a single tree, break (we are done).

Kruskal's Algorithm: Faster Version Sort all edges, save result in array K. Time: O(E lg E) Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Time: O(V) For each edge in K (in ascending order). Time: O(E lg V) If F is connecting two trees in the forest: Time? O(lg V), two find operations Connect the two trees with F. Time: O(1), union operation If the forest is left with a single tree, break (we are done). Overall running time???

Kruskal's Algorithm: Faster Version Sort all edges, save result in array K. Time: O(E lg E) Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Time: O(V) For each edge in K (in ascending order). Time: O(E lg V) If F is connecting two trees in the forest: Time? O(lg V), two find operations Connect the two trees with F. Time: O(1), union operation If the forest is left with a single tree, break (we are done). Overall running time: O(E lg E) which is also O(E lgV) E < V2, so lg E < 2 lg V, so O(lg E) = O(lg V). Note that the O(E lg E) is the dominant cost. It comes from sorting and cannot be improved.

Kruskal's Algorithm - PQ Version In the previous implementation, we sort edges at the beginning. This takes O(E lg E) time, which dominates the running time of the algorithm. Thus, the entire algorithm takes O(E lg E) time. We can do better if, instead of sorting all edges at the beginning, we instead insert all edges into a priority queue. How long does that take, if we use a heap?

Kruskal's Algorithm - PQ Version In the previous implementation, we sort edges at the beginning. This takes O(E lg E) time, which dominates the running time of the algorithm. Thus, the entire algorithm takes O(E lg E) time. We can do better if, instead of sorting all edges at the beginning, we instead insert all edges into a priority queue. How long does that take, if we use a heap? O(E) time. We can also do better if, for the Union-Find operations, we use the most efficient version discussed in the textbook. This version flattens paths that it traverses. Running time: O(lg* V). lg* - iterated logarithm. lg*(V) is the number of times we need to apply lg to V to obtain 1.

Kruskal's Algorithm - PQ Version Initialize a heap with the edges (using weight as key). Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. While (true). F = remove_mininum(heap). If F is connecting two trees in the forest: Connect the two trees with F. If the forest is left with a single tree, break (we are done).

Kruskal's Algorithm - PQ Version Initialize a heap with the edges (using weight as key). Time? Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Time? While (true). Time? F = remove_mininum(heap). Time? If F is connecting two trees in the forest: Time? Connect the two trees with F. Time? If the forest is left with a single tree, break (we are done). Overall running time?

Kruskal's Algorithm - PQ Version Initialize a heap with the edges (using weight as key). Time? O(E) Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. Time? O(V) While (true). Time? X lg V. X: number of iterations. F = remove_mininum(heap). Time? O(lg E) If F is connecting two trees in the forest: Time? O(lg V), find operation Connect the two trees with F. Time? O(1) If the forest is left with a single tree, break (we are done). Overall running time? E + X lg V. E < V2, so lg E < 2 lg V, so O(lg E) = O(lg V).

Kruskal's Algorithm - PQ Version Initialize a heap with the edges (using weight as key). Initialize a forest (a collection of trees), by defining each vertex to be its own separate tree. While (true). Time? X lg V. X: number of iterations. F = remove_mininum(heap). Time? O(lg E) If F is connecting two trees in the forest: Time? O(lg V) Connect the two trees with F. Time? O(1) If the forest is left with a single tree, break (we are done). Overall running time? E + X lg V. X is the number of edges in the graph with weight <= the maximum weight of an edge in the final MST. Here we focus X as opposed to upper-bounding this cost by E, because this is the dominant term now. E < V2, so lg E < 2 lg V, so O(lg E) = O(lg V).

Note on the time analysis The X cost did also occur in the second version of Kruskal (that was sorting the edges), but since the cost was dominated by the sorting time, E*lgE, we did not need to analyze X. The costs for Kruskal are based on the adjacency matrix implementation. If an adjacency matrix was used, we would need time V2 just to identify the edges. For Prim’s algorithm we estimated part of the runtime by looking at: each vertex is processed once to be added in the MST tree and at that step all of its adjacent edges are explored => 2*E, as opposed to the standard approach: outer loop iterations * inner loop iterations.

Definitions (CLRS, pg 625) A cut (S, V-S) of an graph is a partition of its vertices, V. An edge (u,v) crosses the cut (S, V-S) if one of its endpoints is in S and the other in V-S. Let A be a subset of a minimum spanning tree over G. An edge (u,v) is safe for A if adding A {(u,v)} is still a subset of a minimum spanning tree. A cut respects a set of edges, A if no edge in A crosses the cut. An edge is a light edge crossing a cut if its weight is the minimum weight of any edge crossing the cut.

Correctness of Prim and Kruskall (CLRS, pg 625) Let G = (V,E) be a connected, undirected, weighted graph. Let A be a subset of some minimum spanning tree, T, for G, let (S, V-S) be some cut of G that respects A, and let (u,v) be a light edge crossing (S, V-S). Then, edge (u,v) is safe for A. Proof: If (u,v) is part of T, done Else, in T, u and v must be connected through another path, p. One of the edges of p, must connect a vertex x from A and a vertex, y, from V-A. Adding edge(u,v) to T will create a cycle with the path p.