Hannah Tang and Brian Tjaden Summer Quarter 2002 CSE 326: Data Structures Minimum Spanning Trees and The Dynamic Duo (Prim and Kruskal) Hannah Tang and Brian Tjaden Summer Quarter 2002
Some Applications: Moving Around Washington Okay, in the last lecture, we saw that this was a single-source shortest path problem, which was solvable by Dijkstra’s algorithm. Now, we’re going to look at another one of the graph applications we talked about earlier. What’s the fastest way from Seattle to Spokane? Use Dijkstra’s Algorithm!
Some Applications: Communication in Washington This slide should be review What’s the cheapest inter-city network?
Spanning Tree Spanning tree: a subset of the edges from a connected graph that… touches all vertices in the graph (spans the graph) forms a tree (is connected and contains no cycles) Minimum spanning tree: the spanning tree with the least total edge cost. 4 7 1 5 9 2 The previous problem is a minimum spanning tree problem. We have two primary ways of solving this problem deterministically. BTW, Which of these three is the minimum spanning tree? (The one in the middle)
Two Different Algorithms This is like a two-for-one deal! Prim’s is similar to Dijkstra’s in that it starts from one particular node, and then greedily grows a MST from that node Kruskal’s takes an entirely different technique by starting with a forest of minimum spanning trees and unioning those trees together … hmm … unioning … I wonder what data structure we’ll use? Prim’s Algorithm Almost identical to Dijkstra’s Kruskals’s Algorithm Completely different!
Prim’s Algorithm for Minimum Spanning Trees A node-oriented greedy algorithm (builds an MST by greedily adding nodes) Select a node to be the “root” and mark it as known While there are unknown nodes left in the graph Select the unknown node n with the smallest cost from some known node m Mark n as known Add (m, n) to our MST The proof of correctness and runtime is the same as Dijkstra’s, ie O(E log V) Runtime:
Prim’s Algorithm In Action B D F H G E 2 3 1 4 10 8 9 7 Start with ‘a’
Kruskal’s Algorithm for Minimum Spanning Trees An edge-oriented greedy algorithm (builds an MST by greedily adding edges) Initialize all vertices to unconnected While there are still unmarked edges Pick the lowest cost edge e = (u, v) and mark it If u and v are not already connected, add e to the minimum spanning tree and connect u and v Here’s how Kruskal’s works. This should look very familiar. Remember our algorithm for maze creation? Except that the edge order there was random, this is the same as that algorithm! Sound familiar? (Think maze generation.)
Kruskal’s Algorithm In Action B D F H G E 2 3 1 4 10 8 9 7 Therefore, this example should look a lot like maze creation. We test the cheapest edge first (and break ties by choosing the leftmost edge first) Since B and C are unconnected, we add the edge to the MST and union B and C.
Proof of Correctness We already showed this finds a spanning tree: That was part of our definition of a good maze. Proof by contradiction that Kruskal’s finds the minimum: Assume another spanning tree has lower cost than Kruskal’s Pick an edge e1 = (u, v) in that tree that’s not in Kruskal’s Kruskal’s tree connects u’s and v’s sets with another edge e2 But, e2 must have at most the same cost as e1! So, swap e2 for e1 (at worst keeping the cost the same) Repeat until the tree is identical to Kruskal’s: contradiction! QED: Kruskal’s algorithm finds a minimum spanning tree. We already know this makes a spanning tree because our maze generation algorithm made a spanning tree. But, does this find the minimum spanning tree? Let’s assume it doesn’t. Then, there’s some other better spanning tree. Let’s try and make that tree more like Kruskal’s tree. (otherwise, Kruskal’s would have considered and chosen e1 before ever reaching e2) BTW, this is another proof technique for showing that a greedy algorithm finds the global optimal. - For Dijkstra’s/Prim’s, we saw a proof of the type “greedy stays ahead” -- that is, we assume that there is some other algorithm which is optimal. Then we show that the greedy algorithm does the same thing (or better) as the optimal algorithm. - This time around, we use an “exchange argument” proof. We show that, through exchanges which do not affect the value of the greedy solution, that our greedy algorithm finds the optimal result.
Data Structures for Kruskal’s Algorithm |E| times: Pick the lowest cost edge… findMin/deleteMin |E| times: If u and v are not already connected… …connect u and v. What data structures do we need for this? Priority Queue and Disjoint Set Union/Find What about initialization? We need to put all the edges in a priority queue; we can do that fast with a buildHeap call: O(|E|) So, overall, the runtime is O(|E|ack(|E|,|V|) + |E|log|E|) Who knows what’s slower? Inverse ackermann’s or log? LOG! How much slower? TONS! One last little point: |E| is at most |V|2, but log|V|2 = 2 log |V|. So: O(|E|log|V|) find representative union Runtime: