Greedy Algorithms (I) Greed, for lack of a better word, is good! Greed is right! Greed works! - Michael Douglas, U.S. actor in the role of Gordon Gecko, in the film Wall Street, 1987
Main topics Idea of the greedy approach Change-making problem Minimum spanning tree problem Prim’s algorithm Kruskal’s algorithm Properties of minimum spanning tree (additive part) Bottleneck spanning tree (additive part)
Expected Outcomes Student should be able to summarize the idea of the greedy technique list several applications of the greedy technique apply the greedy technique to change-making problem define the minimum spanning tree problem describe ideas of Prim and Kruskal’s algorithms for computing minimum spanning tree prove the correctness of the two algorithms analyze the time complexities of the two algorithms under different data structures prove the various properties of minimum spanning tree
Anticipatory Set: Change Making Problem How to make 48 cents of change using coins of denominations of 25 (1 quarter coin), 10 (1 dime coin), 5 (1 nickel coin), and 1 (1 penny coin) so that the total number of coins is the smallest? The idea: Take the coin with largest denomination without exceeding the remaining amount of cents, make the locally best choice at each step. Is the solution optimal? Yes, the proof is left as exercise.
General Change-Making Problem Given unlimited amounts of coins of denominations d1 > … > dm , give change for amount n with the least number of coins. Does the greedy algorithm always give an optimal solution for the general change-making problem? We give the following example, Example: d1 = 7c, d2 =5c, d3 = 1c, and n = 10c, not always produces an optimal solution. in fact, for some instances, the problem may not have a solution at all! consider instance d1 = 7c, d2 =5c, d3 = 3c, and n = 11c. But, this problem can be solved by dynamic programming, please try to design an algorithm for the general change-making problem after class.
Greedy Algorithms A greedy algorithm makes a locally optimal choice step by step in the hope that this choice will lead to a globally optimal solution. The choice made at each step must be: Feasible Satisfy the problem’s constraints locally optimal Be the best local choice among all feasible choices Irrevocable Once made, the choice can’t be changed on subsequent steps. Do greedy algorithms always yield optimal solutions? Example: change making problem with a denomination set of 7, 5 and 1, and n =10? For many problems, using dynamic programming to determine the best choice is an overkill. Simpler, more efficient algorithm will do. A greedy algorithm always makes the choice that looks best at the moment. That is, it makes a locally optimal choice in the hope that this choice will lead to a globally optimal solution. The greedy algorithm is quite powerful and works well for a wide range of problems.
Applications of the Greedy Strategy For some problems, yields an optimal solution for every instance. For most, does not but can be useful for fast approximations. Optimal solutions: some instances of change making Minimum Spanning Tree (MST) Single-source shortest paths Huffman codes Approximations: Traveling Salesman Problem (TSP) Knapsack problem other optimization problems
Minimum Spanning Tree (MST) Spanning tree of a connected graph G is a connected acyclic subgraph (tree) of G that includes all of G’s vertices. Note: a spanning tree with n vertices has exactly n-1 edges. Minimum Spanning Tree of a weighted, connected graph G is a spanning tree of G of minimum total weight. Example:
MST Problem Given a connected, undirected, weighted graph G= (V, E), find a minimum spanning tree for it. Compute MST through Brute Force? Kruskal: 1956, Prim: 1957 Brute force Generate all possible spanning trees for the given graph. Find the one with minimum total weight. Feasibility of Brute force Possible too many trees (exponential for dense graphs)
The Prim’s Algorithm Idea of the Prim’s algorithm Pseudo-code of the algorithm Correctness of the algorithm (important) Time complexity of the algorithm
Idea of Prim Grow a single tree by repeatedly adding the least cost edge (greedy step) that connects a vertex in the existing tree to a vertex not in the existing tree Intermediate solution is always a subtree of some minimum spanning tree.
Prim’s MST algorithm Start with a tree , T0 ,consisting of one vertex “Grow” tree one vertex/edge at a time Construct a series of expanding subtrees T1, T2, … Tn-1. .At each stage construct Ti from Ti-1 by adding the minimum weight edge that connecting a vertex in tree (Ti-1) to a vertex not yet in the tree this is the “greedy” step! Algorithm stops when all vertices are included
Pseudocode of the Prim ALGORITHM Prim(G) //Prim’s algorithm for constructing a minimum spanning tree //Input: A weighted connected graph G = (V, E) //Output: ET, the set of edges composing a minimum spanning tree of G VT {v0} //v0 can be arbitrarily selected ET for i 1 to |V|-1 do find a minimum-weight edge e* = (v*, u*) among all the edges (v, u) such that v is in VT and u is in V-VT VT VT {u*} ET ET {e*} return ET
An example a e d c b 1 5 2 4 6 3 7
The Prim’s algorithm is greedy! The choice of edges added to current subtree satisfying the three properties of greedy algorithms. Feasible, each edge added to the tree does not result in a cycle, guarantee that the final ET is a spanning tree Local optimal, each edge selected to the tree is always the one with minimum weight among all the edges crossing VT and V-VT Irrevocable, once an edge is added to the tree, it is not removed in subsequent steps.
Correctness of Prim Prove by induction that this construction process actually yields MST. T0 is a subset of all MSTs Suppose that Ti-1 is a subset of some MST T, we should prove that Ti which is generated from Ti-1 is also a subset of some MST. By contradiction, assume that Ti does not belong to any MST. Let ei = (u, v) be the minimum weight edge from a vertex in Ti-1 to a vertex not in Ti-1 used by Prim’s algorithm to expanding Ti-1 to Ti , according to our assumption, ei can not belong to MST T. Adding ei to T results in a cycle containing another edge e’ = (u’, v’) connecting a vertex u’ in Ti-1 to a vertex v’ not in it, and w(e’) w(ei) according to the greedy Prim’s algorithm. Removing e’ from T and adding ei to T results in another spanning tree T’ with weight w(T’) w(T), indicating that T’ is a minimum spanning tree including Ti which contradict to assumption that Ti does not belong to any MST.
Correctness of Prim
Implementation of Prim How to implement the steps in the Prim’s algorithm? First idea, label each vertex with either 0 or 1, 1 represents the vertex in VT, and 0 otherwise. Traverse the edge set to find an minimum weight edge whose endpoints have different labels. Time complexity: O(VE) if adjacency linked list and O(V3) for adjacency matrix For sparse graphs, use adjacency linked list Any improvement?
Notations T: the expanding subtree. Q: the remaining vertices. At each stage, the key point of expanding the current subtree T is to Determine which vertex in Q is the nearest vertex to T. Q can be thought of as a priority queue: The key (priority) of each vertex, key[v], means the minimum weight edge from v to a vertex in T. Key[v] is ∞ if v is not linked to any vertex in T. The major operation is to to find and delete the nearest vertex (v, for which key[v] is the smallest among all the vertices) Remove the nearest vertex v from Q and add it and the corresponding edge to T. With the occurrence of that action, the key of v’s neighbors will be changed. To remember the edges of the MST, an array [] is introduced to record the parent of each vertex. That is [v] is the vertex in the expanding subtree T that is closest to v not in T.
Advanced Prim’s Algorithm ALGORITHM MST-PRIM( G, w, r ) //w: weight; r: root, the starting vertex for each u V[G] do key[u] [u] NIL // [u] : the parent of u key[r] 0 Q V[G] //Now the priority queue, Q, has been built. while Q do u Extract-Min(Q) //remove the nearest vertex from Q for each v Adj[u] // update the key for each of v’s adjacent nodes. do if v Q and w(u,v) < key[v] then [v] u key[v] w(u,v)
Time Complexity of Prim’s algorithm Need priority queue for locating the nearest vertex Use unordered array to store the priority queue: Efficiency: Θ(n2) Use binary min-heap to store the priority queue Efficiency: For graph with n vertices and m edges: O(m log n) Use Fibonacci-heap to store the priority queue: O(n log n + m)
Kruskal’s MST Algorithm Edges are initially sorted by increasing weight Start with an empty forest F0 “grow” MST one edge at a time through a series of expanding forests F1, F2, …, Fn-1 intermediate stages usually have forest of trees (not connected) at each stage add minimum weight edge among those not yet used that does not create a cycle need efficient way of detecting/avoiding cycles algorithm stops when all vertices are included mlogm
Correctness of Kruskal Similar to the proof of Prim Prove by induction on the construction process actually generate a MST Consider F0, F1, …, Fn-1
Basic Kruskal’s Algorithm ALGORITHM Kruscal(G) //Input: A weighted connected graph G = <V, E> //Output: ET, the set of edges composing a minimum spanning tree of G. Sort E in nondecreasing order of the edge weights w(ei1) <= … <= w(ei|E|) ET ; ecounter 0 //initialize the set of tree edges and its size k 0 while encounter < |V| - 1 do k k + 1 if ET U {eik} is acyclic ET ET U {eik} ; ecounter ecounter + 1 return ET Always choosing the closest vertex in Q to add to VT greedy.
Kruskal’s Algorithm (Advanced Part)
Kruskal: Time Complexity O(EV) when disjoint-set data structure are not used. When use disjoint-set data structure with union-by-rank and path-compression heuristics: Initializing the set A in line 1 takes O(1) time. O(V) MAKE-SET operations in lines 2-3 Sort the edges in line 4 takes O(ElogE) time. The for loop of lines 5-8 performs O(E) FIND-SET and UNION operations on the disjoint-set forest b) and d) take a total of O((V+E)(V)) Line 9: O(V+E) Note that: E V-1 and (V) = O(logV) = O(logE) and E V2 SO total time is: O(ElogV)
Properties of MST Property 1: Let (u, v) be a minimum-weight edge in a graph G = (V, E), then (u, v) belongs to some minimum spanning tree of G. Proof hint: suppose (u, v) is not in MST T, construct another MST T’ including (u, v).
Properties of MST Property 2 A graph has a unique minimum spanning tree if all the edge weights are pairwise distinct. The converse does not hold.
Properties of MST Property 3 Property 4 Let T be a minimum spanning tree of a graph G, and let T’ be an arbitrary spanning tree of G, suppose the edges of each tree are sorted in non-decreasing order, that is, w(e1)w(e2) … w(en-1) and w(e1’)w(e2’) … w(en-1’), then for 1 i n-1, w(ei) w(ei’). Property 4 Let T be a minimum spanning tree of a graph G, and let L be the sorted list of the edge weights of T, then for any other minimum spanning tree T’ of G, the list L is also the sorted list of edge weights of T’.
Properties of MST Property 5 Let e=(u, v) be a maximum-weight edge on some cycle of G. Prove that there is a minimum spanning tree that does not include e. Proof. Arbitrarily choose a MST T. If T does not contain e, it is proved. Otherwise, T\e is disconnected, and suppose X, Y are the two connected components of T\e. Let e is on cycle C in G. Let P=C\e. Then there is an edge (x,y) on P such that xX, and y Y. And w(x, y) w(e). T’=T\e+(x,y) is a spanning tree and w(T’) w(T). Also we have w(T) w(T’), so w(T’) = w(T). T’ is a MST not including e.
Bottleneck Spanning Tree A bottleneck spanning tree T of a connected, weighted and undirected graph G is a spanning tree of G whose largest edge weight is minimum over all spanning trees of G. Let T1, T2, …, Tm are all the spanning trees of G, and the largest edge of each tree is et1, et2, …, etm, suppose w(eti) w(etj) for 1 j m and j i, then Ti is a bottleneck spanning tree. The value of a BST T is the weight of the maximum-weight edge in T. The bottleneck spanning tree may not be unique.
BST vs MST Every minimum spanning tree is a bottleneck spanning tree. Property 4 implies it. An easier proof: Let T be a MST and T’ be a BST, let the maximum-weight edge in T and T’ be e and e’, respectively. Suppose for the contrary that the MST T is not a BST, then we have w(e)>w(e’), which also indicates that the weight of e is greater than that of any edges in T’. Removing e from T disconnects T into two subtrees T1 and T2, there must exist an edge f in T’ connecting T1 and T2, otherwise, T is not connected. T1T2 {f} forms a new tree T’’ with w(T’’) = w(T) - w(e) + w(f) < w(T), A contradiction to the fact that T is MST, thus, A MST is also a BST.