Approximation Algorithms Greedy Strategies
I hear, I forget. I learn, I remember. I do, I understand! 2 Max and Min min f is equivalent to max –f. However, a good approximation for min f may not be a good approximation for max –f. For example, consider a graph G=(V,E). C is a minimum vertex cover of G iff V \ C is a maximum independent set of G. The minimum vertex cover has a polynomial-time 2-approximation, but the maximum independent set has no constant-bounded approximation unless NP=P. Another example: Minimum Connected Dominating Set and Minimum Spanning Tree with Maximum Number of Leaves
I hear, I forget. I learn, I remember. I do, I understand! 3 Greedy for Max and Min Max --- independent system Min --- submodular potential function
I hear, I forget. I learn, I remember. I do, I understand! 4 Independent System
I hear, I forget. I learn, I remember. I do, I understand! 5 Independent System Consider a set E and a collection C of subsets of E. (E,C) is called an independent system if The elements of C are called independent sets
I hear, I forget. I learn, I remember. I do, I understand! 6 Maximization Problem
I hear, I forget. I learn, I remember. I do, I understand! 7 Greedy Approximation MAX
I hear, I forget. I learn, I remember. I do, I understand! 8 Theorem
I hear, I forget. I learn, I remember. I do, I understand! 9 Proof
I hear, I forget. I learn, I remember. I do, I understand! 10
I hear, I forget. I learn, I remember. I do, I understand! 11
I hear, I forget. I learn, I remember. I do, I understand! 12 Maximum Weight Hamiltonian Cycle Given an edge-weighted complete graph, find a Hamiltonian cycle with maximum total weight.
I hear, I forget. I learn, I remember. I do, I understand! 13 Independent sets E = {all edges} A subset of edges is independent if it is a Hamiltonian cycle or a vertex-disjoint union of paths. C = a collection of such subsets
I hear, I forget. I learn, I remember. I do, I understand! 14 Maximal Independent Sets Consider a subset F of edges. For any two maximal independent sets I and J of F, |J| < 2|I|
I hear, I forget. I learn, I remember. I do, I understand! 15
I hear, I forget. I learn, I remember. I do, I understand! 16 Theorem: For the maximum Hamiltonian cycle problem, the greedy algorithm MAX produces a polynomial time approximation with performance ratio at most 2.
I hear, I forget. I learn, I remember. I do, I understand! 17 Maximum Weight Directed Hamiltonian Cycle Given an edge-weighted complete digraph, find a Hamiltonian cycle with maximum total weight.
I hear, I forget. I learn, I remember. I do, I understand! 18 Independent sets E = {all edges} A subset of edges is independent if it is a directed Hamiltonian cycle or a vertex-disjoint union of directed paths.
I hear, I forget. I learn, I remember. I do, I understand! 19
I hear, I forget. I learn, I remember. I do, I understand! 20 Tightness ε ε The rest of all edges have a cost ε
I hear, I forget. I learn, I remember. I do, I understand! 21 A Special Case If c satisfies the following quadrilateral condition: For any 4 vertices u, v, u’, v’ in V, Then the greedy approximation for maximum weight Hamiltonian cycle has the performance ratio 2.
I hear, I forget. I learn, I remember. I do, I understand! 22
I hear, I forget. I learn, I remember. I do, I understand! 23
I hear, I forget. I learn, I remember. I do, I understand! 24
I hear, I forget. I learn, I remember. I do, I understand! 25
I hear, I forget. I learn, I remember. I do, I understand! 26 Superstring Given n strings s 1, s 2, …, s n, find a shortest string s containing all s 1, s 2, …, s n as substrings. No s i is a substring of another s j.
I hear, I forget. I learn, I remember. I do, I understand! 27 An Example Given S = {abcc, efaab, bccef} Some possible solutions: Concatenate all substrings = abccefaabbccef (14 chars) A shortest superstring is abccefaab (9 chars)
I hear, I forget. I learn, I remember. I do, I understand! 28 Relationship to Set Cover? How to “ transform ” the shortest superstring (SS) to the Set Cover (SC) problem? Need to identify U Need to identify S Need to define the cost function The SC instance is an SS instance Let U = S (a set of n strings). How to define S ?
I hear, I forget. I learn, I remember. I do, I understand! 29 Relationship to SC (cont) ||| Let M be the set that consists of the strings σ ijk sisi sjsj σ ijk k
I hear, I forget. I learn, I remember. I do, I understand! 30 Let C is the set cover of this constructed SC, then the concatenation of all strings in C is a solution of SS. Note that C is a collection of Relationship to SC (cont) Now, define S Define cost of
I hear, I forget. I learn, I remember. I do, I understand! 31 Algorithm 1 for SS
I hear, I forget. I learn, I remember. I do, I understand! 32 Approximation Ratio Lemma 1: Let opt be length of the optimal solution of SS and opt ’ be the cost of the optimal solution of SC, we have: opt ≤ opt’ ≤ 2opt Proof:
I hear, I forget. I learn, I remember. I do, I understand! 33 Proof of Lemma 1 (cont)
I hear, I forget. I learn, I remember. I do, I understand! 34 Approximation Ratio Theorem1: Algorithm 1 has an approximation ratio within a factor of 2H n Proof: We know that the approximation ratio of Set Cover is H n. From Lemma 1, it follows directly that Algorithm 1 is a 2H n factor algorithm for SS
I hear, I forget. I learn, I remember. I do, I understand! 35 Prefix and Overlap For two string s 1 and s 2, we have: Overlap(s 1,s 2 ) = the maximum between the suffix of s 1 and the prefix of s 2. pref(s 1,s 2 ) = the prefix of s 1 that remains after chopping off overlap(s 1,s 2 ) Example: s 1 = abcbcaa and s 2 = bcaaca, then overlap(s 1,s 2 ) = bcaa pref(s 1,s 2 ) = abc Note: overlap(s 1,s 2 ) ≠ overlap(s 2, s 1 )
I hear, I forget. I learn, I remember. I do, I understand! 36 Is there any better approach? Now, suppose that in the optimal solution, the strings appear from the left to right in the order: s 1, s 2, …, s n Define: opt = |pref(s 1,s 2 )| + …+|pref(s n-1,s n )| + |pref(s n,s 1 )| + |overlap(s n,s 1 )| Why overlap(s n,s 1 )? Consider this example S={agagag, gagaga}. If we just consider the prefix only, the result would be ag whereas the correct result is agagaga
I hear, I forget. I learn, I remember. I do, I understand! 37 Prefix Graph Define the prefix graph as follows: Complete weighted directed graph G=(V,E) V is a set of vertices, labeled from 1 to n (each vertex represents each string s i ) For each edge i→j, i ≠ j, assign a weight of |pref(s i, s j )| Example: S={abc, bcd, dab} 1( ) ( )3 2( )
I hear, I forget. I learn, I remember. I do, I understand! 38 Cycle Cover Cycle Cover: a collection of disjoint cycles covering all vertices (each vertex is in exactly one cycle) Note that the tour 1 → 2 → … → n → 1 is a cycle cover Minimum weight cycle cover: sum of weights is minimum over all covers Thus, we want to find a minimum weight cycle cover
I hear, I forget. I learn, I remember. I do, I understand! 39 How to find a min. weight cycle cover Corresponding to the prefix graph, construct a bipartite graph H=(X,Y;E) such that: X = {x 1, x 2, …, x n } and Y = {y 1, y 2, …, y n } For each i, j (in 1…n), add edge (x i, y j ) of weight |pref(s i,s j )| Each cycle cover of the prefix graph ↔ a perfect matching of the same weight in H. (Perfect matching is a matching which covers all the vertices) Finding a minimum weight cycle cover = finding a minimum weight perfect matching (which can be found in poly-time)
I hear, I forget. I learn, I remember. I do, I understand! 40 How to break the cycle s 11 s 12 s 13
I hear, I forget. I learn, I remember. I do, I understand! 41 A constant factor algorithm Algorithm 2:
I hear, I forget. I learn, I remember. I do, I understand! 42 Approximation Ratio Lemma 2: Let C be the minimum weight cycle cover of S. Let c and c ’ be two cycles in C, and let r, r ’ be representative strings from these cycles. Then |overlap(r, r ’ )| < w(c) + w(c’) Proof: Exercise
I hear, I forget. I learn, I remember. I do, I understand! 43 Approximation Ratio (cont) Theorem 2: Algorithm 2 has an approximation ratio of 4. Proof: (see next slide)
I hear, I forget. I learn, I remember. I do, I understand! 44 Proof
I hear, I forget. I learn, I remember. I do, I understand! 45 Modification to 3-Approximation
I hear, I forget. I learn, I remember. I do, I understand! 46 3-Approximation Algorithm Algorithm 3:
I hear, I forget. I learn, I remember. I do, I understand! 47 Superstring via Hamiltonian path |ov(u,v)| = max{|w| | there exist x and y such that u=xw and v=wy} Overlap graph G is a complete digraph: V = {s 1, s 2, …, s n } |ov(u,v)| is edge weight. Suppose s* is the shortest supper string. Let s 1, …, s n be the strings in the order of appearance from left to right. Then s i, s i+1 must have maximum overlap in s*. Hence s 1, …, s n form a directed Hamiltonian path in G.
I hear, I forget. I learn, I remember. I do, I understand! 48
I hear, I forget. I learn, I remember. I do, I understand! 49
I hear, I forget. I learn, I remember. I do, I understand! 50 The Algorithm (via Hamiltonian)
I hear, I forget. I learn, I remember. I do, I understand! 51 A special property v u v’ u’
I hear, I forget. I learn, I remember. I do, I understand! 52 Theorem The Greedy approximation MAX for maximum Hamiltonian path in overlapping graph has performance ratio 2. Conjecture: This greedy approximation also give the minimum superstring an approximation solution within a factor of 2 from optimal. Example: S={ab k, b k+1, b k a}. s* = ab k+1 a. Our obtained solution: ab k ab k+1.
I hear, I forget. I learn, I remember. I do, I understand! 53 Submodular Function
I hear, I forget. I learn, I remember. I do, I understand! 54 What is a submodular function? Consider a finite set E, (called ground set), and a function f : 2 E →Z. The function f is said to be submodular if for any two subsets A and B in 2 E : Example: f(A) = |A| is submodular.
I hear, I forget. I learn, I remember. I do, I understand! 55 Set-Cover Given a collection C of subsets of a set E, find a minimum subcollection C ’ of C such that every element of E appears in a subset in C ’.
I hear, I forget. I learn, I remember. I do, I understand! 56 Greedy Algorithm Return C’ Here: f(C) = # of elements in C Basically, the algorithm pick up the set that cover the most uncovered elements at each step
I hear, I forget. I learn, I remember. I do, I understand! 57 Analyze the Approximation Ratio
I hear, I forget. I learn, I remember. I do, I understand! 58
I hear, I forget. I learn, I remember. I do, I understand! 59
I hear, I forget. I learn, I remember. I do, I understand! 60 Alternative Analysis
I hear, I forget. I learn, I remember. I do, I understand! 61 What do we need?
I hear, I forget. I learn, I remember. I do, I understand! 62 What ’ s we need?
I hear, I forget. I learn, I remember. I do, I understand! 63 Actually, this inequality holds if and only if f is submodular and (monotone increasing)
I hear, I forget. I learn, I remember. I do, I understand! 64
I hear, I forget. I learn, I remember. I do, I understand! 65 Proof
Proof of (1) I hear, I forget. I learn, I remember. I do, I understand! 66
Proof of (2) I hear, I forget. I learn, I remember. I do, I understand! 67
I hear, I forget. I learn, I remember. I do, I understand! 68 Theorem Greedy Algorithm produces an approximation within ln n +1 from optimal for the set cover problem The same result holds for weighted set-cover.
I hear, I forget. I learn, I remember. I do, I understand! 69 Weighted Set Cover Given a collection C of subsets of a set E and a weight function w on C, find a minimum total- weight subcollection C ’ of C such that every element of E appears in a subset in C ’.
I hear, I forget. I learn, I remember. I do, I understand! 70 Greedy Algorithm
I hear, I forget. I learn, I remember. I do, I understand! 71 A General Problem
I hear, I forget. I learn, I remember. I do, I understand! 72 Greedy Algorithm
I hear, I forget. I learn, I remember. I do, I understand! 73 A General Theorem Remark (Normalized):
I hear, I forget. I learn, I remember. I do, I understand! 74 Proof
I hear, I forget. I learn, I remember. I do, I understand! 75 Proof (cont) We will prove these following claims:
I hear, I forget. I learn, I remember. I do, I understand! 76 Show the First Claim
I hear, I forget. I learn, I remember. I do, I understand! 77
I hear, I forget. I learn, I remember. I do, I understand! 78
I hear, I forget. I learn, I remember. I do, I understand! 79 Show the Second Claim For any integers p > q > 0, we have: $(p – q)/p = \sum_{j=q+1}^p 1/p \le \sum_{j=q+1}^p 1/j$
I hear, I forget. I learn, I remember. I do, I understand! 80 Connected Vertex-Cover Given a connected graph, find a minimum vertex-cover which induces a connected subgraph.
I hear, I forget. I learn, I remember. I do, I understand! 81 For any vertex subset A, p(A) is the number of edges not covered by A. For any vertex subset A, q(A) is the number of connected component of the subgraph induced by A. -p is submodular. -q is not submodular. Note that when A is a connected vertex cover, the q(A) = 1 and p(A) = 0.
I hear, I forget. I learn, I remember. I do, I understand! 82 -p-q Define f(A)= -p(A) –q(A). Then f(A) is submodular and monotone increasing
I hear, I forget. I learn, I remember. I do, I understand! 83 Theorem Connected Vertex-Cover has a (1+ln Δ)- approximation where Δ is the maximum degree. -p(Ø)=-|E|, -q(Ø)=0. |E|-p(x)-q(x) < Δ-1
I hear, I forget. I learn, I remember. I do, I understand! 84 Weighted Connected Vertex-Cover Given a vertex-weighted connected graph, find a connected vertex-cover with minimum total weight. Theorem Weighted Connected Vertex-Cover has a (1+ln Δ)-approximation.