Approximation Algorithms Chapter 14: Rounding Applied to Set Cover
Overview n Set Cover –Approximation by simple rounding f-approx. algorithm (f: the frequency of the most frequent element). –Approximation by randomized rounding O(log n)-approx. algorithm (n: # elements to be covered). n Weighted Vertex Cover –2-approx. algorithm Method based on half-integral solutions of the linear programming Each variable takes only 0, 1/2, or 1.
Set Cover n Input –Elements U={a 1,…,a n }, –Subsets of U:S={S 1,…,S m }. –Cost function c: S→Q +. n Output –Subsets of S that cover all elements in U s.t. the sum of costs of chosen subsets in S is minimized. S1S1 S2S2 S3S3 S4S4 S5S Cost of a subset Cost:2+2+2=6. Cost:1+4=5. Cost:2+1+4=7. This is not a solution since an element is not covered. a1a1 a2a2 a3a3 a4a4 a5a5 a6a6
Set Cover by linear inequalities (1/4) n Objective function –Minimize the sum of costs of subsets chosen: n Constraints –For covers Each element must appear in at least one chosen subset. –For choosing subsets Each subset is either chosen or not chosen. S1S1 S2S2 S3S3 S4S4 S5S Cost of a subset Cost: 2+2+2=6. Cost: 1+4=5. Cost: 2+1+4=7. a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 This is not a solution since an element is not covered.
Set Cover by linear inequalities (2/4) n Objective function –Minimize the sum of costs of subsets chosen. n Constraints –For covers –For choosing subsets S1S1 S2S2 S3S3 S4S4 S5S a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 This is not a solution since an element is not covered. Cost of a subset Cost: 2+2+2=6. Cost: 1+4=5. Cost: 2+1+4=7.
Set Cover by linear inequalities (3/4) n Objective function n Constraints –For covers –For choosing subsets S1S1 S2S2 S3S3 S4S4 S5S a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 Cost of a subset Cost: 2+2+2=6. Cost: 1+4=5. Cost: 2+1+4=7. This is not a solution since an element is not covered.
Set Cover by linear inequalities (4/4) n Objective function n Constraints –For covers –For choosing subsets S1S1 S2S2 S3S3 S4S4 S5S a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 Cost of a subset Cost: 2+2+2=6. Cost: 1+4=5. Cost: 2+1+4=7. This is not a solution since an element is not covered.
LP-relaxation n Constraints –Each subset is either chosen or not chosen. –It takes a value bet. 0 and 1. From the nature of Set Cover, the upper bound can be eliminated. –It takes a positive value. S1S1 S2S2 S3S3 S4S4 S5S a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 Cost of a subset Cost: 2+2+2=6. Cost: 1+4=5. Cost: 2+1+4=7. This is not a solution since an element is not covered.
Rounding n To change a natural number into an integer. x1x1 x2x2 x3x3 x4x4 0 1 Solution found by LP x1x1 x2x2 x3x3 x4x4 0 1 x1x1 x2x2 x3x3 x4x4 0 1 Rounded solution with a threshold Probabilistically rounded solution Threshold
Overview n Set Cover –Approximation by simple rounding f-approx. algorithm (f: the frequency of the most frequent element). –Approximation by randomized rounding O(log n)-approx. algorithm (n: # elements to be covered). n Weighted Vertex Cover –Method based on half-integral solutions of the linear programming Each variable takes only 0, 1/2, or 1. 2-approx. algorithm
Algorithm 14.1 n A simple rounding algorithm A 1 –f: the frequency of the most frequent element. –1. Find an optimal solution to the LP-relaxation. –2. Pick all sets S for which x S ≧ 1/f. x S becomes 1 if x S ≧ 1/f. S1S1 S2S2 S3S3 S4S4 S5S a1a1 a2a2 a3a3 a4a4 a5a5 f =2. Solution by LP-relax. Solution by LP-relax Rounded solution
Theorem 14.2 n A 1 (Algorithm 14.1) is a f -approximation algorithm for Set Cover. –We need to consider the following two properties: A 1 outputs a sound solution, which covers all elements. How much is the cost of the solution with A 1 ?
Proof of Theorem 14.2 (1/2) n A 1 outputs a sound solution, which covers all elements. –For any element a, there exists a set S s.t. a is in S and x S ≧ 1/f. From the constraints for covers. –Therefore, every element is chosen. At most f x S の値 From the constraints of covers, the sum of the areas of is at least 1. 1/f S s.t. a is in S = f (1/f )=1. f Area of
Proof of Theorem 14.2 (2/2) n How much is the cost of A 1 (COST) ? –Let OPT LP (OPT f in the text) be the cost of a solution by the LP-relaxation. –Let x S be a solution by the LP-relax., and y S rounded one. y S ≦ f x S holds since –x S ≧ 1/f, f x S ≧ 1=y S if y S =1. –x S ≧ 0, f x S ≧ 0=y S if y S =0. Therefore, COST ≦ f OPT LP ≦ f OPT. 1/f f xSf xS xS xS 1 yS yS f xSf xS xS xS 1 yS yS
Example 14.3 n A set consists of three connected elements in V i. n A cost of each set is 1. n f = 4. n The optimal cost: 2. n In the bottom figure, the cost is 8. V1V1 V2V2 V3V3 x S =1/4. x S =1.
Overview n Set Cover –Approximation by simple rounding f-approx. algorithm (f: the frequency of the most frequent element). –Approximation by randomized rounding O(log n)-approx. algorithm (n: # elements to be covered). n Weighted Vertex Cover –Method based on half-integral solutions of the linear programming Each variable takes only 0, 1/2, or 1. 2-approx. algorithm
Randomized rounding C=φ % C is a collection of picked sets. n while (C doesn’t satisfy condition A) –Find C by a manner explained later. This C satisfies condition A with prob. more than 1/2. n end-while –[Condition A] C is a solution of set cover. The cost of C is at most OPT LP ・ 4clog n. –c is some constant. –The expectation T of executing loops in while- statement is at most 2.
How to find C (1/2) n Compute a solution x S of the LP-relaxation. n for i=1 to clog n –Construct a family C i of picked sets by choosing S with prob x S. n end-for n C= ∪ C i. Solution by LP-relaxation 1.0 C1C1 S1S1 S2S2 C2C2 S1S1 S2S2 S4S4 C3C3 S1S1 S2S2 S5S5 C4C4 S1S1 S5S5 C S1S1 S2S2 S4S4 S5S5
How to find C (2/2) n Compute a solution x S of the LP-relaxation. n for i=1 to clog n –Construct a family C i of picked sets by choosing S with prob x S. n end-for n C= ∪ C i. –C is not a set cover with prob. less than 1/4. –The cost of C is more than OPT LP 4c log n with prob. less than 1/4. Less than 1/4 More than 1/2
Prob. that element a is in C i n Consider the example below. n The prob. P that any set S i containing element a is P=(1-x S1 )(1-x S2 )(1-x S3 ). –x S1 +x S2 +x S3 ≧ 1 from the constraints of covers. n P is maximized where x S1 =x S2 =x S3 =1/3. S1S1 S2S2 S3S3 S5S5 a1a1
Maximum prob. a is not chosen Suppose an element is in each of k sets. Let Fix d, and replace P k as Then, the partial derivative of log g becomes To simplify the problem, instead of maximize PiPi 0 log g +0 - Max This shows P i =P k makes log g maximized. This property holds for any i, then P i =d/k. Under the constraint that d ≧ 1, g takes the max. ((1-1/k) k ) where d=1. d/kd/k
Prob. C is not a set cover n Prob. a is not covered by using C i is at most (1- 1/k) k. n Prob. a is not covered by C is at most (1/e) clogn. –Choose constant c s.t. (1/e) clogn ≦ 1/(4n). –c ≧ 5 ≧ (4/log n)+1 (n ≧ 3).
Prob. C is not a set cover n Prob. at least one element is not in C is at most 1/4. Less than 1/4n At least one of a 1, a 2, a 3 is not chosen with prob. less than 1/4. Less than 1/4n a 1 is not chosen.a 2 is not chosen. a 3 is not chosen. n=3
The cost of C
Markov’s inequality (1/2) n Random variable X takes a non-negative value, and the average of X is μ. ≧0≧0 ≧ε≧ε X P(X=x) x x=ε
Markov’s inequality (2/2) n Random variable X takes a non-negative value, and the average of X is μ. ≧0≧0 ≧ε≧ε
The value of cost (C) Apply Markov’s inequality to cost (C). Prob. the cost of C becomes more than OPT LP 4clog n is at most 1/4.
n Each of the following two events happens with prob. less than 1/4. –C is not a set cover, –The cost of C is more than c=OPT LP 4clog n. n Therefore, the event that C is a set cover and its cost is at most c is at least 1/2. Less than 1/4 More than 1/2
Overview n Set Cover –Approximation by simple rounding f-approx. algorithm (f: the frequency of the most frequent element). –Approximation by randomized rounding O(log n)-approx. algorithm (n: # elements to be covered). n Weighted Vertex Cover –Method based on half-integral solutions of the linear programming Each variable takes only 0, 1/2, or 1. 2-approx. algorithm
Weighted vertex cover n Weighted vertex cover –Input: graph with weights on vertices G=(V,E). –Output: A ⊆ V. For any (u,v) ∈ E, u ∈ A or v ∈ A. The sum of weights of v ∈ A is minimized v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6
Weighted vertex cover n Definition –Input: Graph G=(V,E). –Output: A ⊆ V. For any edge (u,v) ∈ E, u ∈ A or v ∈ A. The sum of weights of v ∈ A is minimized a1a1 a2a2 a3a3 a6a6 a4a4 a5a a1a1 a2a2 a3a3 a6a6 a4a4 a5a5 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6
Formulation by linear inequalities n Objective function –Minimize: n Constraints –For covers –For choosing edges a1a1 a2a2 a3a3 a6a6 a4a4 a5a a1a1 a2a2 a3a3 a6a6 a4a4 a5a5 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6
Formulation by linear inequalities n Objective function –Minimize: n Constraints –For covers –For choosing edges a1a1 a2a2 a3a3 a6a6 a4a4 a5a a1a1 a2a2 a3a3 a6a6 a4a4 a5a5 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6
LP-relaxation n Objective function –Minimize: n Constraints –For covers –For choosing edges a1a1 a2a2 a3a3 a6a6 a4a4 a5a a1a1 a2a2 a3a3 a6a6 a4a4 a5a5 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6
LP-relaxation n Objective function –Minimize: n Constraints –For covers –For choosing edges a1a1 a2a2 a3a3 a6a6 a4a4 a5a a1a1 a2a2 a3a3 a6a6 a4a4 a5a5 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6
Extreme point solution n The optimal solution of Linear Programming. n The solution which cannot be expressed as convex combination of two other feasible solution. –Convex combination: A linear equation s.t. the sum of its coefficients is 1. z is a convex combination of x and y, where z =0.8x+0.2y. Feasible solution Convex combination of feasible solution
Half-integral solution n Solution of Linear Programming s.t. each value takes 0, 1/2 or 1.
2-approximation algorithm n Compute an extreme point solution x. n Choose any vertex s.t its corresponding value takes 1/2 or 1. –If x is an extreme point solution, each variable takes 0, 1/2, or 1. (Lemma 14.4)
Lemma 14.4 n x: a solution of weighted vertex cover obtained by Linear Programming. n If x is not half-integral, x can be expressed as convex combination of two other feasible solution. –x is not an extreme point solution. –Outline of its proof Construct y and z s.t. x is not half-integral and x=1/2(y+z). –x can be expressed by convex combination of y and z. Show y and z are feasible solutions.
Proof of Lemma 14.4 (1/3) n Construct other solutions y and z from x, each of them takes 0, 1/2, or x y z 0.5 +ε -ε-ε -ε-ε V + ={v 3 }. v i s.t. x vi > 1/2. V - ={v 4 }. v i s.t. x vi < 1/2. +εin y, - εin z. - εin y, +εin z. holds.
Proof of Lemma 14.4 (2/3) n Are y and z feasible solutions? –In any feasible solution, x u +x v ≧ 1 holds. xuxu xvxv 1/21 1 Feasible solution yuyu yvyv 1/21 1 Change from x to y zuzu zvzv 1/21 1 Change from x to z Where εis set to a small value, y and z are feasible solutions.
Proof of Lemma 14.4 (3/3) n When x u + x v =1, –x u = x v =1/2. y u = y v =z u = z v =1/2 (no change). –x u =0, x v =1. y u =0, y v =1, z u =0, z v =1 (no change). –x u 1/2. y u + y v = x u +ε + x v -ε =1, z u + z v = x u -ε + x v +ε =1. n Then, y and z are feasible solutions, and any solution can be expressed by a half-integral solution.