Download presentation
Presentation is loading. Please wait.
Published byJordan Norland Modified over 10 years ago
1
A threshold of ln(n) for approximating set cover By Uriel Feige Lecturer: Ariel Procaccia
2
Set Cover We are given an (unweighted!) collection F of subsets of U={1,…,n}. We must find as few as possible subsets from F such that their union covers U. Hypergraph vertex cover (lecture of 10/3) is a special case of set cover.
3
Approximation algorithms A greedy algorithm: Recall algorithms 1… At each stage, add to the cover the subset which maximizes the number of new elements. This algorithm gives an approximation of ln(n)- lnln(n)+O(1). An approximation of ln(n) can also be achieved using a linear programming relaxation.
4
A bad example for the greedy algorithm U={(x,y): 1 ≤ x,y ≤ n} S 1 = {(x,y): x ≤ n/2} S 2 = {(x,y): x ≥ n/2} T 1 = {(x,y): 1 ≤ y ≤ n/2} T 2 = {(x,y): n/2 ≤ y ≤ 3n/4} … 32 16 8 8 Ratio: logn/2
5
Overview (1) We wish to prove Theorem 8: If a poly time algorithm can approximate set cover within (1- ε)ln(n), then NP is a subset of DTIME(n O(loglog(n)) ). The strategy: a reduction from a k-prover proof system for an NP-complete problem. Familiarwith IP?
6
Overview (2) Parts of the lecture: MAX 3SAT-5. The k-prover proof system. Partition systems. The reduction to set cover. Thundering applause. May the force be with us. Similar to the lecture of 17/3.
7
MAX 3SAT-5 Input: A CNF formula with n variables and 5n/3 clauses, in which every clause contains exactly three literals, every variable appears in exactly 5 clauses, and a variable does not appear in a clause more than once. Theorem 1: For some >0, it is NP-hard to distinguish between 3CNF-5 formulas where OPT=1 or OPT≤(1- ). Reduction from MAX- 3SAT-B
8
Two prover proof system for 3SAT- 5 Presented as to provide intuition and help in the analysis of the k-prover system. One round, two prover system for 3SAT-5. Protocol: V selects an index of a clause, sends it to the first prover, selects a random var in the clause, and sends it to the second prover. First prover returns 3 bits, second returns 1 bit. V accepts if the following conditions hold: Clause check: the assignment sent by the first prover satisfies the clause. Consistency check: the assignment sent by the second prover is identical to the assignment for the same variable sent by the first prover. Assgn. To clause and var
9
Proposition 2: If OPT=1- ε, then under the optimal strategy of the provers, V accepts with probability (1-ε/3). Proof: The strategy of the second prover defines an assignment A to the variables. If V selects a clause that is not satisfied by A (prob. ε), the first prover must set one of the variables differently from A. The consistency check fails with prob. ≥ 1/3. ■ There is a strategy with acceptance prob ε/3
10
Parallel repetition We would like to lower the error. Modified proof system: V sends to the first prover l clauses, from each chooses a var, and sends these l vars to the second prover. Theorem 3 (Ran Raz): If a one round two prover system is repeated l times independently in parallel, then the error is 2 -cl, where c>0 is a constant that depends only on the original proof system. The error of the modified two prover system is ≤ 2 -cl, for some universal constant c. C for subtle
11
The k-prover proof system Binary code with k codewords, each of length l and weight l/2, with Hamming distance at least l/3. Each prover is associated with a codeword. The Protocol: The verifier selects l clauses C 1,…,C l, then selects a var from each clause: the distinguished variables x 1,…,x l. Prover P i receives C j for those coordinates in its codeword that are 1, and x j for the coordinates that are 0, and replies with 2l bits. The answer of the prover induces an assignment to the distinguished variables. Acceptance predicate: Weak: at least one pair of provers is consistent. Strong: every pair of provers is consistent. P 1 : 0011P 2 : 0101P 3 : 1100 V c 1 c 2 v 3 v 4 100,010,1,0
12
Lemma 4: If OPT=1 then the provers have a strategy that causes V to always strongly accept. If OPT≤(1- ε), then the verifier weakly accepts with probability at most k 2 2 -cl, where c>0 is a constant that depends only on ε. Proof: If OPT=1, the provers can base their answers on a satisfying assignment. Assume OPT=(1- ε), and that V weakly accepts with prob. ≥ δ. Then with respect to P i and P j, V accepts with probability δ/k 2. There are ≥ l/6 coordinates on which P i receives a clause, and P j a var in this clause. Other coordinates: for free. The provers have a strategy that succeeds with prob. ≥ δ/k 2 on l/6 parallel repetitions of the original proof system. δ/k 2 <2 -cl. ■
13
Partition systems A partition system B(m,L,k,d) has the following properties. 1. There exists a ground set B of m distinct points. 2. There is a collection of L distinct partitions p 1,…,p L. 3. For 1≤i≤L, partition p i is a collection of k disjoint subsets of B whose union is B. 4. Any cover of the m points by subsets that appear in pairwise different partitions requires at least d subsets. Lemma 6: For every c≥0 and m sufficiently large, there is a partition system B(m,L,k,d) whose parameters satisfy the following inequalities: 1. L ≈ (log(m)) c. 2. K can be chosen arbitrarily as long as k < ln(m)/3ln(ln(m)). 3. d = (1-f(k))k∙ln(m), where f(k)→∞ as k→∞. Proof: a randomized construction usually works. Looks familiar? L=4 k = 4
14
The reduction R=(5n) l : num of r. r↔B r (m,L,k,d): m=n Θ(l), L=2 l, d=(1-f(k))k∙ln(m). Partition↔dist. vars, subset↔prover. B(r,j,i) = i’th subset, partition j, partition system r. Subsets are S(q,a,i): all r s.t. (q,i)@r, extract from a an assignment a r to dist. vars. S(q,a,i) is the union of all subsets B(r,a r,i), for all r s.t. (q,i)@r. Q=n l/2 (5n/3) l/2 (questions to P i ). r=1 r=2 r=R L=2 l m points, k subsets 00 01 10 11 B(2,4,3) S(q,a,1) |U|?
15
Lemma 5: 1. Completeness: OPT=1 there is a set cover of size kQ. 2. Soundness: OPT=(1- ε) more than (1- 2f(k))kQln(m) subsets are required. Proof of Completeness: If OPT=1, the provers answer consistently with the satisfying assignment. For any r, consider S(q 1,a 1,r),…,S(q k,a k,r) s.t. (q i,i)@r, and a i is the appropriate answer. B r (m,L,k,d) is covered by these k sets. Similar for every r. The number of subsets is kQ. ■ a r =01
16
Proof of soundness: the saga begins Assume OPT≤1-ε, and that there exists C that covers U, such that |C|=(1-δ)kQln(m), δ=2f(k). q to prover P i ↔ weight w(q,i)=number of answers a s.t. S(q,a,i) is in C. Σ q,i w(q,i) = |C|. r↔w(r)= Σ (q,i)@r w(q,i). This weight = number of subsets that participate in covering B r (m,L,k,d). Call r good if w(r)<(1-δ/2)k∙ln(m).
17
The good, the bad, and the random Proposition 6: The fraction of good r is at least δ/2. Proof: Assume otherwise, then: On the other hand, Hence |C|>(1- δ)kQln(m) – contradiction. ■ r is good if: w(r)= Σ (q,i)@r w(q,i)<(1-δ/2)k∙ln(m)
18
Proposition 7: C covers U, |C|=(1-δ)kQln(m) for some strategy for the k provers, V weakly accepts with prob. ≥ 2δ/(k∙ln(m)) 2. This proves soundness, since 2δ/(k∙ln(m)) 2 >k 2 2 -cl, for l=Θ(loglogn). Proof of proposition 7: Randomized strategy: on q to P i, select a from the set of a s.t. S(q,a,i) is in C. For a fixed r: sets B(r,p,i) in the cover of B r (m,L,k,d) ↔ sets S(q,a,i) in C. For a good r: C used two subsets from p in the cover of B r (m,L,k,d). Denote B(r,p,i) and B(r,p,j) ↔ S(q i,a i,i) and S(q j,a j,j). Denote: a is in A r,i iff S(q i,a,i) is in C. r is good |A r,i |+|A r,j |<k∙ln(m). The prob. that the provers answer a i and a j ≥ 4/(k∙ln(m)) 2. Answers are consistent with p V weakly accepts. The prob. that V chooses a good r ≥ δ/2. ■ Fix coin tosses for provers
19
Theorem 8: If there is some ε>0 such that a polynomial time algorithm can approximate set cover within (1-ε)ln(n), then NP is a subset of DTIME(n O(loglog(n)) ). Proof: Assume there is a poly time algorithm A that approximates set cover within (1-ε)ln(m). Reduce GAP-3SAT-5 to set cover as described, with k s.t. f(k)<ε/4, and m=(5n) 2l/ε. m,R,Q=n O(loglogn) time to perform the reduction is n O(loglogn). ln(m)>(1-ε/2)ln(N), where N=mR. By lemma 5, for a YES instance all points can be covered by kQ subsets, and for a NO instance all points cannot be covered by (1-2f(k))kQln(m). The ratio is (1-2f(k))ln(m)>(1-ε)lnN. ■
20
Closing remarks (1) Max k-cover: Given U and F as before, select k subsets that such that their union has maximum cardinality. The obvious greedy algorithm approximates max k- cover within a ratio of at least 1-1/e ≈ 0.632. Theorem 8 can be directly used to prove: if max k- cover can be approximated in a polynomial time within a ration of (1 - 1/e + ε) for some ε>0, then NP is a subset of DTIME(n O(loglogn) ).
21
Closing remarks (2) Refinements: In our hardness of approximation result for set cover, ε is a constant. We may strengthen our assumption so as to improve our result. ZTIME=class of languages that have a probabilistic algorithm that runs in expected time t (with zero error). If for some η>0, NP is not a subset of ZTIME(2 n^η), then for some constant c’>0 there is no polynomial time algorithm that approximates set cover within ln(n) –c’(lnln(n)) 2.
22
Questions and cool animations ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.