Tight Hardness Results for Some Approximation Problems [Håstad] Adi Akavia Danna Moshkovits S. Safra 33
“Road-Map” of the Presentation () = () Parallel repetition lemma LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 expander Long code L par[, k] Gap-3-SAT-7 Gap-3-SAT ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Maximum Satisfaction Def: Max-SAT Instance: A set of variables Y = { Y1, …, Ym } A set of Boolean-functions (local-tests) over Y = { 1, …, l } Maximization: We define () = maximum, over all assignments to Y, of the fraction of i satisfied Structure: Various versions of SAT would impose structure properties on Y, Y’s range and ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Max-E3-Lin-2 Def: Max-E3-Lin-2 Instance: a system of linear equations L = { E1, …, En } over Z2 each equation of exactly 3 variables (whose sum is required to equal either 0 or 1) Problem: Compute (L) ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Example x1+x2+x3=1 (mod 2) x4+x5+x6=1 x7+x8+x9=1 x1+x4+x7=0 x2+x5+x8=0 x3+x6+x9=0 Assigning x1-6=1, x7-9=0 satisfy all but the third equation. No assignment can satisfy all equation, as the sum of all leftwing of equations equals zero (every variable appears twice) while the rightwing sums to 1. Therefore, (L)=5/6. ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Main Theorem Thm: gap-Max-E3-Lin-2(1-, ½+) is NP-hard. That is, for every constant 0<<¼ it is NP-hard to distinguish between the case where 1- of the equations are satisfiable and the case where ½+ are. [ It is therefore NP-Hard to approximate Max-E3-Lin-2 to within factor 2- for any constant 0<<¼] ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz This bound is tight A random assignment satisfies half of the equations. Deciding whether a set of linear equations have a common solution is in P (Gaussian elimination). ©Safra,Akavia,Moshkovitz
Gap-3-SAT Proof Outline expander Gap-3-SAT-7 Parallel repetition lemma () = () par[, k] Def: 3SAT is SAT where every i is a disjunction of 3 literals. Def: gap-3SAT-7 is gap-3SAT with the additional restriction, that every variable appears in exactly 7 local-tests Theorem: gap-3SAT-7 is NP-hard The proof proceeds with a reduction from gap-3SAT-7(1, 1-) for some constant >0, known to be NP-hard Given such an instance, the proof shows a poly-time construction, of an instance of Max-E3-Lin-2 with the claimed (1-, ½+) gap. Long code L LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 proof ©Safra,Akavia,Moshkovitz
Distributional Assignments Gap-3-SAT expander Distributional Assignments () = () Parallel repetition lemma Gap-3-SAT-7 par[, k] Long code L Consider a SAT instance over variables X of range R. Let (R) be the set of all distributions over R Def: A distributional-assignment to is A: X (R) Denote by () the maximum over distributional-assignments A of the average probability for to be satisfied, if variables` values are chosen according to A Clearly () (). Moreover Prop: () () LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Notations Def: For a 3SAT formula over Boolean variables Y, Let Yk be the set of all k-sequences of ’s variables Let k be the set of all k-sequences of ’s clauses Def: For any VYk and Ck, let SV be the set of all assignments to V SC be the set of all assignments to C Def: For any set of k variables VYk, and a set of k clauses Ck, denote V C V is a choice of one variable of each clause in C. ©Safra,Akavia,Moshkovitz
Restriction and Extension Def: For any VYk and Ck s.t V C, The natural restriction of an aSC to SV is denoted a|V The elevation of a subset FP[SV] to SC is the subset F*P[SC] of all C assignments whose restriction to V is in F F* = { a | a|V F } ©Safra,Akavia,Moshkovitz
Gap-3-SAT Parallel SAT expander Parallel repetition lemma Gap-3-SAT-7 par[, k] () = () Long code Def: For a 3SAT formula over Boolean variables Y, denote by par[, k] the following SAT instance: par[, k] has two types of variables: x[V] for every set VYk, where x[V]‘s range is the set SV of all assignments to V x[C] for every set Ck, where x[C]‘s range is the set SC of all satisfying assignments to all clauses in C par[, k] has one local-test, [C,V], for every V C: [C,V] accepts if x[C]|V = x[V] (namely, if the assignments to x[C] and x[V] are consistent) L LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 |SV|=2k |SC|=7k ©Safra,Akavia,Moshkovitz
Gap Increases with k Note that if () = 1 then (par[, k]) = 1 Parallel repetition lemma Gap-3-SAT Gap Increases with k expander Gap-3-SAT-7 () = () par[, k] Note that if () = 1 then (par[, k]) = 1 On the other hand, if is not satisfiable: Lemma: (par[, k]) ()c·k for some c>0 Proof: first note that 1-(par[, 1]) (1-())/3 now, to prove the lemma, apply the Parallel-Repetition lemma [Raz] to par[, 1] Long code L LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 In any assignment to ‘s variables, any unsatisfied clause in ”induces“ at least 1 (out of corresponding 3) unsatisfied par[, 1] ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Usually a code is regarded as a sequence, but here we ignore the order between the functions. Long-Code over Range R Any binary code C, of elements of a domain R, can be represented by a set C of Boolean-functions (B.f.) f: R{0,1}. Def: A C-encoding is a binary assignment A: C {0,1} Def: A C-encoding A: C {0,1} is a C-legal-encoding of an element aR, if f C A(f) = f(a). The most extensive binary-code of aR lists f(a) for all B.f. f over R ©Safra,Akavia,Moshkovitz
Gap-3-SAT Long-Code over Range R expander Parallel repetition lemma Gap-3-SAT-7 () = () par[, k] Long code BP[R] the set of all subsets of R of size ≤½|R| Def: an R-long-code has 1 bit for each F P[R] namely, any boolean function: P[R] {0, 1} Def: a legal-long-code-word of an element aR, is a long-code ERa: P[R] {0, 1} that assigns aF to every subset F P[R] |BP[R]| = 2|R|-1-1 L LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 Our long-code: in our context there’re two types of domains “R”: SC for a set C of k clauses of , SV for a set V of k variables of The most extensive binary-code ©Safra,Akavia,Moshkovitz
Encoding an Element In R By Its Long-Code Given eR, list f(e) for all Boolean functions f over R correction: 2n-1-1 example for e=r1 1 1 f0 f1 f2 . . . f2n-1-1 . . . f2n-1 r1 r2 ... rn • 1 r1 r2 ... rn • 1 r1 r2 ... rn • 1 r1 r2 ... rn • 1 Note: 1) Each entry determines uniquely the value of the entry corresponding to the complement function. 2) The value of the entry corresponding to the constant function 0 (1) is always 0 (1). ©Safra,Akavia,Moshkovitz
Linearity of a Legal-Encoding An assignment A : BP[R] {0,1}, if legal, is a linear-function, i.e., F, G BP[R]: F + G ‘FG’ (mod 2) where ‘FG’ BP[R] is the symmetric difference of F and G Unfortunately, the bit-wise xor of two legal encodings of a,b R, is linear as well perhaps ‘FGR’ BP[R] FG R ‘FGR’ ©Safra,Akavia,Moshkovitz
The Variables of L Consider par[,k] for large Gap-3-SAT The Variables of L expander Parallel repetition lemma Gap-3-SAT-7 The constructed linear-equation-system () = () par[, k] Consider par[,k] for large constant k (to be fixed later) L has 2 types of variables: a variable z[V,F] for every variable x[V] of par[,k] and a subset F BP[SV] a variable z[C,F] for every variable x[C] of par[,k] and a subset F BP[SC] Long code L LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 Thus every L-variable corresponds to a subset of either: SC (the set of all assignments to k variables of ), or SV (the set of all satisfying assignments to k clauses of ) ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz The Distribution Def: denote by the distribution over all subset of SC, which assigns probability to a subset H as follows: Independently, for each a SC, let aH with probability 1- aH with probability One should think of as a multiset of subsets in which every subset H appears with the appropriate probability ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Linear equation L‘s linear-equations are the union, over all [C,V] par[,k], of the following: F BP[SV], G BP[SC] and H F G = ‘F*GH’ i.e., [C,V], F, G, H there’s a test z[V,F] + z[C, G] = z[C,’F* G H’] (mod 2) ‘F*GH’ is the symmetric difference of the extension of F to SC, G and H ©Safra,Akavia,Moshkovitz
Prop: if () = 1 then (L) = 1- Gap-3-SAT expander Parallel repetition lemma Gap-3-SAT-7 () = () par[, k] Long code Prop: if () = 1 then (L) = 1- Proof: Let A be a satisfying assignment to par[,k]. Assign all variables of L according to the legal encoding of A’s values. A linear equation of L, corresponding to C,V,F,G,H, would be unsatisfied exactly if h(x[C])=1, which occurs with probability over the choice of H. LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 L LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 Note: independent of k! (Later we use that fact to define k large enough for our needs). = 2(L) -1 ©Safra,Akavia,Moshkovitz
Hardness of approximation of Max-E3-Lin-2 Main Theorem: For any constant >0: gap-Max-E3-Lin-2(1-,½+) is NP-hard. Proof: Let be a gap-3SAT-7(1, 1-) By proposition () = 1 (L) 1- ©Safra,Akavia,Moshkovitz
of the parallel repetition lemma Lemma Main Theorem Prop: Let be a constant >0 s.t.: (1-)/(½+/2) 2- Let k be large enough s.t.: 43 > ()c·k Then () < 1 (L) ½+/2 ½+ Proof: Assume, by way of contradiction, that (L) ½+/2 then: 43 > ()c·k (par[, k]) > 42, which implies that > . Contradiction! of the parallel repetition lemma ©Safra,Akavia,Moshkovitz
Road-Map for the Proof of LLC-Lemma Multiplicative representation General Fourier Analysis facts Representation by Fourier Basis Claim 2: E[C,V] [ Sc FV|V• (FC)2 • (1-2)|| ] = Claim 1: success probability of on [C,V]par[,k] is Sc (FV|V)2 • (FC)2 • ||-1 Claim 3: The expected success of the distributional assignment on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 ©Safra,Akavia,Moshkovitz
Inner Product Spaces Def: W is an Inner Product Space General Fourier Analysis facts Inner Product Spaces Multiplicative representation Representation by Fourier Basis Def: W is an Inner Product Space if W is a vector-space (over ) with an inner product operator: <.,.> : W W Such that <a+b, c> = <a+c> + <b+c> <a, b> = <a, b> a0 <a, a> > 0 Claim 2 Claim 1 Claim 3:The expected success of the distributional assignment on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Orthonormal Basis Def: O is an Orthonormal System if a,bO a b, and aO ||a||=1 i.e.: 1 if a=b <a,b> = 0 otherwise Def: an orthonormal system {ui}i=1...n is an Orthonormal Basis of an inner-product space W, if aW, a i=1...n<a,ui>ui ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Parseval’s Formula Thm: Let W be an inner-product space, and {ui}i=1...n be an orthonormal basis for W, then a W ||a||2 = i=1...n<a,ui>2 ©Safra,Akavia,Moshkovitz
Revised Representation Multiplicative representation General Fourier Analysis facts Revised Representation Representation by Fourier Basis Multiplicative Representation: 1 (True) -1 0 (False) 1 f + g f g L: z[C,*], z[V,*] {-1, 1} z[V,f] • z[C, g] • z[C,’f•g•h’] = 1 Claim 2 Claim 1 Claim 3:The expected success of the distributional assignment on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 ©Safra,Akavia,Moshkovitz
Long Code - Revised Representation Note, that in those revised notations, Boolean functions evaluate to either 1 or –1 Def: The long-code of R has one bit for every B.f. in BP[R] namely, BP[R] {1, -1} The legal-encoding of an element aR, is the mapping: ERa: BP[R] {-1,1} , where ERa(f) = f(a). i.e. ERa assigns f(a) to every B.f. fBP[R]. ©Safra,Akavia,Moshkovitz
Long-Code as an inner product space Def: Let R = { A : BP[R] {-1,1} } R is an inner-product space: A , B R ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Basis for Long-Code go to claim3 Def: For every R, let Claim: For a domain R, the set: { = af(a) : R } is an orthogonal basis of R Corollary: A R A = R < A , > (see Basis def) <A, >2 = 1/|BP[R]|·fBP[R] (A(f))2 = 1 (by Parseval’s formula and the fact the A(f){-1,1}) Note, now we can use <A, >2 as a probability measure! ©Safra,Akavia,Moshkovitz
Representation with Fourier Basis General Fourier Analysis facts Representation with Fourier Basis Multiplicative representation Representation by Fourier Basis The representation of A R as: A = R < A , >· is called: “the Fourier representation of A” The coefficients < A , > are called the Fourier-coefficients of A Claim 2 Claim 1 Claim 3:The expected success of the distributional assignment on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz An Assignment to L For any set C of k clauses of The set z[C,*] of variables of L represent the long-code of x[C] Let FC be the Fourier-Coefficient <|z[C,*],> For any set V of k variables of The set z[V,*] of variables of L represent the long-code of x[V] Let FV be the Fourier-Coefficient <|z[V,*],> ©Safra,Akavia,Moshkovitz
The Distributional Assignment. Def: Let be a distributional-assignment to par[,k] as follows: For any variable x[C] Choose a set SC with probability (FC)2, Uniformly choose a random assignment a. For any variable x[V] Choose a set SV with probability (FV)2, Uniformly choose a random assignment b. ©Safra,Akavia,Moshkovitz
Longcode and Fourier Coeficients go to claim2 Auxiliary Lemmas: 1. For any f,gBP[R] and R, (f·g) = (f)·(g). 2. For any fBP[R] and , R, (f)·(f) = (f), where is the symmetric difference between and . 3. For any random f (uniformly chosen) and , E[ (f) ]=0 and E[ (f) ]=1. =xf(x) apply multiplication’s commutative & associative properties (f)·(f)=xf(x)·xf(x)= xf(x)2·x(f)=1·x(f) x, f(x) is 1 or -1 with probability ½ ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Home Assignment Given an assignment to a Longcode A:BP[R] {0, 1}, show that for any (constant) > 0, there is a constant h(), which depends on , however does not depend on R such that: | {e R | (Ee, A) > ½ + } | h() where (A1, A2) is the fraction of bits A1 and A2 differ on. ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz What’s Ahead: We would show that ‘s expected success on [C,V]par[,k] is > 42 in two steps: First we show (claim 1) that ‘s success probability, for any [C, V] par[,k] is Sc (FV|V)2 • (FC)2 • ||-1 Then show (claim 3) that value to be 42 ©Safra,Akavia,Moshkovitz
Claim 1 Claim 1: The success probability of on [C,V]par[,k] is General Fourier Analysis facts Claim 1 Multiplicative representation go to claim3 Representation by Fourier Basis Claim 1: The success probability of on [C,V]par[,k] is Sc (FV|V)2 • (FC)2 • ||-1 Proof: That success probability is at least Sv,Sc (FV)2 • (FC)2 • Prb[b|V ] and if =|V there is at least one b s.t. b|V So, ‘s success probability is at least ||-1 times the case in which the chosen and satisfy: |V = , i.e. at least Sc (FV|V)2 • (FC)2 • ||-1 Claim 1 Claim 2 Claim 3:The expected success of the distributional assignment on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Auxiliary Lemmas go to claim3 ...Some More Auxiliary Lemmas: For any positive constants a and s: a-se-sa e-x 1-x E[x2] E[x]2 (proof: 0 E[ (x-E[x])2 ] = E[ x2-2E[x]x+E[x]2 ] = E[x2] - 2E[x]E[x] + E[x]2 = E[x2] - E[x]2 ) ©Safra,Akavia,Moshkovitz
Lemma’s Proof - Claim 2 (1) go to claim3 General Fourier Analysis facts Claim 2: E[C,V][ Sc FV|V•(FC)2•(1-2)||] = Proof: The test accepts iff z[V,f]•z[C, g]•z[C,’f•g•h’] = 1 By our assumption, this happens with probability /2+½. Now, according to the definition of the expectation: E[C,V], f, g, h[z[V,f]•z[C, g]•z[C,’f•g•h’]] = 1•(½+/2) + (-1)•(1 -(½+/2)) = Multiplicative representation Representation by Fourier Basis Claim 2 Claim 1 Claim 3:The expected success of the distributional assignment on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 ©Safra,Akavia,Moshkovitz
Lemma’s Proof - Claim 2 (2) We next show that Ef,g,h[z[V,f]•z[C, g]•z[C,’f•g•h’]] = ScFV|V• (FC)2•(1-2)|| Hence, E[C,V], f, g, h[z[V,f]•z[C, g]•z[C,’f•g•h’] ] = E[C,V] [ Ef,g,h[z[V,f]•z[C, g]•z[C,’f•g•h’] ] ] = E[C,V] [ ScFV|V • (FC)2 • (1-2)|| ] E[C,V] [ Sc FV|V • (FC)2 • (1-2)|| ] = ©Safra,Akavia,Moshkovitz
Lemma’s Proof - Proposition By Fourier transform auxiliary lemma 2 auxiliary lemma 1 auxiliary lemma 3 Expectation Linearity
©Safra,Akavia,Moshkovitz Lemma’s Proof - Claim 3 Claim 3: The expected success of the distributional assignment on [C,V]par[,k] is at least 42 Proof: Claim 1 gives us the initial lower bound for the expected success: ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Lemma’s Proof - Claim 3 As we’ve already seen, (FC)2=1. Hence, our lower-bound takes the form of Or alternatively, Which allows us to use the known inequality E[x2]E[x]2 and get ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Lemma’s Proof - Claim 3 By auxiliary lemmas (4||)-1/2 e-2|| (1-2)||, i.e. ||-1/2 (4)1/2 ·(1-2)||, which yields the following bound That is, Now applying claim 2 results the desired lower bound ©Safra,Akavia,Moshkovitz
Lemma’s Proof -Conclusion General Fourier Analysis facts Lemma’s Proof -Conclusion Multiplicative representation Representation by Fourier Basis We showed that there is an assignment scheme with expected success of at least 42 , There exists an assignment that satisfies at least 42 of the tests in par[,k] (par[,k]) > 42 Q.E.D. Claim 2 Claim 1 Claim 3:The expected success of the distributional assignment on [C,V]par[,k] is at least 4 2 (par[,k]) > 42 ©Safra,Akavia,Moshkovitz
©Safra,Akavia,Moshkovitz Home Assignment Show it is NP-hard, for any > 0, given a 3SAT instance , to distinguish between the case where () = 1, and the case in which () < 7/8+ Hint: Let ’s variables be as in L, and ’s clauses to take the form f OR g OR ‘f* + g + h’ for f and g chosen in the same way as in L, while h is chosen as follows: h(b) = 1 for b such that f(b|V) and g(b) are both FALSE For all other b’s, independently for each b, h(b)=1 with probability , and 0 with probability 1- ©Safra,Akavia,Moshkovitz
Appendix
Expanders Def: a graph G(V,E) is a c-expander if for every SV, Gap-3-SAT () = () Parallel repetition lemma L LLC-Lemma: (L) = ½+/2 (par[,k]) > 42 expander Long code Gap-3-SAT-7 par[, k] Expanders Def: a graph G(V,E) is a c-expander if for every SV, |S| ½|V|: |N(S)\S| c·|S| [where N(S) denotes the set of neighbors of S] Lemma: For every m, one can construct in poly-time a 3-regular, m-vertexes, c-expander, for some constant c>0 Corollary: a cut between S and V\S, for |S| ½|V| must contain > c·|S| edges ©Safra,Akavia,Moshkovitz
Reduction Using Expanders Assume ’ for which (’) is either 1 or 1-20/c. is ’ with the following changes: an occurrence of y in i is replaced by a variable xy,i Let Gy, for every y, be a 3-regular, c-expander over all occurrences xy,i of y For every edge connecting xy,i to xy,j in Gy, add to the clauses (xy,i xy,j) and (xy,i xy,j) It is easy to see that: || 10 |’| Each variable xy,i of appears in exactly 7 i constructible by the Lemma asserting equality ©Safra,Akavia,Moshkovitz
Correctness of the Reduction Back Correctness of the Reduction is completely satisfiable iff ’ is In case ’ is unsatisfiable: (’) < 1-20/c Let A be an optimal assignment to Let Amaj assign xy,i the value assigned by A to the majority, over j, of variables xy,j Let FA and FAmaj be the sets of unsatisfied by A and Amaj respectively: ||·(1-()) = |FA| = |FAFAmaj|+|FA\FAmaj| |FAFAmaj|+½c|FAmaj\FA| ½c|FAmaj| and since Amaj is in fact an assignment to ’ () 1- ½c(1- (’))/10 < 1- ½c(20/c)/10= 1- ©Safra,Akavia,Moshkovitz