1 Tight Hardness Results for Some Approximation Problems [mostly Håstad] Adi Akavia Dana Moshkovitz S. Safra
2 “Road-Map” for Chapter I “Road-Map” for Chapter I Parallel repetition lemma X Y is NP-hard expander par[ , k] Gap-3-SAT-7 Gap-3-SAT
3 Maximum Satisfaction Def: Max-SAT Instance: Instance: A set of variables Y = { Y 1, …, Y m } A set of variables Y = { Y 1, …, Y m } A set of Boolean-functions (local-tests) over Y = { 1, …, l } A set of Boolean-functions (local-tests) over Y = { 1, …, l } Maximization: Maximization: We define ( ) = maximum, over all assignments to Y, of the fraction of i satisfied We define ( ) = maximum, over all assignments to Y, of the fraction of i satisfied Structure: Structure: Various versions of SAT would impose structure properties on Y, Y’s range and Various versions of SAT would impose structure properties on Y, Y’s range and
4 Max-E3-Lin-2 Def: Max-E3-Lin-2 Instance: a system of linear equations L = { E 1, …, E n } over Z 2 each equation of exactly 3 variables (whose sum is required to equal either 0 or 1) Instance: a system of linear equations L = { E 1, …, E n } over Z 2 each equation of exactly 3 variables (whose sum is required to equal either 0 or 1) Problem: Compute (L) Problem: Compute (L)
5 Example Assigning x 1-6 =1, x 7-9 =0 satisfy all but the third equation. Assigning x 1-6 =1, x 7-9 =0 satisfy all but the third equation. No assignment can satisfy all equation, as the sum of all leftwing of equations equals zero (every variable appears twice) while the rightwing sums to 1. No assignment can satisfy all equation, as the sum of all leftwing of equations equals zero (every variable appears twice) while the rightwing sums to 1. Therefore, (L)=5/6. Therefore, (L)=5/6. x 1 +x 2 +x 3 =1 (mod 2) x 4 +x 5 +x 6 =1 x 7 +x 8 +x 9 =1 x 1 +x 4 +x 7 =0 x 2 +x 5 +x 8 =0 x 3 +x 6 +x 9 =0
6 2-Variables Functional SAT Def[ ]: over Def[ X Y ]: over variables X,Y of range R x,R y respectively variables X,Y of range R x,R y respectively each is of the form x y : R x R y each is of the form x y : R x R y an assignment A(x R x,y R y ) satisfies x y iff x y (A(x))=A(y) [ Namely, every value to X determines exactly 1 satisfying value for Y] Thm: distinguishing between A satisfies all A satisfies all A satisfies < fraction of A satisfies < fraction of IS NP-HARD as long as |R x |,|R y |> -0(1)
7 Proof Outline Def: 3SAT is SAT where every i is a disjunction of 3 literals. Def: gap-3SAT-7 is gap-3SAT with the additional restriction, that every variable appears in exactly 7 local-tests Theorem: gap-3SAT-7 is NP-hard X Y is NP-hard par[ , k] Gap-3-SAT-7 Gap-3-SAT ?
8 Expanders Def: a graph G(V,E) is a c-expander if for every S V, |S| ½|V|: |N(S)\S| c·|S| [where N(S) denotes the set of neighbors of S] Lemma: For every m, one can construct in poly- time a 3-regular, m-vertices, c-expander, for some constant c>0 Corollary: a cut between S and V\S, for |S| ½|V| must contain > c·|S| edges X Y is NP-hard par[ , k] Gap-3-SAT-7 Gap-3-SAT Expanders ?
9 Reduction Using Expanders Assume ’ for which ( ’) is either 1 or /c. is ’ with the following changes: an occurrence of y in i is replaced by a variable x y,i an occurrence of y in i is replaced by a variable x y,i Let G y, for every y, be a 3-regular, c-expander over all occurrences x y,i of y Let G y, for every y, be a 3-regular, c-expander over all occurrences x y,i of y For every edge connecting x y,i to x y,j in G y, add to the clauses ( x y,i x y,j ) and (x y,i x y,j ) For every edge connecting x y,i to x y,j in G y, add to the clauses ( x y,i x y,j ) and (x y,i x y,j ) It is easy to see that: 1. | | 10 | ’| 2. Each variable x y,i of appears in exactly 7 i ensuring equality constructible by the LemmaLemma
10 Correctness of the Reduction 1. is completely satisfiable iff ’ is 2. In case ’ is unsatisfiable: ( ’) < 1-20 /c Let A be an optimal assignment to Let A maj assign x y,i the value assigned by A to the majority, over j, of variables x y,j Let F A and F A maj be the sets of unsatisfied by A and A maj respectively: | |·(1- ( )) = |F A | = |F A F A maj |+|F A \F A maj | |F A F A maj |+½c|F A maj \F A | ½c|F A maj | and since A maj is in fact an assignment to ’ ( ) 1- ½c(1- ( ’))/10 < 1- ½c(20 /c)/10= 1- and since A maj is in fact an assignment to ’ ( ) 1- ½c(1- ( ’))/10 < 1- ½c(20 /c)/10= 1-
11 Notations Def: For a 3SAT formula over Boolean variables Z, Let Z k be the set of all k-sequences of ’s variables Let Z k be the set of all k-sequences of ’s variables Let k be the set of all k-sequences of ’s clauses Let k be the set of all k-sequences of ’s clauses Def: For any V Y k and C k, let R Y be the set of all assignments to V R Y be the set of all assignments to V R X be the set of all satisfying assignments to C R X be the set of all satisfying assignments to C Def: For any set of k variables V Z k, and a set of k clauses C k, denote V C V is a choice of one variable of each clause in C.
12 Parallel SAT Def: for a 3SAT formula over Boolean variables Z, define par[ , k]: par[ , k] has two types of variables: y V for every set V Y k, where y V ‘s range is the set R Y of all assignments to V y V for every set V Y k, where y V ‘s range is the set R Y of all assignments to V x C for every set C k, where x C ‘s range is the set R X of all satisfying assignments to all clauses in C x C for every set C k, where x C ‘s range is the set R X of all satisfying assignments to all clauses in C par[ , k] has a local-test [C,V] for each V C which accepts if x C ’s value restricted to V is y V ’s value (namely, if the assignments to T[C] and T[V] are consistent) |R Y |=2 k |R X |=7 k
13 Gap Increases with k Note that if ( ) = 1 then (par[ , k]) = 1 On the other hand, if is not satisfiable: Lemma: (par[ , k]) ( ) c·k for some c>0 Proof: first note that 1- (par[ , 1]) (1- ( ))/3 now, to prove the lemma, apply the Parallel- Repetition lemma [Raz] to par[ , 1] par[ , 1] In any assignment to s variables, any unsatisfied clause in ”induces“ at least 1 (out of corresponding 3) unsatisfied par[ , 1] X Y is NP-hard par[ , k] Gap-3-SAT-7 Gap-3-SAT Parallel repetition lemma
14 Conclusion: Conclusion: X Y is NP-hard Denote: Denote: = par[ , k] = par[ , k] X={x C } X={x C } Y={y V } Y={y V }Then, distinguish between: distinguish between: A satisfies all A satisfies all A satisfies < fraction of A satisfies < fraction of IS NP-HARD as long as |R x |,|R y |> -0(1)
15 “Road-Map” for Chapter II “Road-Map” for Chapter II ( ) = ( ) LLC-Lemma: LLC-Lemma: (L ) = ½+ /2 (par[ ,k]) > 4 2 Long code L L X Y is NP-hard
16 Main Theorem Thm: gap-Max-E3-Lin-2(1- , ½+ ) is NP- hard. That is, for every constant 0< <¼ it is NP- hard to distinguish between the case where 1- of the equations are satisfiable and the case where ½+ are. [ It is therefore NP-Hard to approximate Max-E3-Lin-2 to within factor 2- for any constant 0< < ¼ ]
17 This bound is tight A random assignment satisfies half of the equations. A random assignment satisfies half of the equations. Deciding whether a set of linear equations have a common solution is in P (Gaussian elimination). Deciding whether a set of linear equations have a common solution is in P (Gaussian elimination).
18 Distributional Assignments Let be a SAT instance over variables Z of range R. Let (R) be all distributions over R Def: a distributional-assignment to is A: Z (R) Denote by ( ) the maximum over distributional-assignments A of the average probability for to be satisfied, if variables` values are chosen according to A Clearly ( ) ( ). Moreover Prop: ( ) ( ) LLC-Lemma: LLC-Lemma: (L ) = ½+ /2 (par[ ,k]) > 4 2 Long code L L X Y is NP-hard
19 Distributional-assignment to x1x1 x2x2 x3x3 xnxn x1x1 x2x2 x3x3 xnxn OR:
20 Restriction and Extension Def: For any y Y over R Y and x X over R X s.t x y The natural restriction of an a R X to R y is denoted a |y The natural restriction of an a R X to R y is denoted a |y The elevation of a subset F P[R Y ] to R X is the subset F * P[R X ] of all members a of R X for which x y (a) F F* = { a | a |y F } The elevation of a subset F P[R Y ] to R X is the subset F * P[R X ] of all members a of R X for which x y (a) F F* = { a | a |y F }
21 Long-Code In the long-code the set of legal-words consists of all monotone dictatorships This is the most extensive binary code, as its bits represent all possible binary values over n elements LLC-Lemma: LLC-Lemma: (L ) = ½+ /2 (par[ ,k]) > 4 2 Long code LL X Y is NP-hard
22 Long-Code Encoding an element e [n] : Encoding an element e [n] : E e legally-encodes an element e if E e = f e E e legally-encodes an element e if E e = f e F F F F T T T T T T
23 Long-Code over Range R BP[R] the set of all subsets of R of size ≤½|R| Our long-code: in our context there’re two types of domains “R”: R x and R y. Our long-code: in our context there’re two types of domains “R”: R x and R y. Def: an R-long-code has 1 bit for each F P[R] namely, any Boolean function: P[R] {-1, 1} Def: a legal-long-code-word of an element e R, is a long-code E R e : P[R] {-1, 1} that assigns e F to every subset F P[R] |BP[R]| = 2 |R|-1 -1
24 Linearity of a Legal-Encoding An assignment A : BP[R] {-1,1}, if legal, is a linear-function, i.e., F, G BP[R]: f(F) f(G) f(F G) Unfortunately, any character is linear as well!
25 The Variables of L Consider ( x y ) for large constant k (to be fixed later) L has 2 types of variables: 1. a variable z[y,F] for every variable y Y and a subset F BP[R y ] 2. a variable z[x,F] for every variable x X and a subset F BP[R x ] LLC-Lemma: LLC-Lemma: (L ) = ½+ /2 (par[ ,k]) > 4 2 Long code LL X Y is NP-hard
26 The Distribution Def: denote by the distribution over all subset of R x, which assigns probability to a subset H as follows: Independently, for each a R x, let a H with probability 1- a H with probability 1- a H with probability a H with probability H One should think of as a multiset of subsets in which every subset H appears with the appropriate probability
27 Linear equation L ‘s multiplicative-equations are the union, over all x y , of the following: F P[R Y ], G P[R X ] and H (R X ) z(y, F) z(x, G) = z(x, F* G H)
28 Revised Representation Multiplicative Representation: True -1 True -1 False 1 False 1 L: L: z[X,*], z[Y,*] {-1, 1} z[X,*], z[Y,*] {-1, 1} z[X,f] z[Y, g] z[X,’fgh’] = 1 z[X,f] z[Y, g] z[X,’fgh’] = 1 Representation by Fourier Basis Claim 2 (par[ ,k]) > 4 2 Claim 1 4 2 Claim 3: The expected success of the distributional assignment on [C,V] par[ ,k] is at least 4 2 General Fourier Analysi s facts Multiplicative representatio n
29 Prop: if ( ) = 1 then (L ) = 1- Proof: Let A be a satisfying assignment to . Assign all variables of L according to the legal encoding of A’s values. A linear equation of L , corresponding to X,Y,F,G,H, would be unsatisfied exactly if x H, which occurs with probability over the choice of H. LLC-Lemma: (L ) = ½+ /2 ( ) > 4 2 = 2 (L) -1 k k Note: independent of k! (Later we use that fact to define k large enough for our needs). LLC-Lemma: LLC-Lemma: (L ) = ½+ /2 (par[ ,k]) > 4 2 L L X Y is NP-hard
30 Hardness of approximating Max-E3-Lin-2 Main Theorem: For any constant >0: gap-Max-E3-Lin-2(1- ,½+ ) is NP- hard. Proof: Let ’ be a gap-3SAT-7(1, 1- ) By proposition ( ’) = 1 (L ’ ) 1- proposition
31 Lemma Main Theorem Prop: Let be a constant >0 s.t.: (1- )/(½+ /2) 2- Let k be large enough s.t.: 4 3 > ( ’) c·k Then ( ’) 0 s.t.: (1- )/(½+ /2) 2- Let k be large enough s.t.: 4 3 > ( ’) c·k Then ( ’) < 1 (L ’ ) ½+ /2 ½+ Proof: Assume, by way of contradiction, that (L) ½+ /2 then: 4 3 > ( ’) c·k ( ) > 4 2, which implies that > . Contradiction! of the parallel repetition lemma
32 Long-Code as an inner product space Def: { A : BP[R] {-1,1} } { A : BP[R] {-1,1} } is an inner-product space: A, B { A : BP[R] {-1,1} }
33 An Assignment to L For any variable x X The set z[x,*] of variables of L represent the long- code of x Let be the Fourier-Coefficient For any variable x X The set z[x,*] of variables of L represent the long- code of x Let be the Fourier-Coefficient For any variable y Y The set z[y,*] of variables of L represent the long- code of y Let be the Fourier-Coefficient For any variable y Y The set z[y,*] of variables of L represent the long- code of y Let be the Fourier-Coefficient
34 The Distributional Assignment. Def: Let be a distributional-assignment to as follows: For any variable x For any variable x Choose a set S R x with probability, Choose a set S R x with probability, Uniformly choose a random assignment a S. Uniformly choose a random assignment a S. For any variable y For any variable y Choose a set S R y with probability, Choose a set S R y with probability, Uniformly choose a random assignment b S. Uniformly choose a random assignment b S.
35 Longcode and Fourier Coeficients Auxiliary Lemmas: 1. For any F,G BP[R] and S R, S (F·G) = S (F)· S (G). 2. For any F BP[R] and s,s’ R, s (F)· s’ (F) = s s’ (F) 3. For any random F (uniformly chosen) and S , E[ s (F) ]=0 and E[ (F) ]=1. = x f(x) apply multiplication’s commutative & associative properties (f)· (f)= x f(x)· x f(x)= x f(x) 2 · x (f)=1· x (f) x s, f(x)1-1 ½ x s, f(x) is 1 or -1 with probability ½ go to claim2
36 Home Assignment Given an assignment to a Longcode A:BP[R] {-1, 1}, show that for any (constant) > 0, there is a constant h( ), which depends on , however does not depend on R such that: | {e R | (E e, A) > ½ + } | h( ) where (A 1, A 2 ) is the fraction of bits A 1 and A 2 differ on. Given an assignment to a Longcode A:BP[R] {-1, 1}, show that for any (constant) > 0, there is a constant h( ), which depends on , however does not depend on R such that: | {e R | (E e, A) > ½ + } | h( ) where (A 1, A 2 ) is the fraction of bits A 1 and A 2 differ on.
37 What’s Ahead: We show ‘s expected success on x y is > 4 2 in two steps: First we show (claim 1) that ‘s success probability, for any x y is claim 1claim 1 Then show (claim 3) that value to be 4 2 claim 3claim 3
38 Claim 1 Claim 1: The success probability of on x y is Proof: That success probability is at least Proof: That success probability is at least and if S’=S |y there is at least one b S s.t. b |y S’ So, ‘s success probability is at least |S| -1 times the case in which the chosen S’ and S satisfy: S |y = S’, i.e. at least Multiplicative representation Representation by Fourier Basis Claim 2 (par[ ,k]) > 4 2 4 2 Claim 3: The expected success of the distributional assignment on [C,V] par[ ,k] is at least 4 2 General Fourier Analysis facts Claim 1 go to claim3
39 Lemma’s Proof - Claim 2 (1) Claim 2: Proof: The test accepts iff z[y,F]z[x, G]z[x,F*GH] = 1 By our assumption, this happens with probability /2+½. Now, according to the definition of the expectation: E x y, F, G, H [z[y,F]z[x, G]z[x,F*GH]] = 1( ½ + /2) + (-1)(1 -( ½ + /2)) = Multiplicative representation Representation by Fourier Basis (par[ ,k]) > 4 2 Claim 1 4 2 Claim 3: The expected success of the distributional assignment on [C,V] par[ ,k] is at least 4 2 General Fourier Analysi s facts Claim 2 go to claim3
40 Lemma’s Proof - Claim 2 (2) We next show that Hence,
41 Lemma’s Proof - Proposition
42 Lemma’s Proof - Claim 3 Claim 3: The expected success of the distributional assignment on x y is at least 4 2 Proof: Claim 1 gives us the initial lower bound for the expected success: Claim 1Claim 1
43 Lemma’s Proof - Claim 3 As we’ve already seen,. Hence, our lower-bound takes the form of seen Or alternatively, Which allows us to use the known inequality E[x 2 ] E[x] 2 and get known
44 Lemma’s Proof - Claim 3 By auxiliary lemmas (4 | |) -1/2 e -2 | | (1-2 ) | |, i.e. | | -1/2 (4 ) 1/2 ·(1- 2 ) | |, which yields the following bound auxiliary lemmas auxiliary lemmas That is, Now applying claim 2 results the desired lower bound claim 2claim 2
45 Lemma’s Proof -Conclusion We showed that there is an assignment scheme with expected success of at least 4 2, There exists an assignment that satisfies at least 4 2 of the tests in ( ) > 4 2 Q.E.D.
46 Home Assignment Show it is NP-hard, for any > 0, given a 3SAT instance , to distinguish between the case where ( ) = 1, and the case in which ( ) 0, given a 3SAT instance , to distinguish between the case where ( ) = 1, and the case in which ( ) < 7/8+ Hint: Let ’s variables be as in L , and ’s clauses to take the form F OR G OR F* G H for f and g chosen in the same way as in L , while h is chosen as follows: H(b) = 1 for b such that F(b |V ) and G(b) are both FALSE H(b) = 1 for b such that F(b |V ) and G(b) are both FALSE For all other b’s, independently for each b, H(b)=1 with probability , and -1 with probability 1- For all other b’s, independently for each b, H(b)=1 with probability , and -1 with probability 1-