1 Noise-Insensitive Boolean-Functions are Juntas Guy Kindler & Muli Safra Slides prepared with help of: Adi Akavia.

Slides:



Advertisements
Similar presentations
Extracting Randomness From Few Independent Sources Boaz Barak, IAS Russell Impagliazzo, UCSD Avi Wigderson, IAS.
Advertisements

Shortest Vector In A Lattice is NP-Hard to approximate
Inapproximability of MAX-CUT Khot,Kindler,Mossel and O ’ Donnell Moshe Ben Nehemia June 05.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
1/17 Optimal Long Test with One Free Bit Nikhil Bansal (IBM) Subhash Khot (NYU)
1 By Gil Kalai Institute of Mathematics and Center for Rationality, Hebrew University, Jerusalem, Israel presented by: Yair Cymbalista.
Proclaiming Dictators and Juntas or Testing Boolean Formulae Michal Parnas Dana Ron Alex Samorodnitsky.
Derandomized DP  Thus far, the DP-test was over sets of size k  For instance, the Z-Test required three random sets: a set of size k, a set of size k-k’
Chapter 5 Orthogonality
Analysis of Boolean Functions Fourier Analysis, Projections, Influence, Junta, Etc… And (some) applications Slides prepared with help of Ricky Rosen.
Putting a Junta to the Test Joint work with Eldar Fischer, Dana Ron, Shmuel Safra, and Alex Samorodnitsky Guy Kindler.
1 Tight Hardness Results for Some Approximation Problems [Raz,Håstad,...] Adi Akavia Dana Moshkovitz & Ricky Rosen S. Safra.
Putting a Junta to the Test Joint work with Eldar Fischer & Guy Kindler.
Analysis of Boolean Functions Fourier Analysis, Projections, Influence, Junta, Etc… Slides prepared with help of Ricky Rosen.
Fourier Analysis, Projections, Influences, Juntas, Etc…
The 1’st annual (?) workshop. 2 Communication under Channel Uncertainty: Oblivious channels Michael Langberg California Institute of Technology.
Deciding Primality is in P M. Agrawal, N. Kayal, N. Saxena Slides by Adi Akavia.
Coloring graph powers; A Fourier approach N. Alon, I. Dinur, E. Friedgut, B. Sudakov.
1 Noise-Insensitive Boolean-Functions are Juntas Guy Kindler & Muli Safra Slides prepared with help of: Adi Akavia.
Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.
Fourier Analysis, Projections, Influence, Junta, Etc…
Analysis of Boolean Functions and Complexity Theory Economics Combinatorics Etc. Slides prepared with help of Ricky Rosen.
1 Tight Hardness Results for Some Approximation Problems [mostly Håstad] Adi Akavia Dana Moshkovitz S. Safra.
1. 2 Overview Some basic math Error correcting codes Low degree polynomials Introduction to consistent readers and consistency tests H.W.
The Importance of Being Biased Irit Dinur S. Safra (some slides borrowed from Dana Moshkovitz) Irit Dinur S. Safra (some slides borrowed from Dana Moshkovitz)
Fourier Analysis of Boolean Functions Juntas, Projections, Influences Etc.
1 The PCP starting point. 2 Overview In this lecture we’ll present the Quadratic Solvability problem. In this lecture we’ll present the Quadratic Solvability.
1 The PCP starting point. 2 Overview In this lecture we’ll present the Quadratic Solvability problem. We’ll see this problem is closely related to PCP.
1 Slides by Asaf Shapira & Michael Lewin & Boaz Klartag & Oded Schwartz. Adapted from things beyond us.
Adi Akavia Shafi Goldwasser Muli Safra
1 2 Introduction In this lecture we’ll cover: Definition of strings as functions and vice versa Error correcting codes Low degree polynomials Low degree.
1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.
Dana Moshkovitz, MIT Joint work with Subhash Khot, NYU.
A Fourier-Theoretic Perspective on the Condorcet Paradox and Arrow ’ s Theorem. By Gil Kalai, Institute of Mathematics, Hebrew University Presented by:
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)
Diophantine Approximation and Basis Reduction
CHAPTER FIVE Orthogonality Why orthogonal? Least square problem Accuracy of Numerical computation.
Primer on Fourier Analysis Dana Moshkovitz Princeton University and The Institute for Advanced Study.
Quantum Computing MAS 725 Hartmut Klauck NTU TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A.
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
The Integers. The Division Algorithms A high-school question: Compute 58/17. We can write 58 as 58 = 3 (17) + 7 This forms illustrates the answer: “3.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Analysis of Boolean Functions and Complexity Theory Economics Combinatorics …
1/19 Minimizing weighted completion time with precedence constraints Nikhil Bansal (IBM) Subhash Khot (NYU)
Analysis of Boolean Functions and Complexity Theory Economics Combinatorics Etc. Slides prepared with help of Ricky Rosen, & Adi Akavia.
Unique Games Approximation Amit Weinstein Complexity Seminar, Fall 2006 Based on: “Near Optimal Algorithms for Unique Games" by M. Charikar, K. Makarychev,
Analysis of Boolean Functions and Complexity Theory Economics Combinatorics Etc. Slides prepared with help of Ricky Rosen.
Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.
Approximation Algorithms based on linear programming.
Analysis of Boolean Functions and Complexity Theory Economics Combinatorics …
Primbs, MS&E345 1 Measure Theory in a Lecture. Primbs, MS&E345 2 Perspective  -Algebras Measurable Functions Measure and Integration Radon-Nikodym Theorem.
1 IAS, Princeton ASCR, Prague. The Problem How to solve it by hand ? Use the polynomial-ring axioms ! associativity, commutativity, distributivity, 0/1-elements.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Random Access Codes and a Hypercontractive Inequality for
Analysis of Boolean Functions and Complexity Theory Economics Combinatorics Etc. Slides prepared with help of Ricky Rosen.
Probabilistic Algorithms
Dana Ron Tel Aviv University
Chapter 5. Optimal Matchings
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Linear sketching with parities
RS – Reed Solomon List Decoding.
The Curve Merger (Dvir & Widgerson, 2008)
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Noise-Insensitive Boolean-Functions are Juntas
Linear sketching over
Linear sketching with parities
Sparse Kindler-Safra Theorem via agreement theorems
Locality In Distributed Graph Algorithms
Presentation transcript:

1 Noise-Insensitive Boolean-Functions are Juntas Guy Kindler & Muli Safra Slides prepared with help of: Adi Akavia

2 Influential People The theory of the Influence of Variables on Boolean Functions [KKL,BL,R,M] and related issues, has been introduced to tackle social choice problems. This area has motivated a magnificent sequence of works, related to Economics [K], percolation [BKS], Hardness of Approximation [DS] Revolving around the Fourier/Walsh analysis of Boolean functions… The theory of the Influence of Variables on Boolean Functions [KKL,BL,R,M] and related issues, has been introduced to tackle social choice problems. This area has motivated a magnificent sequence of works, related to Economics [K], percolation [BKS], Hardness of Approximation [DS] Revolving around the Fourier/Walsh analysis of Boolean functions… And the real important question: And the real important question:

3 Where to go for Dinner? The alternatives Diners would cast their vote in an (electronic) envelope. The system would decide – not necessarily by majority… It turns out someone –in the Florida wing- has the ability to flip some votes Power influence

4 Voting Systems n agents, each voting either “for” (T) or “against” (F) – a Boolean function over n variables f is the outcome n agents, each voting either “for” (T) or “against” (F) – a Boolean function over n variables f is the outcome The values of the agents (variables) may each, independently, flip with probability The values of the agents (variables) may each, independently, flip with probability Bottom Line: one cannot design an f that would be robust to such noise --that is, would, on average, change value w.p. < O(1) -- unless taking into account only very few of the votes Bottom Line: one cannot design an f that would be robust to such noise --that is, would, on average, change value w.p. < O(1) -- unless taking into account only very few of the votes

5 Dictatorship Def: a Boolean function P([n])  {-1,1} is a monotone e-dictatorships --denoted f e -- if:

6 Juntas Def: a Boolean function f:P([n])  {-1,1} is a j-Junta if  J  [n] where |J|≤ j, s.t. for every x  P([n]), f(x) = f(x  J) Def: f is an [ , j]-Junta if  j-Junta f’ s.t. Def: f is an [ , j, p]-Junta if  j-Junta f’ s.t. We would tend to omit p p-biased, product distribution

7 Long-Code In the long-code L:[n]  {0,1} 2 n each element is encoded by an 2 n -bits In the long-code L:[n]  {0,1} 2 n each element is encoded by an 2 n -bits This is the most extensive binary code, having one bit for every subset in P([n]) This is the most extensive binary code, having one bit for every subset in P([n])

8 Long-Code Encoding an element e  [n]: Encoding an element e  [n]: E e legally-encodes an element e if E e = f e E e legally-encodes an element e if E e = f e F F F F T T T T T T

9 Long-Code  Monotone-Dictatorship The truth-table of a Boolean function over n elements, can be considered as a 2 n bits long string (each corresponding to one input setting – or a subset of [n]) The truth-table of a Boolean function over n elements, can be considered as a 2 n bits long string (each corresponding to one input setting – or a subset of [n]) For a long-code, the legal code-words are all monotone dictatorships For a long-code, the legal code-words are all monotone dictatorships How about the Hadamard code? How about the Hadamard code?

10 Long-code Tests Def (a long-code test): given a code- word w, probe it in a constant number of entries, and Def (a long-code test): given a code- word w, probe it in a constant number of entries, and accept w.h.p if w is a monotone dictatorship accept w.h.p if w is a monotone dictatorship reject w.h.p if w is not close to any monotone dictatorship reject w.h.p if w is not close to any monotone dictatorship

11 Efficient Long-code Tests For some applications, it suffices if the test may accept illegal code-words, nevertheless, ones which have short list-decoding: Def(a long-code list-test): given a code-word w, probe it in 2 or 3 places, and accept w.h.p if w is a monotone dictatorship, accept w.h.p if w is a monotone dictatorship, reject w.h.p if w is not even approximately determined by a short list of domain elements reject w.h.p if w is not even approximately determined by a short list of domain elements that is, if  a Junta J  [n] s.t. f is close to f’ and f’(x)=f’(x  J) for all x that is, if  a Junta J  [n] s.t. f is close to f’ and f’(x)=f’(x  J) for all x Note: a long-code list-test, distinguishes between the case w is a dictatorship, to the case w is far from a junta.

12 General Direction These tests may vary These tests may vary The long-code list-test, in particular the biased case version, seem essential in proving improved hardness results for approximation problems The long-code list-test, in particular the biased case version, seem essential in proving improved hardness results for approximation problems Other interesting applications Other interesting applications Therefore: finding simple, weak as possible, sufficient-conditions for a function to be a junta is important. Therefore: finding simple, weak as possible, sufficient-conditions for a function to be a junta is important.

13 Background Thm (Friedgut): a Boolean function f with small average-sensitivity is an [ ,j]-junta Thm (Friedgut): a Boolean function f with small average-sensitivity is an [ ,j]-junta Thm (Bourgain): a Boolean function f with small high- frequency weight is an [ ,j]-junta Thm (Bourgain): a Boolean function f with small high- frequency weight is an [ ,j]-junta Thm: a Boolean function f with small high-frequency weight in a p-biased measure is an [ ,j]-junta Thm: a Boolean function f with small high-frequency weight in a p-biased measure is an [ ,j]-junta Corollary: a Boolean function f with small noise- sensitivity is an [ ,j]-junta Corollary: a Boolean function f with small noise- sensitivity is an [ ,j]-junta Parameters: average-sensitivity [M,R,BL,KKL,F] high-frequency weight [KKL,B] noise-sensitivity [BKS] Parameters: average-sensitivity [M,R,BL,KKL,F] high-frequency weight [KKL,B] noise-sensitivity [BKS]

14 [n] x I I z Noise-Sensitivity How often does the value of f changes when the input is perturbed? x I I z

15 Def( ,p,x [n] ): Let 0< <1, and x  P([n]). Then y~ ,p,x, if y = (x\I)  z where Def( ,p,x [n] ): Let 0< <1, and x  P([n]). Then y~ ,p,x, if y = (x\I)  z where I~  [n] is a noise subset, and I~  [n] is a noise subset, and z~  p I is a replacement. z~  p I is a replacement. Def( -noise-sensitivity): let 0< <1, then [ When p=½ equivalent to flipping each coordinate in x w.p. /2.] [n] x I I z Noise-Sensitivity

16 Fourier/Walsh Transform Write f:{-1, 1} n  {-1, 1} as a polynomial What would be the monomials? For every set S  [n] we have a monomial which is the product of all variables in S (the only relevant powers are either 0 or 1) For every set S  [n] we have a monomial which is the product of all variables in S (the only relevant powers are either 0 or 1) It now makes sense to consider the degree of f or to break it according to the various degrees of the monomials..

17 High/Low Frequencies Def: the high-frequency portion of f: Def: the low-frequency portion of f: Def: the high-frequency-weight is: Def: the low-frequency-weight is:

18 Low High-Frequency Weight Prop: the -noise-sensitivity can be expressed in Fourier transform terms as Prop: the -noise-sensitivity can be expressed in Fourier transform terms as Prop: Low ns  Low high-freq weight Proof: By the above proposition, low noise-sensitivity implies nevertheless, f being {-1, 1} function, by Parseval formula (that the norm 2 of the function and its Fourier transform are equal) implies Proof: By the above proposition, low noise-sensitivity implies nevertheless, f being {-1, 1} function, by Parseval formula (that the norm 2 of the function and its Fourier transform are equal) implies

19 Average and Restriction Def: Let I  [n], x  P([n]\I), the restriction function is Def: the average function is Note: [n] I x y I x y y y y y

20 Fourier Expansion Prop: Prop:

21 Influence /Variation Def: the variation of I on f: Prop: the following are equivalent definitions to the variation of I on f: Influence i (f) = variation i (f) = variation {i} (f)

22 Proof Recall Recall Therefore Therefore

23 Proof – Cont. Recall Recall Therefore (by Parseval): Therefore (by Parseval):

24 Proof First, let’s show : First, let’s show :

25 Low-frequencies Variation and a.s. Def: the low-frequency variation is: Def:the average-sensitivity is Def:the average-sensitivity is And in Fourier representation: Def: the low-frequency average-sensitivity is:

26 Biased Walsh Product [Talagrand] Def: In the p-biased product distribution  p, the probability of a subset x is The usual Fourier basis  is not orthogonal with respect to the biased inner-product, The usual Fourier basis  is not orthogonal with respect to the biased inner-product, Hence, we use the Biased Walsh Product: Hence, we use the Biased Walsh Product:

27 Main Result Theorem:  constant  >0 s.t. any Boolean function f:P([n])  {-1,1} satisfying is an [ ,j]-junta for j=O(  -2 k 3  2k ). Corollary: fix a p-biased distribution  p over P([n]). Let >0 be any parameter. Set k=log 1- (1/2). Then  constant  >0 s.t. any Boolean function f:P([n])  {-1,1} satisfying is an [ ,j]-junta for j=O(  -2 k 3  2k ).

28 The KKL/Freidgut Framework Thm: any Boolean function f is an [ ,j]-junta for Proof: 1. Specify the junta where, let k=O(as(f)/  ) and fix  =2 -O(k) 2. Show the complement of J has small variation [n] J

29 Proving [n]\J has small variation Prop: Let f be a Boolean function, s.t. variation J (f)   /2, then f is an [ ,|J|]-junta. Proof: define a junta f’ as follows: f’(x)=f(x  J)???????? then f’ is a |J|-junta, and hence

30 KKL/Freidgut Lemma: Proof: Now, lets bound each argument: Prop[KKL]: Proof: characters of size  k contribute to the average-sensitivity at least (since) [n] J

31 KKL/Freidgut Lemma: Proof: Now, lets bound each argument: Prop: Proof: characters of size  k contribute to the average-sensitivity at least (since) P([n]) J

32 Beckner/Nelson/Bonami Inequality Def: let T  be the following operator on f Thm: for any p≥r and  ≤((r-1)/(p-1)) ½ Corollary: for g of degree k

33 Beckner/Nelson/Bonami Corollary Proof:

34 Freidgut’s Proof Prop: Proof: we do not know whether as(f) is small!  this way with only as  k ! True only since this is a {-1,0,1} function. So we cannot proceed this way with only as  k ! 

35 If k were 1 Easy case (!?!): If we’d have a bound on the non- linear weight, we should be done. The linear part is a set of independent characters (the singletons) Concentration of measure: In order for those to hit close to 1 or -1 most of the time, they must avoid the law of large numbers, namely be almost entirely placed on one singleton [by Chernoff like bound] (!) [FKN, ext.] if f is close to linear then f is close to shallow (  a constant function or a dictatorship)

36 Almost Linear  Almost Shallow Thm([FKN]):  global constant M, s.t.  Boolean function f,  shallow Boolean function g, s.t. Hence, ||f I [x] >1 || 2 is small  f I [x] is close to shallow! Hence, ||f I [x] >1 || 2 is small  f I [x] is close to shallow!

37 How to Deal with Dependency between Characters? Recall Recall (theorem’s premise) (theorem’s premise) Idea: Let Partition [n]\J into I 1,…,I r, for r >> k Partition [n]\J into I 1,…,I r, for r >> k w.h.p f I [x] is close to linear (low freq characters intersect I expectedly by  1 element, while high-frequency weight is low). w.h.p f I [x] is close to linear (low freq characters intersect I expectedly by  1 element, while high-frequency weight is low). [n] J I1I1 I2I2 IrIr I

38 So what? f I [x] is close to linear f I [x] is close to linear By [FKN], f I [x] is shallow for any x Still, f I [x] could be a different dictatorship for different x’s, hence the variation of each i  I might be low!! P([n]) J I1I1 I2I2 IrIr I

39 Dictatorship and its Singleton Prop: for a dictatorship h,  coordinate i s.t.(where p is the bias). Prop: for a dictatorship h,  coordinate i s.t.(where p is the bias). Corollary (from [FKN]):  global constant M, s.t.  Boolean function h, either or Corollary (from [FKN]):  global constant M, s.t.  Boolean function h, either or {1} {2} {i}{n} {1,2} {1,3}{n-1,n}S{1,..,n} weight Characters Total weight of no more than 1-p

40 Main Lemma Lemma:  >0, s.t. for any  and any function g:P([m])  , the following holds: Lemma:  >0, s.t. for any  and any function g:P([m])  , the following holds: Low-freq high-freq

41 Probability Concentration Simple Bound: Simple Bound: Proof: Proof: Low-freq Bound: Let g:P([m])   be of degree k and  >0, then  >0 s.t. Low-freq Bound: Let g:P([m])   be of degree k and  >0, then  >0 s.t. Proof: recall the corollary: Proof: recall the corollary: 

42 Lemma’s Proof Now, let’s prove the lemma: Now, let’s prove the lemma: Bounding low and high freq separately:  , Bounding low and high freq separately:  , simple bound Low-freq bound

43 f I [x] Mostly Constant Lemma:  >0, s.t. for any  and any function g:P([m])   Lemma:  >0, s.t. for any  and any function g:P([m])   Def: Let D I be the set of x  P(I), s.t. f I [x] is a dictatorship Def: Let D I be the set of x  P(I), s.t. f I [x] is a dictatorship Next we show, that |D I | must be small, hence for most x, f I [x] is constant. Next we show, that |D I | must be small, hence for most x, f I [x] is constant.

44 Lemma: Lemma: Proof: denote, then Proof: denote, then |D I | must be small Prev lemma Each S is counted only for one index i  I. (Otherwise, if S was counted for both i and j in I, then |S  I|>1!) Parseval

45 Simple Prop Prop: let {a i } i  I be sub-distribution, that is,  i  I a i  1, 0  a i, then  i  I a i 2  max i  I {a i }. Prop: let {a i } i  I be sub-distribution, that is,  i  I a i  1, 0  a i, then  i  I a i 2  max i  I {a i }. Proof: Proof: max n aiai no more than 1 no more than n aiai 1/a max 1

46 |D I | must be small - Cont Therefore (since), Therefore (since), Hence Hence

47 Obtaining the Lemma It remains to show that indeed: It remains to show that indeed: Prop1: Prop1: Prop2: Prop2: {  S } S {  S } S are orthonormal, and RecallRecall HoweverHowever

48 Obtaining the Lemma – Cont. Prop3: Prop3: Proof: separate by freq: Proof: separate by freq: Small freq: Small freq: Large freq: Large freq: Corollary(from props 2,3): Corollary(from props 2,3):

49 Obtaining the Lemma – Cont. Recall: by corollary from [FKN], Eitheror Recall: by corollary from [FKN], Eitheror Hence Hence By Corollary By Corollary Combined with Prop1 we obtain: Combined with Prop1 we obtain: |D I | is small

50 prop1 Corollary (from[FKN]): eitheror prop2 |D I | must be small

51 Where to go for Dinner? The alternatives Diners would cast their vote in an (electronic) envelope. The system would decide – not necessarily by majority… It turns out someone –in the Florida wing- has the ability to flip some votes Power influence Of course they’ll have to discuss it over dinner….

52 Discussion Tests that look at only 2 or 3 places cannot produce a large gap between probability of acceptance of a dictatorship and that of a function not so close to a junta Tests that look at only 2 or 3 places cannot produce a large gap between probability of acceptance of a dictatorship and that of a function not so close to a junta Nevertheless, if requiring the function to have additional properties, such as local-maximality, one may be able to design a test with a large gap Nevertheless, if requiring the function to have additional properties, such as local-maximality, one may be able to design a test with a large gap

53 Shallow Function Def: a function f is linear, if only singletons have non-zero weight Def: a function f is linear, if only singletons have non-zero weight Def: a function f is shallow, if f is either a constant or a dictatorship. Def: a function f is shallow, if f is either a constant or a dictatorship. Claim: Boolean linear functions are shallow. Claim: Boolean linear functions are shallow. 0123kn0123kn weight Character size

54 Boolean Linear  Shallow Claim: Boolean linear functions are shallow. Claim: Boolean linear functions are shallow. Proof: let f be Boolean linear function, we next show: Proof: let f be Boolean linear function, we next show: 1.  {i o } s.t. (i.e. ) 2. And conclude, that eitheror i.e. f is shallow

55 Claim 1 Claim 1: let f be boolean linear function, then  {i o } s.t. Claim 1: let f be boolean linear function, then  {i o } s.t. Proof: w.l.o.g assume Proof: w.l.o.g assume for any z  {3,…,n}, consider x 00 =z, x 10 =z  {1}, x 01 =z  {2}, x 11 =z  {1,2} for any z  {3,…,n}, consider x 00 =z, x 10 =z  {1}, x 01 =z  {2}, x 11 =z  {1,2} then. then. Next value must be far from {-1,1}, Next value must be far from {-1,1}, A contradiction! (boolean function) A contradiction! (boolean function) Therefore Therefore 1 ?

56 Claim 1 Claim 1: let f be boolean linear function, then  {i o } s.t. Claim 1: let f be boolean linear function, then  {i o } s.t. Proof: w.l.o.g assume Proof: w.l.o.g assume for any z  {3,…,n}, consider x 00 =z, x 10 =z  {1}, x 01 =z  {2}, x 11 =z  {1,2} for any z  {3,…,n}, consider x 00 =z, x 10 =z  {1}, x 01 =z  {2}, x 11 =z  {1,2} then. then. But this is impossible as f(x 00 ),f(x 10 ),f(x 01 ), f(x 11 )  {-1,1}, hence their distances cannot all be >0 ! But this is impossible as f(x 00 ),f(x 10 ),f(x 01 ), f(x 11 )  {-1,1}, hence their distances cannot all be >0 ! Therefore. Therefore. 1 ?

57 Claim 2 Claim 2: let f be boolean function, s.t. Then eitheror Claim 2: let f be boolean function, s.t. Then eitheror Proof: consider f(  ) and f(i 0 ): Proof: consider f(  ) and f(i 0 ): Then Then but f is boolean, hence but f is boolean, hence therefore therefore 1 0

58 Linearity and Dictatorship Prop: Let f be a balanced linear boolean function then f is a dictatorship. Proof: f(  ),f(i 0 )  {-1,1}, hence Prop: Let f be a balanced boolean function s.t. as(f)=1, then f is a dictatorship. Proof:, but f is balanced, (i.e. ), therefore f is also linear.

59 Proving FKN: almost-linear  close to shallow Theorem: Let f:P([n])   be linear, Theorem: Let f:P([n])   be linear, Let Let let i 0 be the index s.t. is maximal let i 0 be the index s.t. is maximalthen Note: f is linear, hence w.l.o.g., assume i 0 =1, then all we need to show is: We show that in the following claim and lemma. Note: f is linear, hence w.l.o.g., assume i 0 =1, then all we need to show is: We show that in the following claim and lemma.

60 Corollary Corollary: Let f be linear, and then  a shallow boolean function g s.t. Corollary: Let f be linear, and then  a shallow boolean function g s.t. Proof: let, let g be the boolean function closest to l. Then, this is true, as Proof: let, let g be the boolean function closest to l. Then, this is true, as is small (by theorem), is small (by theorem), and additionallyis small, since and additionallyis small, since

61 Claim 1 Claim 1: Let f be linear. w.l.o.g., assume then  global constant c=min{p,1-p} s.t. Claim 1: Let f be linear. w.l.o.g., assume then  global constant c=min{p,1-p} s.t. {} {1} {2} {i}{n} {1,2} {1,3}{n-1,n}S{1,..,n} weight Characters Each of weight no more than c 

62 Proof of Claim1 Proof: assume Proof: assume for any z  {3,…,n}, consider x 00 =z, x 10 =z  {1}, x 01 =z  {2}, x 11 =z  {1,2} for any z  {3,…,n}, consider x 00 =z, x 10 =z  {1}, x 01 =z  {2}, x 11 =z  {1,2} then then Next value must be far from {-1,1} ! Next value must be far from {-1,1} ! A contradiction! (to ) A contradiction! (to ) 1 ?

63 Proof of Claim1 Proof: assume. Proof: assume. for any z  {3,…,n}, consider x 00 =z, x 10 =z  {1}, x 01 =z  {2}, x 11 =z  {1,2} for any z  {3,…,n}, consider x 00 =z, x 10 =z  {1}, x 01 =z  {2}, x 11 =z  {1,2} then. then. Hence Hence Therefore, for a random x this holds w.p. at least c, and therefore-- a contradiction. Therefore, for a random x this holds w.p. at least c, and therefore-- a contradiction. they cannot all be near {-1,1}! 1 ?

64 Lemma Lemma: Let g be linear, let assume, then Lemma: Let g be linear, let assume, then Corrolary: The theorem follows from the combination of claim1 and the lemma: Corrolary: The theorem follows from the combination of claim1 and the lemma: Let m be the minimal index s.t. Let m be the minimal index s.t. Consider Consider If m=2: the theorem is obtained (by lemma) If m=2: the theorem is obtained (by lemma) Otherwise -- a contradiction to minimality of m : Otherwise -- a contradiction to minimality of m : note

65 Lemma’s Proof Lemma’s Proof: Note Lemma’s Proof: Note Hence, all we need to show is that Hence, all we need to show is that Intuition: Intuition: Note that |g| and |b| are far from 0 (since |g| is  -close to 1, and c  -close to b). Note that |g| and |b| are far from 0 (since |g| is  -close to 1, and c  -close to b). Assume b>0, then for almost all inputs x, g(x)=|g(x)| (as ) Assume b>0, then for almost all inputs x, g(x)=|g(x)| (as ) Hence E[g]  E[|g(x)|], and Hence E[g]  E[|g(x)|], and therefore var(g)  var(|g|) therefore var(g)  var(|g|)

66 E 2 [g] - E 2 [|g|] = 2E 2 [|g|1 {f<0} ]  o(  ) (by Azuma’s inequality) E 2 [g] - E 2 [|g|] = 2E 2 [|g|1 {f<0} ]  o(  ) (by Azuma’s inequality) We next show var(g)  var(|g|): We next show var(g)  var(|g|): By the premise By the premise however however therefore therefore Proof-map: |g|,|b| are far from 0 g(x)=|g(x)| for almost all x E[g]  E[|g|] var(g)  var(|g|)

67 Variation Lemma Lemma(variation):  >0, and r>>k s.t. Lemma(variation):  >0, and r>>k s.t. Corollary: for most I and x, f I [x] is almost constant Corollary: for most I and x, f I [x] is almost constant P([n]) J I1I1 I2I2 IrIr I

68 By union bound on I 1,…,I r : By union bound on I 1,…,I r : (set) (set) Let f’(x) = sign( A J [f](x  J) ). f’ is the boolean function closest to A J [f], therefore Let f’(x) = sign( A J [f](x  J) ). f’ is the boolean function closest to A J [f], therefore Hence f is an [ ,j]-junta. Hence f is an [ ,j]-junta. Using Idea2 P([n]) J I1I1 I2I2 IrIr I

69 variation-Lemma - Proof Plan Lemma(variation):  >0, and r>>k s.t. Sketch for proving the variation lemma: 1. w.h.p f I [x] is almost linear 2. w.h.p f I [x] is close to shallow 3. f I [x] cannot be close to dictatorship too often.

70 The End

71 XOR Test Let  be a random procedure for choosing two disjoint subsets x,y s.t.:  i  [n], i  x\y w.p 1/3, i  y\x w.p 1/3, and i  x  y w.p 1/3. Let  be a random procedure for choosing two disjoint subsets x,y s.t.:  i  [n], i  x\y w.p 1/3, i  y\x w.p 1/3, and i  x  y w.p 1/3. Def(XOR-Test): Pick ~ , Def(XOR-Test): Pick ~ , Accept if f(x)  f(y), Accept if f(x)  f(y), Reject otherwise. Reject otherwise.

72 Example Claim: Let f be a dictatorship, then f passes the XOR-test w.p. 2/3. Claim: Let f be a dictatorship, then f passes the XOR-test w.p. 2/3. Proof: Let i be the dictator, then Pr ~  [f(x)  f(y)]=Pr ~  [i  x  y]=2/3 Proof: Let i be the dictator, then Pr ~  [f(x)  f(y)]=Pr ~  [i  x  y]=2/3 Claim: Let f’ be a  -close to a dictatorship f, then f’ passes the XOR- test w.p. 2/3 – 2/3  (  -  2 ). Claim: Let f’ be a  -close to a dictatorship f, then f’ passes the XOR- test w.p. 2/3 – 2/3  (  -  2 ). Proof: see next slide… Proof: see next slide…

73

74 Local Maximality Def: f is locally maximal with respect to a test, if  f’ obtained from f by a change on one input x 0, that is, Pr ~  [f(x)  f(y)]  Pr ~  [f’(x)  f’(y)] Def: f is locally maximal with respect to a test, if  f’ obtained from f by a change on one input x 0, that is, Pr ~  [f(x)  f(y)]  Pr ~  [f’(x)  f’(y)] Def: Let  x be the distribution of all y such that ~ . Def: Let  x be the distribution of all y such that ~ . Claim: if f is locally maximal then f(x) = -sign(E y~  (x) [f(y)]). Claim: if f is locally maximal then f(x) = -sign(E y~  (x) [f(y)]).

75 Claim: Claim: Proof: immediate from the Fourier- expansion, and the fact that y  x=  Proof: immediate from the Fourier- expansion, and the fact that y  x= 

76 Conjecture: Let f be locally maximal (with respect to the XOR-test), assume f passes the XOR-test w.p  1/2 + , for some constant  >0, then f is close to a junta. Conjecture: Let f be locally maximal (with respect to the XOR-test), assume f passes the XOR-test w.p  1/2 + , for some constant  >0, then f is close to a junta.