Analysis of Boolean Functions and Complexity Theory Economics Combinatorics Etc. Slides prepared with help of Ricky Rosen
Introduction Objectives: Overview: To introduce Analysis of Boolean Functions and some of its applications. Overview: Basic definitions. First passage percolation Mechanism design Graph property … And more…
Influential People The theory of the Influence of Variables on Boolean Functions [KKL,BL,R,M], has been introduced to tackle Social Choice problems and distributed computing. It has motivated a magnificent body of work, related to Sharp Threshold [F, FK] Percolation [BKS] Economics: Arrow’s Theorem [K] Hardness of Approximation [DS] Utilizing Harmonic Analysis of Boolean functions… And the real important question:
Where to go for Dinner? The alternatives Diners would cast their vote in an (electronic) envelope The system would decide – not necessarily according to majority… And what if someone (in Florida?) can flip some votes influence Power
Where to go for Dinner? The alternatives Diners would cast their vote in an (electronic) envelope The system would decide – not necessarily according to majority… And what if someone (in Florida?) can flip some votes influence Power
Where to go for Dinner? The alternatives Diners would cast their vote in an (electronic) envelope The system would decide – not necessarily according to majority… And what if someone (in Florida?) can flip some votes influence Power
Boolean Functions Def: A Boolean function Power set of [n] Choose the location of -1 Choose a sequence of -1 and 1
Noise Sensitivity The values of the variables may each, independently, flip with probability It turns out: one cannot design an f that would be robust to such noise --that is, would, on average, change value w.p. < O(1)-- unless determining the outcome according to very few of the voters
Voting and influence Def: the influence of i on f is the probability, over a random input x, that f changes its value when i is flipped -1 1 -1 -1 1 1 -1 1 -1 1 -1 1 -1 1 -1 1 1 1 -1 1 1 -1
-1 1 -1 -1 ? 1 -1 1 -1 1 -1 1 -1 1 -1 1 1 1 -1 Majority :{1,-1}n {1,-1} The influence of i on Majority is the probability, over a random input x, Majority changes with i this happens when half of the n-1 coordinate (people) vote -1 and half vote 1. i.e.
Always changes the value of parity -1 1 -1 -1 1 1 -1 1 -1 1 -1 1 -1 1 -1 1 1 1 -1 1 Parity : {1,-1}n {1,-1} Always changes the value of parity
Dictatorshipi :{1,-1}20 {1,-1} Dictatorshipi(x)=xi influence of i on Dictatorshipi= 1. influence of ji on Dictatorshipi= 0.
Average Sensitivity Def: the Average Sensitivity of f (as) is the sum of influences of all coordinates i [n] : as(Majority) = O(n½) as(Parity) = n as(dictatorship) =1
example majority for What is Average Sensitivity ? AS= ½+ ½+ ½= 1.5 Influence 2 Influence 3 Influence
When as(f)=1 Prop: f is balanced and as(f)=1 f is a dictatorship. Def: f is a balanced function if it equals -1 exactly half of the times: Ex[f(x)]=0 Can a balanced f have as(f) < 1? What about as(f)=1? Beside dictatorships? Prop: f is balanced and as(f)=1 f is a dictatorship.
Representing f as a Polynomial What would be the monomials over x P[n] ? All powers except 0 and 1 cancel out! Hence, one for each character S[n] These are all the multiplicative functions
Fourier-Walsh Transform Consider all characters Given any function let the Fourier-Walsh coefficients of f be thus f can be described as
Norms Def: Expectation norm on the function Def: Summation norm on the transform Thm [Parseval]: Hence, for a Boolean f
Distribution over Characters We may think of the Transform as defining a distribution over the characters.
Simple Observations Claim: For any function f whose range is {-1,0,1}:
Variables` Influence Recall: influence of an index i [n] on a Boolean function f:{1,-1}n {1,-1} is Which can be expressed in terms of the Fourier coefficients of f Claim: And the as:
Fourier Representation of influence Proof: consider the influence function which in Fourier representation is and
Balanced f s.t. as(f)=1 is Dict. Since f is balanced and So f is linear For any i s.t. If s s.t |s|>1 and then as(f)>1 Only i has changed
Expectation and Variance Claim: Hence, for any f
First Passage Percolation [BKS] Each edge costs a w/probability ½ and b w/probability ½
First Passage Percolation Consider the Grid For each edge e of choose independently we = 1 or we = 2, each with probability ½ This induces a shortest-path metric on Thm : The variance of the shortest path from the origin to vertex v is bounded from above by O( |v|/ log |v|) [BKS] Proof idea: The average sensitivity of shortest-path is bounded by that term
Proof outline Let G denote the grid SPG – the shortest path in G from the origin to v. Let denote the Grid which differ from G only on we i.e. flip the value of e in G. Set
Observation If e participates in a shortest path then flipping its value will increase or decrease the SP in 1 ,if e is not in SP - the SP will not change.
Proof cont. And by [KKL] there is at least one variable whose influence is at least (logn/n)
Graph properties Def: A graph property is a subset of graphs invariant under isomorphism. Def: a monotone graph property is a graph property P s.t. If P(G) then for every super-graph H of G (namely, a graph on the same set of vertices, which contains all edges of G) P(H) as well. P is in fact a Boolean function: P: {-1, 1}V2{-1, 1}
Examples of graph properties G is connected G is Hamiltonian G contains a clique of size t G is not planar The clique number of G is larger than that of its complement The diameter of G is at most s ... etc . What is the influence of different e on P?
Erdös–Rényi G(n,p) Graph The Erdös-Rényi distribution of random graphs Put an edge between any two vertices w.p. p
Definitions P – a graph property (P) - the probability that a random graph on n vertices with edge probability p satisfies P. GG(n,p) - G is a random graph of n vertices and edge probability p.
Probability for choosing an edge Example-Max Clique Probability for choosing an edge Consider GG(n,p) The size of the interval of probabilities p for which the clique number of G is almost surely k (where k log n) is of order log-1n. The threshold interval: The transition between clique numbers k-1 and k. Number of vertices
The probability of having a (k + 1)-clique is still small ( log-1n). The probability of having a clique of size k is 1- The probability of having a (k + 1)-clique is still small ( log-1n). The value of p must increase by clog-1n before the probability for having a (k + 1)-clique reaches and another transition interval begins. The probability of having a clique of size k is
Def: Sharp threshold Sharp threshold in monotone graph property: The transition from a property being very unlikely to it being very likely is very swift. G satisfies property P G Does not satisfies property P
Thm: every monotone graph property has a Sharp Threshold [FK] Let P be any monotone property of graphs on n vertices . If p(P) > then q(P) > 1- for q = p + c1log(½)/logn Proof idea: show asp’(P), for p’>p, is high
Thm [Margulis-Russo]: For monotone f
Proof [Margulis-Russo]:
Mechanism Design Problem N agents, each agent i has private input tiT. All other information is public knowledge. Each agent i has a valuation for all items: Each agent wishes to optimize her own utility. Objective: minimize the objective function, the total payment. Means: protocol between agents and auctioneer.
Vickery-Clarke-Groves (VCG) Sealed bid auction A Truth Revealing protocol, namely, one in which each agent might as well reveal her valuation to the auctioneer Whereby each agent gets the best (for her) price she could have bid and still win the auction
Shortest Path using VGC Problem definition: Communication network modeled by a directed graph G and two vertices source s and target t. Agents = edges in G Each agent has a cost for sending a single message on her edge denote by te. Objective: find the shortest (cheapest) path from s to t. Means: protocol between agents and auctioneer.
Always in the shortest path VCG for Shortest-Path Always in the shortest path 10$ 10$ 50$ 50$
How much will we pay? SP Every agent will get $1 more. Thm[Mahedia,Saberi,S]: expected extra pay is asSP (G) 1$ 2$ 2$ 1$ 1$ 1$ 1$ 1$ 2$ 1$ 2$ 1$ 1$ 2$
How much will we pay? SP Every agent gets an extra $1 Thm[Mahedia,Saberi,S]: expected extra pay is asSP (G) 1$ 2$ 2$ 1$ 1$ 1$ 1$ 1$ 2$ 1$ 2$ 1$ 1$ 2$
Juntas A function is a J-junta if its value depends on only J variables. 1 -1 -1 1 -1 -1 1 1 -1 1 -1 1 -1 1 1 1 -1 1 A Dictatorship is 1-junta -1 1 -1 -1 1 1 -1 1 -1 1 -1 1 -1 1 -1 1 1 1 -1 1 -1
Noise-Sensitivity How often does the value of f changes when the input is perturbed? [n] [n] x I z [n] I I z x
Noise-Sensitivity [n] x I z Def(,p,x[n] ): Let 0<<1, and xP([n]) Then y~,p,x, if y = (x\I) z where I~[n] is a noise subset, and z~ pI is a replacement. Def(-noise-sensitivity): let 0<<1, then [ When p=½ equivalent to flipping each coordinate in x independently w.p. /2.]
Noise-Sensitivity – Cont. Advantage: very efficiently testable (using only two queries) by a perturbation-test. Def (perturbation-test): choose x~p, and y~,p,x, check whether f(x)=f(y) The success is proportional to the noise-sensitivity of f. Prop: the -noise-sensitivity is given by
Relation between Parameters Prop: small ns small high-freq weight Proof: therefore: if ns is small, then Hence the high frequencies must have small weights (as ). Prop: small as small high-freq weight Proof:
High vs. Low Frequencies Def: The section of a function f above k is and the low-frequency portion is
Low-degree B.f are Juntas [KS] Theorem: constant >0 s.t. any Boolean function f:P([n]){-1,1} satisfying is an [,j]-junta for j=O(-2k32k) Corollary: fix a p-biased distribution p over P([n]) Let >0 be any parameter. Set k=log1-(½) Then constant >0 s.t. any Boolean function f:P([n]){-1,1} satisfying is an [,j]-junta for j=O(-2k32k)
Freidgut Theorem Thm: any Boolean f is an [, j]-junta for Proof: Specify the junta J Show the complement of J has little influence
Long-Code In the long-code the set of legal-words consists of all monotone dictatorships This is the most extensive binary code, as its bits represent all possible binary values over n elements
Long-Code Encoding an element e[n] : Ee legally-encodes an element e if Ee = fe F F T T T
Codes and Boolean Functions Def: an m-bit code is a subset of the set of all the m-binary string C{-1,1}m The distance of a code C is the minimum, over all pairs of legal-words (in C), of the Hamming distance between the two words Note: A Boolean function over n binary variables is a 2n-bit string Hence, a set of Boolean functions can be considered as a 2n-bits code
Long-Code Monotone-Dictatorship In the long-code, the legal code-words are all monotone dictatorships C={{i} | i [n]} namely, all the singleton characters
Of course they’ll have to discuss it over dinner…. Where to go for Dinner? The alternatives Diners would cast their vote in an (electronic) envelope The system would decide – not necessarily according to majority… And what if someone (in Florida?) can flip some votes Form a Committee influence Power
Open Questions Hardness of Approximation: MAX-CUT Coloring a 3-colorable graph with fewest colors Graph Properties: find sharp-thresholds for properties Circuit Complexity: switching lemmas Mechanism Design: show a non truth-revealing protocol in which the pay is smaller (Nash equilibrium when all agents tell the truth?) Analysis: show weakest condition for a function to be a Junta Apply Concentration of Measure techniques to other problems in Complexity Theory
Of course they’ll have to discuss it over dinner…. Where to go for Dinner? The alternatives Diners would cast their vote in an (electronic) envelope The system would decide – not necessarily according to majority… And what if someone (in Florida?) can flip some votes Form a Committee influence Power
Of course they’ll have to discuss it over dinner…. Where to go for Dinner? The alternatives Diners would cast their vote in an (electronic) envelope The system would decide – not necessarily according to majority… And what if someone (in Florida?) can flip some votes Form a Committee influence Power
Specify the Junta Set k=(as(f)/), and =2-(k) Let We’ll prove: and let hence, J is a [,j]-junta, and |J|=2O(k)
Functions’ Vector-Space A functions f is a vector Addition: ‘f+g’(x) = f(x) + g(x) Multiplication by scalar ‘cf’(x) = cf(x)
Hadamard Code In the Hadamard code the set of legal-words consists of all multiplicative (linear if over {0,1}) functions C={S | S [n]} namely all characters
Hadamard Test Given a Boolean f, choose random x and y; check that f(x)f(y)=f(xy) Prop(completeness): a legal Hadamard word (a character) always passes this test
Hadamard Test – Soundness Prop(soundness): Proof:
Testing Long-code Def(a long-code list-test): given a code-word f, probe it in a constant number of entries, and accept almost always if f is a monotone dictatorship reject w.h.p if f does not have a sizeable fraction of its Fourier weight concentrated on a small set of variables, that is, if a semi-Junta J[n] s.t. Note: a long-code list-test, distinguishes between the case f is a dictatorship, to the case f is far from a junta.
Motivation – Testing Long-code The long-code list-test are essential tools in proving hardness results. Hence finding simple sufficient-conditions for a function to be a junta is important.
High Frequencies Contribute Little Prop: k >> r log r implies Proof: a character S of size larger than k spreads w.h.p. over all parts Ih, hence contributes to the influence of all parts. If such characters were heavy (>/4), then surely there would be more than j parts Ih that fail the t independence-tests
Altogether Lemma: Proof:
Altogether
Beckner/Nelson/Bonami Inequality Def: let T be the following operator on any f, Prop: Proof:
Beckner/Nelson/Bonami Inequality Def: let T be the following operator on any f, Thm: for any p≥r and ≤((r-1)/(p-1))½
Beckner/Nelson/Bonami Corollary Corollary 1: for any real f and 2≥r≥1 Corollary 2: for real f and r>2
Perturbation Def: denote by the distribution over all subsets of [n], which assigns probability to a subset x as follows: independently, for each i[n], let ix with probability 1- ix with probability
Long-Code Test Given a Boolean f, choose random x and y, and choose z; check that f(x)f(y)=f(xyz) Prop(completeness): a legal long-code word (a dictatorship) passes this test w.p. 1-
Long-code Tests Def (a long-code test): given a code-word w, probe it in a constant number of entries, and accept w.h.p if w is a monotone dictatorship reject w.h.p if w is not close to any monotone dictatorship
Efficient Long-code Tests For some applications, it suffices if the test may accept illegal code-words, nevertheless, ones which have short list-decoding: Def(a long-code list-test): given a code-word w, probe it in 2/3 places, and accept w.h.p if w is a monotone dictatorship, reject w.h.p if w is not even approximately determined by a short list of domain elements, that is, if a Junta J[n] s.t. f is close to f’ and f’(x)=f’(xJ) for all x Note: a long-code list-test, distinguishes between the case w is a dictatorship, to the case w is far from a junta.
General Direction These tests may vary The long-code list-test a, in particular the biased case version, seem essential in proving improved hardness results for approximation problems Other interesting applications Hence finding simple, weak as possible, sufficient-conditions for a function to be a junta is important.