A Property Testing Double-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013.

A Property Testing Double-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013

Oded Goldreich Weizmann Institute of Science On the communication complexity methodology for proving lower bounds on the query complexity of property testing

Before Blais, Brody, and Matulef (2011) (In order to derive a lower bound on testing the property , reduce a two-party communication problem  to .) Communication Complexity Property Testing AxAx ByBy T z The models seem incompatible: (1) no natural partition in PT, (2) no distance in CC.

The Methodology of Blais, Brody, and Matulef In order to derive a lower bound on testing the property , reduce a two-party communication problem  to . That is, present a mapping F of pairs of inputs (x,y)  0,1  n+n for the CC-problem  to l(n)-bit long inputs for testing  such that (x,y)  implies F(x,y)  and (x,y)  implies that F(x,y) is far from . In [BBM], l(n)=n and each f i is a function of x i and y i only. This restriction complicates the use of the methodology. Let f i (x,y) be the i-th bit of F(x,y), and suppose that B is an upper bound on the (deterministic) communication complexity of each f i and that C is a lower bound on the randomized communication complexity of . Then, testing  requires at least C/B queries. (x,y) F(x,y)  =1  =0 in  far from 

Soundness of the Methodology RCC = randomized CC (with error, say 1/3). Shared randomness. DCC = deterministic CC (or randomized with error 1/6n). PT = query complexity of testing (w.r.t distance as in “far”). Proof: Each of the two parties invokes a local copy of the tester using the shared randomness. Each query (i.e., i) made by the tester is answered by invoking the corresponding CC protocol (for f i ). Note that the two local executions are kept identical. The error probability of this protocol equals that of the tester. ■ THM: Let F:  0,1  n+n  0,1  l(n) be such that (x,y)  implies F(x,y)  and (x,y)  implies that F(x,y) is far from . Let f i (x,y) be the i-th bit of F(x,y). Then, RCC(  ) ≤ max i {DCC(f i )} ∙ PT(  ). Extends to CC promise problems

Applying the Methodology THM: Let F:  0,1  n+n  0,1  l(n) be such that (x,y)  implies F(x,y)  and (x,y)  implies that F(x,y) is far from . Let f i (x,y) be the i-th bit of F(x,y). Then, RCC(  ) ≤ max i {DCC(f i )} ∙ PT(  ); i.e., PT(  ) ≥ RCC(  )/max i {DCC(f i )}. THM: Let C:  0,1  n  0,1  l(n) be a linear code of constant relative distance, and k:N  N. Then, the query complexity of the set {C(x):x  0,1  n & wt(x)=k} is  (k). PF: Reduce from k-DISJ n (disjointness for k/2-subsets), using F(x,y)=C(x+y)=C(x)+C(y). Note that each bit in F(x,y) has DCC=2 (by exchanging the corresponding bits of C(x) and C(y)). COR: Testing k-linearity has query complexity  (k). [C = Hadamard] Note: Typically, the i-th bit of F(x,y) depends on a linear number of bits in x and in y. An alternative proof that uses the original BBM formulation needs to maneuver around this difficulty.

Applying the Restricted Methodology THM: Let F:  0,1  n+n  0,1  l(n) be such that (x,y)  implies F(x,y)  and (x,y)  implies that F(x,y) is far from . Let f i (x,y) be the i-th bit of F(x,y). Then, PT(  ) ≥ RCC(  )/max i {DCC(f i )}. Restriction: f i (x,y)=fnc(i,x i,y i ). THM: Let C:  0,1  n  0,1  l(n) be a linear code of constant relative distance, and k:N  N. Then, the query complexity of the set {C(x):x  0,1  n & wt(x)=k} is  (k). An alternative proof via the restricted methodology introduces an auxiliary CC problem (“C-encoded k-DISJ”)  ’ that consists of pairs (C(x),C(y)) s.t (x,y)  k-DIST n and reduces (in the CC world) k-DISJ to  ’ and then applies the restricted method to  ’. The general methodology frees the prover/user from this type of acrobatics. Interestingly, this is only a matter of convenience; that is, it does not add power (i.e., “anything provable via general is essentially provable by restricted”).

Emulating the Restricted Methodology THM: Let F:  0,1  n+n  0,1  l(n) be such that (x,y)  implies F(x,y)  and (x,y)  implies that F(x,y) is far from . Let f i (x,y) be the i-th bit of F(x,y). Then, PT(  ) ≥ RCC(  )/max i {DCC(f i )}. Restriction: f i (x,y)=fnc(i,x i,y i ). THM (imprecise sketch): Suppose that ,  and F satisfy the conditions of the general methodology with B=max i {DCC(f i )}. Then, there exists  ’,  ’ and F’ that satisfy the conditions of the restricted methodology while RCC(  ’)≥RCC(  ) and PT(  )=  (PT(  ’)/B). Still, the general methodology frees the prover/user from this type of acrobatics.

Oded Goldreich Weizmann Institute of Science On Multiple Input Problems in Property Testing

Three types of multiple input problems For any fixed property  and proximity parameter . Direct m-Sum Problem: Given a sequence of m inputs, output a sequence of m outputs that each satisfy the testing requirements; that is, for every i, if the i th input is in  then the i th output is 1 w.p.≥2/3, whereas if the input is  -far from  then the output is 1 w.p. ≥ 2/3. Direct m-Product Problem: Given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if some input is  -far from . m-Concatenation Problem: Given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if the average distance of the inputs from  is at least . The results at a glance: For DS and DP the query complexity is m times the query complexity of , for CP it is about the same as for .

The main results m-DS: Given a sequence of m inputs, output a sequence of m outputs such that, for every i, if the i th input is in  then the i th output is 1 w.p.≥2/3, whereas if the input is  -far from  then the output is 1 w.p. ≥ 2/3. m-DP: Given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if some input is  -far from . m-CP: Given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if the average distance of the inputs from  is at least . For any  and , w.r.t. error probability at most 1/3. THM 1: m-DS  (  ) =  (m∙PT  (  )). THM 2: m-DP  (  ) =  (m∙PT  (  )). THM 3: Typically (*), m-DP  (  ) = Õ(PT  (  )). *) “Typically” = if PT  (  ) increases at least linearly with 1/ 

Comments re the proof of THM1 THM 1: m-DS  (  ) =  (m∙PT  (  )). (m-DS  = given a sequence of m inputs, output a sequence of m outputs such that, for every i, if the i th input is in  the i th output is 1 w.p.≥2/3, whereas if the input is  -far from  then the output is 1 w.p. ≥ 2/3.) Re the lower bound: In the model of query complexity, it is easy to decouple the execution of the multiple-instance procedure into a sequence of single-instance executions, and the only issue at hand is the possibly uneven and adaptive allocation of resources among the executions. We need to consider the allocation of resources w.r.t some distribution on instances; which one? The one provided by the MiniMax Principle! The real contents of the MMP is not that the worst-case performance of each randomized algorithm is bounded by the average-case performance (of all deter’ algorithms) w.r.t some fixed input distribution, but rather that this bound is tight!

Comments re the proof of THM2 THM 2: m-DP  (  ) =  (m∙PT  (  )). (m-DP  = given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if some input is  -far from .) In iteration j, run DS on the instances with index in I, with error parameter exp(-j), and reset I to be the set of indices with output 0. If |I|>m/2 j, then halt with output 0. If I is empty, halt with output 1. Re the upper bound: A straightforward reduction of DP to DS will require error reduction (and so we would lose a  (log m) factor). LEM: m-DP can be reduced to O(j) instances of 2 -(j-1) m-DS, for j=1,…,log m. Idea: Proceed in iterations, initializing I (the set of “far” suspects) to [m]. Re the lower bound: Via an adaptation of the proof of THM1.

Illustration for the proof of LEM In iteration j, run DS on the instances with index in I, with error parameter exp(-j), and reset I to be the set of indices with output 0. If |I|>m/2 j, then halt with output 0. If I is empty, halt with output 1. LEM: m-DP can be reduced to O(j) instances of 2 -(j-1) m-DS, for j=1,…,log m. Idea: Proceed in iterations, initializing I (the set of “far” suspects) to [m]. Case: All inputs in  Case:  an input far from  011011111010011111 11 * 110 0

Comments re the proof of THM3 THM 3: Typically (*), m-DP  (  ) = Õ(PT  (  )). (m-CP  = given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if the average distance of the inputs from  is at least .) *) “Typically” = if PT  (  ) increases at least linearly with 1/  Suppose E s [q(s)] > , for q:[N]  [0,1]. (Invested work is proportional to 1/q(s), unknown a priori.) Then, exists j  [l] such that Prob s [q(s)>2 -j ] > 2 j  /4l. Re the upper bound: A straightforward algorithm would sample O(1/  ) instances and run the  -tester for  on each of them. Complexity O(PT  /  ). One can do better using Levin’s economical work investment strategy. Let l = log(2/  ). For j=1,…,l, take a sample of O(l/2 j  ) instances and invoke a 2 -j -tester on each.

Additional results and comments Non-adaptive and/or one-sided error testers The only deviation from the general case is for the one-sided error version of DP: Its complexity is  (m∙PT(  )+PT ose (  )). (m-DP = given a sequence of m inputs, output 1 w.p. ≥2/3 if all inputs are in , and 0 w.p.≥2/3 if some input is  -far from .) (OSE is the adaptive version) it selects a random i in I, and invokes the one-sided error tester on the i th instance, and decides accordingly. In contrast, in the invocations of the reduction procedure, we use the two-sided error tester. Re the upper bound: We adapt the procedure presented in the proof of the efficient reduction of DP to DS (cf., Lemma for THM2). Recall that this procedure proceeds in iterations halting with output 1 if I (the set of “far” suspects) becomes empty and outputting 0 if I is ever too big. We modify the procedure such that in the latter case

End The slides of this talk are available at http://www.wisdom.weizmann.ac.il/~oded/T/2pt13.ppt The “CC Methodology” paper is available at http://www.wisdom.weizmann.ac.il/~oded/p_ccpt.html The “Multiple Input” paper is available at http://www.wisdom.weizmann.ac.il/~oded/p_mi-pt.html

Gothic cathedral ? Property Testing: an illustration

Property Testing: informal definition A relaxation of a decision problem: For a fixed property P and any object O, determine whether O has property P or is far from having property P ( i.e., O is far from any other object having P ). Focus: sub-linear time algorithms – performing the task by inspecting the object at few locations. ?? ? ? ? Objects viewed as functions. Inspecting = querying the function/oracle.

Property Testing: the standard (one-sided error) def’n A property P =  n P n, where P n is a set of functions with domain D n. The tester gets explicit input n and , and oracle access to a function with domain D n. If f  P n then Prob[T f (n,  ) accepts] = 1. (or > 2/3) If f is  -far from P n then Prob[T f (n,  ) rejects] > 2/3. (Distance is defined as fraction of disagreements.) Focus: query complexity, q(n,  ) « |D n | Special focus: q(n,  )=q(  ), independent of n. Terminology:  is called the proximity parameter.

The Methodology of Blais, Brody, and Matulef In order to derive a lower bound on testing the property , reduce a two-party communication problem  to . That is, present a mapping F of pairs of inputs (x,y)  0,1  n+n for the CC-problem  to l(n)-bit long inputs for testing  such that (x,y)  implies F(x,y)  and (x,y)  implies that F(x,y) is far from . In [BBM], l(n)=n and each f i is a function of x i and y i only. This restriction complicates the use of the methodology. Let f i (x,y) be the i-th bit of F(x,y), and suppose that B is an upper bound on the (deterministic) communication complexity of each f i and that C is a lower bound on the randomized communication complexity of . Then, testing  requires at least C/B queries.

A Property Testing Double-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013.

Similar presentations

Presentation on theme: "A Property Testing Double-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Property Testing Double-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013.

Similar presentations

Presentation on theme: "A Property Testing Double-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013."— Presentation transcript:

Similar presentations

About project

Feedback