On Sample Based Testers

Slides:

Advertisements

Similar presentations

1 Property testing and learning on strings and trees Michel de Rougemont University Paris II & LRI Joint work with E. Fischer, Technion, F. Magniez, LRI.

Advertisements

Hardness of testing 3- colorability in bounded degree graphs Andrej Bogdanov Kenji Obata Luca Trevisan.

Finding Cycles and Trees in Sublinear Time Oded Goldreich Weizmann Institute of Science Joint work with Artur Czumaj, Dana Ron, C. Seshadhri, Asaf Shapira,

On Complexity, Sampling, and -Nets and -Samples. Range Spaces A range space is a pair, where is a ground set, it’s elements called points and is a family.

Deterministic vs. Non-Deterministic Graph Property Testing Asaf Shapira Tel-Aviv University Joint work with Lior Gishboliner.

Gillat Kol joint work with Ran Raz Locally Testable Codes Analogues to the Unique Games Conjecture Do Not Exist.

Property testing of Tree Regular Languages Frédéric Magniez, LRI, CNRS Michel de Rougemont, LRI, University Paris II.

Distributional Property Estimation Past, Present, and Future Gregory Valiant (Joint work w. Paul Valiant)

Fast Algorithms For Hierarchical Range Histogram Constructions

A UNIFIED FRAMEWORK FOR TESTING LINEAR-INVARIANT PROPERTIES ARNAB BHATTACHARYYA CSAIL, MIT (Joint work with ELENA GRIGORESCU and ASAF SHAPIRA)

1 Distribution-free testing algorithms for monomials with a sublinear number of queries Elya Dolev & Dana Ron Tel-Aviv University.

Distribution-free testing algorithms for monomials with a sublinear number of queries Elya Dolev & Dana Ron Tel-Aviv University.

Approximating Average Parameters of Graphs Oded Goldreich, Weizmann Institute Dana Ron, Tel Aviv University.

Christian Sohler | Every Property of Hyperfinite Graphs is Testable Ilan Newman and Christian Sohler.

Property Testing: A Learning Theory Perspective Dana Ron Tel Aviv University.

Asaf Shapira (Georgia Tech) Joint work with: Arnab Bhattacharyya (MIT) Elena Grigorescu (Georgia Tech) Prasad Raghavendra (Georgia Tech) 1 Testing Odd-Cycle.

Testing of ‘massively parametrized problems’ - Ilan Newman Haifa University Based on joint work with: Sourav Chakraborty, Eldar Fischer, Shirley Halevi,

Oded Goldreich Shafi Goldwasser Dana Ron February 13, 1998 Max-Cut Property Testing by Ori Rosen.

Proclaiming Dictators and Juntas or Testing Boolean Formulae Michal Parnas Dana Ron Alex Samorodnitsky.

Learning and Fourier Analysis Grigory Yaroslavtsev CIS 625: Computational Learning Theory.

Proximity Oblivious Testing Oded Goldreich Weizmann Institute of Science Joint work with Dana Ron.

Testing the Diameter of Graphs Michal Parnas Dana Ron.

Some Techniques in Property Testing Dana Ron Tel Aviv University.

Proximity Oblivious Testing Oded Goldreich Weizmann Institute of Science Joint work with Dana Ron.

Testing of Clustering Noga Alon, Seannie Dar Michal Parnas, Dana Ron.

Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.

Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.

Testing Metric Properties Michal Parnas and Dana Ron.

On Proximity Oblivious Testing Oded Goldreich - Weizmann Institute of Science Dana Ron – Tel Aviv University.

On Testing Convexity and Submodularity Michal Parnas Dana Ron Ronitt Rubinfeld.

1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint work with Mira Gonen Dana Ron Tel-Aviv University.

1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.

1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University.

Lower Bounds for Property Testing Luca Trevisan U C Berkeley.

Christian Sohler 1 University of Dortmund Testing Expansion in Bounded Degree Graphs Christian Sohler University of Dortmund (joint work with Artur Czumaj,

Approximating the Distance to Properties in Bounded-Degree and Sparse Graphs Sharon Marko, Weizmann Institute Dana Ron, Tel Aviv University.

On Testing Computability by small Width OBDDs Oded Goldreich Weizmann Institute of Science.

A Tutorial on Property Testing Dana Ron Tel Aviv University.

Finding Cycles and Trees in Sublinear Time Oded Goldreich Weizmann Institute of Science Joint work with Artur Czumaj, Dana Ron, C. Seshadhri, Asaf Shapira,

Some 3CNF Properties are Hard to Test Eli Ben-Sasson Harvard & MIT Prahladh Harsha MIT Sofya Raskhodnikova MIT.

Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)

A Property Testing Double-Feature of Short Talks Oded Goldreich Weizmann Institute of Science Talk at Technion, June 2013.

Complexity and Efficient Algorithms Group / Department of Computer Science Approximating Structural Properties of Graphs by Random Walks Christian Sohler.

Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Property Testing: Sublinear-Time Approximate Decisions Oded Goldreich Weizmann Institute of Science Talk at CTW, July 2013.

1 Quasi-randomness is determined by the distribution of copies of a graph in equicardinal large sets Raphael Yuster University of Haifa.

狄彥吾 (Yen-Wu Ti) 華夏技術學院資訊工程系 Property Testing on Combinatorial Objects.

Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.

Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.

On Local Partition Oracles and their Applications

Dana Ron Tel-Aviv University

Property Testing (a.k.a. Sublinear Algorithms )

Lower Bounds for Property Testing

Randomness and Computation

Dana Ron Tel Aviv University

On Testing Dynamic Environments

Approximating the MST Weight in Sublinear Time

Finding Cycles and Trees in Sublinear Time

On Approximating the Number of Relevant Variables in a Function

From dense to sparse and back again: On testing graph properties (and some properties of Oded)

Speaker: Chuang-Chieh Lin National Chung Cheng University

The Bernoulli distribution

Lecture 18: Uniformity Testing Monotonicity Testing

On Learning and Testing Dynamic Environments

Warren Center for Network and Data Sciences

CIS 700: “algorithms for Big Data”

Invariance in Property Testing

Algebraic Property Testing

The Subgraph Testing Model

Every set in P is strongly testable under a suitable encoding

Presentation transcript:

On Sample Based Testers Oded Goldreich, Weizmann Institute Dana Ron, Tel Aviv University

(Standard) Testing – A quick reminder Let OBJ be an object (function) of size N a Testing Algorithm for a (prespecified) property P is given a proximity parameter (0,1]; - If OBJ has P should accept with prob  2/3; - If OBJ is -far from having P should reject with prob  2/3. Distance is normalized Hamming To this end the algorithm is given query access to OBJ x1 f(x1) Query complexity q(N,) should be sublinear in N. x2 OBJ=f f(x2) ... This model is as defined by Rubinfeld and Sudan, and most results in property testing are in this model.

Sample-Based Testing f What if we don’t have query access to the object, but can only obtain (uniform) random samples? Namely, if the tested object is a function f : [N]  R, and we can only obtain pairs (x,f(x)), where x is uniformly distributed in [N]. (x1,f(x1)) (x2,f(x2)) f ...

Sample-Based Testing Background Setting of sample-based testing is similar to the setting in Learning Theory (learning under the uniform distribution), and was defined in “Property testing and its connection to learning and approximation” [Goldreich, Goldwasser, R]. However, in [GGR] and most works since, results on sample-based testing were mainly negative, essentially establishing the necessity of queries. Two exceptions: Decision Trees over [0,1]d [Kearns, R], Interval Functions (d=1) and Linear Threshold Functions under Gaussian distribution [Balkan, Blais, Blum, Yang] (as part of their study of Active Testing) 1 1

Sample-Based Testing Background Note: Many works on testing and estimating properties of distributions (starting with [Batu, Fortnow, Rubinfeld, Smith, White]) where the object is a distribution D, the algorithm gets sample distributed according to D, and should test if D has property P. x1 x2 D ... Differs from sample-based testing where: (1) Object is function; (2) Underlying distribution is fixed (uniform); (3) Get function labels; (4) Tested property is of function.

This Work We were interested in understanding the relation between sample-based testing and other models of testing, as well as variants of sample-based testing (as just defined).

Our Results: 1. Relation to POTs Proximity Oblivious Testers (POT) that are “fair” imply sublinear sample-based testers. (q,)-POT: Performs (const) q queries; - If fP, accepts with prob  c; - If fP, accepts with prob  c- (P(f)) Fair: if each query (almost) uniformly distributed. (q,)-POT that is fair Sample-based tester with sample complexity O(N1-1/q/(())2+3/q). Comments: (1) Stronger notion of fairness (e.g., pairs of queries unif. dist.) gives better sample complexity; (2) Some notion of fairness is necessary in general (3) Fairness not necessary for Boolean functions, q=2 and POT with 1-sided error (c=1) [Fischer, Goldhirsh, Lachish]. dist of f to P Example, Linearity: query on x1,x2 and x1+x2

Our Results: 2. Dense-Graphs Model Quasi-Canonical testers in dense-graphs model imply sublinear sample-based testers. Dense-Graphs model: G=(V,E) represented by adjacency matrix: can query if (u,v) E and N = n2 (n=|V|). Quasi-Canonical: Select v(N,) vertices uniformly at random; Query all pairs; Decide based on induced subgraph, possibly randomly. Quasi-Canonical tester (using v(N,) vertices) Sample-based tester with sample complexity O(N1-1/(v(N,)-1) ). Comment: Better results for graph partitioning problems: e.g., for k-colorability O(N1/2log k/) Not time-efficient, but (for k-colorability) can get time N1-(1/2k)g(k,) (with N1-1/2k(k/2) samples). v u 1 (N1/2 necessary even for k=2)

Our Results: 3. Distribution-Free Testing Distribution-free sample-based testing is related to 1-sided error sample-based testing (under unif. dist.) non-trivially. Distribution-free (sample-based)* testing: Unknown distribution D, algorithm get samples (x,f(x)) where xD, and distance (to property) is defined w.r.t. D. For property P, DF(P): distribution-free sample complexity, OSE(P): 1-sided error sample complexity, consider const . Every P: OSE(P) = Õ(DF(P)2). Exist P s.t. OSE(P) = (DF(P)). Exist P s.t. OSE(P) = (DF(P)). Exist P s.t. OSE(P) = o(DF(P)). (x1D,f(x1)) f (x2D,f(x2)) P is natural (E.g., OSE(P) = O(log(DF(P))), or even OSE(P) = O(1) when DF(P)= (N).) (*) Distribution-free testing previously studied when queries are also allowed (e.g., by Halevi and Kushilevitz).

Our Results: 4. Testing Distributions Distribution testing reduces to sample-based testing of symmetric properties (articulates [Sudan]). Distribution testing: For unknown distribution D, algorithm gets samples xD and should decide whether D has property P or is -far from having P, where distance is L1 (statistical/variation). Symmetric properties of functions f: X  R: Property is invariant under permutations over X (in other words, property defined by {Ny}yR where Ny=|{xX : f(x)=y}| ). Let P= U Pm be property of distributions s.t. Pm has support that is a subset of Sm, |Sm|=m. Denote sample complexity of testing P by s(m,). Exists symmetric property P’=U PN,m of functions, with domain [N] and range Sm such that for every N  cn2/4 s(m,) = O(s’(N,m,/2)), s(m,) = (s’(N,m,2)).

Proof Sketch of OSE(P) = Õ(DF(P)2) Recall: DF(P) is distribution-free sample complexity, OSE(P) is 1-sided error (unif. dist.) sample complexity. Let T’ be a distribution-free sample-based algorithm with complexity s = o(N1/2) that errs with prob < 1/6. (The case s = (N1/2) is trivial.) Define T to be the algorithm that takes a uniform sample of size r=O(s2), labeled by f: ((x1,f(x1)),…,(xr,f(xr)), xi  U and accepts i.f.f  g  P s.t. g(xi) = f(xi) for every 1  i  r. - By definition of T, if f  P, then T always accepts (as required from 1-sided error testers). - Remains to show that if f is -far from P, then T rejects with probability at least 2/3.

Proof Sketch of OSE(P) = Õ(DF(P)2), cont’ Assume in contradiction: Exists f that is -far from P, which T accepts with probability > 1/3 (recall: T uses uniform sample) That is, for more than 1/3 of sample sets x = {x1,…,xr},  gx  P s.t. gx(xi) = f(xi) For each such “bad” sample set x, consider distribution x that is uniform over x={x1,…,xr}. Key: Given sample of size s distributed according to x, labeled by f, T’ must accept w.p.  5/6. Implies: If select x = {x1,…,xr} uniformly, and give T’ subsample of size s, must accept w.p.  (1/3)(5/6) > 1/6 + 1/10. But, this dist over sample of size s is 0.1-close to uniform, on which T’ must accept w.p.  1/6. X Recall: T’ is dist-free contradiction

Wrapping up We study relations between sample-based testing and other models of testing, as well as variants of sample-based testing. In particular, showed: Proximity Oblivious Testers (POT) that are “fair” imply sublinear sample based testers. Quasi-Canonical testers in dense-graphs model imply sublinear sample-based testers. Distribution-free sample-based testing is related to 1-sided error sample-based testing (under uniform dist) non-trivially (always have quadratic upper bound but other than that, varies). Distribution testing reduces to sample-based testing of symmetric properties (articulates [Sudan]).

Thanks