Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Slides:



Advertisements
Similar presentations
Something for almost nothing: Advances in sublinear time algorithms Ronitt Rubinfeld MIT and Tel Aviv U.
Advertisements

Property Testing and Communication Complexity Grigory Yaroslavtsev
Lower Bounds for Testing Properties of Functions on Hypergrids Grigory Yaroslavtsev Joint with: Eric Blais (MIT) Sofya Raskhodnikova.
Grigory Yaroslavtsev Joint work with Piotr Berman and Sofya Raskhodnikova.
Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)
Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.
Distributional Property Estimation Past, Present, and Future Gregory Valiant (Joint work w. Paul Valiant)
Something for almost nothing: Advances in sublinear time algorithms Ronitt Rubinfeld MIT and Tel Aviv U.
Kevin Matulef MIT Ryan O’Donnell CMU Ronitt Rubinfeld MIT Rocco Servedio Columbia.
1 Distribution-free testing algorithms for monomials with a sublinear number of queries Elya Dolev & Dana Ron Tel-Aviv University.
Distribution-free testing algorithms for monomials with a sublinear number of queries Elya Dolev & Dana Ron Tel-Aviv University.
Quantum Spectrum Testing Ryan O’Donnell John Wright (CMU)
Approximating Average Parameters of Graphs Oded Goldreich, Weizmann Institute Dana Ron, Tel Aviv University.
Christian Sohler | Every Property of Hyperfinite Graphs is Testable Ilan Newman and Christian Sohler.
Estimating the Unseen: Sublinear Statistics Paul Valiant.
Inferring Mixtures of Markov Chains Tuğkan BatuSudipto GuhaSampath Kannan University of Pennsylvania.
Testing the Diameter of Graphs Michal Parnas Dana Ron.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
Testing of Clustering Noga Alon, Seannie Dar Michal Parnas, Dana Ron.
Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.
Sublinear time algorithms Ronitt Rubinfeld Blavatnik School of Computer Science Tel Aviv University TexPoint fonts used in EMF. Read the TexPoint manual.
Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.
Testing Metric Properties Michal Parnas and Dana Ron.
On Proximity Oblivious Testing Oded Goldreich - Weizmann Institute of Science Dana Ron – Tel Aviv University.
1 Sampling Lower Bounds via Information Theory Ziv Bar-Yossef IBM Almaden.
1 On approximating the number of relevant variables in a function Dana Ron & Gilad Tsur Tel-Aviv University.
On Testing Convexity and Submodularity Michal Parnas Dana Ron Ronitt Rubinfeld.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint work with Mira Gonen Dana Ron Tel-Aviv University.
1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint works with Mira Gonen and Oded Goldreich Dana Ron Tel-Aviv University.
Christian Sohler 1 University of Dortmund Testing Expansion in Bounded Degree Graphs Christian Sohler University of Dortmund (joint work with Artur Czumaj,
1 On The Learning Power of Evolution Vitaly Feldman.
7-1 Chapter Seven SAMPLING DESIGN. 7-2 Sampling What is it? –Drawing a conclusion about the entire population from selection of limited elements in a.
Approximating the Distance to Properties in Bounded-Degree and Sparse Graphs Sharon Marko, Weizmann Institute Dana Ron, Tel Aviv University.
On Testing Computability by small Width OBDDs Oded Goldreich Weizmann Institute of Science.
Some 3CNF Properties are Hard to Test Eli Ben-Sasson Harvard & MIT Prahladh Harsha MIT Sofya Raskhodnikova MIT.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005
Sublinear time algorithms Ronitt Rubinfeld Computer Science and Artificial Intelligence Laboratory (CSAIL) Electrical Engineering and Computer Science.
1 Performance Evaluation of Computer Systems By Behzad Akbari Tarbiat Modares University Spring 2009 Introduction to Probabilities: Discrete Random Variables.
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
Complexity and Efficient Algorithms Group / Department of Computer Science Approximating Structural Properties of Graphs by Random Walks Christian Sohler.
1 Introduction to Quantum Information Processing CS 467 / CS 667 Phys 467 / Phys 767 C&O 481 / C&O 681 Richard Cleve DC 653 Course.
Transitive-Closure Spanner of Directed Graphs Kyomin Jung KAIST 2009 Combinatorics Workshop Joint work with Arnab Bhattacharyya MIT Elena Grigorescu MIT.
1 Sublinear Algorithms Lecture 1 Sofya Raskhodnikova Penn State University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
Sampling in Space Restricted Settings Anup Bhattacharya IIT Delhi Joint work with Davis Issac (MPI), Ragesh Jaiswal (IITD) and Amit Kumar (IITD)
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
Seminar on Sub-linear algorithms Prof. Ronitt Rubinfeld.
Chapter 8: Simple Linear Regression Yang Zhenlin.
狄彥吾 (Yen-Wu Ti) 華夏技術學院資訊工程系 Property Testing on Combinatorial Objects.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Week 101 Test on Pairs of Means – Case I Suppose are iid independent of that are iid. Further, suppose that n 1 and n 2 are large or that are known. We.
Learning and Testing Junta Distributions Maryam Aliakbarpour (MIT) Joint work with: Eric Blais (U Waterloo) and Ronitt Rubinfeld (MIT and TAU) 1.
On Sample Based Testers
Dana Ron Tel Aviv University
On Testing Dynamic Environments
Approximating the MST Weight in Sublinear Time
On Approximating the Number of Relevant Variables in a Function
Sample Mean Distributions
Lecture 18: Uniformity Testing Monotonicity Testing
Testing with Alternative Distances
Warren Center for Network and Data Sciences
Tests for Two Means – Normal Populations
Local Error-Detection and Error-correction
CIS 700: “algorithms for Big Data”
Presentation By: Barak Gross Supervised : Ronitt Rubinfeld
Computational Learning Theory
Gautam Kamath Simons Institute  University of Waterloo
Computational Learning Theory
Learning From Observed Data
Presentation transcript:

Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Shopping distribution What properties do your distributions have?

Transactions in CaliforniaTransactions in New York Testing closeness of two distributions: trend change?

Testing Independence: Shopping patterns: Independent of zip code?

This work: Many distributions

One distribution:  D is arbitrary black-box distribution over [n], generates iid samples.  Sample complexity in terms of n? (can it be sublinear?) D Test samples Pass/Fail?

 Uniformity  (n 1/2 ) [Goldreich, Ron 00] [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] [Paninski 08]  Identity  (n 1/2 ) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01]  Closeness  (n 2/3 ) [Batu, Fortnow, Rubinfeld, Smith, White], [Valiant 08]  Independence O(n 1 2/3 n 2 1/3 ),  (n 1 2/3 n 2 1/3 ) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01], this work  Entropyn 1/β^2+o(1) [Batu, Dasgupta, Kumar, Rubinfeld 05], [Valiant 08]  Support Size  (n/logn) [Raskhodnikova, Ron, Shpilka, Smith 09], [Valiant, Valiant 10]  Monotonicity on total order  (n 1/2 ) [Batu, Kumar, Rubinfeld 04]  Monotonicity on posetn 1-o(1) [Bhattacharyya, Fischer, Rubinfeld, Valiant 10] Some answers…

Collection of distributions:  Two models:  Sampling model:  Get (i,x) for random i, x  D i  Query model:  Get (i,x) for query i and x  D i  Sample complexity in terms of n,m? D1D1 Test samples Pass/Fail? D2D2 DmDm … Further refinement: Known or unknown distribution on i’s?

Properties considered:  Equivalence  All distributions are equal  ``Clusterability’’  Distributions can be clustered into k clusters such that within a cluster, all distributions are close

Equivalence vs. independence  Process of drawing pairs:  Draw i  [m], x  D i output (i,x)  Easy fact: (i,x) independent iff D i ‘s are equal

Results Def: (D 1,…D m ) has the Equivalence property if D i = D i' for all 1 ≤ i, i’ ≤ m. Lower BoundUpper Bound n>m  (n 2/3 m 1/3 ) Unknown Weights Õ(n 2/3 m 1/3 ) m>n  (n 1/2 m 1/2 ) Õ(n 1/2 m 1/2 ) Known Weights Also yields “tight” lower bound for independence testing

Clusterability  Can we cluster distributions s.t. in each cluster, distributions (very) close?  Sample complexity of test is  O(kn 2/3 ) for n = domain size, k = number of clusters  No dependence on number of distributions  Closeness requirement is very stringent

Open Questions Clusterability in the sampling model, less stringent notion of close Other properties of collections? E.g., all distributions are shifts of each other?

Thank you