Presentation is loading. Please wait.

Presentation is loading. Please wait.

Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Similar presentations


Presentation on theme: "Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011."— Presentation transcript:

1 Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

2 Shopping distribution What properties do your distributions have?

3 Transactions in CaliforniaTransactions in New York Testing closeness of two distributions: trend change?

4 Testing Independence: Shopping patterns: Independent of zip code?

5 This work: Many distributions

6 One distribution:  D is arbitrary black-box distribution over [n], generates iid samples.  Sample complexity in terms of n? (can it be sublinear?) D Test samples Pass/Fail?

7  Uniformity  (n 1/2 ) [Goldreich, Ron 00] [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] [Paninski 08]  Identity  (n 1/2 ) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01]  Closeness  (n 2/3 ) [Batu, Fortnow, Rubinfeld, Smith, White], [Valiant 08]  Independence O(n 1 2/3 n 2 1/3 ),  (n 1 2/3 n 2 1/3 ) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01], this work  Entropyn 1/β^2+o(1) [Batu, Dasgupta, Kumar, Rubinfeld 05], [Valiant 08]  Support Size  (n/logn) [Raskhodnikova, Ron, Shpilka, Smith 09], [Valiant, Valiant 10]  Monotonicity on total order  (n 1/2 ) [Batu, Kumar, Rubinfeld 04]  Monotonicity on posetn 1-o(1) [Bhattacharyya, Fischer, Rubinfeld, Valiant 10] Some answers…

8 Collection of distributions:  Two models:  Sampling model:  Get (i,x) for random i, x  D i  Query model:  Get (i,x) for query i and x  D i  Sample complexity in terms of n,m? D1D1 Test samples Pass/Fail? D2D2 DmDm … Further refinement: Known or unknown distribution on i’s?

9 Properties considered:  Equivalence  All distributions are equal  ``Clusterability’’  Distributions can be clustered into k clusters such that within a cluster, all distributions are close

10 Equivalence vs. independence  Process of drawing pairs:  Draw i  [m], x  D i output (i,x)  Easy fact: (i,x) independent iff D i ‘s are equal

11 Results Def: (D 1,…D m ) has the Equivalence property if D i = D i' for all 1 ≤ i, i’ ≤ m. Lower BoundUpper Bound n>m  (n 2/3 m 1/3 ) Unknown Weights Õ(n 2/3 m 1/3 ) m>n  (n 1/2 m 1/2 ) Õ(n 1/2 m 1/2 ) Known Weights Also yields “tight” lower bound for independence testing

12 Clusterability  Can we cluster distributions s.t. in each cluster, distributions (very) close?  Sample complexity of test is  O(kn 2/3 ) for n = domain size, k = number of clusters  No dependence on number of distributions  Closeness requirement is very stringent

13 Open Questions Clusterability in the sampling model, less stringent notion of close Other properties of collections? E.g., all distributions are shifts of each other?

14 Thank you


Download ppt "Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011."

Similar presentations


Ads by Google