Download presentation
Presentation is loading. Please wait.
Published byLogan Rogers Modified over 9 years ago
1
Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011
2
Shopping distribution What properties do your distributions have?
3
Transactions in CaliforniaTransactions in New York Testing closeness of two distributions: trend change?
4
Testing Independence: Shopping patterns: Independent of zip code?
5
This work: Many distributions
6
One distribution: D is arbitrary black-box distribution over [n], generates iid samples. Sample complexity in terms of n? (can it be sublinear?) D Test samples Pass/Fail?
7
Uniformity (n 1/2 ) [Goldreich, Ron 00] [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] [Paninski 08] Identity (n 1/2 ) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] Closeness (n 2/3 ) [Batu, Fortnow, Rubinfeld, Smith, White], [Valiant 08] Independence O(n 1 2/3 n 2 1/3 ), (n 1 2/3 n 2 1/3 ) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01], this work Entropyn 1/β^2+o(1) [Batu, Dasgupta, Kumar, Rubinfeld 05], [Valiant 08] Support Size (n/logn) [Raskhodnikova, Ron, Shpilka, Smith 09], [Valiant, Valiant 10] Monotonicity on total order (n 1/2 ) [Batu, Kumar, Rubinfeld 04] Monotonicity on posetn 1-o(1) [Bhattacharyya, Fischer, Rubinfeld, Valiant 10] Some answers…
8
Collection of distributions: Two models: Sampling model: Get (i,x) for random i, x D i Query model: Get (i,x) for query i and x D i Sample complexity in terms of n,m? D1D1 Test samples Pass/Fail? D2D2 DmDm … Further refinement: Known or unknown distribution on i’s?
9
Properties considered: Equivalence All distributions are equal ``Clusterability’’ Distributions can be clustered into k clusters such that within a cluster, all distributions are close
10
Equivalence vs. independence Process of drawing pairs: Draw i [m], x D i output (i,x) Easy fact: (i,x) independent iff D i ‘s are equal
11
Results Def: (D 1,…D m ) has the Equivalence property if D i = D i' for all 1 ≤ i, i’ ≤ m. Lower BoundUpper Bound n>m (n 2/3 m 1/3 ) Unknown Weights Õ(n 2/3 m 1/3 ) m>n (n 1/2 m 1/2 ) Õ(n 1/2 m 1/2 ) Known Weights Also yields “tight” lower bound for independence testing
12
Clusterability Can we cluster distributions s.t. in each cluster, distributions (very) close? Sample complexity of test is O(kn 2/3 ) for n = domain size, k = number of clusters No dependence on number of distributions Closeness requirement is very stringent
13
Open Questions Clusterability in the sampling model, less stringent notion of close Other properties of collections? E.g., all distributions are shifts of each other?
14
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.