Presentation is loading. Please wait.

Presentation is loading. Please wait.

On the Complexity of Approximating the VC Dimension Chris Umans, Microsoft Research joint work with Elchanan Mossel, Microsoft Research June 2001.

Similar presentations


Presentation on theme: "On the Complexity of Approximating the VC Dimension Chris Umans, Microsoft Research joint work with Elchanan Mossel, Microsoft Research June 2001."— Presentation transcript:

1 On the Complexity of Approximating the VC Dimension Chris Umans, Microsoft Research joint work with Elchanan Mossel, Microsoft Research June 2001

2 The VC Dimension C collection of subsets of universe U VC(C) = VC dimension of C: size of largest subset T  U shattered by C T shattered if every subset T’  T expressible as T  (an element of C) Example: C = {{a}, {a, c}, {a, b, c}, {b, c}, {b}} VC(C) = 2{b, c} shattered by C Plays important role in learning theory, finite automata, comparability theory, computational geometry

3 Complexity Questions Given C, compute VC(C) since VC(C)  log |C|, can compute in O(n log n ) time (Linial-Mansour-Rivest 88) probably can’t do better: problem is LOGNP-complete (Papadimitriou-Yannakakis 96) Often C has a small implicit representation: C(i, x) is a polynomial-size circuit such that C(i, x) = 1 iff x belongs to set i implicit version is  3 -complete (Schaefer 99) (as hard as  a  b  c  (a, b, c) for CNF formula  )

4 Approximation Given C (circuit with N inputs), approximate VC(C) approximation within N 1-  NP-hard (Schaefer 99) this paper:  3 -hard to approximate to within 2-  (for any   0) approximable to within 2 in AM AM-hard to approximate to within N  (for some   0) (any   1 if optimal explicit dispersers exist) PSPACE  3  2 AM NP P

5 Why Interesting first constant approximability threshold for optimization problem in the Polynomial Hierarchy we locate threshold with unusual accuracy:  3 -hard to within 2-  (N -(1/4 -  ) ) (for any   0) approximable in AM to within 2-O(N -1/2 ) main idea in  3 -hardness result: desired reduction is essentially a randomness extraction problem AM-hardness result requires strong amplification of AM using dispersers constant in disperser seed length matters

6 Outline for Rest of Talk Arthur-Merlin protocol for 2-approximation  3 -hardness Schaefer’s  3 -completeness proof why we need “randomness extraction” dispersers for simple distributions using list- decodable codes AM-hardness conclusions

7 2-approximation in AM Sauer-Shelah(-Perles) Lemma: Let C be a collection of subsets of [n] such that. Then VC(C)  m+1.  ( ) j=0 m njnj |C|  Mutual input: circuit C(i, x) (= 1 iff x in set i) Merlin sends set of k elements X = {x 0, x 1,..., x k-1 } Arthur replies with a random k-bit string s Merlin sends an index i Accept iff C(i, x j ) = s j for j = 0, 1, 2,..., k-1 VC(C)  k/2 ⇒ VC(C ∩ X)  k/2 ⇒ Pr[C accepted]  1/2

8 Schaefer’s Reduction  (a, b, c) an instance of QSAT 3 with |a| = |b| = |c| = n Circuit C encodes these sets over universe {0,1} n x [n]: S ( , v, w) = {  } x v if  ( , v, w)=1 n if  b  c  (a, b, c) = 1 then C includes sets: {a} x 0...... 00000 {a} x 0...... 00001  {a} x 1.......111111 and so set{a} x 1.......111111 is shattered by C ⇒ VC(C)  n

9 Schaefer’s Reduction Circuit C encodes these sets over universe {0,1} n x [n]: S ( , v, w) = {  } x v if  ( , v, w)=1 n In general, set of form: {a} x VC(C) onesis shattered if VC(C)  n, then {a} x 1.......111111 is shattered, which implies  b  c  (a, b, c) = 1 For inapproximability, we want relaxed version of this statement to hold for some  close to ½. if VC(C)   n, then {a} x  n ones is shattered, which implies  b  c  (a, b, c) = 1 ???

10 A Randomness Extraction Problem we have a distribution X ⊂ {0,1} n with  n entropy we need: this gives us:  (x ∈ X)  c ∧  (a, EXT(x, y), c) ⇔  b  c  (a, b, c) Note: BUT, X is in a special class of distributions... x ∈ X seed EXT n bits O(log n) bits m = (  n)  (1) bits uniform y we only need a disperser we need zero-error ! (need to hit all of {0,1} m )

11 Generalized Bit-Fixing Sources From what class of distributions is X ? recall a set of form {a} x 001001010001 is shattered ↑ ↑ ↑ ↑ implies C includes following sets: {a} x ??0??0?0???0 {a} x ??0??0?0???1  {a} x ??1 ??1 ?1???1 projecting onto  n red positions, we get uniform distr. “generalized bit-fixing source of dimension  n” ( Kahn-Kalai-Linial 88: no det. extraction if (1 -  )n  Ω(n/log n) )  n ones = {a} x X

12 Dispersers from Codes x ∈ X seed EXT n bits t = O(log n) bits m = (  n)  (1) bits uniform we need: so that EXT(X, U t ) = {0,1} m, for all generalized bit- fixing sources X of dimension  n binary list-decodable code ECC:{0,1} m → {0,1} n Decode(R, i) gives i th codeword within distance at most (1-  )n from R EXT(x, i) ≝ Decode(x, i) Proof: ∀ z ∈ {0,1} m ∃ x ∈ X s.t. dist(ECC(z), x)  (1-  )n therefore, EXT(x, U t ) hits every z.

13 Dispersers from Codes x ∈ X seed EXT n bits t = O(log n) bits m = (  n)  (1) bits uniform using Guruswami-Sudan 00: Theorem: for all 1    ¾, exists an explicit zero- error disperser EXT:{0,1} n x {0,1} 2(1-  )log n + O(1) → {0,1} m for generalized bit-fixing sources of dimension  n = n/2 + n  with m = n  (1). ( notice degree is ≈ n 1/2 instead of ≈ n)

14 (2-  ) Approximation is  3 -Hard  (a, b, c) an instance of QSAT 3 with |a| = |b| = |c| = m Circuit C encodes these sets over universe {0,1} m x [n]: S ( , v, w) = {  } x v if ( ∧  ( , EXT(v,y), w y ) )=1 if  b  c  (a, b, c) = 1, then {a} x 1111111111 shattered ⇒ VC(C)  n if VC(C)  (1/2 +  )n =  n, {a} x  n ones shattered, C contains sets {a} x x ∈ X, for some generalized bit-fixing source X of dimension  n ⇒  (x ∈ X)  c ∧  (a, EXT(x, y), c) ⇔  b  c  (a, b, c) → y y

15 N  Approximation is AM-hard language L in AM ⇔ exists poly-time computable R L : x ∈ L ⇒ Pr[  z R L (x, y, z) = 1] = 1 x ∉ L ⇒ Pr[  z R L (x, y, z) = 1] ≤ ½ strong amplification using dispersers yields: (   0) x ∉ L ⇒ Pr[  z R L (x, y, z) = 1] ≤ exp(|y|  -|y|) Circuit C encodes sets S (y, z) = y if R L (x, y, z)=1 if x ∈ L, then 11111111 is shattered ⇒ VC(C) = |y| if x ∉ L, then #sets ≤ exp(|y|  ) ⇒ VC(C) ≤ |y|  size of instance (N) depends on degree of disperser to get gap of N 1- , need near-linear degree disperser

16 Conclusions fairly complete picture of approximability of VC dimension list-decodable binary codes yield zero-error dispersers for a non-trivial class of distributions Some improvements and generalizations: Ta-Shma-Zuckerman-Safra 01 construct near- linear degree dispersers (available on ECCC) Using q-ary list-decodable codes, and generalization of Sauer’s Lemma, we obtain approximability threshold of q for a generalization of VC dimension (in final version)


Download ppt "On the Complexity of Approximating the VC Dimension Chris Umans, Microsoft Research joint work with Elchanan Mossel, Microsoft Research June 2001."

Similar presentations


Ads by Google