Download presentation
Presentation is loading. Please wait.
Published byJennifer Egan Modified over 11 years ago
1
Reductions to the Noisy Parity Problem TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A A Vitaly Feldman Parikshit Gopalan Subhash Khot Ashok K. Ponnuswami Harvard UW Georgia Tech aka New Results on Learning Parities, Halfspaces, Monomials, Mahjongg etc.
2
Uniform Distribution Learning x, f(x)x {0,1} n f: {0,1} n ! {+1,-1} Goal: Learn the function f in poly(n) time.
3
Uniform Distribution Learning x, f(x) Goal: Learn the function f in poly(n) time. Information theoretically impossible. Will assume f has nice structure, such as 1.Parityf(x) = (-1) · x 2.Halfspacef(x) = sgn(w · x) 3.k-juntaf(x) = f(x i 1,…,x i k ) 4.Decision Tree 5.DNF
4
Uniform Distribution Learning x, f(x) Goal: Learn the function f in poly(n) time. 1.Parityn O(1) Gaussian elim. 2.Halfspacen O(1) LP 3.k-juntan 0.7k [MOS] 4.Decision Treen log n Fourier 5.DNFn log n Fourier
5
Uniform Distribution Learning with Random Noise x, (-1) e ·f(x) Goal: Learn the function f in poly(n) time. x {0,1} n f: {0,1} n ! {+1,-1} e = 1 w.p = 0 w.p 1 -
6
x, (-1) e ·f(x) Goal: Learn the function f in poly(n) time. 1.ParityNoisy Parity 2.Halfspacen O(1) [BFKV] 3.k-juntan k Fourier 4.Decision Treen log n Fourier 5. DNFn log n Fourier Uniform Distribution Learning with Random Noise
7
Coding Theory: Decoding a random linear code from random noise. Best Known Algorithm: 2 n/log n Blum-Kalai-Wasserman [BKW] Believed to be hard. Variant: Noisy parity of size k. Brute force runs in time O(n k ). The Noisy Parity Problem x, (-1) e ·f(x)
8
Agnostic Learning under the Uniform Distribution x, g(x) Goal: Get an approx. to g that is as good as f. g(x) is a {-1,+1} random variable. Pr x [g(x) f(x)]
9
x, g(x) Goal: Get an approx. to g that is as good as f. If the function f is a 1.Parity2 n/log n [FGKP] 2.Halfspacen O(1) [KKMS] 3.k-juntan k [KKMS] 4.Decision Treen log n [KKMS] 5.DNFn log n [KKMS] Agnostic Learning under the Uniform Distribution
10
x, g(x) Given g which has a large Fourier coefficient, find it. Coding Theory: Decoding a random linear code with adversarial noise. If queries were allowed: Hadamard list decoding [GL, KM]. Basis of algorithms for Decision trees [KM], DNF [Jackson]. Agnostic Learning of Parities
11
Reductions between problems and models x, f(x)x, g(x) Noise-free Random Agnostic x, (-1) e ·f(x)
12
Reductions to Noisy Parity Theorem [FGKP]: Learning Juntas, Decision Trees and DNFs reduce to learning noisy parities of size k. ClassSize of ParityError-rate k-juntak½ - 2 -k Decision tree, DNF log n½ - n -2
13
Uniform Distribution Learning x, f(x) Goal: Learn the function f in poly(n) time. 1.Parityn O(1) Gaussian elim. 2.Halfspacen O(1) LP 3.k-juntan 0.7k [MOS] 4.Decision Treen log n Fourier 5.DNFn log n Fourier
14
Reductions to Noisy Parity Theorem [FGKP]: Learning Juntas, Decision Trees and DNFs reduce to learning noisy parities of size k. ClassSize of ParityError-rate k-juntak½ - 2 -k Decision tree, DNF log n½ - n -2 Evidence in favor of noisy parity being hard? Reduction holds even with random classification noise.
15
x, (-1) e ·f(x) Goal: Learn the function f in poly(n) time. 1.ParityNoisy Parity 2.Halfspacen O(1) [BFKV] 3.k-juntan k Fourier 4.Decision Treen log n Fourier 5. DNFn log n Fourier Uniform Distribution Learning with Random Noise
16
Reductions to Noisy Parity Theorem [FGKP]: Agnostically learning parity with error-rate reduces to learning noisy parity with error-rate. With BKW, gives 2 n/log n agnostic learning algorithm. Main Idea: A noisy parity algorithm can help find large Fourier coefficients from random examples.
17
Reductions between problems and models x, f(x)x, g(x) Noise-free Random Agnostic x, (-1) e ·f(x) Probabilistic Oracle
18
Probabilistic Oracles Given h: {0,1} n ! [-1,1] h x, b x {0,1} n, b 2 {-1,+1}. E[b | x] = h(x).
19
Simulating Noisefree Oracles x, f(x) f x, b E[b | x] = f(x) 2 {-1,1}, hence b = f(x) Let f: {0,1} n ! {-1,1}.
20
Simulating Random Noise x, f(x) 0.8f x, b E[b | x] = 0.8 f(x) Hence b = f(x) w.p 0.9 b = -f(x) w.p 0.1 Given f: {0,1} n ! {-1,1} and = 0.1 Let h(x) = 0.8 f(x).
21
Simulating Adversarial Noise x, g(x) h x, b Given g(x) is a {-1,1} r.v. and Pr x [g(x) f(x)] =. Let h(x) = E[g(x)]. Bound on error rate implies E x [|h(x) – f(x)|] <
22
Reductions between problems and models x, f(x)x, g(x) Noise-free Random Agnostic x, (-1) e ·f(x) Probabilistic Oracle
23
… for the slideshow.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.