Learning and Testing Junta Distributions Maryam Aliakbarpour (MIT) Joint work with: Eric Blais (U Waterloo) and Ronitt Rubinfeld (MIT and TAU) 1
The Problem 2
Relevant features in distributions Smokes Does not regularly exercise Gender: male Correlates with heart attack Irrelevant to heart attack Binary features list: distributed the same Distribution over heart attack patients. Junta coordinates 3
Non-smoker smoker Relevant features in distributions Heart attack correlates Irrelevant to heart attack ExercisesDoes not exercise 4 Assumption: Irrelevant features are uniformly distributed.
Problem Definition 5
Relevant features in distributions Heart attack correlates Irrelevant to heart attack and uniformly distributed. is there a small such set? Which set is it? Testing problem Learning problem 6
Lots of related work Feature selection: Guyon-Elissee ff ’03, Liu-Motoda’12, and Chandrashekar-Sahin’14. Junta functions: A. Blum’94 and A. Blum-Langley’97, …., Blais’09, G. Valiant’12 Property testing of distribution: GR00, BFR+00, BFF+01, Bat01, BDKR02, BKR04, Val08, Pan08, Val11, DDS+13, ADJ+11, LRR11, ILR12, CDVV14, VV14, DKN15b, DKN15a, ADK15, and CDGR16 Testing properties of collection of distributions: Levi-Ron- Rubinfeld’13, and Diakonikolas-Kane’16 7
Learning Algorithm 8
PAC learning 9
Our results on Learning 10 Sample complexityRunning time Lower boundUpper bound Running time Cover method Our work
Our results on Learning 11 Sample complexityRunning time Lower boundUpper bound Running time Cover method Our work
12 Theorem
Overview of the Fourier analysis on Boolean cube 13 We can estimate!
Lemma 1 14 Corollary :
15 Lemma 2
Learning Algorithm 16
Testing Algorithm 17
What does it mean to test? accept reject 18
Our results on Testing 19 Sample ComplexityTime complexity Lower boundUpper bound Our work
Our results on Testing 20 Sample ComplexityTime complexity Lower boundUpper bound
Reduction 21
Testing Algorithm 22
Conclusion Summary: Introduced junta distributions How to learn junta distributions How to test junta distributions Future directions Tighter results Removing uniformity assumption 23
Reference Isabelle Guyon and Andr´e Elissee ff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157–1182, Huan Liu and Hiroshi Motoda. Feature selection for knowledge discovery and data mining, volume 454. Springer Science & Business Media, Girish Chandrashekar and Ferat Sahin. A survey on feature selection methods. Computers & Electrical Engineering, 40(1):16 – 28, Avrim Blum. Relevant examples and relevant features: Thoughts from computational learning theory. In AAAI Fall Symposium on ‘Relevance’, volume 5, Avrim Blum and Pat Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1-2):245–271, December Reut Levi, Dana Ron, and Ronitt Rubinfeld. Testing properties of collections of distributions. Theory of Computing, 9(8):295–347, Ilias Diakonikolas and Daniel M. Kane. A new approach for testing properties of discrete distributions. CoRR, abs/ , URL 24
Reference Gregory Valiant. Finding correlations in subquadratic time, with applications to learning parities and juntas. FOCS, pages 11–20, Blais, E.: Testing juntas nearly optimally. In: Proc. 41st Symposium on Theory of Computing, pp. 151–158 (2009) 25