Data-Powered Algorithms Bernard Chazelle Princeton University Bernard Chazelle Princeton University
Linear Programming Linear Programming
N constraints and d variables
Dimension Reduction 25 Images (face recognition) Signals (voice recognition) Text (NLP) Nearest neighbor searching Clustering...
Dimension reduction All pairwise distances nearly preserved
Johnson-Lindenstrauss Transform (JLT) c log n 2 d Random Orthogonal Matrix v d
Friendly JLT c log n 2 d N(0,1)N(0,1)N(0,1) N(0,1) N(0,1)N(0,1)N(0,1) N(0,1) N(0,1)N(0,1)N(0,1) N(0,1) N(0,1)N(0,1)N(0,1) N(0,1)
Friendlier JLT c log n 2 d d log n 2 2 =
Sparse JLT ? c log n d 1 d o(1)-Fraction non-zeros
Main Tool: Uncertainty Principle Time Frequency Heisenberg
Fast Johnson-Lindenstrauss Transform (FJLT) d Discrete Fourier Transform dd c log n N(0,1) = O + d log d + d log 3 n 2 2d Optimal ??
theory experimentation
computation theory experimentation
computation theory experimentation
input output Most interesting problems are too hard !! Most interesting problems are too hard !!
input output randomization approximation So, we change the model… So, we change the model…
input output randomization approximation PTAS for ETSP
input output randomization approximation Impossible to approximate chromatic chromatic number within a factor of… Impossible to approximate chromatic chromatic number within a factor of…
input output randomization approximation Property Testing [RS’96, GGR’96] Property Testing [RS’96, GGR’96] Berkeley “school” (program checking & probabilistic proofs) Berkeley “school” (program checking & probabilistic proofs)
Distance is 3
Distance is 4
nono yesyes bipartitebipartite
nono yesyes bipartitebipartite anythinganything [GR’97][GR’97]
Birthday paradox polylog cycles 1717 Mixing case
[M’89][M’89] Nonmixing implies small cuts Non-mixing case
Dense graphs [GGR98, AK99] Hofstadter. Godel, Escher, Bach. Is graph k-colorable?
Main tool Szemerédi’s Regularity Lemma Far from k-colorable Lots of witnesses
Property Testing Graph algorithms connectivity acyclicity k-way cuts clique Distributions independence entropy monotonicity distances Geometry convexity disjointness delaunay plane EMST