Data-Powered Algorithms Bernard Chazelle Princeton University Bernard Chazelle Princeton University.

Slides:



Advertisements
Similar presentations
Randomness Conductors Expander Graphs Randomness Extractors Condensers Universal Hash Functions
Advertisements

Optimal Bounds for Johnson- Lindenstrauss Transforms and Streaming Problems with Sub- Constant Error T.S. Jayram David Woodruff IBM Almaden.
Linear-Degree Extractors and the Inapproximability of Max Clique and Chromatic Number David Zuckerman University of Texas at Austin.
A Nonlinear Approach to Dimension Reduction Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb TexPoint fonts used in EMF.
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Tools from Computational Geometry Bernard Chazelle Princeton University Bernard Chazelle Princeton University Tutorial FOCS 2005.
Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.
3/5/15CMPS 3130/6130 Computational Geometry1 CMPS 3130/6130 Computational Geometry Spring 2015 Delaunay Triangulations II Carola Wenk Based on: Computational.
Probabilistic Graph and Hypergraph Matching
CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep
Uncertainty Principles, Extractors, and Explicit Embeddings of L 2 into L 1 Piotr Indyk MIT.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
“Random Projections on Smooth Manifolds” -A short summary
Approximate Nearest Subspace Search with Applications to Pattern Recognition Ronen Basri, Tal Hassner, Lihi Zelnik-Manor presented by Andrew Guillory and.
A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts
Biomedical imaging Sloan Digital Sky Survey 4 petabytes (~1MG) (~1MG) 10 petabytes/yr 150 petabytes/yr.
Vector Space Information Retrieval Using Concept Projection Presented by Zhiguo Li
Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2002 Lecture 1 (Part 1) Introduction/Overview Tuesday, 9/3/02.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2002 Review Lecture Tuesday, 12/10/02.
Lower Bounds for Property Testing Luca Trevisan U C Berkeley.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2004 Lecture 1 (Part 1) Introduction/Overview Wednesday, 9/8/04.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Spring, 2002 Lecture 1 (Part 1) Introduction/Overview Tuesday, 1/29/02.
Approximate Nearest Subspace Search with applications to pattern recognition Ronen Basri Tal Hassner Lihi Zelnik-Manor Weizmann Institute Caltech.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Some 3CNF Properties are Hard to Test Eli Ben-Sasson Harvard & MIT Prahladh Harsha MIT Sofya Raskhodnikova MIT.
Beating the Union Bound by Geometric Techniques Raghu Meka (IAS & DIMACS)
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
Approximation Schemes via Sherali-Adams Hierarchy for Dense Constraint Satisfaction Problems and Assignment Problems Yuichi Yoshida (NII & PFI) Yuan Zhou.
Small clique detection and approximate Nash equilibria Danny Vilenchik UCLA Joint work with Lorenz Minder.
Approximate schemas Michel de Rougemont, LRI, University Paris II.
An Efficient Algorithm for Enumerating Pseudo Cliques Dec/18/2007 ISAAC, Sendai Takeaki Uno National Institute of Informatics & The Graduate University.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.
1 LING 696B: MDS and non-linear methods of dimension reduction.
Geometric Problems in High Dimensions: Sketching Piotr Indyk.
Dimension Reduction using Rademacher Series on Dual BCH Codes Nir Ailon Edo Liberty.
Randomized Algorithms Part 3 William Cohen 1. Outline Randomized methods - so far – SGD with the hash trick – Bloom filters – count-min sketches Today:
geometric representations of graphs
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2001 Review Lecture Tuesday, 12/11/01.
11 Lecture 24: MapReduce Algorithms Wrap-up. Admin PS2-4 solutions Project presentations next week – 20min presentation/team – 10 teams => 3 days – 3.
Summer School on Hashing’14 Dimension Reduction Alex Andoni (Microsoft Research)
Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.
Spectral Methods for Dimensionality
Lecture 1 (Part 1) Introduction/Overview Tuesday, 9/9/08
Property Testing (a.k.a. Sublinear Algorithms )
Lower Bounds for Property Testing
Information Complexity Lower Bounds
Multiplicative updates for L1-regularized regression
Lecture 22: Linearity Testing Sparse Fourier Transform
Fast Dimension Reduction MMDS 2008
From dense to sparse and back again: On testing graph properties (and some properties of Oded)
Lecture 18: Uniformity Testing Monotonicity Testing
Sublinear Algorithmic Tools 3
Basic Algorithms Christina Gallner
Lecture 10: Sketching S3: Nearest Neighbor Search
Sketching and Embedding are Equivalent for Norms
geometric representations of graphs
“(More) Consequences of Falsifying SETH
Around the Regularity Lemma
 = N  N matrix multiplication N = 3 matrix N = 3 matrix N = 3 matrix
Lecture 15: Least Square Regression Metric Embeddings
Nonlinear Dimension Reduction:
Machine Learning – a Probabilistic Perspective
Richard Anderson Lecture 5 Graph Theory
Ronen Basri Tal Hassner Lihi Zelnik-Manor Weizmann Institute Caltech
Learning-Based Low-Rank Approximations
Presentation transcript:

Data-Powered Algorithms Bernard Chazelle Princeton University Bernard Chazelle Princeton University

Linear Programming Linear Programming

N constraints and d variables

Dimension Reduction   25 Images (face recognition) Signals (voice recognition) Text (NLP) Nearest neighbor searching Clustering...

Dimension reduction All pairwise distances nearly preserved

Johnson-Lindenstrauss Transform (JLT) c log n  2 d Random Orthogonal Matrix v d

Friendly JLT c log n  2 d N(0,1)N(0,1)N(0,1) N(0,1) N(0,1)N(0,1)N(0,1) N(0,1) N(0,1)N(0,1)N(0,1) N(0,1) N(0,1)N(0,1)N(0,1) N(0,1)

Friendlier JLT c log n  2 d d log n  2  2 =  

Sparse JLT ? c log n  d 1 d o(1)-Fraction non-zeros

Main Tool: Uncertainty Principle Time Frequency Heisenberg

Fast Johnson-Lindenstrauss Transform (FJLT) d Discrete Fourier Transform dd c log n  N(0,1) = O  + d log d + d  log 3 n  2  2d Optimal ??

theory experimentation

computation theory experimentation

computation theory experimentation

input output Most interesting problems are too hard !! Most interesting problems are too hard !!

input output randomization approximation So, we change the model… So, we change the model…

input output randomization approximation PTAS for ETSP

input output randomization approximation Impossible to approximate chromatic chromatic number within a factor of… Impossible to approximate chromatic chromatic number within a factor of…

input output randomization approximation Property Testing [RS’96, GGR’96] Property Testing [RS’96, GGR’96] Berkeley “school” (program checking & probabilistic proofs) Berkeley “school” (program checking & probabilistic proofs)

Distance is 3

Distance is 4

nono yesyes bipartitebipartite

nono yesyes bipartitebipartite anythinganything [GR’97][GR’97]

Birthday paradox polylog cycles 1717 Mixing case

[M’89][M’89] Nonmixing implies small cuts Non-mixing case

Dense graphs [GGR98, AK99] Hofstadter. Godel, Escher, Bach. Is graph k-colorable?

Main tool Szemerédi’s Regularity Lemma Far from k-colorable Lots of witnesses

Property Testing  Graph algorithms  connectivity  acyclicity  k-way cuts  clique  Distributions  independence  entropy  monotonicity  distances  Geometry  convexity  disjointness  delaunay  plane EMST