Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Columbia) Robert Krauthgamer (Weizmann Inst) Ilya Razenshteyn (MIT) 1.

Slides:



Advertisements
Similar presentations
The Data Stream Space Complexity of Cascaded Norms T.S. Jayram David Woodruff IBM Almaden.
Advertisements

Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.
Numerical Linear Algebra in the Streaming Model
Tight Lower Bounds for the Distinct Elements Problem David Woodruff MIT Joint work with Piotr Indyk.
Vertex sparsifiers: New results from old techniques (and some open questions) Robert Krauthgamer (Weizmann Institute) Joint work with Matthias Englert,
Property Testing of Data Dimensionality Robert Krauthgamer ICSI and UC Berkeley Joint work with Ori Sasson (Hebrew U.)
A Nonlinear Approach to Dimension Reduction Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb TexPoint fonts used in EMF.
1 Approximating Edit Distance in Near-Linear Time Alexandr Andoni (MIT) Joint work with Krzysztof Onak (MIT)
Sparse Recovery Using Sparse (Random) Matrices Piotr Indyk MIT Joint work with: Radu Berinde, Anna Gilbert, Howard Karloff, Martin Strauss and Milan Ruzic.
Sparse Recovery Using Sparse (Random) Matrices Piotr Indyk MIT Joint work with: Radu Berinde, Anna Gilbert, Howard Karloff, Martin Strauss and Milan Ruzic.
Algorithmic High-Dimensional Geometry 1 Alex Andoni (Microsoft Research SVC)
Spectral Approaches to Nearest Neighbor Search arXiv: Robert Krauthgamer (Weizmann Institute) Joint with: Amirali Abdullah, Alexandr Andoni, Ravi.
Image acquisition using sparse (pseudo)-random matrices Piotr Indyk MIT.
Overcoming the L 1 Non- Embeddability Barrier Robert Krauthgamer (Weizmann Institute) Joint work with Alexandr Andoni and Piotr Indyk (MIT)
Dimensionality Reduction PCA -- SVD
Geometric embeddings and graph expansion James R. Lee Institute for Advanced Study (Princeton) University of Washington (Seattle)
Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]
A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.
ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Institute) Robert Krauthgamer (Weizmann Institute) Ilya Razenshteyn (CSAIL MIT)
Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)
1 Lecture 18 Syntactic Web Clustering CS
Avraham Ben-Aroya (Tel Aviv University) Oded Regev (Tel Aviv University) Ronald de Wolf (CWI, Amsterdam) A Hypercontractive Inequality for Matrix-Valued.
Clustering In Large Graphs And Matrices Petros Drineas, Alan Frieze, Ravi Kannan, Santosh Vempala, V. Vinay Presented by Eric Anderson.
1 Edit Distance and Large Data Sets Ziv Bar-Yossef Robert Krauthgamer Ravi Kumar T.S. Jayram IBM Almaden Technion.
Optimal Data-Dependent Hashing for Approximate Near Neighbors
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Inst. / Columbia) Robert Krauthgamer (Weizmann Inst.) Ilya Razenshteyn (MIT, now.
Dimensionality Reduction
Embedding and Sketching Alexandr Andoni (MSR). Definition by example  Problem: Compute the diameter of a set S, of size n, living in d-dimensional ℓ.
Random Projections of Signal Manifolds Michael Wakin and Richard Baraniuk Random Projections for Manifold Learning Chinmay Hegde, Michael Wakin and Richard.
Embedding and Sketching Non-normed spaces Alexandr Andoni (MSR)
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
Algorithms on negatively curved spaces James R. Lee University of Washington Robert Krauthgamer IBM Research (Almaden) TexPoint fonts used in EMF. Read.
Entropy-based Bounds on Dimension Reduction in L 1 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A AAAA A Oded Regev.
On Embedding Edit Distance into L_11 On Embedding Edit Distance into L 1 Robert Krauthgamer (Weizmann Institute and IBM Almaden)‏ Based on joint work (i)
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
MA5241 Lecture 1 TO BE COMPLETED
Non-Linear Dimensionality Reduction
Geometric Problems in High Dimensions: Sketching Piotr Indyk.
Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.
MAT 4725 Numerical Analysis Section 7.1 Part I Norms of Vectors and Matrices
Summer School on Hashing’14 Dimension Reduction Alex Andoni (Microsoft Research)
Big Data Lecture 5: Estimating the second moment, dimension reduction, applications.
Sparse RecoveryAlgorithmResults  Original signal x = x k + u, where x k has k large coefficients and u is noise.  Acquire measurements Ax = y. If |x|=n,
Tight Lower Bounds for Data- Dependent Locality-Sensitive Hashing Alexandr Andoni (Columbia) Ilya Razenshteyn (MIT CSAIL)
Singular Value Decomposition and its applications
Approximate Near Neighbors for General Symmetric Norms
Dimension reduction for finite trees in L1
Fast Dimension Reduction MMDS 2008
Polynomial Norms Amir Ali Ahmadi (Princeton University) Georgina Hall
Sublinear Algorithmic Tools 3
Sublinear Algorithmic Tools 2
LSI, SVD and Data Management
Spectral Approaches to Nearest Neighbor Search [FOCS 2014]
Lecture 10: Sketching S3: Nearest Neighbor Search
Sketching and Embedding are Equivalent for Norms
Nuclear Norm Heuristic for Rank Minimization
Near(est) Neighbor in High Dimensions
Data-Dependent Hashing for Nearest Neighbor Search
Lecture 16: Earth-Mover Distance
Near-Optimal (Euclidean) Metric Compression
Yair Bartal Lee-Ad Gottlieb Hebrew U. Ariel University
Bounds for Optimal Compressed Sensing Matrices
Overcoming the L1 Non-Embeddability Barrier
Streaming Symmetric Norms via Measure Concentration
Embedding and Sketching
Dimension versus Distortion a.k.a. Euclidean Dimension Reduction
Lecture 15: Least Square Regression Metric Embeddings
Approximating Edit Distance in Near-Linear Time
On Solving Linear Systems in Sublinear Time
Presentation transcript:

Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Columbia) Robert Krauthgamer (Weizmann Inst) Ilya Razenshteyn (MIT) 1

Sketching Compress a massive object to a small sketch Objects: high-dimensional vectors, matrices, graphs Similarity search, compressed sensing, numerical linear algebra Dimension reduction (Johnson, Lindenstrauss 1984): random projection on a low-dimensional subspace preserves distances 2 n d When is sketching possible?

Similarity search 3

Sketching metrics …1 AliceBob Charlie

5

6

The main question 7 Which metrics can we sketch with constant sketch size and approximation?

X Y A map f: X → Y is an embedding with distortion C, if for a, b from X: d X (a, b) / C ≤ d Y (f(a), f(b)) ≤ d X (a, b) Reductions for geometric problems 8 a b f(a) f(b) f f Sketches of size s and approximation D for Y Sketches of size s and approximation CD for X

Metrics with good sketches: summary A metric X admits sketches with s, D = O(1), if: X = ℓ p for p ≤ 2 X embeds into ℓ p for p ≤ 2 with distortion O(1) Are there any other metrics with efficient sketches? We don’t know! 9

The main result Embedding into ℓ p, p ≤ 2 Efficient sketches (Kushilevitz, Ostrovsky, Rabani 1998) (Indyk 2000) For norms 10

Application: lower bounds for sketches Convert non-embeddability into lower bounds for sketches in a black box way 11 No embeddings with distortion O(1) into ℓ 1 – ε No sketches * of size and approximation O(1) * in fact, any communication protocols

Example 1: the Earth Mover’s Distance No sketches with D = O(1) and s = O(1) 12

Example 2: the Trace Norm For an n × n matrix A define the Trace Norm (the Nuclear Norm) ‖A‖ to be the sum of the singular values Previously: lower bounds only for certain restricted classes of sketches [Li-Nguyen-Woodruff’14] 13

The sketch of the proof 14 Good sketches for X Absence of certain Poincaré-type inequalities on X [A-Jayram-P ă traşcu 2010], Direct sum for Information Complexity Convex duality + compactness [Johnson-Randrianarivony 2006], Lipschitz extension Linear embedding of X into ℓ 1-ε [Aharoni-Maurey-Mityagin 1985], Fourier analysis Good sketches for ℓ ∞ (X) Uses that X is a norm

Open problems 15