Sparse Recovery Using Sparse (Random) Matrices Piotr Indyk MIT Joint work with: Radu Berinde, Anna Gilbert, Howard Karloff, Martin Strauss and Milan Ruzic.

Slides:

Advertisements

Similar presentations

1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.

Advertisements

A Nonlinear Approach to Dimension Reduction Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb TexPoint fonts used in EMF.

Linear Subspaces - Geometry. No Invariants, so Capture Variation Each image = a pt. in a high-dimensional space. –Image: Each pixel a dimension. –Point.

Caltech October 2005 Compressive Sensing Tutorial PART 2

On the Power of Adaptivity in Sparse Recovery Piotr Indyk MIT Joint work with Eric Price and David Woodruff, 2011.

Sparse Recovery Using Sparse (Random) Matrices Piotr Indyk MIT Joint work with: Radu Berinde, Anna Gilbert, Howard Karloff, Martin Strauss and Milan Ruzic.

Image acquisition using sparse (pseudo)-random matrices Piotr Indyk MIT.

Sparse Recovery (Using Sparse Matrices)

Signal processing’s free lunch The accurate recovery of undersampled signals.

ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.

Richard Baraniuk Rice University dsp.rice.edu/cs Compressive Signal Processing.

Compressive Signal Processing

5. Topic Method of Powers Stable Populations Linear Recurrences.

Random Convolution in Compressive Sampling Michael Fleyer.

Chapter 8. Linear Systems with Random Inputs 1 0. Introduction 1. Linear system fundamentals 2. Random signal response of linear systems Spectral.

Hybrid Dense/Sparse Matrices in Compressed Sensing Reconstruction

Random Projections of Signal Manifolds Michael Wakin and Richard Baraniuk Random Projections for Manifold Learning Chinmay Hegde, Michael Wakin and Richard.

Compressed Sensing Compressive Sampling

Model-based Compressive Sensing

Sparse Fourier Transform

Game Theory Meets Compressed Sensing

Recovery of Clustered Sparse Signals from Compressive Measurements

Cs: compressed sensing

4.5 Solving Systems using Matrix Equations and Inverses.

Solving Systems of 3 or More Variables Why a Matrix? In previous math classes you solved systems of two linear equations using the following method:

4.5 Solving Systems using Matrix Equations and Inverses OBJ: To solve systems of linear equations using inverse matrices & use systems of linear equations.

4.4 Solving Systems With Matrix Equations

Sketching via Hashing: from Heavy Hitters to Compressive Sensing to Sparse Fourier Transform Piotr Indyk MIT.

Streaming Algorithms Piotr Indyk MIT. Data Streams A data stream is a sequence of data that is too large to be stored in available memory Examples: –Network.

Jarvis Haupt Department of Electrical and Computer Engineering University of Minnesota Compressive Saliency Sensing: Locating Outliers in Large Data Collections.

Shriram Sarvotham Dror Baron Richard Baraniuk ECE Department Rice University dsp.rice.edu/cs Sudocodes Fast measurement and reconstruction of sparse signals.

Examples of linear transformation matrices Some special cases of linear transformations of two-dimensional space R 2 are illuminating:dimensional Dimoffree.svgDimoffree.svg‎

Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30 th annual ACM symposium on theory of computing.

Linear Algebra Diyako Ghaderyan 1 Contents:  Linear Equations in Linear Algebra  Matrix Algebra  Determinants  Vector Spaces  Eigenvalues.

Linear Algebra Diyako Ghaderyan 1 Contents:  Linear Equations in Linear Algebra  Matrix Algebra  Determinants  Vector Spaces  Eigenvalues.

4.7 Solving Systems using Matrix Equations and Inverses

Worksheet Answers Matrix worksheet And Matrices Review.

Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Columbia) Robert Krauthgamer (Weizmann Inst) Ilya Razenshteyn (MIT) 1.

3.8B Solving Systems using Matrix Equations and Inverses.

n-pixel Hubble image (cropped)

Sparse RecoveryAlgorithmResults  Original signal x = x k + u, where x k has k large coefficients and u is noise.  Acquire measurements Ax = y. If |x|=n,

Maths for Signals and Systems Linear Algebra for Engineering Applications Lectures 1-2, Tuesday 13 th October 2014 DR TANIA STATHAKI READER (ASSOCIATE.

Compressive Sensing for Attitude Determination Rishi Gupta, Piotr Indyk, Eric Price, Yaron Rachlin May 12, 2011 Draper Laboratory.

Use Inverse Matrices to Solve Linear Systems

PreCalculus Section 14.3 Solve linear equations using matrices

Solving Linear Systems

Chapter 3 Linear Systems Review

Systems of Linear Equations: Matrices

Jeremy Watt and Aggelos Katsaggelos Northwestern University

Lecture 13 Compressive sensing

Multiplication of Matrices

Lecture 15 Sparse Recovery Using Sparse Matrices

MA10210: Algebra 1B

Sketching and Embedding are Equivalent for Norms

CNNs and compressive sensing Theoretical analysis

Random feature for sparse signal classification

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

Use Inverse Matrices to Solve Linear Systems

Y. Kotidis, S. Muthukrishnan,

Multiplicative Inverses of Matrices and Matrix Equations

1.3 Vector Equations.

UNIT 5. Linear Systems with Random Inputs

Aishwarya sreenivasan 15 December 2006.

9.3 Perform Reflections Mrs. vazquez Geometry.

9.4 Perform Rotations Mrs. vazquez Geometry.

Sudocodes Fast measurement and reconstruction of sparse signals

Perspective Photos.

Learning-Based Low-Rank Approximations

Presentation transcript:

Sparse Recovery Using Sparse (Random) Matrices Piotr Indyk MIT Joint work with: Radu Berinde, Anna Gilbert, Howard Karloff, Martin Strauss and Milan Ruzic

Linear Compression (a.k.a. linear sketching, compressed sensing) Setup: –Data/signal in n-dimensional space : x E.g., x is an 1000x1000 image  n=1000,000 –Goal: compress x into a “sketch” Ax, where A is a carefully designed m x n matrix, m << n Requirements: –Plan A: want to recover x from Ax Impossible: undetermined system of equations –Plan B: want to recover an “approximation” x* of x Sparsity parameter k Want x* such that ||x*-x|| p  C(k) ||x’-x|| q ( l p /l q guarantee ) over all x’ that are k-sparse (at most k non-zero entries) The best x* contains k coordinates of x with the largest abs value Want: –Good compression (small m) –Efficient algorithms for encoding and recovery Why linear compression ? = A x Ax x x* k=2

Applications

Application I: Monitoring Network Traffic Router routs packets (many packets) –Where do they come from ? –Where do they go to ? Ideally, would like to maintain a traffic matrix x[.,.] –Easy to update: given a (src,dst) packet, increment x src,dst –Requires way too much space! (2 32 x 2 32 entries) –Need to compress x, increment easily Using linear compression we can: –Maintain sketch Ax under increments to x, since A(x+  ) = Ax + A  –Recover x* from Ax source destination x

Other applications Single pixel camera [Wakin, Laska, Duarte, Baron, Sarvotham, Takhar, Kelly, Baraniuk’06] Microarray Experiments/Pooling [Kainkaryam, Gilbert, Shearer, Woolf], [Hassibi et al], [Dai-Sheikh, Milenkovic, Baraniuk]

Known constructions+algorithms

Constructing matrix A Choose encoding matrix A at random (the algorithms for recovering x* are more complex) –Sparse matrices: Data stream algorithms Coding theory (LDPCs) –Dense matrices: Compressed sensing Complexity theory (Fourier) “Traditional” tradeoffs: –Sparse: computationally more efficient, explicit –Dense: shorter sketches Goals: unify, find the “best of all worlds”

Result Table [CCF’02], [CM’06] Rk log nn log nlog nn log nl2 / l2 Rk log c nn log c nlog c nk log c nl2 / l2 [CM’04]Rk log nn log nlog nn log nl1 / l1 Rk log c nn log c nlog c nk log c nl1 / l1 Scale: ExcellentVery GoodGoodFair Caveats: (1) only results for general vectors x are shown; (2) all bounds up to O() factors; (3) specific matrix type often matters (Fourier, sparse, etc); (4) ignore universality, explicitness, etc (5) most “dominated” algorithms not shown; Legend: n=dimension of x m=dimension of Ax k=sparsity of x* T = #iterations Approx guarantee: l2/l2: ||x-x*|| 2  C||x-x’|| 2 l2/l1: ||x-x*|| 2  C||x-x’|| 1 /k 1/2 l1/l1: ||x-x*|| 1  C||x-x’|| 1 [CRT’04] [RV’05] Dk log(n/k)nk log(n/k)k log(n/k)ncnc l2 / l1 Dk log c nn log nk log c nncnc l2 / l1 [GSTV’06] [GSTV’07] Dk log c nn log c nlog c nk log c nl1 / l1 Dk log c nn log c nk log c nk 2 log c nl2 / l1 [BGIKS’08]Dk log(n/k)n log(n/k)log(n/k)ncnc l1 / l1 [GLR’08]Dk logn logloglogn kn 1-a n 1-a ncnc l2 / l1 [NV’07], [DM’08], [NT’08,BD’08] Dk log(n/k)nk log(n/k)k log(n/k)nk log(n/k) * Tl2 / l1 Dk log c nn log nk log c nn log n * Tl2 / l1 [IR’08]Dk log(n/k)n log(n/k)log(n/k)n log(n/k)l1 / l1 [BIR’08]Dk log(n/k)n log(n/k)log(n/k)n log(n/k) *Tl1 / l1 PaperRand. / Det. Sketch length Encode time Col. sparsity/ Update time Recovery timeApprx [DIP’09]D  (k log(n/k)) l1 / l1 [CDD’07]D  (n) l2 / l2

Techniques

Restricted Isometry Property (RIP) - key property of a dense matrix A: x is k-sparse  ||x|| 2  ||Ax|| 2  C ||x|| 2 Holds w.h.p. for: –Random Gaussian/Bernoulli: m=O(k log (n/k)) –Random Fourier: m=O(k log O(1) n) Consider random m x n 0-1 matrices with d ones per column Do they satisfy RIP ? –No, unless m=  (k 2 ) [Chandar’07] However, they can satisfy the following RIP-1 property [Berinde-Gilbert-Indyk- Karloff-Strauss’08]: x is k-sparse  d (1-  ) ||x|| 1  ||Ax|| 1  d||x|| 1 Sufficient (and necessary) condition: the underlying graph is a ( k, d(1-  /2) )-expander dense vs. sparse

Expanders A bipartite graph is a (k,d(1-  ))-expander if for any left set S, |S|≤k, we have |N(S)|≥(1-  )d |S| Constructions: –Randomized: m=O(k log (n/k)) –Explicit: m=k quasipolylog n Plenty of applications in computer science, coding theory etc. In particular, LDPC-like techniques yield good algorithms for exactly k-sparse vectors x [Xu-Hassibi’07, Indyk’08, Jafarpour-Xu-Hassibi-Calderbank’08] n m d S N(S)

Instead of RIP in the L 2 norm, we have RIP in the L 1 norm Suffices for these results: Main drawback: l1/l1 guarantee Better approx. guarantee with same time and sketch length Other sparse matrix schemes, for (almost) k-sparse vectors: LDPC-like: [Xu-Hassibi’07, Indyk’08, Jafarpour-Xu-Hassibi-Calderbank’08] L1 minimization: [Wang-Wainwright-Ramchandran’08] Message passing: [Sarvotham-Baron-Baraniuk’06,’08, Lu-Montanari-Prabhakar’08] [BGIKS’08]Dk log(n/k)n log(n/k)log(n/k)ncnc l1 / l1 [IR’08]Dk log(n/k)n log(n/k)log(n/k)n log(n/k)l1 / l1 [BIR’08]Dk log(n/k)n log(n/k)log(n/k)n log(n/k) *Tl1 / l1 PaperRand. / Det. Sketch length Encode time Sparsity/ Update time Recovery timeApprx ? dense vs. sparse

Algorithms/Proofs

Proof: d(1-  /2)-expansion  RIP-1 Want to show that for any k-sparse x we have d (1-  ) ||x|| 1  ||Ax|| 1  d||x|| 1 RHS inequality holds for any x LHS inequality: –W.l.o.g. assume |x 1 |≥… ≥|x k | ≥ |x k+1 |=…= |x n |=0 –Consider the edges e=(i,j) in a lexicographic order –For each edge e=(i,j) define r(e) s.t. r(e)=-1 if there exists an edge (i’,j)<(i,j) r(e)=1 if there is no such edge Claim: ||Ax|| 1 ≥∑ e=(i,j) |x i |r e n m

Proof: d(1-  /2)-expansion  RIP-1 (ctd) Need to lower-bound ∑ e z e r e where z (i,j) =|x i | Let R b = the sequence of the first bd r e ’s From graph expansion, R b contains at most  /2 bd -1’s (for b=1, it contains no -1’s) The sequence of r e ’s that minimizes ∑ e z e r e is 1,1,…,1,-1,..,-1, 1,…,1, … d  /2 d (1-  /2)d Thus ∑ e z e r e ≥ (1-  )∑ e z e = (1-  ) d||x|| 1 d

A satisfies RIP-1  LP works [Berinde-Gilbert-Indyk-Karloff-Strauss’08] Compute a vector x* such that Ax=Ax* and ||x*|| 1 minimal Then we have, over all k-sparse x’ ||x-x*|| 1 ≤ C min x’ ||x-x’|| 1 –C  2 as the expansion parameter  0 Can be extended to the case when Ax is noisy

A satisfies RIP-1  Sparse Matching Pursuit works [Berinde-Indyk-Ruzic’08] Algorithm: –x*=0 –Repeat T times Compute c=Ax-Ax* = A(x-x*) Compute ∆ such that ∆ i is the median of its neighbors in c Sparsify ∆ (set all but 2k largest entries of ∆ to 0) x*=x*+∆ Sparsify x* (set all but k largest entries of x* to 0) After T=log() steps we have ||x-x*|| 1 ≤ C min k-sparse x’ ||x-x’|| 1 A ∆ c

Experiments Probability of recovery of random k-sparse +1/-1 signals from m measurements –Sparse matrices with d=10 1s per column –Signal length n=20,000 SMP LP

Conclusions Sparse approximation using sparse matrices State of the art: can do 2 out of 3: –Near-linear encoding/decoding –O(k log (n/k)) measurements –Approximation guarantee with respect to L2/L1 norm Open problems: –3 out of 3 ? –Explicit constructions ? This talk

Resources References: –R. Berinde, A. Gilbert, P. Indyk, H. Karloff, M. Strauss, “Combining geometry and combinatorics: a unified approach to sparse signal recovery”, Allerton, –R. Berinde, P. Indyk, M. Ruzic, “Practical Near-Optimal Sparse Recovery in the L1 norm”, Allerton, –R. Berinde, P. Indyk, “Sparse Recovery Using Sparse Random Matrices”, –P. Indyk, M. Ruzic, “Near-Optimal Sparse Recovery in the L1 norm”, FOCS, 2008.

Running times