Sparse Representation and Compressed Sensing: Theory and Algorithms

Slides:

Advertisements

Similar presentations

Shortest Vector In A Lattice is NP-Hard to approximate

Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Bregman Iterative Algorithms for L1 Minimization with

Joint work with Irad Yavneh

Pixel Recovery via Minimization in the Wavelet Domain Ivan W. Selesnick, Richard Van Slyke, and Onur G. Guleryuz *: Polytechnic University, Brooklyn, NY.

The Stability of a Good Clustering Marina Meila University of Washington

Online Performance Guarantees for Sparse Recovery Raja Giryes ICASSP 2011 Volkan Cevher.

T HE POWER OF C ONVEX R ELAXATION : N EAR - OPTIMAL MATRIX COMPLETION E MMANUEL J. C ANDES AND T ERENCE T AO M ARCH, 2009 Presenter: Shujie Hou February,

Submodular Dictionary Selection for Sparse Representation Volkan Cevher Laboratory for Information and Inference Systems - LIONS.

Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.

Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang

Extensions of wavelets

Ilias Theodorakopoulos PhD Candidate

Compressed sensing Carlos Becker, Guillaume Lemaître & Peter Rennert

ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.

“Random Projections on Smooth Manifolds” -A short summary

Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.

Matrix Extensions to Sparse Recovery Yi Ma 1,2 Allen Yang 3 John Wright 1 CVPR Tutorial, June 20, Microsoft Research Asia 3 University of California.

Sparse and Overcomplete Data Representation

Image Denoising via Learned Dictionaries and Sparse Representations

Random Convolution in Compressive Sampling Michael Fleyer.

Introduction to Compressive Sensing

Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.

Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.

Recent Trends in Signal Representations and Their Role in Image Processing Michael Elad The CS Department The Technion – Israel Institute of technology.

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems

A Sparse Solution of is Necessarily Unique !! Alfred M. Bruckstein, Michael Elad & Michael Zibulevsky The Computer Science Department The Technion – Israel.

Alfredo Nava-Tudela John J. Benedetto, advisor

Compressed Sensing Compressive Sampling

1 of 12 COMMUTATORS, ROBUSTNESS, and STABILITY of SWITCHED LINEAR SYSTEMS SIAM Conference on Control & its Applications, Paris, July 2015 Daniel Liberzon.

An ALPS’ view of Sparse Recovery Volkan Cevher Laboratory for Information and Inference Systems - LIONS

SVD(Singular Value Decomposition) and Its Applications

Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.

Compressive Sampling: A Brief Overview

AMSC 6631 Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Midyear Report Alfredo Nava-Tudela John J. Benedetto,

Game Theory Meets Compressed Sensing

Recovery of Clustered Sparse Signals from Compressive Measurements

Cs: compressed sensing

Recovering low rank and sparse matrices from compressive measurements Aswin C Sankaranarayanan Rice University Richard G. Baraniuk Andrew E. Waters.

“A fast method for Underdetermined Sparse Component Analysis (SCA) based on Iterative Detection- Estimation (IDE)” Arash Ali-Amini 1 Massoud BABAIE-ZADEH.

1 Sparsity Control for Robust Principal Component Analysis Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments:

Sparse Matrix Factorizations for Hyperspectral Unmixing John Wright Visual Computing Group Microsoft Research Asia Sept. 30, 2010 TexPoint fonts used in.

Learning With Structured Sparsity

Jarvis Haupt Department of Electrical and Computer Engineering University of Minnesota Compressive Saliency Sensing: Locating Outliers in Large Data Collections.

Model-Based Compressive Sensing Presenter: Jason David Bonior ECE / CMR Tennessee Technological University November 5, 2010 Reading Group (Richard G. Baraniuk,

Compressible priors for high-dimensional statistics Volkan Cevher LIONS/Laboratory for Information and Inference Systems

Shriram Sarvotham Dror Baron Richard Baraniuk ECE Department Rice University dsp.rice.edu/cs Sudocodes Fast measurement and reconstruction of sparse signals.

CSE 185 Introduction to Computer Vision Face Recognition.

High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.

Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.

SuperResolution (SR): “Classical” SR (model-based) Linear interpolation (with post-processing) Edge-directed interpolation (simple idea) Example-based.

Affine Registration in R m 5. The matching function allows to define tentative correspondences and a RANSAC-like algorithm can be used to estimate the.

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.

Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.

From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images Alfred M. Bruckstein (Technion), David L. Donoho (Stanford), Michael.

Jeremy Watt and Aggelos Katsaggelos Northwestern University

University of Ioannina

Basic Algorithms Christina Gallner

Nuclear Norm Heuristic for Rank Minimization

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

Bounds for Optimal Compressed Sensing Matrices

Sudocodes Fast measurement and reconstruction of sparse signals

Sparse Regression-based Hyperspectral Unmixing

Introduction to Compressive Sensing Aswin Sankaranarayanan

Optimal sparse representations in general overcomplete bases

CIS 700: “algorithms for Big Data”

Sudocodes Fast measurement and reconstruction of sparse signals

Outline Sparse Reconstruction RIP Condition

Subspace Expanders and Low Rank Matrix Recovery

Presentation transcript:

Sparse Representation and Compressed Sensing: Theory and Algorithms Yi Ma1,2 Allen Yang3 John Wright1 1Microsoft Research Asia 2University of Illinois at Urbana-Champaign 3University of California Berkeley CVPR Tutorial, June 20, 2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAA

MOTIVATION – Applications to a variety of vision problems Face Recognition: Wright et al PAMI ’09, Huang CVPR ’08, Wagner CVPR ’09 … Image Enhancement and Superresolution: Elad TIP ’06, Huang CVPR ‘08, … Image Classification: Mairal CVPR ‘08, Rodriguez ‘07, many others … Multiple Motion Segmentation: Rao CVPR ‘08, Elfamhir CVPR ’09 … … and many others, including this conference Today I will focus on one particular observation in automatic face recognition.

When and why can we expect such good performance? MOTIVATION – Applications to a variety of vision problems Face Recognition: Wright et al PAMI ’09, Huang CVPR ’09, Wagner CVPR ’09 … Image Enhancement and Superresolution: Elad TIP ’06, Huang CVPR ‘08, … Image Classification: Mairal CVPR ‘08, Rodriguez ‘07, … Multiple Motion Segmentation: Rao CVPR ‘08, Elfamhir CVPR ’09 … … and many others, including this conference When and why can we expect such good performance? A closer look at the theory … Today I will focus on one particular observation in automatic face recognition.

SPARSE REPRESENTATION – Model problem Underdetermined system of linear equations, ? … Observation Unknown Today I will focus on one particular observation in automatic face recognition. Two interpretations: Compressed sensing: A as sensing matrix Sparse representation: A as overcomplete dictionary

SPARSE REPRESENTATION – Model problem Underdetermined system of linear equations, ? … Observation Unknown Today I will focus on one particular observation in automatic face recognition. Many more unknowns than observations → no unique solution. Classical answer: minimum -norm solution Emerging applications: instead desire sparse solutions

SPARSE SOLUTIONS – Uniqueness Look for the sparsest solution: - number of nonzero elements Today I will focus on one particular observation in automatic face recognition.

SPARSE SOLUTIONS – Uniqueness Look for the sparsest solution: - number of nonzero elements Is the sparsest solution unique? - size of smallest set of linearly dependent columns of A. Today I will focus on one particular observation in automatic face recognition.

SPARSE SOLUTIONS – Uniqueness Look for the sparsest solution: - number of nonzero elements Is the sparsest solution unique? - size of smallest set of linearly dependent columns of A. Today I will focus on one particular observation in automatic face recognition. Proposition [Gorodnitsky & Rao ‘97]: If with , then is the unique solution to

SPARSE SOLUTIONS – So How Do We Compute It? Looking for the sparsest solution: Bad News: NP-hard in the worst case, hard to approximate within certain constants [Amaldi & Kann ’95]. Today I will focus on one particular observation in automatic face recognition.

SPARSE SOLUTIONS – So How Do We Compute It? Looking for the sparsest solution: Bad News: NP-hard in the worst case, hard to approximate within certain constants [Amaldi & Kann ’95]. Maybe we can still solve important cases? Greedy algorithms: Matching Pursuit, Orthogonal Matching Pursuit [Mallat & Zhang ‘93] CoSAMP [Needell & Tropp ‘08] Convex programming [Chen, Donoho & Saunders ‘94] Today I will focus on one particular observation in automatic face recognition.

SPARSE SOLUTIONS – The Heuristic Looking for the sparsest solution: Intractable. convex relaxation Linear program, solvable in polynomial time. Why ? Convex envelope of over the unit cube: Today I will focus on one particular observation in automatic face recognition. Rich applied history – geosciences, sparse coding in vision, statistics

EQUIVALENCE – A stronger motivation In many cases, the solutions to (P0) and (P1) are exactly the same: Theorem [Candes & Tao ’04, Donoho ‘04]: For Gaussian , with overwhelming probability, whenever Today I will focus on one particular observation in automatic face recognition. “ -minimization recovers any sufficiently sparse solution”

GUARANTEES – “Well-Spread” A Mutual coherence: largest inner product between distinct columns of A Low mutual coherence: vectors are well-spread in the space Today I will focus on one particular observation in automatic face recognition.

GUARANTEES – “Well-Spread” A Mutual coherence: Theorem [Elad & Donoho ’03, Gribvonel & Nielsen ‘03]: minimization uniquely recovers any with . Strong point: checkable condition. Weakness: low coherence can only guarantee recovery up to nonzeros. Today I will focus on one particular observation in automatic face recognition.

GUARANTEES – Beyond Coherence Low coherence: “any submatrix consisting of two columns of A is well-conditioned” Stronger bounds by looking at larger submatrices? Restricted Isometry Constants: s.t. for all -sparse , Today I will focus on one particular observation in automatic face recognition. Low RIC: “Column submatrices of A are uniformly well-conditioned”

GUARANTEES – Beyond Coherence Restricted Isometry Constants: s.t. for all -sparse , Theorem [Candes & Tao ’04, Candes ‘07]: If , then -minimization recovers any k-sparse . Today I will focus on one particular observation in automatic face recognition. For random A, this guarantees recovery up to linear sparsity:

GUARANTEES – Sharp Conditions? Necessary and sufficient condition: solves iff Today I will focus on one particular observation in automatic face recognition. polytope spanned by columns of A and their negatives

GUARANTEES – Geometric Interpretation Necessary and sufficient condition: [Donoho ’06] [Donoho + Tanner ’08] uniquely recovers with support and signs iff is a simplicial face of . Today I will focus on one particular observation in automatic face recognition. Uniform guarantees for -sparse P centrally -neighborly.

GUARANTEES – Geometric Interpretation Geometric understanding gives sharp thresholds for sparse recovery with Gaussian A [Donoho & Tanner ‘08]: Weak threshold Failure almost always Sparsity Today I will focus on one particular observation in automatic face recognition. Strong threshold Success almost always Success always Aspect ratio of A

GUARANTEES – Geometric Interpretation Explicit formulas in the wide-matrix limit [Donoho & Tanner ‘08]: Today I will focus on one particular observation in automatic face recognition. Weak threshold: Strong threshold:

GUARANTEES – Noisy Measurements What if there is noise in the observation? Gaussian or bounded 2-norm Natural approach: relax the constraint: Studied in several literatures Statistics – LASSO Signal processing – BPDN. Today I will focus on one particular observation in automatic face recognition.

GUARANTEES – Noisy Measurements What if there is noise in the observation? Natural approach: Theorem [Donoho, Elad & Temlyakov ‘06]: Recovery is stable: Today I will focus on one particular observation in automatic face recognition. See also [Candes-Romberg-Tao ‘06], [Wainwright ‘06], [Meinshausen & Yu ’06], [Zhao & Yu ‘06], …

GUARANTEES – Noisy Measurements What if there is noise in the observation? Natural approach: Theorem [Candes-Romberg-Tao ‘06]: Recovery is stable – for A satisfying an appropriate condition, Today I will focus on one particular observation in automatic face recognition. – best S-term approximation See also [Donoho ‘06], [Wainwright ‘06], [Meinshausen & Yu ’06], [Zhao & Yu ‘06], …

CONNECTIONS – Sketching and Expanders Similar sparse recovery problems explored in data streaming community: Combinatorial algorithms → fast encoding/decoding at expense of suboptimal # of measurements Based on ideas from group testing, expander graphs 2 5 1 Sketch Data stream … Today I will focus on one particular observation in automatic face recognition. [Gilbert et al ‘06], [Indyk ‘08], [Xu & Hassibi ‘08]

CONNECTIONS – High dimensional geometry Sparse recovery guarantees can also be derived via probabilistic constructions from high-dimensional geometry: The Johnson-Lindenstrauss lemma Dvoretsky’s almost-spherical section theorem: There exist subspaces of dimension as high as on which the and norms are comparable: Given n points a random projection into dimensions preserves pairwise distances: Today I will focus on one particular observation in automatic face recognition.

THE STORY SO FAR – Sparse recovery guarantees Sparse solutions can often be recovered by linear programming. Performance guarantees for arbitrary matrices with “uniformly well-spread columns”: (in)-coherence Restricted Isometry Sharp conditions via polytope geometry Very well-understood performance for random matrices What about matrices arising in vision… ? Today I will focus on one particular observation in automatic face recognition.

Combined training dictionary PRIOR WORK - Face Recognition as Sparse Representation Linear subspace model for images of same face under varying illumination: Subject i Training If test image is also of subject , then for some . . Can represent any test image wrt the entire training set as Key point : e can be large, unbounded, *not* noise Test image Combined training dictionary coefficients corruption, occlusion

PRIOR WORK - Face Recognition as Sparse Representation Underdetermined system of linear equations in unknowns : Solution is not unique … but should be sparse: ideally, only supported on images of the same subject expected to be sparse: occlusion only affects a subset of the pixels Seek the sparsest solution: convex relaxation Wright, Yang, Ganesh, Sastry, and Ma. Robust Face Recognition via Sparse Representation, PAMI 2008

GUARANTEES – What About Vision Problems? Behavior under varying levels of random pixel corruption: Recognition rate 99.3% 90.7% 37.5% Today I will focus on one particular observation in automatic face recognition. Can existing theory explain this phenomenon?

PRIOR WORK - Error Correction by minimization Candes and Tao [IT ‘05]: Apply parity check matrix s.t. , yielding Set Recover from clean system Underdetermined system in sparse e only Succeeds whenever in the reduced system .

PRIOR WORK - Error Correction by minimization Candes and Tao [IT ‘05]: Apply parity check matrix s.t. , yielding Set Recover from clean system Underdetermined system in sparse e only Succeeds whenever in the reduced system . This work: Instead solve Can be applied when A is wide (no parity check).

PRIOR WORK - Error Correction by minimization Candes and Tao [IT ‘05]: Apply parity check matrix s.t. , yielding Set Recover from clean system Underdetermined system in sparse e only Succeeds whenever in the reduced system . This work: Instead solve Succeeds whenever in the expanded system .

GUARANTEES – What About Vision Problems? Highly coherent ( volume ) very sparse: # images per subject, often nonnegative (illumination cone models). as dense as possible: robust to highest possible corruption. Today I will focus on one particular observation in automatic face recognition. Results so far: should not succeed.

SIMULATION - Dense Error Correction? As dimension , an even more striking phenomenon emerges:

SIMULATION - Dense Error Correction? As dimension , an even more striking phenomenon emerges:

SIMULATION - Dense Error Correction? As dimension , an even more striking phenomenon emerges:

SIMULATION - Dense Error Correction? As dimension , an even more striking phenomenon emerges:

SIMULATION - Dense Error Correction? As dimension , an even more striking phenomenon emerges:

SIMULATION - Dense Error Correction? As dimension , an even more striking phenomenon emerges: Conjecture: If the matrices are sufficiently coherent, then for any error fraction , as , solving corrects almost any error with .

DATA MODEL - Cross-and-Bouquet Our model for should capture the fact that the columns are tightly clustered around a common mean : L^-norm of deviations well-controlled ( -> v ) Mean is mostly incoherent with standard (error) basis We call this the “Cross-and-Bouquet’’ (CAB) model.

ASYMPTOTIC SETTING - Weak Proportional Growth Observation dimension Problem size grows proportionally: Error support grows proportionally: Support size sublinear in :

ASYMPTOTIC SETTING - Weak Proportional Growth Observation dimension Problem size grows proportionally: Error support grows proportionally: Support size sublinear in : Sublinear growth of is necessary to correct arbitrary fractions of errors: Need at least “clean” equations. Empirical Observation: If grows linearly in , sharp phase transition at .

NOTATION - Correct Recovery of Solutions Whether is recovered depends only on Call -recoverable if with these signs and support and the minimizer is unique.

MAIN RESULT - Correction of Arbitrary Error Fractions Recall notation: “ recovers any sparse signal from almost any error with density less than 1”

SIMULATION - Arbitrary Errors in WPG Fraction of correct successes for increasing m ( , )

SIMULATION - Phase Transition in Proportional Growth What if grows linearly with m? Asymptotically sharp phase transition, similar to that observed by Donoho and Tanner for homogeneous Gaussian matrices

SIMULATION - Comparison to Alternative Approaches “L1 - [A I]”: “L1 -  comp”: “ROMP”: Regularized orthogonal matching pursuit Candes + Tao ‘05 Needell + Vershynin ‘08

SIMULATION - Error Correction with Real Faces For real face images, weak proportional growth corresponds to the setting where the total image resolution grows proportionally to the size of the database. Fraction of correct recoveries Above: corrupted images. ( 50% probability of correct recovery ) Below: reconstruction.

SUMMARY – Sparse Representation in Theory and Practice So far: Face recognition as a motivating example Sparse recovery guarantees for generic systems New theory and new phenomena from face data After the break: Algorithms for sparse recovery Many more applications in vision and sensor networks Matrix extensions: missing data imputation and robust PCA Today I will focus on one particular observation in automatic face recognition.