Matrix Extensions to Sparse Recovery Yi Ma 1,2 Allen Yang 3 John Wright 1 CVPR Tutorial, June 20, 2009 1 Microsoft Research Asia 3 University of California.

Slides:



Advertisements
Similar presentations
Matrix Factorization with Unknown Noise
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
EKF, UKF TexPoint fonts used in EMF.
Globally Optimal Estimates for Geometric Reconstruction Problems Tom Gilat, Adi Lakritz Advanced Topics in Computer Vision Seminar Faculty of Mathematics.
Siddharth Choudhary.  Refines a visual reconstruction to produce jointly optimal 3D structure and viewing parameters  ‘bundle’ refers to the bundle.
T HE POWER OF C ONVEX R ELAXATION : N EAR - OPTIMAL MATRIX COMPLETION E MMANUEL J. C ANDES AND T ERENCE T AO M ARCH, 2009 Presenter: Shujie Hou February,
Image Congealing (batch/multiple) image (alignment/registration) Advanced Topics in Computer Vision (048921) Boris Kimelman.
Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang
More MR Fingerprinting
Robust Object Tracking via Sparsity-based Collaborative Model
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Robust Network Compressive Sensing Lili Qiu UT Austin NSF Workshop Nov. 12, 2014.
EE 290A: Generalized Principal Component Analysis Lecture 6: Iterative Methods for Mixture-Model Segmentation Sastry & Yang © Spring, 2011EE 290A, University.
Sparse Representation and Compressed Sensing: Theory and Algorithms
Random Convolution in Compressive Sampling Michael Fleyer.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
Robust Statistics Robust Statistics Why do we use the norms we do? Henrik Aanæs IMM,DTU A good general reference is: Robust Statistics:
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
Kalman Filtering Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics TexPoint fonts used in EMF. Read.
Compressed Sensing Compressive Sampling
An ALPS’ view of Sparse Recovery Volkan Cevher Laboratory for Information and Inference Systems - LIONS
AMSC 6631 Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Midyear Report Alfredo Nava-Tudela John J. Benedetto,
Matrix Completion IT530 Lecture Notes.
1 Exact Recovery of Low-Rank Plus Compressed Sparse Matrices Morteza Mardani, Gonzalo Mateos and Georgios Giannakis ECE Department, University of Minnesota.
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
Cs: compressed sensing
Handling Outliers and Missing Data in Statistical Data Models Kaushik Mitra Date: 17/1/2011 ECSU Seminar, ISI.
Recovering low rank and sparse matrices from compressive measurements Aswin C Sankaranarayanan Rice University Richard G. Baraniuk Andrew E. Waters.
1 Sparsity Control for Robust Principal Component Analysis Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments:
Sparse Matrix Factorizations for Hyperspectral Unmixing John Wright Visual Computing Group Microsoft Research Asia Sept. 30, 2010 TexPoint fonts used in.
Learning With Structured Sparsity
Compressible priors for high-dimensional statistics Volkan Cevher LIONS/Laboratory for Information and Inference Systems
Orthogonalization via Deflation By Achiya Dax Hydrological Service Jerusalem, Israel
Constrained adaptive sensing Mark A. Davenport Georgia Institute of Technology School of Electrical and Computer Engineering TexPoint fonts used in EMF.
Introduction to Video Background Subtraction 1. Motivation In video action analysis, there are many popular applications like surveillance for security,
A Note on Rectangular Quotients By Achiya Dax Hydrological Service Jerusalem, Israel
Inference of Poisson Count Processes using Low-rank Tensor Data Juan Andrés Bazerque, Gonzalo Mateos, and Georgios B. Giannakis May 29, 2013 SPiNCOM, University.
Javad Lavaei Department of Electrical Engineering Columbia University Convex Relaxation for Polynomial Optimization: Application to Power Systems and Decentralized.
Regularization and Feature Selection in Least-Squares Temporal Difference Learning J. Zico Kolter and Andrew Y. Ng Computer Science Department Stanford.
E XACT MATRIX C OMPLETION VIA CONVEX OPTIMIZATION E MMANUEL J. C ANDES AND B ENJAMIN R ECHT M AY 2008 Presenter: Shujie Hou January, 28 th,2011 Department.
The Scaling Law of SNR-Monitoring in Dynamic Wireless Networks Soung Chang Liew Hongyi YaoXiaohang Li.
High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,
Face recognition via sparse representation. Breakdown Problem Classical techniques New method based on sparsity Results.
Rank Minimization for Subspace Tracking from Incomplete Data
Robust Principal Components Analysis IT530 Lecture Notes.
Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.
Affine Registration in R m 5. The matching function allows to define tentative correspondences and a RANSAC-like algorithm can be used to estimate the.
Application of Dynamic Programming to Optimal Learning Problems Peter Frazier Warren Powell Savas Dayanik Department of Operations Research and Financial.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
Sequential Off-line Learning with Knowledge Gradients Peter Frazier Warren Powell Savas Dayanik Department of Operations Research and Financial Engineering.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.
From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images Alfred M. Bruckstein (Technion), David L. Donoho (Stanford), Michael.
Wildlife Census via LSH-based animal tracking APOORV PATWARDHAN 1.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Jeremy Watt and Aggelos Katsaggelos Northwestern University
ROBUST SUBSPACE LEARNING FOR VISION AND GRAPHICS
Motion Segmentation with Missing Data using PowerFactorization & GPCA
A Unified Algebraic Approach to 2D and 3D Motion Segmentation
Segmentation of Dynamic Scenes from Image Intensities
Generalized Principal Component Analysis CVPR 2008
Matrix Completion from a few entries
Nuclear Norm Heuristic for Rank Minimization
Informed Non-convex Robust Principal Component Analysis with Features
Estimating Networks With Jumps
Optimal sparse representations in general overcomplete bases
Segmentation of Dynamical Scenes
Rank-Sparsity Incoherence for Matrix Decomposition
Presentation transcript:

Matrix Extensions to Sparse Recovery Yi Ma 1,2 Allen Yang 3 John Wright 1 CVPR Tutorial, June 20, Microsoft Research Asia 3 University of California Berkeley 2 University of Illinois at Urbana-Champaign TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA A A A A AA A

FINAL TOPIC – Generalizations: sparsity to degeneracy The tools and phenomena underlying sparse recovery generalize very nicely to low-rank matrix recovery ???

FINAL TOPIC – Generalizations: sparsity to degeneracy The tools and phenomena underlying sparse recovery generalize very nicely to low-rank matrix recovery Matrix completion: Given an incomplete subset of the entries of a low-rank matrix, fill in the missing values. Robust PCA: Given a low-rank matrix which has been grossly corrupted, recover the original matrix.

??? Face images Degeneracy: illumination models Errors: occlusion, corruption Relevancy data Degeneracy: user preferences co-predict Errors: Missing rankings, manipulation Video Degeneracy: temporal, dynamic structures Errors: anomalous events, mismatches… Examples of degenerate data: THIS TALK – From sparse recovery to low-rank recovery

KEY ANALOGY – Connections between rank and sparsity Sparse recoveryRank minimization Unknown Vector x Matrix A Observations y = Ax y = L[A] (linear map) Combinatorial objective Convex relaxation Algorithmic tools Linear programming Semidefinite programming

KEY ANALOGY – Connections between rank and sparsity This talk: exploiting this connection for matrix completion and RPCA Sparse recoveryRank minimization Unknown Vector x Matrix A Observations y = Ax y = L[A] (linear map) Combinatorial objective Convex relaxation Algorithmic tools Linear programming Semidefinite programming

CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then

CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then Principal Component Analysis via singular value decomposition: Stable, efficient computation Optimal estimate of under iid Gaussian noise Fundamental statistical tool, huge impact in vision, search, bioinformatics

CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then But… PCA breaks down under even a single corrupted observation. Principal Component Analysis via singular value decomposition: Stable, efficient computation Optimal estimate of under iid Gaussian noise Fundamental statistical tool, huge impact in vision, search, bioinformatics

ROBUST PCA – Problem formulation … … D - observation A – low-rankE – sparse error … Properties of the errors: Each multivariate data sample (column) may be corrupted in some entries Corruption can be arbitrarily large in magnitude (not Gaussian!) Problem: Given recover. Low-rank structureSparse errors

ROBUST PCA – Problem formulation Problem: Given recover. Low-rank structureSparse errors Numerous heuristic methods in the literature: Random sampling [Fischler and Bolles ‘81] Multivariate trimming [Gnanadesikan and Kettering ‘72] Alternating minimization [Ke and Kanade ‘03] Influence functions [de la Torre and Black ‘03] No polynomial-time algorithm with strong performance guarantees! … … D - observation A – low-rankE – sparse error …

ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank that agrees with the data up to some sparse error:

ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank that agrees with the data up to some sparse error: Not directly tractable, relax:

ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank that agrees with the data up to some sparse error: Not directly tractable, relax: Semidefinite program, solvable in polynomial time Convex envelope over

MATRIX COMPLETION – Motivation for the nuclear norm Related problem: we observe only a small known subset of entries of a rank- matrix. Can we exactly recover ?

MATRIX COMPLETION – Motivation for the nuclear norm Related problem: recover a rank matrix from a known subset of entries Convex optimization heuristic [Candes and Recht] : Spectral trimming also succeeds with for For incoherent, exact recovery with [Keshavan, Montanari and Oh] [Candes and Tao]

ROBUST PCA – Exact recovery? CONJECTURE :If with sufficiently low-rank and exactly recovers. Sparsity of error sufficiently sparse, then solving Empirical evidence: probability of correct recovery vs rank and sparsity Perfect recovery Rank

Decompose as or ? ROBUST PCA – Which matrices and which errors? Fundamental ambiguity – very sparse matrices are also low-rank: rank-1 rank-0 0-sparse1-sparse Obviously we can only hope to uniquely recover that are incoherent with the standard basis. Can we recover almost all low-rank matrices from almost all sparse errors?

ROBUST PCA – Which matrices and which errors? Random orthogonal model (of rank r) [Candes & Recht ‘08] : independent samples from invariant measure on Steifel manifold of orthobases of rank r. arbitrary.

ROBUST PCA – Which matrices and which errors? Random orthogonal model (of rank r) [Candes & Recht ‘08] : independent samples from invariant measure on Steifel manifold of orthobases of rank r. arbitrary. Bernoulli error signs-and-support (with parameter ): Magnitude of is arbitrary.

MAIN RESULT – Exact Solution of Robust PCA “Convex optimization recovers almost any matrix of rank from errors affecting of the observations!”

BONUS RESULT – Matrix completion in proportional growth “Convex optimization exactly recovers matrices of rank, even with entries missing!”

MATRIX COMPLETION – Contrast with literature [Candes and Tao 2009]: Correct completion whp for Does not apply to the large-rank case This work: Correct completion whp for even with Proof exploits rich regularity and independence in random orthogonal model. Caveats: - [C-T ‘09] tighter for small r. - [C-T ‘09] generalizes better to other matrix ensembles.

MAIN RESULT – Exact Solution of Robust PCA “Convex optimization recovers almost any matrix of rank from errors affecting of the observations!”

ROBUST PCA – Solving the convex program Semidefinite program in millions of unknowns. Scalable solution: apply a first-order method with convergence to Sequence of quadratic approximations [Nesterov, Beck & Teboulle] : Solved via soft thresholding (E), and singular value thresholding (A).

ROBUST PCA – Solving the convex program Iteration complexity for suboptimal solution. Dramatic practical gains from continuation

SIMULATION – Recovery in various growth scenarios Correct recovery with and fixed, increasing. Empirically, almost constant number of iterations: Provably robust PCA at only a constant factor more computation than conventional PCA.

SIMULATION – Phase Transition in Rank and Sparsity Fraction of successes with, varying (10 trials) Fraction of successes with, varying (65 trials) [0,.5] x [0,.5][0,1] x [0,1] [0,.4] x [0,.4]

EXAMPLE – Background modeling from video Video Low-rank appx.Sparse error Static camera surveillance video 200 frames, 72 x 88 pixels, Significant foreground motion

EXAMPLE – Background modeling from video Video Low-rank appx.Sparse error Static camera surveillance video 550 frames, 64 x 80 pixels, significant illumination variation Background variation Anomalous activity

EXAMPLE – Faces under varying illumination … … RPCA 29 images of one person under varying lighting:

EXAMPLE – Faces under varying illumination … … RPCA 29 images of one person under varying lighting: Self- shadowing Specularity

EXAMPLE – Face tracking and alignment Initial alignment, inappropriate for recognition:

EXAMPLE – Face tracking and alignment

Final result: per-pixel alignment

EXAMPLE – Face tracking and alignment Final result: per-pixel alignment

SIMULATION – Phase Transition in Rank and Sparsity Fraction of successes with, varying (10 trials) Fraction of successes with, varying (65 trials) [0,.5] x [0,.5][0,1] x [0,1] [0,.4] x [0,.4]

CONJECTURES – Phase Transition in Rank and Sparsity Hypothesized breakdown behavior as m  ∞

CONJECTURES – Phase Transition in Rank and Sparsity What we know so far: This work Classical PCA

CONJECTURES – Phase Transition in Rank and Sparsity CONJECTURE I: convex programming succeeds in proportional growth

CONJECTURES – Phase Transition in Rank and Sparsity CONJECTURE II: for small ranks, any fraction of errors can eventually be corrected. Similar to Dense Error Correction via L1 Minimization, Wright and Ma ‘08

CONJECTURES – Phase Transition in Rank and Sparsity CONJECTURE III: for any rank fraction,, there exists a nonzero fraction of errors that can eventually be corrected with high probability.

CONJECTURES – Phase Transition in Rank and Sparsity CONJECTURE IV: there is an asymptotically sharp phase transition between correct recovery with overwhelming probability, and failure with overwhelming probability.

CONJECTURES – Connections to Matrix Completion Our results also suggest the possibility of a proportional growth phase transition for matrix completion How do the two breakdown points compare? How much is gained by knowing the location of the corruption? Robust PCA Matrix Completion Similar to Recht, Xu and Hassibi ‘08 Matrix CompletionRobust PCA

FUTURE WORK – Stronger results on RPCA? RPCA with noise and errors: Tradeoff between estimation error and robustness to corruption? Deterministic conditions on the matrix Simultaneous error correction and matrix completion: bounded noise (e.g., Gaussian) Conjecture: stable recovery with we observe

Faster algorithms: Smarter continuation strategies Parallel implementations, GPU, multi-machine Further applications: Computer vision: photometric stereo, tracking, video repair Relevancy data: search, ranking and collaborative filtering Bioinformatics System Identification FUTURE WORK – Algorithms and Applications

Reference: Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices by Convex Optimization submitted to the Journal of the ACM Collaborators: Prof. Yi Ma (UIUC, MSRA) Dr. Zhouchen Lin (MSRA) Dr. Shankar Rao (UIUC) Arvind Ganesh (UIUC) Yigang Peng (MSRA) Funding: Microsoft Research Fellowship (sponsored by Live Labs) Grants NSF CRS-EHS , NSF CCF-TF , ONR YIP N , NSF IIS REFERENCES + ACKNOWLEDGEMENT

Questions, please? THANK YOU! John Wright Robust PCA: Exact Recovery of Corrupted Low-Rank Matrices