Direct Robust Matrix Factorization Liang Xiong, Xi Chen, Jeff Schneider Presented by xxx School of Computer Science Carnegie Mellon University.

Slides:

Advertisements

Similar presentations

Subspace Embeddings for the L1 norm with Applications Christian Sohler David Woodruff TU Dortmund IBM Almaden.

Advertisements

Nonnegative Matrix Factorization with Sparseness Constraints S. Race MA591R.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Bregman Iterative Algorithms for L1 Minimization with

Compressive Sensing IT530, Lecture Notes.

Pixel Recovery via Minimization in the Wavelet Domain Ivan W. Selesnick, Richard Van Slyke, and Onur G. Guleryuz *: Polytechnic University, Brooklyn, NY.

Online Performance Guarantees for Sparse Recovery Raja Giryes ICASSP 2011 Volkan Cevher.

T HE POWER OF C ONVEX R ELAXATION : N EAR - OPTIMAL MATRIX COMPLETION E MMANUEL J. C ANDES AND T ERENCE T AO M ARCH, 2009 Presenter: Shujie Hou February,

Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.

Sketching for M-Estimators: A Unified Approach to Robust Regression

1/11 Tea Talk: Weighted Low Rank Approximations Ben Marlin Machine Learning Group Department of Computer Science University of Toronto April 30, 2003.

Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang

Shape From Light Field meets Robust PCA

Middle Term Exam 03/01 (Thursday), take home, turn in at noon time of 03/02 (Friday)

* * Joint work with Michal Aharon Freddy Bruckstein Michael Elad

1 Micha Feigin, Danny Feldman, Nir Sochen

An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication.

1 Applications on Signal Recovering Miguel Argáez Carlos A. Quintero Computational Science Program El Paso, Texas, USA April 16, 2009.

ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.

Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.

A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University.

1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.

Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.

A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University.

Sketching as a Tool for Numerical Linear Algebra David Woodruff IBM Almaden.

Hybrid Dense/Sparse Matrices in Compressed Sensing Reconstruction

1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)

Compressed Sensing Compressive Sampling

Game Theory Meets Compressed Sensing

Matrix Completion IT530 Lecture Notes.

Yan Yan, Mingkui Tan, Ivor W. Tsang, Yi Yang,

1 Information Retrieval through Various Approximate Matrix Decompositions Kathryn Linehan Advisor: Dr. Dianne O’Leary.

Non Negative Matrix Factorization

Cs: compressed sensing

BrainStorming 樊艳波 Outline Several papers on icml15 & cvpr15 PALM Information Theory Learning.

 Karthik Gurumoorthy  Ajit Rajwade  Arunava Banerjee  Anand Rangarajan Department of CISE University of Florida 1.

Fast and incoherent dictionary learning algorithms with application to fMRI Authors: Vahid Abolghasemi Saideh Ferdowsi Saeid Sanei. Journal of Signal Processing.

Dimensionality Reduction Shannon Quinn (with thanks to William Cohen of Carnegie Mellon University, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford.

Targil 6 Notes This week: –Linear time Sort – continue: Radix Sort Some Cormen Questions –Sparse Matrix representation & usage. Bucket sort Counting sort.

Efficient and Numerically Stable Sparse Learning Sihong Xie 1, Wei Fan 2, Olivier Verscheure 2, and Jiangtao Ren 3 1 University of Illinois at Chicago,

Text Categorization Moshe Koppel Lecture 12:Latent Semantic Indexing Adapted from slides by Prabhaker Raghavan, Chris Manning and TK Prasad.

Orthogonalization via Deflation By Achiya Dax Hydrological Service Jerusalem, Israel

Lec 22: Stereo CS4670 / 5670: Computer Vision Kavita Bala.

Efficient computation of Robust Low-Rank Matrix Approximations in the Presence of Missing Data using the L 1 Norm Anders Eriksson and Anton van den Hengel.

A Note on Rectangular Quotients By Achiya Dax Hydrological Service Jerusalem, Israel

High-dimensional Error Analysis of Regularized M-Estimators Ehsan AbbasiChristos ThrampoulidisBabak Hassibi Allerton Conference Wednesday September 30,

Version 1.1 Improving our knowledge of metaheuristic approaches for cell suppression problem Andrea Toniolo Staggemeier Alistair R. Clark James Smith Jonathan.

Rank Minimization for Subspace Tracking from Incomplete Data

BART VANLUYTEN, JAN C. WILLEMS, BART DE MOOR 44 th IEEE Conference on Decision and Control December 2005 Model Reduction of Systems with Symmetries.

Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.

Non-negative Matrix Factorization

Dimensionality Reduction

Matrix Factorization and its applications By Zachary 16 th Nov, 2010.

DATA MINING LECTURE 8 Sequence Segmentation Dimensionality Reduction.

Super-resolution MRI Using Finite Rate of Innovation Curves Greg Ongie*, Mathews Jacob Computational Biomedical Imaging Group (CBIG) University of Iowa.

SketchVisor: Robust Network Measurement for Software Packet Processing

Dimensionality Reduction and Principle Components Analysis

Compressive Coded Aperture Video Reconstruction

Jeremy Watt and Aggelos Katsaggelos Northwestern University

CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.

Matrix Completion from a few entries

Basic Algorithms Christina Gallner

Parallelism in High-Performance Computing Applications

Structure from motion Input: Output: (Tomasi and Kanade)

Singular Value Decomposition

Algebraic Techniques for Analysis of Large Discrete-Valued Datasets

Solving Linear Systems: Iterative Methods and Sparse Systems

Structure from motion Input: Output: (Tomasi and Kanade)

Non-Negative Matrix Factorization

Outline Sparse Reconstruction RIP Condition

Presentation transcript:

Direct Robust Matrix Factorization Liang Xiong, Xi Chen, Jeff Schneider Presented by xxx School of Computer Science Carnegie Mellon University

Matrix Factorization Extremely useful… – Assumes the data matrix is of low-rank. – PCA/SVD, NMF, Collaborative Filtering… – Simple, effective, and scalable. For Anomaly Detection – Assumption: the normal data is of low-rank, and anomalies are poorly approximated by the factorization. DRMF: Liang Xiong, Xi Chen, Jeff Schneider2

Robustness Issue Usually not robust (sensitive to outliers) – Because of the L 2 (Frobenius) measure they use. For anomaly detection, of course we have outliers. DRMF: Liang Xiong, Xi Chen, Jeff Schneider3 Minimize the approximation error Low rank

Why outliers matter DRMF: Liang Xiong, Xi Chen, Jeff Schneider4 Input signals Output basis No outlier Moderate outlier Wild outlier Simulation – We use SVD to find the first basis of 10 sine signals. – To make it more fun, let us turn one point of one signal into a spike (the outlier). Cool Disturbed  Totally lost 

Direct Robust Matrix Factorization (DRMF) Throw outliers out of the factorization, and problem solved! Mathematically, this is DRMF: – : number of non-zeros in S. DRMF: Liang Xiong, Xi Chen, Jeff Schneider5 “Trash can” for outliers There should be only a small number of outliers.

DRMF Algorithm Input: Data X. Output: Low-rank L; Outliers S. Iterate (block coordinate descent): – Let C = X – S. Do rank-K SVD: L = SVD(C, K). – Let E = X – L. Do thresholding: t: the e-th largest elements in {|E ij |}. That’s it! Everyone could try at home. DRMF: Liang Xiong, Xi Chen, Jeff Schneider6

Related Work Nuclear norm minimization (NNM) – Effective methods with nice theoretical properties from compressive sensing. – NNM is the convex relaxation of DRMF: A parallel work GoDec by Zhou et al. found in ICML’11. DRMF: Liang Xiong, Xi Chen, Jeff Schneider7

Pros & Cons Pros: – No compromise/relaxation => High quality – Efficient – Easy to implement and use Cons: – Difficult theory Because of the rank and the L 0 norm… – Non-convex. Local minima exist. But can be greatly mitigated if initialized by its convex version, NNM. DRMF: Liang Xiong, Xi Chen, Jeff Schneider8

Highly Extensible Structured Outliers – Outlier rows instead of entries? Just use structured measurements. Sparse Input / Missing data – Useful for Recommendation, Matrix Completion. Non-Negativity like in NMF – Still readily solvable with the constraints. For large-scale problems. – Use approximate SVD solvers. DRMF: Liang Xiong, Xi Chen, Jeff Schneider9

Simulation Study Factorize noisy low-rank matrices to find entry outliers. – SVD: plain SVD. RPCA, SPCP: two representative NNM methods. DRMF: Liang Xiong, Xi Chen, Jeff Schneider10 Error of recovering normal entries Detection rate of outlier entries. Running time (log-scale)

Simulation Study Sensitivity to outliers – We examine the recovering errors when the outlier amplitude grows. – Noiseless case. All assumptions by RPCA hold. DRMF: Liang Xiong, Xi Chen, Jeff Schneider11

Find Stranger Digits USPS dataset is used. We mix a few ‘7’s into many ‘1’’s, and then ask DRMF to find out those ‘7’s. Unsupervised. – Treat each digit as a row in the matrix. – Rank the digits by reconstruction errors. – Use the structured extension of DRMF: row outliers. Resulting ranked list: DRMF: Liang Xiong, Xi Chen, Jeff Schneider12

Conclusion DRMF is a direct and intuitive solution to the robust factorization problem. Easy to implement and use. Highly extensible. Good empirical performance. DRMF: Liang Xiong, Xi Chen, Jeff Schneider13 Please direct questions to Liang Xiong