Presentation is loading. Please wait.

Presentation is loading. Please wait.

Direct Robust Matrix Factorization Liang Xiong, Xi Chen, Jeff Schneider Presented by xxx School of Computer Science Carnegie Mellon University.

Similar presentations


Presentation on theme: "Direct Robust Matrix Factorization Liang Xiong, Xi Chen, Jeff Schneider Presented by xxx School of Computer Science Carnegie Mellon University."— Presentation transcript:

1 Direct Robust Matrix Factorization Liang Xiong, Xi Chen, Jeff Schneider Presented by xxx School of Computer Science Carnegie Mellon University

2 Matrix Factorization Extremely useful… – Assumes the data matrix is of low-rank. – PCA/SVD, NMF, Collaborative Filtering… – Simple, effective, and scalable. For Anomaly Detection – Assumption: the normal data is of low-rank, and anomalies are poorly approximated by the factorization. DRMF: Liang Xiong, Xi Chen, Jeff Schneider2

3 Robustness Issue Usually not robust (sensitive to outliers) – Because of the L 2 (Frobenius) measure they use. For anomaly detection, of course we have outliers. DRMF: Liang Xiong, Xi Chen, Jeff Schneider3 Minimize the approximation error Low rank

4 Why outliers matter DRMF: Liang Xiong, Xi Chen, Jeff Schneider4 Input signals Output basis No outlier Moderate outlier Wild outlier Simulation – We use SVD to find the first basis of 10 sine signals. – To make it more fun, let us turn one point of one signal into a spike (the outlier). Cool Disturbed  Totally lost 

5 Direct Robust Matrix Factorization (DRMF) Throw outliers out of the factorization, and problem solved! Mathematically, this is DRMF: – : number of non-zeros in S. DRMF: Liang Xiong, Xi Chen, Jeff Schneider5 “Trash can” for outliers There should be only a small number of outliers.

6 DRMF Algorithm Input: Data X. Output: Low-rank L; Outliers S. Iterate (block coordinate descent): – Let C = X – S. Do rank-K SVD: L = SVD(C, K). – Let E = X – L. Do thresholding: t: the e-th largest elements in {|E ij |}. That’s it! Everyone could try at home. DRMF: Liang Xiong, Xi Chen, Jeff Schneider6

7 Related Work Nuclear norm minimization (NNM) – Effective methods with nice theoretical properties from compressive sensing. – NNM is the convex relaxation of DRMF: A parallel work GoDec by Zhou et al. found in ICML’11. DRMF: Liang Xiong, Xi Chen, Jeff Schneider7

8 Pros & Cons Pros: – No compromise/relaxation => High quality – Efficient – Easy to implement and use Cons: – Difficult theory Because of the rank and the L 0 norm… – Non-convex. Local minima exist. But can be greatly mitigated if initialized by its convex version, NNM. DRMF: Liang Xiong, Xi Chen, Jeff Schneider8

9 Highly Extensible Structured Outliers – Outlier rows instead of entries? Just use structured measurements. Sparse Input / Missing data – Useful for Recommendation, Matrix Completion. Non-Negativity like in NMF – Still readily solvable with the constraints. For large-scale problems. – Use approximate SVD solvers. DRMF: Liang Xiong, Xi Chen, Jeff Schneider9

10 Simulation Study Factorize noisy low-rank matrices to find entry outliers. – SVD: plain SVD. RPCA, SPCP: two representative NNM methods. DRMF: Liang Xiong, Xi Chen, Jeff Schneider10 Error of recovering normal entries Detection rate of outlier entries. Running time (log-scale)

11 Simulation Study Sensitivity to outliers – We examine the recovering errors when the outlier amplitude grows. – Noiseless case. All assumptions by RPCA hold. DRMF: Liang Xiong, Xi Chen, Jeff Schneider11

12 Find Stranger Digits USPS dataset is used. We mix a few ‘7’s into many ‘1’’s, and then ask DRMF to find out those ‘7’s. Unsupervised. – Treat each digit as a row in the matrix. – Rank the digits by reconstruction errors. – Use the structured extension of DRMF: row outliers. Resulting ranked list: DRMF: Liang Xiong, Xi Chen, Jeff Schneider12

13 Conclusion DRMF is a direct and intuitive solution to the robust factorization problem. Easy to implement and use. Highly extensible. Good empirical performance. DRMF: Liang Xiong, Xi Chen, Jeff Schneider13 Please direct questions to Liang Xiong (lxiong@cs.cmu.edu)


Download ppt "Direct Robust Matrix Factorization Liang Xiong, Xi Chen, Jeff Schneider Presented by xxx School of Computer Science Carnegie Mellon University."

Similar presentations


Ads by Google