Scalable Maximum Margin Factorization by Active Riemannian Subspace search Yan Yan, Mingkui Tan, Ivor W. Tsang, Yi Yang, Chengqi Zhang and Qinfeng Shi QCIS, University of Technology, Sydney ACVT, The University
Outline Introduction The Proposed Model Experiments Conclusion
Collaborative filtering for recommendation systems Goal Recover missing ratings by low-rank matrix completion Real world applications Recommend TV shows/movies on Netflix Recommend artists/music tracks on Xiami Recommend products on Taobao… Data that can be used Partially observed rating data from users on items A specific output of recommendation systems The predicted ranking scores of users on unseen items
A user/item rating matrix on movies Figure: An example of a user/item rating matrix on movies
Problem setup of matrix completion Reconstruct the rating matrix X with a low-rank constraint Y is the observed matrix The problem is NP-hard Approach: matrix factorization
Matrix factorization approach Figure: Matrix factorization
Challenges Real world rating data are in discrete values Maximum margin matrix factorization Existing methods usually requires repetitive SVDs Our optimization avoids repetitive SVDs and applies cheaper QR The latent variable r is usually unknown and can be different among various datasets A automatic method to detect the rank
Maximum margin matrix factorization (M3F) Hinge loss: appropriate for discrete rating data in real world M3F for binary values (-1/+1) From binary values to ordinal values Suppose Introduce L+1 thresholds
Maximum margin matrix factorization (M3F) M3F for ordinal values
Maximum margin matrix factorization (M3F) Figure: M3F loss for discrete ordinal values
Formulation
Differential Geometry of Fixed-rank Matrices
Differential Geometry of Fixed-rank Matrices Figure: Gradient descent on Riemannian manifold
Differential Geometry of Fixed-rank Matrices
Differential Geometry of Fixed-rank Matrices Figure: Gradient descent on Riemannian manifold
Differential Geometry of Fixed-rank Matrices Retraction Retraction can be cheaply calculated without SVD in
Line Search on Riemannian Manifold Figure: Gradient descent on Riemannian manifold
BNRCG: Block-wise nonlinear Riemannian conjugate gradient descent for M3F
Active Riemannian subspace search for M3F: ARSS-M3F
Active Riemannian subspace search for M3F: ARSS-M3F Step 1: Increase the rank.
Active Riemannian subspace search for M3F: ARSS-M3F Step 2: Update X and thresholds.
Experiments Data sets # users # items # ratings Binary-syn 1,000 All Ordinal-syn-small Ordinal-syn-large 20,000 Movielens 1M 6,040 3,952 1,000,209 Movielens 10M 71,567 10,681 10,000,054 Netflix 480,189 17,770 100,480,507 Yahoo! Music Track 1 1,000,990 624,961 262,810,175
The sensitivity of the regularization parameter experiment
The convergence behavior experiment
RMSE and consumed time on the synthetic datasets
RMSE and consumed time on Movielens 1M and Movielens 10M
RMSE and consumed time on Netix and Yahoo Music
Conclusion Two challenges in M3F: scalability and latent factor detection BNRCG addresses the scalability problem by exploiting Riemannian geometry ARSS-M3F applies an efficient and simple method to detect the latent factor Extensive experiments demonstrate the proposed method can provide competitive performance.
Thank you!