ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer Science & Engineering The Chinese University of Hong Kong
Outline Background Motivation An Enhanced Model Experimental Analysis Conclusion 2 ICONIP 2010, Sydney, Australia
Background Recommendation in Collaborative Filtering Recommendation ICONIP 2010, Sydney, Australia 3
Background Significance –Consumer Satisfaction –Profit Mathematical Form –User-item matrix complete task – Rating prediction User Item Rating for Prediction ICONIP 2010, Sydney, Australia 4
Background Traditional Recommendation Methods –Memory-based method Item-based method, WWW ’01 & SIGIR ’06 User-based method, SIGIR ’06 –Model-based method Probabilistic matrix factorization, SIGIR ’07 & 04 ICONIP 2010, Sydney, Australia 5
Background A Novel View of Recommendation [Green’s function recommendation, KDD ’07 & WWW10] –Label propagation on a graph –Label prediction with semi-supervised learning ICONIP 2010, Sydney, Australia 6
Motivation Higher accuracy in label propagation recommendation Importance of graph construction Accuracy Reduction –Data Sparsity Some items have no similarity information –Information Loss Similarity in a local view ICONIP 2010, Sydney, Australia 7
An Enhanced Model An Enhanced Model Based on Green’s Function Enhanced Item-Graph Construction User-Item Rating Matrix Green’s Function Calculation Label Propagation Predicted User-item Matrix ICONIP 2010, Sydney, Australia 8
An Enhanced Model Enhanced Item-Graph Construction –Global similarity between items Latent-feature vector similarity –Local similarity between items Similarity derived from ratings –Global and local consistent similarity Linear combination of global and local similarity ICONIP 2010, Sydney, Australia 9
An Enhanced Model Global Similarity Calculation –Latent features extraction Probabilistic matrix factorization (PMF), NIPS ’08 : M*N rating matrix ; : K*N item-latent matrix : M*K user-latent : rating of user i for item j; : indicator to show whether user i rated item j. ICONIP 2010, Sydney, Australia 10
An Enhanced Model Local Similarity Calculation –Cosine Similarity –Pearson Correlation Coefficient (PCC) ICONIP 2010, Sydney, Australia 11
An Enhanced Model Global And Local Consistent Similarity (GLCS) –Global similarity from item latent matrix –Global and Local similarity combination –Weighted undirected item-graph ICONIP 2010, Sydney, Australia 12
An Enhanced Model Green’s Function Calculation (An Example) –Given an item-graph –Calculate the Laplacian matrix L= D-W W= D= ICONIP 2010, Sydney, Australia 13
An Enhanced Model Green’s Function Calculation –Defined as the inverse of matrix L with zero- mode discarded without ICONIP 2010, Sydney, Australia 14
An Enhanced Model Label Propagation Recommendation –rating as label ; –Closed form label propagation: Label Propagation Label data Unlabeled data ICONIP 2010, Sydney, Australia 15
Experimental Analysis Dataset –MovieLens dataset Metrics –Mean Absolute Error (MAE) –Mean Zero-one Error (MZOE) –Rooted Mean Squared Error (RMSE) #Rating#Item#User#Rating Range #Training Data #Test Data Sparsity Level 100, ~580,00020,0006.3% ICONIP 2010, Sydney, Australia 16
Experimental Analysis Impact of Weight Parameter k=10 k=5 ICONIP 2010, Sydney, Australia 17
Experimental Analysis Performance Comparison –Previous Green’s function model ( GCOS, GPCC ), [KDD ’07] –Item-based recommendation ( ICOS, IPCC ) –User-based recommendation ( UCOS, UPCC ) ICONIP 2010, Sydney, Australia 18
Conclusion Latent features provide global similarity. Global and local consistent similarity can improve item-graph construction. The enhanced model outperformed other memory-based methods and previous model. ICONIP 2010, Sydney, Australia 19
Q&A Thank you! ICONIP 2010, Sydney, Australia 20
PMF Probabilistic Matrix Factorization –Define a conditional distribution over the observed ratings as: ICONIP 2010, Sydney, Australia 21 Gaussian Distribution
PMF –Assume zero-mean spherical Gaussian priors on user and item feature –By Bayesian Inference: ICONIP 2010, Sydney, Australia 22
PMF –Optimization: to maximize the log likelihood of the posterior distribution: –Using Gradient Decent in Y, U, V to get local optimal. ICONIP 2010, Sydney, Australia 23
Algorithm ICONIP 2010, Sydney, Australia 24
ICONIP 2010, Sydney, Australia 25