Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modeling User Rating Profiles For Collaborative Filtering

Similar presentations


Presentation on theme: "Modeling User Rating Profiles For Collaborative Filtering"— Presentation transcript:

1 Modeling User Rating Profiles For Collaborative Filtering
Benjamin M. Marlin University of Toronto. Department of Computer Science. Toronto, Ontario, Canada 2. Introduction AP 08

2 1. Abstract 2. Introduction
• We present a new latent variable model for rating-based collaborative filtering called the User Rating Profile model (URP). URP has complete generative semantics at the user and rating profile levels. • URP is related to several models including a multinomial mixture model, the aspect model, and latent Dirichlet allocation, but has advantages over each. • A variational Expectation Maximization procedure is used to fit the URP model. Rating prediction makes use of a well defined variational inference procedure. • Empirical results on two rating prediction tasks using the EachMovie and MovieLens data sets show that URP attains lower error rates than the multinomial mixture model, the aspect model, and neighborhood-based techniques. 2. Introduction

3 Collaborative Filtering Formulations
Preference Indicators Co-occurrence Pair (u,y): u is a user index and y is an item index. Count Vector (n1u, n2u, … , nMu): nyu is the number of times (u,y) is observed. Rating Triplet (u,y,r): u is a user index, y is an item index, r is a rating value. Rating Vector (r1u, r2u, … , rMu): ryu is rating assigned to item y by user u. Additional Features In a pure formulation no additional features are used. A hybrid formulation incorporates additional content-based item and user features. Preference Dynamics In a sequential formulation the rating process is modeled as a time series. In a non-sequential formulation preferences are assumed to be static.

4 The Pure, Non-Sequential, Rating-Based Formulation
Additional Features: None Preference Dynamics: Non-sequential Preference Indicators: Ordinal rating vectors Items: y=1,…,M Users: u=1,…,N Ratings: r=1,…,V Formal Description: Tasks: The two main tasks under this formulation are recommendation and rating prediction. Rating prediction is the task of estimating all unknown ratings for the active user. The focus of research is developing highly accurate methods for rating prediction. 1. Item y2 2. Item y3 Item List Rating Database Recommendation Sort Active User Ratings Predicted Ratings Rating Prediction Figure 1: Given a rating prediction method, a recommendation method is easily obtained: predict, then sort.

5 3. Related Work Neighborhood Methods: Multinomial Mixture Model:
• Introduced by Resnick et al (GroupLens), Shardanand and Maes (Ringo). • All variants can be seen as modifications of the K-Nearest Neighbor classifier. Rating Prediction: 1. Compute similarity measure between active user and all users in database. 2. Compute predicted rating for each item. Multinomial Mixture Model: • A simple mixture model with fast, reliable learning by EM, and low prediction time. • Simple but correct generative semantics. Each profile is generated by 1 of K types. Learning: E-Step: M-Step: Rating Prediction:

6 Latent Dirichlet Allocation:
The Aspect Model: Latent Dirichlet Allocation: • Proposed by Blei et al. for text modeling. • Can be used in a co-occurrence based CF formulation. Can not model ratings. • A correct generative version of the dyadic aspect model. User’s distribution over types is random variable with Dirichlet prior. • Many versions proposed by Hofmann. Of main interest are dyadic, triadic, and new vector version proposed by Marlin. • All have incomplete generative semantics. Learning (Vector): E-Step: M-Step: Learning: • Model learned using variational EM or Minka’s Expectation propagation. • Exact inference not possible. Rating Prediction (Vector): Prediction: • Needs approximate inference. Variational methods result in an iterative algorithm.

7 Co-occurrence to Ratings Ratings to Rating profiles
Graphical Models: Figure 2: Dyadic Aspect Model Figure 3: Triadic Aspect Model Figure 4: Vector Aspect Model Co-occurrence to Ratings Ratings to Rating profiles Variable U: User index Variable Z: Attitude index Variable Y: Item Index Variable R: Rating Value Parameter : P(Z|U=u) Parameter : P(R|Z=z,Y=y) Variable U: User index Variable Z: Attitude index Variable Y: Item Index Parameter : P(Z|U=u) Parameter : P(Y|Z=z) Variable U: User index Variable Zy: Attitude index Variable Ry: Rating value Variable Y: Item Index Parameter : P(Z|U=u) Parameter : P(R|Z=z,Y=y) Generative Generative

8 Co-occurrence to Rating Profile
Figure 5: LDA Model Variable  : P(Z|U=u) Variable Z: Attitude index Variable Y: Item index Parameter : Dirichlet prior Parameter : P(Y|Z=z) Figure 6: URP Model Variable  : P(Z|U=u) Variable Zy: Attitude index Variable Ry: Rating value Variable Y: Item index Parameter : Dirichlet prior Parameter : P(Ry |Z=z) Co-occurrence to Rating Profile

9 4. The URP Model Model Specification: Generative Process: Description:
• The latent space description of a user is a Dirichlet random variable  that encodes a multinomial distribution over user types. • Each setting of the multinomial variables Zy is an index into K user types or user attitudes. • Each user attitude is represented by a multinomial distribution over ratings for each item encoded by . • The multinomial variables Ry give the ratings for each item y. Possible values are from 1 to V. Generative Process: • Unlike a simple mixture model, each user has a unique distribution over . • Unlike the aspect model family, there are proper generative semantics on . • Unlike LDA, URP generates a set of complete user rating profiles 1. For each user u = 1 to N Sample  ~ Dirichlet() For each item y = 1 to M Sample z ~ Multinomial() Sample r ~ Multimonial(yz)

10 Learning Variational Approximation Paramter Estimation
• Exact inference is intractable with URP. We define a fully factorized approximate q-distribution with variational multinomial parameters u, and variational Dirichlet parameters u. Variational Inference Paramter Estimation Solve

11 5. Experimentation Rating Prediction
• Once rating distributions are estimated, any number of prediction techniques can be used. The prediction technique should match the error measure used. 5. Experimentation Weak Generalization Experiment: • Available ratings for each user split into observed and unobserved sets. Trained on the observed ratings, tested on the unobserved ratings. • Repeated on 3 random splits of data. Strong Generalization Experiment: • Users split into training set and testing set. Ratings for test users split into observed and unobserved sets. Trained on training users, tested on test users. • Repeated on 3 random splits of data.

12 Error Measure: Data Sets: Normalized Mean Absolute Error:
• Average over all users of the absolute difference between predicted and actual ratings. • Normalized by expectation of the difference between predicted and actual ratings under empirical rating distribution of the base data set. EachMovie: Compaq Systems Research Center • Users: • Items: • Rating Values: 6 • Ratings: 2,811,983 • Sparsity: 97.6% • Filtering: 20 ratings MovieLens: GroupLens Research Center • Users: 6040 • Items: • Rating Values: 5 • Ratings: 1,000,209 • Sparsity: 95.7% • Filtering: 20 ratings Figure 7: Distribution of ratings in weak and strong filtered data sets compared to base data sets.

13 5. Experimentation and Results 6. Results
Norm. Norm. Figure 8: MovieLens Weak Generalization Results Figure 9: MovieLens Strong Generalization Results • URP and the aspect model attain the same minimum weak generalization error rate, but URP does so using far fewer model parameters.

14 Norm. Norm. Figure 10: EachMovie Weak Generalization Results Figure 11: EachMovie Strong Generalization Results • On the more difficult EachMovie data set, URP clearly performs better than the other rating prediction methods considered.

15 7. Conclusions and Future Work
• We have introduced URP, a new generative model specially designed for pure, non-sequential, ratings-based collaborative filtering. URP has consistent generative semantics at both the user level, and the rating profile level. • Empirical results show that URP outperforms other popular rating prediction methods using fewer model parameters. Future Work: • Models with more intuitive generative semantics. Currently under study are a promising family of product models. • Models that integrate additional features, or sequential dynamics, or both.

16 8. References 1. D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3: , January 2003. 2. John S. Breese, David Heckerman, and Carl Kadie. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pages 43-52, July 1998. 3. Thomas Hofmann. Learning What People (Don't) Want. In Proceedings of the European Conference on Machine Learning (ECML), 2001. 5. Thomas Minka and John Lafferty. Expectation-Propagation for the Generative Aspect Model. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, 2002. 6. R. M. Neal and G. E. Hinton. A new view of the EM algorithm that justifies incremental, sparse and other variants. In M. I. Jordan, editor, Learning in Graphical Models, pages Kluwer Academic Publishers, 1998. 7. P. Resnick, N. Iacovou, M. Suchak, P. Bergstorm, and J. Riedl. GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work, pages 175{186, Chapel Hill, North Carolina, ACM. 8. Upendra Shardanand and Patti Maes. Social information ltering: Algorithms for automating “word of mouth". In Proceedings of ACM CHI'95, volume 1, pages , 1995.


Download ppt "Modeling User Rating Profiles For Collaborative Filtering"

Similar presentations


Ads by Google