Modeling User Rating Profiles For Collaborative Filtering

Slides:

Advertisements

Similar presentations

Mining customer ratings for product recommendation using the support vector machine and the latent class model William K. Cheung, James T. Kwok, Martin.

Advertisements

Topic models Source: Topic models, David Blei, MLSS 09.

Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.

Information retrieval – LSI, pLSI and LDA

Title: The Author-Topic Model for Authors and Documents

Collaborative Filtering Sue Yeon Syn September 21, 2005.

27/06/2005ISMB 2005 GenXHC: A Probabilistic Generative Model for Cross- hybridization Compensation in High-density Genome-wide Microarray Data Joint work.

Probabilistic Clustering-Projection Model for Discrete Data

Statistical Topic Modeling part 1

Active Learning and Collaborative Filtering

2. Introduction Multiple Multiplicative Factor Model For Collaborative Filtering Benjamin Marlin University of Toronto. Department of Computer Science.

Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.

Lecture 14: Collaborative Filtering Based on Breese, J., Heckerman, D., and Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative.

Probability based Recommendation System Course : ECE541 Chetan Tonde Vrajesh Vyas Ashwin Revo Under the guidance of Prof. R. D. Yates.

1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.

1 Collaborative Filtering Rong Jin Department of Computer Science and Engineering Michigan State University.

Top-N Recommendation Algorithm Based on Item-Graph

Latent Dirichlet Allocation a generative model for text

Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.

Multiscale Topic Tomography Ramesh Nallapati, William Cohen, Susan Ditmore, John Lafferty & Kin Ung (Johnson and Johnson Group)

Visual Recognition Tutorial

CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.

1 Collaborative Filtering: Latent Variable Model LIU Tengfei Computer Science and Engineering Department April 13, 2011.

Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.

Item-based Collaborative Filtering Recommendation Algorithms

A NON-IID FRAMEWORK FOR COLLABORATIVE FILTERING WITH RESTRICTED BOLTZMANN MACHINES Kostadin Georgiev, VMware Bulgaria Preslav Nakov, Qatar Computing Research.

Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.

Example 16,000 documents 100 topic Picked those with large p(w|z)

Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.

A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences.

Memory Bounded Inference on Topic Models Paper by R. Gomes, M. Welling, and P. Perona Included in Proceedings of ICML 2008 Presentation by Eric Wang 1/9/2009.

Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.

27. May Topic Models Nam Khanh Tran L3S Research Center.

1 Recommender Systems Collaborative Filtering & Content-Based Recommending.

Learning Geographical Preferences for Point-of-Interest Recommendation Author(s): Bin Liu Yanjie Fu, Zijun Yao, Hui Xiong [KDD-2013]

EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.

Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.

Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl

The Effect of Dimensionality Reduction in Recommendation Systems

A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,

Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang

A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.

1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.

Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.

Topic Modeling using Latent Dirichlet Allocation

Collaborative Filtering Zaffar Ahmed

Latent Dirichlet Allocation

Department of Automation Xiamen University

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Expectation-Maximization (EM) Algorithm & Monte Carlo Sampling for Inference and Approximation.

NTNU Speech Lab Dirichlet Mixtures for Query Estimation in Information Retrieval Mark D. Smucker, David Kulp, James Allan Center for Intelligent Information.

Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.

Web-Mining Agents Topic Analysis: pLSI and LDA

Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.

RADFORD M. NEAL GEOFFREY E. HINTON 발표: 황규백

Modeling Annotated Data (SIGIR 2003) David M. Blei, Michael I. Jordan Univ. of California, Berkeley Presented by ChengXiang Zhai, July 10, 2003.

Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.

Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li

Inferring User Interest Familiarity and Topic Similarity with Social Neighbors in Facebook INSTRUCTOR: DONGCHUL KIM ANUSHA BOOTHPUR

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.

Collaborative Filtering With Decoupled Models for Preferences and Ratings Rong Jin 1, Luo Si 1, ChengXiang Zhai 2 and Jamie Callan 1 Language Technology.

Multimodal Learning with Deep Boltzmann Machines

Advanced Artificial Intelligence

Bayesian Inference for Mixture Language Models

Stochastic Optimization Maximization for Latent Variable Models

Michal Rosen-Zvi University of California, Irvine

Latent Dirichlet Allocation

Probabilistic Latent Preference Analysis

Topic Models in Text Processing

Presentation transcript:

Modeling User Rating Profiles For Collaborative Filtering Benjamin M. Marlin marlin@cs.toronto.edu University of Toronto. Department of Computer Science. Toronto, Ontario, Canada 2. Introduction AP 08

1. Abstract 2. Introduction • We present a new latent variable model for rating-based collaborative filtering called the User Rating Profile model (URP). URP has complete generative semantics at the user and rating profile levels. • URP is related to several models including a multinomial mixture model, the aspect model, and latent Dirichlet allocation, but has advantages over each. • A variational Expectation Maximization procedure is used to fit the URP model. Rating prediction makes use of a well defined variational inference procedure. • Empirical results on two rating prediction tasks using the EachMovie and MovieLens data sets show that URP attains lower error rates than the multinomial mixture model, the aspect model, and neighborhood-based techniques. 2. Introduction

Collaborative Filtering Formulations Preference Indicators Co-occurrence Pair (u,y): u is a user index and y is an item index. Count Vector (n1u, n2u, … , nMu): nyu is the number of times (u,y) is observed. Rating Triplet (u,y,r): u is a user index, y is an item index, r is a rating value. Rating Vector (r1u, r2u, … , rMu): ryu is rating assigned to item y by user u. Additional Features In a pure formulation no additional features are used. A hybrid formulation incorporates additional content-based item and user features. Preference Dynamics In a sequential formulation the rating process is modeled as a time series. In a non-sequential formulation preferences are assumed to be static.

The Pure, Non-Sequential, Rating-Based Formulation Additional Features: None Preference Dynamics: Non-sequential Preference Indicators: Ordinal rating vectors Items: y=1,…,M Users: u=1,…,N Ratings: r=1,…,V Formal Description: Tasks: The two main tasks under this formulation are recommendation and rating prediction. Rating prediction is the task of estimating all unknown ratings for the active user. The focus of research is developing highly accurate methods for rating prediction. 1. Item y2 2. Item y3 Item List Rating Database Recommendation Sort Active User Ratings Predicted Ratings Rating Prediction Figure 1: Given a rating prediction method, a recommendation method is easily obtained: predict, then sort.

3. Related Work Neighborhood Methods: Multinomial Mixture Model: • Introduced by Resnick et al (GroupLens), Shardanand and Maes (Ringo). • All variants can be seen as modifications of the K-Nearest Neighbor classifier. Rating Prediction: 1. Compute similarity measure between active user and all users in database. 2. Compute predicted rating for each item. Multinomial Mixture Model: • A simple mixture model with fast, reliable learning by EM, and low prediction time. • Simple but correct generative semantics. Each profile is generated by 1 of K types. Learning: E-Step: M-Step: Rating Prediction:

Latent Dirichlet Allocation: The Aspect Model: Latent Dirichlet Allocation: • Proposed by Blei et al. for text modeling. • Can be used in a co-occurrence based CF formulation. Can not model ratings. • A correct generative version of the dyadic aspect model. User’s distribution over types is random variable with Dirichlet prior. • Many versions proposed by Hofmann. Of main interest are dyadic, triadic, and new vector version proposed by Marlin. • All have incomplete generative semantics. Learning (Vector): E-Step: M-Step: Learning: • Model learned using variational EM or Minka’s Expectation propagation. • Exact inference not possible. Rating Prediction (Vector): Prediction: • Needs approximate inference. Variational methods result in an iterative algorithm.

Co-occurrence to Ratings Ratings to Rating profiles Graphical Models: Figure 2: Dyadic Aspect Model Figure 3: Triadic Aspect Model Figure 4: Vector Aspect Model Co-occurrence to Ratings Ratings to Rating profiles Variable U: User index Variable Z: Attitude index Variable Y: Item Index Variable R: Rating Value Parameter : P(Z|U=u) Parameter : P(R|Z=z,Y=y) Variable U: User index Variable Z: Attitude index Variable Y: Item Index Parameter : P(Z|U=u) Parameter : P(Y|Z=z) Variable U: User index Variable Zy: Attitude index Variable Ry: Rating value Variable Y: Item Index Parameter : P(Z|U=u) Parameter : P(R|Z=z,Y=y) Generative Generative

Co-occurrence to Rating Profile Figure 5: LDA Model Variable  : P(Z|U=u) Variable Z: Attitude index Variable Y: Item index Parameter : Dirichlet prior Parameter : P(Y|Z=z) Figure 6: URP Model Variable  : P(Z|U=u) Variable Zy: Attitude index Variable Ry: Rating value Variable Y: Item index Parameter : Dirichlet prior Parameter : P(Ry |Z=z) Co-occurrence to Rating Profile

4. The URP Model Model Specification: Generative Process: Description: • The latent space description of a user is a Dirichlet random variable  that encodes a multinomial distribution over user types. • Each setting of the multinomial variables Zy is an index into K user types or user attitudes. • Each user attitude is represented by a multinomial distribution over ratings for each item encoded by . • The multinomial variables Ry give the ratings for each item y. Possible values are from 1 to V. Generative Process: • Unlike a simple mixture model, each user has a unique distribution over . • Unlike the aspect model family, there are proper generative semantics on . • Unlike LDA, URP generates a set of complete user rating profiles 1. For each user u = 1 to N 2. Sample  ~ Dirichlet() 3. For each item y = 1 to M 4. Sample z ~ Multinomial() 5. Sample r ~ Multimonial(yz)

Learning Variational Approximation Paramter Estimation • Exact inference is intractable with URP. We define a fully factorized approximate q-distribution with variational multinomial parameters u, and variational Dirichlet parameters u. Variational Inference Paramter Estimation Solve

5. Experimentation Rating Prediction • Once rating distributions are estimated, any number of prediction techniques can be used. The prediction technique should match the error measure used. 5. Experimentation Weak Generalization Experiment: • Available ratings for each user split into observed and unobserved sets. Trained on the observed ratings, tested on the unobserved ratings. • Repeated on 3 random splits of data. Strong Generalization Experiment: • Users split into training set and testing set. Ratings for test users split into observed and unobserved sets. Trained on training users, tested on test users. • Repeated on 3 random splits of data.

Error Measure: Data Sets: Normalized Mean Absolute Error: • Average over all users of the absolute difference between predicted and actual ratings. • Normalized by expectation of the difference between predicted and actual ratings under empirical rating distribution of the base data set. EachMovie: Compaq Systems Research Center • Users: 72916 • Items: 1628 • Rating Values: 6 • Ratings: 2,811,983 • Sparsity: 97.6% • Filtering: 20 ratings MovieLens: GroupLens Research Center • Users: 6040 • Items: 3900 • Rating Values: 5 • Ratings: 1,000,209 • Sparsity: 95.7% • Filtering: 20 ratings Figure 7: Distribution of ratings in weak and strong filtered data sets compared to base data sets.

5. Experimentation and Results 6. Results Norm. Norm. Figure 8: MovieLens Weak Generalization Results Figure 9: MovieLens Strong Generalization Results • URP and the aspect model attain the same minimum weak generalization error rate, but URP does so using far fewer model parameters.

Norm. Norm. Figure 10: EachMovie Weak Generalization Results Figure 11: EachMovie Strong Generalization Results • On the more difficult EachMovie data set, URP clearly performs better than the other rating prediction methods considered.

7. Conclusions and Future Work • We have introduced URP, a new generative model specially designed for pure, non-sequential, ratings-based collaborative filtering. URP has consistent generative semantics at both the user level, and the rating profile level. • Empirical results show that URP outperforms other popular rating prediction methods using fewer model parameters. Future Work: • Models with more intuitive generative semantics. Currently under study are a promising family of product models. • Models that integrate additional features, or sequential dynamics, or both.

8. References 1. D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993-1022, January 2003. 2. John S. Breese, David Heckerman, and Carl Kadie. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pages 43-52, July 1998. 3. Thomas Hofmann. Learning What People (Don't) Want. In Proceedings of the European Conference on Machine Learning (ECML), 2001. 5. Thomas Minka and John Lafferty. Expectation-Propagation for the Generative Aspect Model. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, 2002. 6. R. M. Neal and G. E. Hinton. A new view of the EM algorithm that justifies incremental, sparse and other variants. In M. I. Jordan, editor, Learning in Graphical Models, pages 355-368. Kluwer Academic Publishers, 1998. 7. P. Resnick, N. Iacovou, M. Suchak, P. Bergstorm, and J. Riedl. GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work, pages 175{186, Chapel Hill, North Carolina, 1994. ACM. 8. Upendra Shardanand and Patti Maes. Social information ltering: Algorithms for automating “word of mouth". In Proceedings of ACM CHI'95, volume 1, pages 210-217, 1995.