Yehuda Koren , Joe Sill Recsys’11 best paper award OrdRec: An Ordinal Model for Predicting Personalized Item Rating Distributions Yehuda Koren , Joe Sill Recsys’11 best paper award
Outline Motivations The OrdRec Model MultiNomial Factor Model Experiment
Motivations Numerical v.s. Ordinal
Motivations A comparative ranking of products No direct interpretation in terms of numerical values Numerical may not reflect user intention well User bias
Motivations OrdRec Model Motivated by above discussion and inspired by the ordinal logistic regression model by McCullagh Ability to output a full probability distribution of the scores Ability to associated with confidence levels
The OrdRec Model
The OrdRec Model
The OrdRec Model
Ranking items for a user OrdRec predicts a full probability distribution over ratings Much richer output Rank items given predicted rating distributions Computing Statistics like mean no longer plausible Cast the problem as a learning-to-rank task
Ranking items for a user
A multinomial Factor Model(MultiMF) A multinomial distribution over categorical scores Constructed baseline model for comparing with OrdRec For each score r: Same as OrdRec, log likelihood of training data is maximized Score-dependent item factor vector
Experiments Data set Netflix Two Yahoo! Music Data set
Evaluation Metrics
Results
Result Analysis OrdRec as leader on Nexflix for both RMSE and FCP. Better model ordinal semantics of user ratings SVD++ performs best in terms of RMSE The only methods trained to minimize RMSE RMSE values on Y!Music much greater than Netflix while FCP values changes little RMSE more sensitive to rating scales than FCP (Y!Music 10 scales, Netflix 5 scales)
Result Analysis OrdRec consistently outperforms the rest in terms of FCP Indicate it better ranking items for a user: reflect the benefit of better modeling the semantics of user feedback Training time comparison
Recommendation Confidence Estimation Formulate confidence estimation as a binary classification problem Predict whether the model’s predicted rating is within one rating level of the true rating Predicted values : expected value of the predicted rating distribution Using logistic regression to predict Random 2/3 of Test data as training , the rest as test
Result
Conclusions Taking user feedback as ordinal relaxes the numerical view Can deal with all usual feedbacks, such as thumbs-up/down, like-votes, stars, numerical scores, or A-F grades Without assuming categorical feedback Also applied even feedback is actual numerical: It allow expresses distinct internal scales for their qualitative ratings
Conclusions OrdRec employs a point-wise approach to ordinal modeling Training time is linearly with data set size OrdRec outputs a full probability distribution of scores Provides richer expressive power Helpful in estimating the confidence level Goes beyond describing only the average rating or the most likely rating. May have a impact on system design: let certain part of the system be more conservative (avoiding high probability of the lowest rating)
Thank you Q&A