Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.

Slides:

Advertisements

Similar presentations

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Advertisements

Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.

Evaluating Recommender Systems  A myriad of techniques has been proposed, but –Which one is the best in a given application domain? –What.

Efﬁcient Retrieval of Recommendations in a Matrix Factorization Framework Noam KoenigsteinParikshit RamYuval Shavitt School of Electrical Engineering Tel.

Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.

1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.

Yehuda Koren , Joe Sill Recsys’11 best paper award

Active Learning and Collaborative Filtering

The Wisdom of the Few A Collaborative Filtering Approach Based on Expert Opinions from the Web Xavier Amatriain Telefonica Research Nuria Oliver Telefonica.

I NCREMENTAL S INGULAR V ALUE D ECOMPOSITION A LGORITHMS FOR H IGHLY S CALABLE R ECOMMENDER S YSTEMS (S ARWAR ET AL ) Presented by Sameer Saproo.

Memory-Based Recommender Systems : A Comparative Study Aaron John Mani Srinivasan Ramani CSCI 572 PROJECT RECOMPARATOR.

Evaluating Hypotheses

Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.

Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.

Analysis of Recommendation Algorithms for E-Commerce Badrul M. Sarwar, George Karypis*, Joseph A. Konstan, and John T. Riedl GroupLens Research/*Army HPCRC.

Intelligible Models for Classification and Regression

Item-based Collaborative Filtering Recommendation Algorithms

Performance of Recommender Algorithms on Top-N Recommendation Tasks

A NON-IID FRAMEWORK FOR COLLABORATIVE FILTERING WITH RESTRICTED BOLTZMANN MACHINES Kostadin Georgiev, VMware Bulgaria Preslav Nakov, Qatar Computing Research.

Wancai Zhang, Hailong Sun, Xudong Liu, Xiaohui Guo.

A Hybrid Recommender System: User Profiling from Keywords and Ratings Ana Stanescu, Swapnil Nagar, Doina Caragea 2013 IEEE/WIC/ACM International Conferences.

1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)

Training and Testing of Recommender Systems on Data Missing Not at Random Harald Steck at KDD, July 2010 Bell Labs, Murray Hill.

Evaluation Methods and Challenges. 2 Deepak Agarwal & Bee-Chung ICML’11 Evaluation Methods Ideal method –Experimental Design: Run side-by-side.

Classical Music for Rock Fans?: Novel Recommendations for Expanding User Interests Makoto Nakatsuji, Yasuhiro Fujiwara, Akimichi Tanaka, Toshio Uchiyama,

A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

GAUSSIAN PROCESS FACTORIZATION MACHINES FOR CONTEXT-AWARE RECOMMENDATIONS Trung V. Nguyen, Alexandros Karatzoglou, Linas Baltrunas SIGIR 2014 Presentation:

Center for E-Business Technology Seoul National University Seoul, Korea BrowseRank: letting the web users vote for page importance Yuting Liu, Bin Gao,

Online Learning for Collaborative Filtering

Collaborative Filtering versus Personal Log based Filtering: Experimental Comparison for Hotel Room Selection Ryosuke Saga and Hiroshi Tsuji Osaka Prefecture.

EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.

Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl

Diversifying Search Result WSDM 2009 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University Center for E-Business.

Temporal Diversity in Recommender Systems Neal Lathia, Stephen Hailes, Licia Capra, and Xavier Amatriain SIGIR 2010 April 6, 2011 Hyunwoo Kim.

A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.

Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.

EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.

Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1.

Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.

Collaborative Filtering with Temporal Dynamics Yehuda Koren Yahoo Research Israel KDD’09.

Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,

Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun

KNN CF: A Temporal Social Network kNN CF: A Temporal Social Network Neal Lathia, Stephen Hailes, Licia Capra University College London RecSys ’ 08 Advisor:

FISM: Factored Item Similarity Models for Top-N Recommender Systems

Page 1 A Random Walk Method for Alleviating the Sparsity Problem in Collaborative Filtering Hilmi Yıldırım and Mukkai S. Krishnamoorthy Rensselaer Polytechnic.

Exploiting Contextual Information from Event Logs for Personalized Recommendation ICIS2010 Intelligent Database Systems Lab. School of Computer Science.

Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.

ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.

Feature Selection Poonam Buch. 2 The Problem  The success of machine learning algorithms is usually dependent on the quality of data they operate on.

Collaborative Competitive Filtering: Learning recommender using context of user choices Shuang Hong Yang Bo Long, Alex Smola, Hongyuan Zha Zhaohui Zheng.

Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.

Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.

To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.

Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.

Innovation Team of Recommender System(ITRS) Collaborative Competitive Filtering : Learning Recommender Using Context of User Choice Keynote: Zhi-qiang.

The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!

Federated text retrieval from uncooperative overlapped collections Milad Shokouhi, RMIT University, Melbourne, Australia Justin Zobel, RMIT University,

Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia-Molina Department of Computer Science Stanford University SIGIR 2008 Presentation.

Matrix Factorization and Collaborative Filtering

Recommender Systems Session I

Methods and Metrics for Cold-Start Recommendations

Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.

Collaborative Filtering Nearest Neighbor Approach

Q4 : How does Netflix recommend movies?

Movie Recommendation System

ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS

Probabilistic Latent Preference Analysis

Machine Learning: Lecture 5

Presentation transcript:

Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University Center for E-Business Technology Seoul National University Seoul, Korea Presented by Sangkeun Lee 1/14/2011 Paolo Cremonesi, Yehuda Koren, Roberto Turrin Politecnico di Milano, Yahoo! Research Haifa, Israel, Neptuny Milan, Italy

Copyright  2010 by CEBT Introduction  Competition of recommender systems By evaluating their error metrics such as RMSE (Root mean squared error) Average error between estimated ratings and actual ratings  Why the majority of the literature is focused on error metrics? Logical & convenient  However, many commercial systems perform top-N recommendation tasks The systems suggest a few specific items to the user that are likely to be very appealing to him

Copyright  2010 by CEBT Introduction: Top-N Performance  Classical error measures (e.g. RMSE, MAE) do not really measure top- N performance  Measure for Top-N Performance Accuracy metrics – Recall and Precision  In this paper, The authors present an extensive evaluation of several state-of-art recommender systems & naïve non-personalized algorithms And they give us some insight from the experimental results On Netflix & Movielens datasets

Copyright  2010 by CEBT Testing Methodology: Dataset  For each dataset, known ratings are split into two subsets : Training set M and test set T Test set T contains only 5-starts ratings – So, we can reasonably state that T contains items relevant to the respective users  For the Neflix dataset, Training set = training dataset 100M ratings for Netflix prize Test set = 5-star ratings from probe dataset for Netflix prize (|T|=384,573)  For the Movielens dataset, Randomly sub-sampled 1.4% of the ratings from the dataset to create testset

Copyright  2010 by CEBT Testing Methodology: measuring precision and recall  1) Train the model over the ratings in M  2) For each item I rated 5-starts by user u in T Randomly select 1000 additional items unrated by user u Predict the ratings for the test item I and for the additional 1000 items Form a ranked list by ordering 1001 items according to the predicted ratings. Let p denote the rank of the item I within this list. (The best result: p=1) Form a top-N recommendation list by picking the N top ranked items from the list. If p<=N we have a hit. Otherwise we have a miss.

Copyright  2010 by CEBT Testing Methodology: measuring precision and recall  For any single test case, recall for a single test can assume either 0 (miss) or 1(hit) Precision for a single test can assume either the value 0(miss) or 1/N (hit) The overall recall and precision are defined by averaging over all test cases

Copyright  2010 by CEBT Rating distribution : Popular items vs.Long-tail  About 33% of ratings collected by Netflix involve only the 1.7% of most popular items  To evaluate the accuracy of recommender algorithms in suggesting non-trivial items, T has been partitioned into T head and T long

Copyright  2010 by CEBT Algorithms  Non-personalized models Movie Rating Average (MovieAvg) – average of ratings Top Popular (TopPop) – number of ratings – non applicable to measure error metrics  Collaborative Filtering models Neighborhood models – The most common approaches – Based on similarity among either users or items Latent factor models – Finding hidden factors – Model users and items in the same latent factor spaces – Predict ratings usib proximity (e.g., inner-product)

Copyright  2010 by CEBT Neighborhood Models It’s no longer estimated rating, but still we can use this for top-N recommendation tasks

Copyright  2010 by CEBT Latent Factor Models

Copyright  2010 by CEBT Latent Factor Models: PureSVD It’s no longer estimated rating, but still we can use this for top-N recommendation tasks

Copyright  2010 by CEBT RMSE Ranking  SVD  AsySVD  CorNgbr  MovieAvg  Note that TopPop, NNCorNgbr, PureSVD are not applicable for measuring error metrics

Copyright  2010 by CEBT Result: Movielens dataset Similar!? Best!?

Copyright  2010 by CEBT Result: Netflix dataset  All items TopPop outperforms CorNgbr AsySVD and SVD++ slightly performs better than TopPop (Note that these algorithms are possibly better tuned for Neflix data) NNCosNgbr works good PureSVD is still the best  Long-tail CorNgbr significantly underperforms for the head But it performs well on long-tail data (Probably, it explains why CorNgbr has been widely used)

Copyright  2010 by CEBT PureSVD??  Poor design in terms of rating estimation The authors did not expect the result  PureSVD Easy to code &Good computational performance in both offline and online When moving to longer tail items, accuracy improves with raising the dimensionality of the PureSVD model. (50 -> 150) – This could mean that first latent factors capture properties of popular items, while additional features capture properties of long-tail items

Copyright  2010 by CEBT Conclusions  Error metrics have been more popular Mathematical convenience Formal optimization However, it is well recognized that accuracy measures may be more natural  In summary, (1) There is no monotonic(trivial) relation between error metrics and accuracy metrics (2) Test-cases should be carefully selected as we can see the experimental results (long-tail vs. head) Watch out the possible pitfalls! (3) New variants of existing algorithms improves the top-N performances

Q&A Thank you 17