Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.

Slides:



Advertisements
Similar presentations
Google News Personalization: Scalable Online Collaborative Filtering
Advertisements

Music Recommendation by Unified Hypergraph: Music Recommendation by Unified Hypergraph: Combining Social Media Information and Music Content Jiajun Bu,
Item Based Collaborative Filtering Recommendation Algorithms
Collaborative Filtering Sue Yeon Syn September 21, 2005.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
A. Darwiche Learning in Bayesian Networks. A. Darwiche Known Structure Complete Data Known Structure Incomplete Data Unknown Structure Complete Data Unknown.
Active Learning and Collaborative Filtering
Evaluating Search Engine
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
CS345 Data Mining Recommendation Systems Netflix Challenge Anand Rajaraman, Jeffrey D. Ullman.
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
Learning Bit by Bit Collaborative Filtering/Recommendation Systems.
1 Introduction to Recommendation System Presented by HongBo Deng Nov 14, 2006 Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D. Ullman.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology
Item-based Collaborative Filtering Recommendation Algorithms
Performance of Recommender Algorithms on Top-N Recommendation Tasks
Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.
By Rachsuda Jiamthapthaksin 10/09/ Edited by Christoph F. Eick.
1 Recommender Systems and Collaborative Filtering Jon Herlocker Assistant Professor School of Electrical Engineering and Computer Science Oregon State.
1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)
Training and Testing of Recommender Systems on Data Missing Not at Random Harald Steck at KDD, July 2010 Bell Labs, Murray Hill.
+ Recommending Branded Products from Social Media Jessica CHOW Yuet Tsz Yongzheng Zhang, Marco Pennacchiotti eBay Inc. eBay Inc.
Evaluation Methods and Challenges. 2 Deepak Agarwal & Bee-Chung ICML’11 Evaluation Methods Ideal method –Experimental Design: Run side-by-side.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Experimental Evaluation of Learning Algorithms Part 1.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Google News Personalization: Scalable Online Collaborative Filtering
Netflix Netflix is a subscription-based movie and television show rental service that offers media to subscribers: Physically by mail Over the internet.
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Similarity & Recommendation Arjen P. de Vries CWI Scientific Meeting September 27th 2013.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Amanda Lambert Jimmy Bobowski Shi Hui Lim Mentors: Brent Castle, Huijun Wang.
Marketing and CS Philip Chan.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Show Me the Money! Deriving the Pricing Power of Product Features by Mining Consumer Reviews Nikolay Archak, Anindya Ghose, and Panagiotis G. Ipeirotis.
Page 1 A Random Walk Method for Alleviating the Sparsity Problem in Collaborative Filtering Hilmi Yıldırım and Mukkai S. Krishnamoorthy Rensselaer Polytechnic.
SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize.
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown Structure Complete Data Unknown Structure Incomplete.
Reputation-aware QoS Value Prediction of Web Services Weiwei Qiu, Zhejiang University Zibin Zheng, The Chinese University of HongKong Xinyu Wang, Zhejiang.
Innovation Team of Recommender System(ITRS) Collaborative Competitive Filtering : Learning Recommender Using Context of User Choice Keynote: Zhi-qiang.
Analysis of massive data sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
ItemBased Collaborative Filtering Recommendation Algorithms 1.
Item-Based Collaborative Filtering Recommendation Algorithms
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Matrix Factorization and Collaborative Filtering
Item-Based P-Tree Collaborative Filtering applied to the Netflix Data
Recommender Systems Session I
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
North Dakota State University Fargo, ND USA
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Q4 : How does Netflix recommend movies?
North Dakota State University Fargo, ND USA
ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS
Probabilistic Latent Preference Analysis
North Dakota State University Fargo, ND USA
Presentation transcript:

Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1

Georgia Tech Joonseok Lee Agenda Recommendation Task Recommending good items Optimizing utility function Predicting ratings Evaluation Protocols and Tasks Online evaluation Offline experiment Evaluation Metrics Case Studies Other Issues - 2 -

Georgia Tech Joonseok Lee Task 1: Recommending Good Items Recommending some good items: more important not to present any disliked item. Media items: movie, music, book, etc. Recommending all good items: more important not to skip any liked item. Scientific papers which should be cited Legal databases - 3 -

Georgia Tech Joonseok Lee Task 2: Optimizing Utility Maximizing profit! Buy more than originally intended. Keeping users in the website longer. (Banner advertisement) Generalization of Task Type 1. Weighted sum of purchased items’ profit. When advertisement profit is considered, target function to be optimized can be very complex

Georgia Tech Joonseok Lee Task 3: Predicting Ratings Predict unseen ratings based on observed ratings. Common in research community Netflix competition Common practice Recommending items according to the predicted ratings. Is this a correct strategy? - 5 -

Georgia Tech Joonseok Lee Agenda Recommendation Task Recommending good items Optimizing utility function Predicting ratings Evaluation Protocols and Tasks Online evaluation Offline experiment Evaluation Metrics Case Studies Other Issues - 6 -

Georgia Tech Joonseok Lee Online Evaluation Test with real users, on a real situation! Set up several recommender engines on a target system. Redirect each group of subjects to different recommenders. Observe how much the user behaviors are influenced by the recommender system. Limitation Very costly. Need to open imperfect version to real users. May give negative experience, making them to avoid using the system in the future

Georgia Tech Joonseok Lee Offline Experiments Filtering promising ones before online evaluation! Train/Test data split Learn a model from train data, then evaluate it with test data. How to split: Simulating online behaviors Using timestamps, allow ratings only before it rated. Hide ratings after some specific timestamps. For each test user, hide some portion of recent ratings. Regardless of timestamp, randomly hide some portion of ratings

Georgia Tech Joonseok Lee Agenda Recommendation Task Recommending good items Optimizing utility function Predicting ratings Evaluation Protocols and Tasks Online evaluation Offline experiment Evaluation Metrics Case Studies Other Issues - 9 -

Georgia Tech Joonseok Lee Task 3: Predicting Ratings Goal: Evaluate the accuracy of predictions. Popular metrics: Root of the Mean Square Error (RMSE) Mean Average Error (MAE) Normalized Mean Average Error (NMAE) Do not differentiate between errors Ex: (5 stars – 4 stars) == (3 stars – 2 stars)

Georgia Tech Joonseok Lee Task 1: Recommending Items Goal: Suggesting good items (not discouraging bad items) Popular metrics:

Georgia Tech Joonseok Lee Task 1: Recommending Items Popular graphical models Precision-Recall Curve: Precision, Recall ROC Curve: Recall, False Positive Rate So, what to use? Depend on problem domain and task. Example Video rental service: False positive rate is not important.  Precision-Recall Curve would be desirable. Online dating site: False positive rate is very important.  ROC Curve would be desirable

Georgia Tech Joonseok Lee Task 2: Optimizing Utility Goal: modeling the way of users interacting with the recommendations. Popular metrics: Half-life Utility Score Generalized version with a utility function

Georgia Tech Joonseok Lee Agenda Recommendation Task Recommending good items Optimizing utility function Predicting ratings Evaluation Protocols and Tasks Online evaluation Offline experiment Evaluation Metrics Case Studies Other Issues

Georgia Tech Joonseok Lee Case Study Setting Goal: Demonstrate that incorrect choice of evaluation metric can lead different decision. Algorithms used: User-based CF Neighbor size =

Georgia Tech Joonseok Lee Case Study 1 Task: prediction vs. recommendation Algorithms to compare Pearson Correlation Cosine Similarity Dataset Netflix (users with more than 100 ratings only) BookCrossing (extremely sparse)

Georgia Tech Joonseok Lee Case Study 1 Experimental Results Prediction task, measured with RMSE Recommendation task, measured with Precision-Recall curve

Georgia Tech Joonseok Lee Case Study 2 Task: recommendation vs. utility maximization Algorithms to compare Item to Item: maximum likelihood estimate for the conditional probabilities of each target item given each observed item. Expected Utility: reflecting expected utility on the item-to-item method. Dataset Belgian Retailer News Click Stream

Georgia Tech Joonseok Lee Case Study 2 Experimental Results Recommendation task, measured with Precision-Recall curve Utility maximization task, measured with Half-life Utility score

Georgia Tech Joonseok Lee Agenda Recommendation Task Recommending good items Optimizing utility function Predicting ratings Evaluation Protocols and Tasks Online evaluation Offline experiment Evaluation Metrics Case Studies Other Issues

Georgia Tech Joonseok Lee Other Issues User interface (UI) plays an important role. Lots of design choices Image vs. Text? Horizontal vs. Vertical? User study in an HCI manner Eliciting Utility function is not straightforward. How much this recommended movie contributed the user to maintain subscription?

Georgia Tech Joonseok Lee Other Issues (Koren, KDD’08) Is lowering RMSE meaningful for users, indeed? Mix a favorite movie (rated as 5) with 1,000 random movies. Estimate rating and rank those 1,001 movies. Observe where the favorite movie is located. If the prediction is precise, the favorite movie should locate at the top!

Georgia Tech Joonseok Lee Other Issues (Koren, KDD’08) RMSE

Georgia Tech Joonseok Lee Any question?

THE END Thank you very much! 25