Recommender Systems Session I

Slides:



Advertisements
Similar presentations
Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.
Advertisements

Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.
Active Learning and Collaborative Filtering
The Wisdom of the Few A Collaborative Filtering Approach Based on Expert Opinions from the Web Xavier Amatriain Telefonica Research Nuria Oliver Telefonica.
Memory-Based Recommender Systems : A Comparative Study Aaron John Mani Srinivasan Ramani CSCI 572 PROJECT RECOMPARATOR.
Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.
Algorithms for Efficient Collaborative Filtering Vreixo Formoso Fidel Cacheda Víctor Carneiro University of A Coruña (Spain)
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
Item-based Collaborative Filtering Recommendation Algorithms
Performance of Recommender Algorithms on Top-N Recommendation Tasks
Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.
Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.
Objectives Objectives Recommendz: A Multi-feature Recommendation System Matthew Garden, Gregory Dudek, Center for Intelligent Machines, McGill University.
Evaluation of Recommender Algorithms for an Internet Information Broker based on Simple Association Rules and on the Repeat-Buying Theory WEBKDD 2002 Edmonton,
Temporal Diversity in Recommender Systems Neal Lathia, Stephen Hailes, Licia Capra, and Xavier Amatriain SIGIR 2010 April 6, 2011 Hyunwoo Kim.
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Collaborative Filtering Zaffar Ahmed
Pearson Correlation Coefficient 77B Recommender Systems.
Amanda Lambert Jimmy Bobowski Shi Hui Lim Mentors: Brent Castle, Huijun Wang.
Foxtrot seminar Capturing knowledge of user preferences with recommender systems Stuart E. Middleton David C. De Roure, Nigel R. Shadbolt Intelligence,
KNN CF: A Temporal Social Network kNN CF: A Temporal Social Network Neal Lathia, Stephen Hailes, Licia Capra University College London RecSys ’ 08 Advisor:
ApproxHadoop Bringing Approximations to MapReduce Frameworks
Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
Reputation-aware QoS Value Prediction of Web Services Weiwei Qiu, Zhejiang University Zibin Zheng, The Chinese University of HongKong Xinyu Wang, Zhejiang.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
ItemBased Collaborative Filtering Recommendation Algorithms 1.
Trust-aware Recommender Systems
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Data Science Credibility: Evaluating What’s Been Learned
Statistics 202: Statistical Aspects of Data Mining
Recommendation in Scholarly Big Data
Recommender Systems & Collaborative Filtering
Item-to-Item Recommender Network Optimization
Sampath Jayarathna Cal Poly Pomona
Evaluating Classifiers
Hardware & Software Reliability
Qualitative vs. Quantitative
WSRec: A Collaborative Filtering Based Web Service Recommender System
Statistics: The Z score and the normal distribution
Evaluation of IR Systems
Preface to the special issue on context-aware recommender systems
Methods and Metrics for Cold-Start Recommendations
Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.
7 How to Decide Which Variables to Manipulate and Measure Marziyeh Rezaee.
Collaborative Filtering
Introduction to Measurement
INF 397C Fall, 2003 Days 13.
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Collaborative Filtering Nearest Neighbor Approach
M.Sc. Project Doron Harlev Supervisor: Dr. Dana Ron
Process Capability.
Ensembles.
Movie Recommendation System
ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS
Retrieval Performance Evaluation - Measures
Evaluation and Its Methods
Journal of Web Semantics 55 (2019)
GhostLink: Latent Network Inference for Influence-aware Recommendation
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Outlines Introduction & Objectives Methodology & Workflow
Evaluation David Kauchak CS 158 – Fall 2019.
Presentation transcript:

Recommender Systems Session I Robin Burke DePaul University Chicago, IL

Roadmap Session A: Basic Techniques I Session B: Basic Techniques II Introduction Knowledge Sources Recommendation Types Collaborative Recommendation Session B: Basic Techniques II Content-based Recommendation Knowledge-based Recommendation Session C: Domains and Implementation I Recommendation domains Example Implementation Lab I Session D: Evaluation I Evaluation Session E: Applications User Interaction Web Personalization Session F: Implementation II Lab II Session G: Hybrid Recommendation Session H: Robustness Session I: Advanced Topics Dynamics Beyond accuracy 2

Current research Question 1 Question 2 do we lose something when we think of a ratings database as static? my work Question 2 does a summary statistic like MAE hide valuable information? Mike O’Mahoney (UCD colleague)

Collaborative Dynamics Remember our evaluation methodology get all the ratings divide them up into test / training data sets run prediction tests

Problem That isn’t how real recommender systems operate They get a stream of ratings over time They have to respond to user requests predictions recommendation lists dynamically

Questions Are early ratings more predictive than later ratings? Is there a pattern to how users build their profiles? How long does it take to get past the cold-start?

Some ideas Temporal leave-one-out Profile MAE Profile Hit Ratio

Temporal leave-one-out (TL1O) for a rating r(u,i) at time t predict that r(u,i) using the ratings database immediately prior to t the information that would have been available right before we learned u’s real rating Average the error over time intervals we see how error evolves as data is added cold-start in action

Profile MAE For each profile See the aggregate evolution of profiles do the TL1O ratings average over all profiles of that length See the aggregate evolution of profiles

Profile Hit Ratio Do a similar thing for hit ratio For each liked item r(u,i) > 3 at time t create a recommendation list at time t measure the rank of item i on that list compute the hit ratio of such items on lists of length k

Temporal MAE (ML1M)

Cold Start Seems to take about 150 days to get past the initial cold start about 15% of the data Temporal MAE improves after that but not as steeply

Profile MAE Decrease in MAE as profiles get longer Strongest decrease earlier in the curve Seems to be a kNN property same thing happens if the first 150

Diminishing returns Appears to be diminishing returns in longer profile sizes paradoxical given what we know about sparsity More data should be better

A clue ML100K data Sparser data compresses the curve 10% data size Sparser data compresses the curve Diminishing returns may be a function of the average profile length

Average rating Users seem to add positive ratings first and negative ratings later

Application-dependence Could be because ratings are added in response to recommendations Easy (popular) recommendations given first likely to be right Later recommendations more errors users rate lower

Profile Hit Ratio Cumulative hit ratio n=50 Dashed line is random performance

Interestingly Harder to see Appear to be diminishing returns like MAE but then a jump at the end Need to examine this data more ML100K data experiments very slow to run

MAE for different ratings Odd result MAE for each rating value correlated with # of ratings of that value in the profile subtract out contribution of total # of ratings of that value May tell us the average value of adding a rating of a particular type Look at R=5? saturation more about this later

Break

What Have The Neighbours Ever Done for Us? A Collaborative Filtering Perspective. Michael O’Mahony 5th March, 2009

Presentation based on paper submitted to UMAP ’09 Authors: R. Rafter, M.P. O’Mahony, N. J. Hurley and B. Smyth

Collaborative Filtering Collaborative filtering (CF) – key techniques used in recommender systems Harnesses past ratings to make predictions & recommendations for new items Recommend items with high predicted ratings and suppress those with low predicted ratings Assumption: CF techniques provide a considerable advantage over simpler average-rating approaches

Valid Assumption? We analyse the following: What do CF techniques actually contribute? How is accuracy performance measured? What datasets are used to evaluate CF techniques? Consider two standard CF techniques: User-based and item-based CF

CF Algorithms Two components to user-based and item-based CF: Initial estimate: based on average rating of target user or item Neighbour estimate: based on ratings of similar users or items Must perturb the initial estimate: By the correct magnitude In the correct direction General formula:

CF Algorithms User-based CF: Item-based CF: Initial Estimate Neighbour Estimate

Evaluating Accuracy Predictive accuracy: Mean Absolute Error (MAE): MAE calculated over all test set ratings (problem?) Other metrics: RMSE, ROC curves … – give similar trends

Evaluation Datasets: Procedure: # Users # Items # Ratings Sparsity Rating Scale MovieLens Netflix Book-crossing 943 24,010 77,805 1,682 17,471 185,973 100,000 5,581,775 433,671 93.695% 98.690% 99.997% 1 – 5 1 – 10 Datasets: Procedure: Create test set by randomly removing 10% of ratings Make predictions for test set ratings using remaining data Repeat x10 and compute average MAE

Results User-based Item-based Dataset Mag. Cor. Dir. MAE MovieLens Netflix Book-crossing 0.43 0.41 0.99 66% 53% 0.73 0.70 1.53 0.34 0.35 0.94 64% 67% 63% 0.69 1.34 Average performance, computed over all test set ratings Neighbour estimate magnitudes are small, between 8.5% – 11% of range Item-based CF is comparable to/outperforms user-based CF wrt MAE (smaller magnitudes observed for item-based CF) Book-crossing dataset – user-based CF shifts initial estimate in correct direction in only 53% of cases (just slightly better than chance!)

Neighbour Magnitude

Datasets Frequency of occurrence of ratings: Consider MovieLens: Bias (natural?) toward ratings on higher end of scale Consider MovieLens: Most ratings are 3 and 4 Mean user rating ≈ 3.6 –– small neighbour estimate magnitude required in most cases Consequences of such datasets characteristics for CF research: Computing average MAE across all test set ratings hide performance issues in light of such characteristics [Shardanand and Maes 1995] For example, can CF achieve large magnitudes when needed?

MAE vs Actual Ratings Recall: average overall MAE = 0.73 for both UB and IB …

Error PDFs

Neighbour Contribution Effect of neighbour estimate versus initial (mean-based) estimate:

Neighbour Contribution

Conclusions Examined the contribution of standard CF techniques: Neighbours have small influence (magnitude) which is not always reliable (direction) Evaluating accuracy performance: Need for more fine-grained error analysis [Shardanand and Maes 1995] Focus on developing CF algorithms which offer improved accuracy performance for extreme ratings Test datasets: Standard datasets have particular characteristics – e.g. bias in ratings toward higher end of rating scale – need for new datasets Such characteristics, combined with using overall MAE to evaluate accuracy, has “hidden” performance issues – and hindered CF development (?)

That’s all folks! Questions?