Download presentation
Presentation is loading. Please wait.
Published byBerenice Matthews Modified over 9 years ago
1
1 Recommender Systems and Collaborative Filtering Jon Herlocker Assistant Professor School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR (also President, MusicStrands, Inc.)
2
2 Personalized Recommender Systems and Collaborative Filtering (CF)
3
3 Outline The recommender system space Pure collaborative filtering (CF) CF algorithms for prediction Evaluation of CF algorithms CF in web search (if time)
4
4 Recommender Systems Help people make decisions –Examples: Where to spend attention Where to spend money Help maintain awareness –Examples: New products New information In both cases –Many options, limited resources
5
5 Stereotypical Integrator of RS Has: Large product (item) catalog –With product attributes Large user base –With user attributes (age, gender, city, country, …) Evidence of customer preferences –Explicit ratings (powerful, but harder to elicit) –Observations of user activity (purchases, page views, emails, prints, …)
6
6 UsersItems Observed preferences The RS Space Item-Item Links User-User Links Links derived from similar attributes, similar content, explicit cross references Links derived from similar attributes, explicit connections (Ratings, purchases, page views, laundry lists, play lists)
7
7 UsersItems Observed preferences Individual Personalization Item-Item Links User-User Links
8
8 UsersItems Observed preferences Classic CF Item-Item Links User-User Links In the end, most models will be hybrid
9
9 Collaborative Filtering Process Community Opinions Items you’ve experienced Predictions Unseen items Your Opinions You
10
10 Find a Restaurant!
11
11 Find a Restaurant!
12
12 Find a Restaurant!
13
13 Find a Restaurant!
14
14 Find a Restaurant!
15
15 Find a Restaurant!
16
16 Find a Restaurant!
17
17 Advantages of Pure CF No expensive and error-prone user attributes or item attributes Incorporates quality and taste Works on any rate-able item One data model => many content domains Serendipity Users understand it!
18
18 Predictive Algorithms for Collaborative Filtering
19
19 Predictive Algorithms for Collaborative Filtering Frequently proposed taxonomy for collaborative filtering systems –Model-based methods Build a model offline Use model to generate recommendations Original data not needed at predict-time –Instance-based methods Use the ratings directly to generate recommendations
20
20 Model-Based Algorithms Probabilistic Bayesian approaches, clustering, PCA, SVD, etc. Key ideas –Reduced dimension representations (aggregations) of original data –Ability to reconstruct an approximation of the original data
21
21 Stereotypical model- based approaches Lower dimensionality => faster performance Can explain recommendations Can over-generalize Not using the latest data Force a choice of aggregation dimensions ahead of time
22
22 Instance-Based Methods Primarily nearest neighbor approaches Key ideas –Predict over raw ratings data (sometimes called memory- based methods) –Highly personalized recommendations
23
23 Stereotypical model- based approaches Use most up-to-date ratings Are simple and easy to explain to users Are unstable when there are few ratings Have linear (w.r.t. users and items) run-times Allow a different aggregation method for each user, possibly chosen at runtime
24
24 Evaluating CF Recommender Systems
25
25 Evaluation – User Tasks Evaluation depends on the user task Most common tasks –Annotation in context Predict ratings for individual items –Find good items Produce top-N recommendations Other possible tasks –Find all good items –Recommend sequence –Many others…
26
26 Novelty and Trust - Confidence Tradeoff –High confidence recommendations Recommendations are obvious Low utility for user However, they build trust –Recommendations with high prediction yet lower confidence Higher variability of error Higher novelty => higher utility for user
27
27 Test Users Community Ratings Data Training Set Test Set
28
28 Predictive Accuracy Metrics Mean absolute error (MAE) Most common metric Characteristics –Assumes errors at all levels in the ranking have equal weight –Sensitive to small changes –Good for “Annotate in Context” task –May not be appropriate for “Find Good Items” task
29
29 Classification Accuracy Metrics Precision/Recall –Precision: Ratio of “good” items recommended to number of items recommended –Recall: Ratio of “good” items recommended to the total number of “good” items Characteristics –Simple, easy to understand –Binary classification of “goodness” –Appropriate for “Find Good Items” –Can be dangerous due to lack of ratings for recommended items
30
30 ROC Curves “Relative Operating Characteristic” or “Receiver Operating Characteristic” Characteristics –Binary classification –Not a single number metric –Covers performance of system at all points in the recommendation list –More complex
31
31
32
32
33
33 Prediction to Rating Correlation Metrics Pearson, Spearman, Kendall Characteristics –Compare non-binary ranking to non-binary ranking –Rank correlation metrics suffer from “weak orderings” –Can only be computed on rated items –Provide a single score
34
34 Half-life Utility Metric Characteristics –Explicitly incorporates idea of decreasing user utility –Tuning parameters reduce comparability –Weak orderings can result in different utilities for the same system ranking –All items rated less than the max contribute equally –Only metric to really consider non-uniform utility
35
35 Does it Matter What Metric You Use? An empirical study to gain some insight…
36
36 Analysis of 432 variations of an algorithm on a 100,000 rating movie dataset
37
37 Comparison among results provided by all the per-user correlation metrics and the mean average precision per user metric. These metrics have strong linear relationships with each other.
38
38
39
39 Comparison between metrics that are averaged overall rather than per- user. Note the linear relationship between the different metrics.
40
40 A comparison of representative metrics from the three subsets that were depicted in the previous slides. Within each of the subsets, the metrics strongly agree, but this figure shows that metrics from different subsets do not correlate well
41
41 Does it Matter What Metric You Use? Yes.
42
42 Want to try CF? CoFE “Collaborative Filtering Engine” –Open source Java –Easy to add new algorithms –Includes testing infrastructure (this month) –Reference implementations of many popular CF algorithms –One high performance algorithm Production ready (see Furl.net) http://eecs.oregonstate.edu/iis/CoFE
43
43 Improving Web Search Using CF With Janet Webster, OSU Libraries
44
44 Controversial Claim Improvements in text analysis will substantially improve the search experience Focus on improving results of Mean Average Precision (MAP) metric
45
45 Evidence of the Claim Human Subjects Study by Turpin and Hersh (SIGIR 2001) –Compared human performance of 1970s search model (basic TF/IDF) Recent OKAPI search model with greatly improved MAP –Task: locating medical information –No statistical difference
46
46 Bypass the Hard Problem! The hard problem – Automatic analysis of text –Software “understanding” language We propose: Let humans assist with the analysis of text! –Enter Collaborative Filtering
47
47
50
50 The Human Element Capture and leverage the experience of every user –Recommendations are based on human evaluation Explicit votes Inferred votes (implicit) Recommend (question, document) pairs –Not just documents –Human can determine if questions have similarity System gets smarter with each use –Not just each new document
51
51 Research Issues Basic Issues –Is the concept sound? –What are the roadblocks? More mathematical issues –Algorithms for ranking recommendations (question, document, votes) –Robustness with unreliable data Text/Content Analysis –Improved NLP for matching questions –Incorporating more information into information context More social issues –Training users for new paradigm –Privacy –Integrating with existing reference library practices and systems –Offensive material in questions –Most effective user interface metaphors
52
52 Initial Results
53
53 Initial Results
54
Average visited documents: 2.196 First click - recommendation (141 – 71.6%) First click - Google result (56 – 28.4%) Average ratings: 14.727Average ratings: 20.715 Only Google Results (706 - 59.13%) Google results + recommendations (488 - 40.87%) Average visited documents: 1.598 Clicked (172 – 24.4%) No clicks (534 - 75.6%) Clicked (197 – 40.4%) No click (291 – 59.6%) Three months SERF usage – 1194 search transactions
55
Average visited documents: 2.196 First click - recommendation (141 – 71.6%) First click - Google result (56 – 28.4%) Average ratings: 14.727Average ratings: 20.715 Only Google Results (706 - 59.13%) Google results + recommendations (488 - 40.87%) Average visited documents: 1.598 Clicked (172 – 24.4%) No clicks (534 - 75.6%) Clicked (197 – 40.4%) No click (291 – 59.6%) Three months SERF usage – 1194 search transactions
56
Average visited documents: 2.196 First click - recommendation (141 – 71.6%) First click - Google result (56 – 28.4%) Average ratings: 14.727Average ratings: 20.715 Only Google Results (706 - 59.13%) Google results + recommendations (488 - 40.87%) Average visited documents: 1.598 Clicked (172– 24.4%) No clicks (534 - 75.6%) Clicked (197 – 40.4%) No click (291 – 59.6%) Three months SERF usage – 1194 search transactions
57
Average visited documents: 2.196 First click - recommendation (141 – 71.6%) First click - Google result (56 – 28.4%) Average rating: 14.727 (49% Voted as Useful) Average rating: 20.715 (69% Voted as Useful) Only Google Results (706 - 59.13%) Google results + recommendations (488 - 40.87%) Average visited documents: 1.598 Clicked (172 – 24.4%) No clicks (534 - 75.6%) Clicked (197 – 40.4%) No click (291 – 59.6%) Three months SERF usage – 1194 search transactions Vote of yes = 30, vote of no = 0
58
58 SERF Project Summary No large leaps in language understanding expected –Understanding the meaning of language is *very* hard Collaborative filtering (CF) bypasses this problem –Humans do the analysis Technology is widely applicable
59
59 Talk Messages Model for learning options in recommender systems Survey of popular predictive algorithms for CF Survey of evaluation metrics Empirical data showing metrics can matter Evidence showing CF could significantly improve web search
60
60 Links & Contacts CoFE –http://eecs.oregonstate.edu/iis/CoFE SERF –http://osulibrary.oregonstatate.edu/ Jon Herlocker herlock@cs.orst.edu herlock@MusicStrands.com + 1 (541) 737-8894
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.