Download presentation
Presentation is loading. Please wait.
Published byLisa Patrick Modified over 9 years ago
1
Similarity & Recommendation Arjen P. de Vries arjen@cwi.nl CWI Scientific Meeting September 27th 2013
2
Recommendation Informally: –Search for information “without a query” Three types: –Content-based recommendation –Collaborative filtering (CF) Memory-based Model-based –Hybrid approaches
3
Recommendation Informally: –Search for information “without a query” Three types: –Content-based recommendation –Collaborative filtering Memory-based Model-based –Hybrid approaches Today’s focus!
4
Collaborative Filtering Collaborative filtering (originally introduced by Patti Maes as “social information filtering”) 1. Compare user judgments 2. Recommend differences between similar users Leading principle: People’s tastes are not randomly distributed –A.k.a. “You are what you buy”
5
Collaborative Filtering Benefits over content-based approach –Overcomes problems with finding suitable features to represent e.g. art, music –Serendipity –Implicit mechanism for qualitative aspects like style Problems: large groups, broad domains
6
Context Recommender systems –Users interact (rate, purchase, click) with items
7
Context Recommender systems –Users interact (rate, purchase, click) with items
8
Context Recommender systems –Users interact (rate, purchase, click) with items
9
Context Recommender systems –Users interact (rate, purchase, click) with items
10
Context Nearest-neighbour recommendation methods –The item prediction is based on “similar” users
11
Context Nearest-neighbour recommendation methods –The item prediction is based on “similar” users
12
Similarity
14
s(, ) sim(, )s(, )
15
Research Question How does the choice of similarity measure determine the quality of the recommendations?
16
Sparseness Too many items exist, so many ratings will be missing A user’s neighborhood is likely to extend to include “not-so-similar” users and/or items
17
“Best” similarity? Consider cosine similarity vs. Pearson similarity Most existing studies report Pearson correlation to lead to superior recommendation accuracy
18
“Best” similarity? Common variations to deal with sparse observations: –Item selection: Compare full profiles, or only on overlap –Imputation: Impute default value for unrated items –Filtering: Threshold on minimal similarity value
19
“Best” similarity? Cosine superior (!), but not for all settings –No consistent results
20
Analysis
21
Distance Distribution In high dimensions, nearest neighbour is unstable: If the distance from query point to most data points is less than (1 + ε) times the distance from the query point to its nearest neighbour Beyer et al. When is “nearest neighbour” meaningful? ICDT 1999
22
Distance Distribution Beyer et al. When is “nearest neighbour” meaningful? ICDT 1999
23
Distance Distribution Quality q(n, f): Fraction of users for which the similarity function has ranked at least n percent of the user community within a factor f of the nearest neighbour’s similarity value (well... its corresponding distance)
24
Distance Distribution
25
NN k Graph Graph associated with the top k nearest neighbours Analysis focusing on the binary relation of whether a user does or does not belong to a neighbourhood –Ignore similarity values (already included in the distance distribution analysis)
26
NN k Graph
27
MRR vs. Features Quality: –If most of the user population is far away, high similarity correlates with effectiveness –If most of the user population is close, high similarity correlates with ineffectiveness
28
MRR vs. Features
29
Conclusions (so far) “Similarity features” correlate with recommendation effectiveness –“Stability” of a metric (as defined in database literature on k-NN search in high dimensions) is related to its ability to discriminate between good and bad neighbours
30
Future Work How to exploit this knowledge to now improve recommendation systems?
31
News Recommendation Challenge
32
Thanks Alejandro Bellogín – ERCIM fellow in the Information Access group Details: Bellogín and De Vries, ICTIR 2013.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.