Download presentation
Presentation is loading. Please wait.
Published bySamuel Davidson Modified over 8 years ago
1
The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh
2
Outline Introduction Mining the Web For Expert Ratings Expert Nearest-Neighbors Result User Study Discussion Conclusion
3
Introduction Nearest-neighbor collaborative filtering suffers from the following shortcomings - Data sparsity - Noise - Cold-start problem - Scalability Cause the problem of defining the similarity
4
Mining the Web For Expert Ratings Collect 8,000 movies from Netflix 1,750 experts from “Rotten Tomatoes” - Remove those experts who rated less than movies - Only 169 experts left General users: members in the “Netflix”
5
Mining the Web For Expert Ratings
6
Dataset analysis: Data sparsity Experts achieve lower data sparsity
7
Mining the Web For Expert Ratings Dataset analysis: Average rating distribution
8
Expert Nearest-Neighbors User-based Collaborative Filtering (CF) is used Two stages 1. construct user-item matrix 2. construct user-expert matrix
9
Expert Nearest-Neighbors Construct user-expert matrix - Calculate similarities between users and expert
10
Expert Nearest-Neighbors Only select the experts such that A threshold as the minimum number of expert neighbors who must have rate the item
11
Expert Nearest-Neighbors Construct user-expert matrix - Rating prediction mean ratings with respect to user and expert predicted rating for item from user
12
Result Error in Predicted Recommendations - 5-fold cross validation - Base line: without similarity calculation
13
Result Expert CF and Neighbor CF:
14
Result
17
Top-N Recommendation Precision - No recommendation list with fixed number of N items - Only classify items into recommendable and not recommendable given a threshold
18
Result
19
User Study Select 500,000 movies from Netflix (random sample) Separate movies into 10 equal density bins according to the popularity Select 10 movies from each bin for 57 participants to rate (total 100 movies)
20
User Study 8,000 movies and theirs rating are collected Generate 4 top-10 recommendation list - Random List - Critics Choice - Neighbor-CF - Expert-CF
21
User Study K=50
22
User Study
23
User Result
24
Discussion Data sparsity - Experts are more likely to have rated a large percentage of the items Noise and malicious ratings - Noise: Experts are more consistent with their ratings - Malicious ratings: Experts rate items in good manner Cold start problem Scalability - Reduce the time complexity to compute the similarity matrix
25
Conclusions Proposed an approach to recommend, based on the opinion of an external source. Using few experts to predict. Tackle the problems of traditional KNN-CF
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.