The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh

Outline Introduction Mining the Web For Expert Ratings Expert Nearest-Neighbors Result User Study Discussion Conclusion

Introduction Nearest-neighbor collaborative filtering suffers from the following shortcomings - Data sparsity - Noise - Cold-start problem - Scalability Cause the problem of defining the similarity

Mining the Web For Expert Ratings Collect 8,000 movies from Netflix 1,750 experts from “Rotten Tomatoes” - Remove those experts who rated less than movies - Only 169 experts left General users: members in the “Netflix”

Mining the Web For Expert Ratings

Dataset analysis: Data sparsity Experts achieve lower data sparsity

Mining the Web For Expert Ratings Dataset analysis: Average rating distribution

Expert Nearest-Neighbors User-based Collaborative Filtering (CF) is used Two stages 1. construct user-item matrix 2. construct user-expert matrix

Expert Nearest-Neighbors Construct user-expert matrix - Calculate similarities between users and expert

Expert Nearest-Neighbors Only select the experts such that A threshold as the minimum number of expert neighbors who must have rate the item

Expert Nearest-Neighbors Construct user-expert matrix - Rating prediction mean ratings with respect to user and expert predicted rating for item from user

Result Error in Predicted Recommendations - 5-fold cross validation - Base line: without similarity calculation

Result Expert CF and Neighbor CF:

Result

Top-N Recommendation Precision - No recommendation list with fixed number of N items - Only classify items into recommendable and not recommendable given a threshold

Result

User Study Select 500,000 movies from Netflix (random sample) Separate movies into 10 equal density bins according to the popularity Select 10 movies from each bin for 57 participants to rate (total 100 movies)

User Study 8,000 movies and theirs rating are collected Generate 4 top-10 recommendation list - Random List - Critics Choice - Neighbor-CF - Expert-CF

User Study K=50

User Study

User Result

Discussion Data sparsity - Experts are more likely to have rated a large percentage of the items Noise and malicious ratings - Noise: Experts are more consistent with their ratings - Malicious ratings: Experts rate items in good manner Cold start problem Scalability - Reduce the time complexity to compute the similarity matrix

Conclusions Proposed an approach to recommend, based on the opinion of an external source. Using few experts to predict. Tackle the problems of traditional KNN-CF

The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

Similar presentations

Presentation on theme: "The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

Similar presentations

Presentation on theme: "The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh."— Presentation transcript:

Similar presentations

About project

Feedback