Download presentation
Presentation is loading. Please wait.
1
Quest for $1,000,000: The Netflix Prize Bob Bell AT&T Labs-Research July 15, 2009 Joint work with Chris Volinsky, AT&T Labs-Research and Yehuda Koren, Yahoo! Research
2
2 Recommender Systems Personalized recommendations of items (e.g., movies) to users Increasingly common –To deal with explosive number of choices on the internet –Netflix –Amazon –Many others
3
3 Content Based Systems A pre-specified list of attributes Score each item on all attributes User interest obtained for the same attributes –Direct solicitation, or –Estimated based on user purchases or ratings
4
4 Pandora Music recommendation system Songs rated on 400+ attributes –Music genome project –Roots, instrumentation, lyrics, vocals Two types of user feedback –Seed songs –Thumbs up/down for recommended songs
5
5 Drawbacks of Content Based Systems Effort to score all items on many attributes –Best attributes may be unknown –Some attributes may be unscorable Need for direct solicitation of data from users in some systems
6
6 Collaborative Filtering (CF) Does not require content information about items or solicitation of users Infers user-item relationships from purchases or ratings Used by Amazon and Netflix
7
7 “We’re quite curious, really. To the tune of one million dollars.” – Netflix Prize rules Goal to improve on Netflix’ existing movie recommendation technology Prize –Based on reduction in root mean squared error (RMSE) on test data –$1,000,000 grand prize for 10% drop –Or, $50,000 progress for best result each year Contest began October 2, 2006
8
8 Data Details Training data –100 million ratings (from 1 to 5 stars) –6 years (2000-2005) –480,000 users –17,770 “movies” Test data –Last few ratings of each user –User, movie, date given –Ratings withheld (for most of test data) –Teams are allowed daily feedback on their RMSE
9
9 Higher Mean Rating in Test Data
10
10 2004 Something Happened in Early 2004
11
11 Movies Rated Most Often Title# RatingsMean Rating Miss Congeniality227,7153.36 Independence Day216,2333.72 The Patriot200,4903.78 The Day After Tomorrow194,6953.44 Pretty Woman190,3203.90 Pirates of the Caribbean188,8494.15 The Green Mile180,8834.31 Forrest Gump180,7364.30
12
12 Most Active Users User ID# RatingsMean Rating 30534417,6511.90 38741817,4321.81 243949316,5601.22 166401015,8114.26 211846114,8294.08 1461435 9,8201.37 1639792 9,7641.33 1314869 9,7392.95
13
13 Ratings per Movie in Training Data Avg #ratings/movie: 5627
14
14 Ratings per User in Training Data Avg #ratings/user: 208
15
15 Progress after 2 Months
16
16 Progress after 8 Months
17
17 Nearest Neighbor (NN) Methods Most common CF tool Predict rating for a specific user-item pair based on ratings of –Similar items –By the same user –Or vice versa Requires no “content” about items or users Easy to apply Easy to explain to users But not as powerful as other methods
18
18 Latent Factor Models Explain ratings by a set of latent factors (attributes) –Factors are learned from the data –No need for pre specification Neural networks SVD (Singular Value Decomposition) –AKA matrix factorization –Dominant method used by leaders of competition
19
19 Item Factors Each item summarized by a d-dimensional vector q i Potential factors –Comedy vs. drama –Amount of action –Depth of character development –Totally uninterpretable Choose d much smaller than number of items or users –e.g., d = 50 << 18,000 or 480,000
20
20 User Factors Similarly, each user summarized by p u Same number of factors User factors measure interest in corresponding item factors Predicted rating for Item i by User u –Inner product of q i and p u –
21
21 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility
22
22 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility Gus Dave
23
Challenges in Using SVD Need lots of factors (large d) 23
24
Challenges in Using SVD Need lots of factors (large d) Easy to over fit 24
25
25 The Fundamental Challenge How can we estimate as much signal as possible where there are sufficient data, without over fitting where data are scarce?
26
26 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility Gus
27
27 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility Gus
28
28 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility Gus
29
29 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility Gus
30
Challenges in Using SVD Need lots of factors (large d) Easy to over fit User behavior may change over time –Ratings go up or down –Interests may change –Composition of account may change, for example, with addition of a new rater 30
31
31 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility Gus
32
32 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day Amadeus The Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility Gus
33
33 Geared towards females Geared towards males serious escapist The Princess Diaries The Lion King Braveheart Lethal Weapon Independence Day AmadeusThe Color Purple Dumb and Dumber Ocean’s 11 Sense and Sensibility Gus +
34
Challenges in Using SVD Need lots of factors (large d) Easy to over fit User behavior may change over time Misses some types of patterns 34
35
35 Neither SVD nor NN is Perfect SVD is poorly situated to fully capture strong “local” relationships –e.g., among sequels NN ignores cumulative effect of many small signals –May be ineffective for items with no close neighbors Each method complements the other
36
36 The Wisdom of Crowds (of Models) All models are wrong; some are useful – G. Box Our best entry during Year 1 was a linear combination of 107 sets of predictions –Nearest neighbors, SVD, neural nets, et al. –Many variations of model structure and parameter settings Years 2 and 3 –Individual models are more comprehensive and much more accurate –Combining many models still helps –Five models suffice to beat Year 1 score
37
37 Progress after 1 Year
38
38 Is this Any Way to do Science? Wide participation –Submissions from 5,000 teams –8,300 posts on the Netflix Prize forum Generation and dissemination of new methods –Presentations/workshops in academic conferences –Journal publications Reasons for success –Well designed by Netflix –Industrial strength data set –Opportunity to build on work of others –Collegial spirit of competitors
39
The Race is On
40
40 Thank You! rbell@research.att.com www.netflixprize.com –…/leaderboard –…/community Click BellKor on Leaderboard for details
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.