CS 277: The Netflix Prize Professor Padhraic Smyth Department of Computer Science University of California, Irvine.

Slides:

Advertisements

Similar presentations

Online Recommendations

Advertisements

Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

Dimensionality Reduction PCA -- SVD

Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.

Oct 14, 2014 Lirong Xia Recommender systems acknowledgment: Li Zhang, UCSC.

Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University Note to other teachers and users of these.

G54DMT – Data Mining Techniques and Applications Dr. Jaume Bacardit

Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!

Statistical Learning Introduction: Data Mining Process and Modeling Examples Data Mining Process.

Customizable Bayesian Collaborative Filtering Denver Dash Big Data Reading Group 11/19/2007.

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems

Quest for $1,000,000: The Netflix Prize Bob Bell AT&T Labs-Research July 15, 2009 Joint work with Chris Volinsky, AT&T Labs-Research and Yehuda Koren,

1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)

Portfolio Selection of IT Service Products – Case Study Antti Vikman

Review an existing website Usability in Design. to begin with.. Meeting Organization’s objectives and your Usability goals Meeting User’s Needs Complying.

Research Methods for Computer Science CSCI 6620 Spring 2014 Dr. Pettey CSCI 6620 Spring 2014 Dr. Pettey.

EASY TEAM MANAGER By Dave Abineri EASYWARE: PO Box 231, Milford, OHIO (Cincinnati) Phone: (513) Use UP arrow to move to the NEXT slide Use.

Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;

Collaborative Filtering Matrix Factorization Approach

CS 277: Data Mining Recommender Systems

Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.

Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.

Face Detection using the Viola-Jones Method

Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis.

Statistical Learning Introduction: Modeling Examples.

John Stamper Human-Computer Interaction Institute Carnegie Mellon University Technical Director Pittsburgh Science of Learning Center DataShop.

Matrix Factorization Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.

J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, 2.

Chapter 6 : Software Metrics

Netflix Prize and Heritage Health Prize Philip Chan.

EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.

Research Paper Assignment CS 435 Winter, As an important part of the course requirement, each student will participate in a group project to prepare.

Report #1 By Team: Green Ensemble AusDM 2009 ENSEMBLE Analytical Challenge: Rules, Objectives, and Our Approach.

Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.

Assessing the Frequency of Empirical Evaluation in Software Modeling Research Workshop on Experiences and Empirical Studies in Software Modelling (EESSMod)

Online Learning for Collaborative Filtering

Netflix Netflix is a subscription-based movie and television show rental service that offers media to subscribers: Physically by mail Over the internet.

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

Project Estimation techniques Estimation of various project parameters is a basic project planning activity. The important project parameters that are.

Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.

Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.

Collaborative Filtering with Temporal Dynamics Yehuda Koren Yahoo Research Israel KDD’09.

CeAnn Chalker What is SO? * Science Olympiad in a Nutshell * Competitions * State Science Olympiad Coaching 101 * YOUR TEAM… * Kids,

Singular Value Decomposition and Item-Based Collaborative Filtering for Netflix Prize Presentation by Tingda Lu at the Saturday Research meeting 10_23_10.

Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun

Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.

Collaborative Filtering with Temporal Dynamics Yehuda Koren Yahoo! Israel KDD 2009.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

1 Systematic Data Selection to Mine Concept-Drifting Data Streams Wei Fan Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery.

Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

Mining of Massive Datasets Edited based on Leskovec’s from

ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.

Data Mining Lectures Lecture 7: Regression Padhraic Smyth, UC Irvine ICS 278: Data Mining Lecture 7: Regression Algorithms Padhraic Smyth Department of.

Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!

CSE 4705 Artificial Intelligence

Matrix Factorization and Collaborative Filtering

Statistical Learning Introduction: Modeling Examples

Boosting and Additive Trees

Advisor: Prof. Shou-de Lin (林守德) Student: Eric L. Lee (李揚)

Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.

Collaborative Filtering Nearest Neighbor Approach

Q4 : How does Netflix recommend movies?

Collaborative Filtering Matrix Factorization Approach

Matrix Factorization & Singular Value Decomposition

Data Mining Ensembles Last modified 1/9/19.

Presentation and project

Presentation transcript:

CS 277: The Netflix Prize Professor Padhraic Smyth Department of Computer Science University of California, Irvine

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 2 Netflix Movie rentals by DVD (mail) and online (streaming) 100k movies, 10 million customers Ships 1.9 million disks to customers each day –50 warehouses in the US –Complex logistics problem Employees: 2000 –But relatively few in engineering/software –And only a few people working on recommender systems Moving towards online delivery of content Significant interaction of customers with Web site

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 3 The $1 Million Question

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 4 Million Dollars Awarded Sept 21 st 2009

Background

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 6 Ratings Data ,700 movies 480,000 users

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 7 Training Data 100 million ratings (matrix is 99% sparse) Rating = [user, movie-id, time-stamp, rating value] Generated by users between Oct 1998 and Dec 2005 Users randomly chosen among set with at least 20 ratings –Small perturbations to help with anonymity

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 8 Ratings Data ?? ? 21? 3 ? 1 Test Data Set (most recent ratings) 480,000 users 17,700 movies

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 9 Structure of Competition Register to enter at Netflix site Download training data of 100 million ratings –480k users x 17.7k movies –Anonymized Submit predictions for 3 million ratings in “test set” –True ratings are known only to Netflix Can submit multiple times (limit of once/day) Prize –$1 million dollars if error is 10% lower than Netflix current system –Annual progress prize of $50,000 to leading team each year

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 10 Scoring Minimize root mean square error Does not necessarily correlate well with user satisfaction But is a widely-used well-understood quantitative measure Mean square error = 1/|R|    u,i)  R ( r ui - r ui ) 2 ^

Training Data 100 million ratings Held-Out Data 3 million ratings Labels only known to Netflix Labels known publicly

Training Data 100 million ratings Held-Out Data 3 million ratings 1.5m ratings Quiz Set: scores posted on leaderboard Test Set: scores known only to Netflix Scores used in determining final winner Labels only known to Netflix Labels known publicly

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 13 RMSE Baseline Scores on Test Data just predict the mean user rating for each movie Netflix’s own system (Cinematch) as of nearest-neighbor method using correlation required 10% reduction to win $1 million

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 14 Other Aspects of Rules Rights –Software + non-exclusive license to Netflix –Algorithm description to be posted publicly Final prize details –If public score of any contestant is better than 10%, this triggers a 30-day final competition period –Anyone can submit scores in this 30-day period –Best score at the end of the 30-day period wins the $1 million prize Competition not open to entrants in North Korea, Iran, Libya, Cuba….and Quebec

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 15 Why did Netflix do this? Customer satisfaction/retention is key to Netflix – they would really like to improve their recommender systems Progress with internal system (Cinematch) was slow Initial prize idea from CEO Reed Hastings $1 million would likely easily pay for itself Potential downsides –Negative publicity (e.g., privacy) –No-one wins the prize (conspiracy theory) –The prize is won within a day or 2 –Person-hours at Netflix to run the competition –Algorithmic solutions are not useful operationally

Key Technical Ideas

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 17 Outline Focus primarily on techniques used by Koren, Volinsky, Bell team (winners of prize) –We will focus on some of the main ideas used in their algorithms –Many other details in their papers, and in other papers published on the Netflix data set Useful References –Y. Koren, Collaborative filtering with temporal dynamics, ACM SIGKDD Conference 2009 –Koren, Bell, Volinsky, Matrix factorization techniques for recommender systems, IEEE Computer, 2009 –Y. Koren, Factor in the neighbors: scalable and accurate collaborative filtering, ACM Transactions on Knowledge Discovery in Data, 2010

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 18 Singular Value Decomposition R = U  V t m x n n x n where:columns of V are eigenvectors of R t R  is diagonal (eigenvalues) rows of U are coefficients in V-space of each row in R

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 19 Matrix Approximation with SVD R U  V t ~ ~ m x n m x f f x f f x n where:columns of V are first f eigenvectors of R t R  is diagonal with f largest eigenvalues rows of U are coefficients in reduced dimension V-space This approximation is the best rank-f approximation to matrix R in a least squares sense (principal components analysis)

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 20 m users n movies m users n movies f f ~ ~ Matrix Factorization of Ratings Data x

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 21 m users n movies m users n movies f f ~ ~ Matrix Factorization of Ratings Data x r ui p t u q i ~ ~ r ui q t i p u ~ ~

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 22 Figure from Koren, Bell, Volinksy, IEEE Computer, 2009

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 23 Computation of Matrix Factors Problem 1: Finding the f factors is equivalent to performing a singular value decomposition of a matrix, i.e., Let R be an m x n matrix SVD computation has complexity O(mn 2 + n 3 )

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 24 Computation of Matrix Factors Problem 1: Finding the f factors is equivalent to performing a singular value decomposition of a matrix, i.e., Let R be an m x n matrix SVD computation has complexity O(mn 2 + n 3 ) Problem 2: Most of the entries in R are missing, i.e., only 100 x 10 6 / (480k x 17k) ~ 1% are present

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 25 Dealing with Missing Data min q,p     u,i)  R ( r ui - q t i p u ) 2 r ui q t i p u ~ ~ sum is only over known ratings

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 26 Dealing with Missing Data min q,p     u,i)  R ( r ui - q t i p u ) 2 r ui q t i p u ~ ~ min q,p     u,i)  R ( r ui - q t i p u ) 2 + ( |q i | 2 + |p u | 2 ) Add regularization

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 27 Stochastic Gradient Descent (SGD)  ui = r ui - q t i p u q i <= q i +   ui p u - q i ) etc….. min q,p     u,i)  R ( r ui - q t i p u ) 2 + ( |q i | 2 + |p u | 2 ) regularization goodness of fit Online (“stochastic”) gradient update equations:

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 28 Modeling Systematic Biases  = overall mean rating b u = mean rating for user u b i = mean rating for movie i

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 29 Components of a rating predictor user-movie interactionmovie biasuser bias User-movie interaction Characterizes the matching between users and movies Attracts most research in the field Benefits from algorithmic and mathematical innovations Baseline predictor Separates users and movies Often overlooked Benefits from insights into users’ behavior Among the main practical contributions of the competition (slide from Yehuda Koren)

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 30 A baseline predictor We have expectations on the rating by user u of movie i, even without estimating u’s attitude towards movies like i – Rating scale of user u – Values of other ratings user gave recently (day-specific mood, anchoring, multi-user accounts) – (Recent) popularity of movie i – Selection bias; related to number of ratings user gave on the same day (“frequency”) (slide from Yehuda Koren)

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 31 Modeling Systematic Biases r ui  + b u + b i + user-movie interactions ~ ~ overall mean rating mean rating for user u mean rating for movie i Example: Mean rating  = 3.7 You are a critical reviewer: your ratings are 1 lower than the mean -> b u = -1 Star Wars gets a mean rating of 0.5 higher than average movie: b i = Predicted rating for you on Star Wars = = 3.2 q t i p u

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 32 Objective Function min q,p     u,i)  R ( r ui - (  + b u + b i + q t i p u ) ) 2 + ( |q i | 2 + |p u | 2 + |b u | 2 + |b i | 2 ) } regularization goodness of fit Typically selected via grid-search on a validation set

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 33 Figure from Koren, Bell, Volinksy, IEEE Computer, % 8%

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 34 Adding Implicit Information ?? ? 21? 3 ? 1 Test Data Set 400,000 users movies

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 35 Figure from Koren, Bell, Volinksy, IEEE Computer, % 8%

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 36

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 37 Explanation for increase?

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 38 Adding Time Effects r ui  + b u + b i + user-movie interactions ~ ~ ~ ~ r ui  + b u (t) + b i (t) + user-movie interactions Add time dependence to biases Time-dependence parametrized by linear trends, binning, and other methods For details see Y. Koren, Collaborative filtering with temporal dynamics, ACM SIGKDD Conference 2009

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 39 Adding Time Effects r ui  + b u (t) + b i (t) + q t i p u (t) ~ ~ Add time dependence to user “factor weights” Models the fact that user’s interests over “genres” (the q’s) may change over time

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 40 Figure from Koren, Bell, Volinksy, IEEE Computer, % 8%

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 41 The Kitchen Sink Approach…. Many options for modeling –Variants of the ideas we have seen so far Different numbers of factors Different ways to model time Different ways to handle implicit information …. –Other models (not described here) Nearest-neighbor models Restricted Boltzmann machines Model averaging was useful…. –Linear model combining –Neural network combining –Gradient boosted decision tree combining –Note: combining weights learned on validation set (“stacking”)

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 42

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 43 Other Aspects of Model Building Automated parameter tuning –Using a validation set, and grid search, various parameters such as learning rates, regularization parameters, etc., can be optimized Memory requirements –Memory: can fit within roughly 1 Gbyte of RAM Training time –Order of days: but achievable on commodity hardware rather than a supercomputer –Some parallelization used

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 44 Matrix factorization vs Near Neighbor? From Koren, ACM Transactions on Knowledge Discovery, 2010 “Latent factor models such as SVD face real difficulties when needed to explain predictions. …Thus, we believe that for practical applications neighborhood models are still expected to be a common choice.”

The Competition:

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 46 Setting up and Launching… Summer 2006 – from Netflix about large monetary award –Is this real? –Apparently so: serious and well-organized –Spent summer carefully designing data set and rules Official Launch, Oct 2 nd 2006 – lists, conferences, press releases, etc –Significant initial interest in research community, blogs, etc 40,000 teams (eventually) from over 150 countries. –Number of initial registrants significantly exceeded expectations

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 47 Progress in first 3 months Oct 2, 2006Launch of competition Oct 8, 2006WXY Consulting already better than Cinematch score Oct 15, teams above Cinematch, one with 1.06% improvement (qualifying for $50k progress prize) Dec, 2006: Jim Bennett from Netflix describes progress so far during an invited talk at NIPS

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 48 Prize Progress

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 49 Prize Submissions

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 50 Prize Submissions User avg Movie avg Bayes? Multinominal? Pearson? Leaders

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 52 First Progress Prize, October 2007 Progress prize: $50k annually awarded to leading team provided there is at least 1% improvement over previous year Sept 2 nd First progress prize “30 day” last call Oct 2 nd Leaders were BellKor, 8.4% improvement (Yehuda Koren, Bob Bell, Chris Volinksy, AT&T Research)

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 53 First Progress Prize, October 2007 Progress prize: $50k annually awarded to leading team provided there is at least 1% improvement over previous year Sept 2 nd First progress prize “30 day” last call Oct 2 nd Leaders were BellKor, 8.4% improvement (Yehuda Koren, Bob Bell, Chris Volinksy, AT&T Research) Oct/NovCode and documentation submitted for judging Complicated methods: primarily relying on factor models Nov 13Winners officially declared and BellKor documentation published on Netflix Web site

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 54 Progress in 2008… Progress slows down…improvements are incremental Many of the leading prize contenders publishing their methods and techniques at academic conferences (2 nd KDD workshop in August) Much speculation on whether the prize would ever be won – is 10% even attainable? Many initial participants had dropped out – too much time and effort to seriously compete But leaderboard and forum still very active

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 55 Progress Prize 2008 Sept 2 nd Only 3 teams qualify for 1% improvement over previous year Oct 2 nd Leading team has 9.4% overall improvement Oct/NovCode/documentation reviewed and judged

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 56 Progress Prize 2008 Sept 2 nd Only 3 teams qualify for 1% improvement over previous year Oct 2 nd Leading team has 9.4% overall improvement Oct/NovCode/documentation reviewed and judged Progress prize ($50,000) awarded to BellKor team of 3 AT&T researchers (same as before) plus 2 Austrian graduate students, Andreas Toscher and Martin Jahrer Key winning strategy: clever “blending” of predictions from models used by both teams Speculation that 10% would be attained by mid-2009

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 57 Example of Predictor Specifications….

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 58

The End-Game

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 60 June 26 th 2009: after 1000 Days and nights…

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 61 The Leading Team BellKorPragmaticChaos –BellKor: Yehuda Koren (now Yahoo!), Bob Bell, Chris Volinsky, AT&T –BigChaos: Michael Jahrer, Andreas Toscher, 2 grad students from Austria –Pragmatic Theory Martin Chabert, Martin Piotte, 2 engineers from Montreal (Quebec) June 26 th submission triggers 30-day “last call” Submission timed purposely to coincide with vacation schedules

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 62 The Last 30 Days Ensemble team formed –Group of other teams on leaderboard forms a new team –Relies on combining their models –Quickly also get a qualifying score over 10% BellKor –Continue to eke out small improvements in their scores –Realize that they are in direct competition with Ensemble Strategy –Both teams carefully monitoring the leaderboard –Only sure way to check for improvement is to submit a set of predictions This alerts the other team of your latest score

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: Hours from the Deadline Submissions limited to 1 a day –So only 1 final submission could be made by either team in the last 24 hours 24 hours before deadline… –BellKor team member in Austria notices (by chance) that Ensemble posts a score that is slightly better than BellKor’s –Leaderboard score disappears after a few minutes (rule loophole) Frantic last 24 hours for both teams –Much computer time on final optimization –run times carefully calibrated to end about an hour before deadline Final submissions –BellKor submits a little early (on purpose), 40 mins before deadline –Ensemble submits their final entry 20 mins later –….and everyone waits….

Training Data 100 million ratings Held-Out Data 3 million ratings 1.5m ratings Quiz Set: scores posted on leaderboard Test Set: scores known only to Netflix Scores used in determining final winner

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 67 Netflix Scoring and Judging Leaders on test set are contacted and submit their code and documentation (mid-August) Judges review documentation and inform winners that they have won $1 million prize (late August) Considerable speculation in press and blogs about which team has actually won News conference scheduled for Sept 21 st in New York to announce winner and present $1 million check

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 68

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 69

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 70 Million Dollars Awarded Sept 21 st 2009

Lessons Learned

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 72 Who were the Real Winners? Winning team –BellKor: 2 statisticians + computer scientist (AT&T, Yahoo!; US) –BigChaos: 2 computer science masters students (Austria) –Pragmatic Theory: 2 electrical engineers (Canada) –Division of prize money within team not revealed Netflix –Publicity –New algorithms –More research on recommender systems Machine learning/data mining research community –Increased interest in the field –Large new data set –Interest in more competitions

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 73 Lessons Learned Scale is important –e.g., stochastic gradient descent on sparse matrices Latent factor models work well on this problem –Previously had not been explored for recommender systems Understanding your data is important, e.g., time-effects Combining models works surprisingly well –But final 10% improvement can probably be achieved by judiciously combining about 10 models rather than 1000’s –This is likely what Netflix will do in practice Surprising amount of collaboration among participants

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 74

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 75 Why Collaboration? Openness of competition structure Rules stated that winning solutions would be published Non-exclusive license of winning software to Netflix “Description of algorithm to be posted on site” Research workshops sponsored by Netflix Leaderboard was publicly visible: “it was addictive….”

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 76

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 77 Why Collaboration? Development of Online Community Active Netflix prize forum + other blogs Quickly acquired “buzz” Forum was well-moderated by Netflix Attracted discussion from novices and experts alike Early posting of code and solutions Early self-identification (links via leaderboard)

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 78 Why Collaboration? Academic/Research Culture Nature of competition was technical/mathematical Attracted students, hobbyists, researchers Many motivated by fundamental interest in producing better algorithms - $1 million would be a nice bonus History in academic circles of being open, publishing, sharing

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 79 Why Collaboration? Technical Reasons Realization that combining many different models and techniques always produced small but systematic improvements (Statistical theory supports this….) “Teaming” was strategically attractive Particularly for the “end-game” (summer 2009), teaming was quite critical in terms of who won the competition

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 80 Questions Does reduction in squared error metric correlate with real improvements in user satisfaction? Are these competitions good for scientific research? –Should researchers be solving other more important problems? Are competitions a good strategy for companies?

Rank of best recommendation Probability of Rank From Y. Koren, ACM SIGKDD 2008 Where does a 5-star movie get ranked on average?

Rank of best recommendation Probability of Rank From Y. Koren, ACM SIGKDD 2008 Where does a 5-star movie get ranked on average?

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 83

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 84 Conclusions Was the Netflix prize a success? –For Netflix? Publicity Algorithms –For Participants? –For Research Community? –For recommender systems in general?

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 85

CS 277: Data Mining Netflix Competition Overview Padhraic Smyth, UC Irvine: 86 Links to additional information Netflix prize page (FAQs, rules, forum, etc) Page with links to articles, blogs, etc