1) New Paths to New Machine Learning Science 2) How an Unruly Mob Almost Stole the Grand Prize at the Last Moment Jeff Howbert February 6, 2012.

Slides:



Advertisements
Similar presentations
Symantec 2010 Windows 7 Migration EMEA Results. Methodology Applied Research performed survey 1,360 enterprises worldwide SMBs and enterprises Cross-industry.
Advertisements

Symantec 2010 Windows 7 Migration Global Results.
Solving Systems by Elimination
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
3-2 Warm Up Lesson Presentation Lesson Quiz Using Algebraic Methods
Tuesday, May 7 Integer Programming Formulations Handouts: Lecture Notes.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Human Service Providers and Referrals Chapter 5. Human Service Providers and Referrals 5-2 Objectives Demonstrate the process for entering a Human Service.
Recommender Systems & Collaborative Filtering
Fawaz Ghali Web 2.0 for the Adaptive Web.
Chapter 7 Sampling and Sampling Distributions
1 Outline relationship among topics secrets LP with upper bounds by Simplex method basic feasible solution (BFS) by Simplex method for bounded variables.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
- A Powerful Computing Technology Department of Computer Science Wayne State University 1.
16 MKTG CHAPTER Lamb, Hair, McDaniel
Factoring Quadratics — ax² + bx + c Topic
On Comparing Classifiers : Pitfalls to Avoid and Recommended Approach
Database Performance Tuning and Query Optimization
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
User Friendly Price Book Maintenance A Family of Enhancements For iSeries 400 DMAS from Copyright I/O International, 2006, 2007, 2008, 2010 Skip Intro.
2009 Foster School of Business Cost Accounting L.DuCharme 1 Determining How Costs Behave Chapter 10.
 Copyright I/O International, 2013 Visit us at: A Feature Within from Item Class User Friendly Maintenance  Copyright.
Dave Chaffey, E-Business and E-Commerce Management, 4 th Edition, © Marketing Insights Limited 2009 Slide 1.1 Introduction to e-business and e-commerce.
Traditional IR models Jian-Yun Nie.
CMPT 275 Software Engineering
Memory vs. Model-based Approaches SVD & MF Based on the Rajaraman and Ullman book and the RS Handbook. See the Adomavicius and Tuzhilin, TKDE 2005 paper.
Data Management Seminar, 8-11th July 2008, Hamburg Survey System – Overview & Changes from the Field Trial.
Adding Up In Chunks.
1 Joseph Ghafari Artificial Neural Networks Botnet detection for Stéphane Sénécal, Emmanuel Herbert.
Mushroom Softech Pvt. Ltd.1 eCRM-Presentation Security Administration Home/Login page General Masters Customer Registration Customer Wise Search Fill up.
CS910: Foundations of Data Analytics Graham Cormode Recommender Systems.
Chapter 13 Web Page Design Studio
Multiple Regression and Model Building
Prediction Modeling for Personalization & Recommender Systems Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
G54DMT – Data Mining Techniques and Applications Dr. Jaume Bacardit
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
Statistical Learning Introduction: Data Mining Process and Modeling Examples Data Mining Process.
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Customizable Bayesian Collaborative Filtering Denver Dash Big Data Reading Group 11/19/2007.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Statistical Learning Introduction: Data Mining Process and Modeling Examples Data Mining Process.
Recommender systems Ram Akella November 26 th 2008.
Quest for $1,000,000: The Netflix Prize Bob Bell AT&T Labs-Research July 15, 2009 Joint work with Chris Volinsky, AT&T Labs-Research and Yehuda Koren,
Collaborative Filtering Matrix Factorization Approach
Statistical Learning Introduction: Modeling Examples.
Matrix Factorization Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
By Rachsuda Jiamthapthaksin 10/09/ Edited by Christoph F. Eick.
Marketing and CS Philip Chan. Enticing you to buy a product 1. What is the content of the ad? 2. Where to advertise? TV, radio, newspaper, magazine, internet,
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 9: Ways of speeding up the learning and preventing overfitting Geoffrey Hinton.
Netflix Netflix is a subscription-based movie and television show rental service that offers media to subscribers: Physically by mail Over the internet.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Investigation of Various Factorization Methods for Large Recommender Systems G. Takács, I. Pilászy, B. Németh and D. Tikk 10th International.
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
M Machine Learning F# and Accord.net.
Marketing and CS Philip Chan.
Yue Xu Shu Zhang.  A person has already rated some movies, which movies he/she may be interested, too?  If we have huge data of user and movies, this.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Recommender Systems & Collaborative Filtering
Statistical Learning Introduction: Modeling Examples
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Collaborative Filtering Nearest Neighbor Approach
Q4 : How does Netflix recommend movies?
Collaborative Filtering Matrix Factorization Approach
Ensembles.
Matrix Factorization & Singular Value Decomposition
CSE 491/891 Lecture 25 (Mahout).
Presentation transcript:

1) New Paths to New Machine Learning Science 2) How an Unruly Mob Almost Stole the Grand Prize at the Last Moment Jeff Howbert February 6, 2012

Netflix Viewing Recommendations

Recommender Systems DOMAIN: some field of activity where users buy, view, consume, or otherwise experience items PROCESS: 1. users provide ratings on items they have experienced 2. Take all data and build a predictive model 3. For a user who hasnt experienced a particular item, use model to predict how well they will like it (i.e. predict rating)

Roles of Recommender Systems Help users deal with paradox of choice Allow online sites to: Increase likelihood of sales Retain customers by providing positive search experience Considered essential in operation of: Online retailing, e.g. Amazon, Netflix, etc. Social networking sites

Amazon.com Product Recommendations

Recommendations on essentially every category of interest known to mankind Friends Groups Activities Media (TV shows, movies, music, books) News stories Ad placements All based on connections in underlying social network graph and your expressed likes and dislikes Social Network Recommendations

Types of Recommender Systems Base predictions on either: content-based approach explicit characteristics of users and items collaborative filtering approach implicit characteristics based on similarity of users preferences to those of other users

The Netflix Prize Contest GOAL: use training data to build a recommender system, which, when applied to qualifying data, improves error rate by 10% relative to Netflixs existing system PRIZE: first team to 10% wins $1,000,000 Annual Progress Prizes of $50,000 also possible

The Netflix Prize Contest CONDITIONS: Open to public Compete as individual or group Submit predictions no more than once a day Prize winners must publish results and license code to Netflix (non-exclusive) SCHEDULE: Started Oct. 2, 2006 To end after 5 years

The Netflix Prize Contest PARTICIPATION: contestants on teams from 186 different countries valid submissions from 5169 different teams

The Netflix Prize Data Netflix released three datasets 480,189 users (anonymous) 17,770 movies ratings on integer scale 1 to 5 Training set: 99,072,112 pairs with ratings Probe set: 1,408,395 pairs with ratings Qualifying set of 2,817,131 pairs with no ratings

Model Building and Submission Process 99,072,1121,408,395 1,408,7891,408,342 training set quiz settest set qualifying set (ratings unknown) probe set MODEL tuning ratings known validate make predictions RMSE on public leaderboard RMSE kept secret for final scoring

Why the Netflix Prize Was Hard Massive dataset Very sparse – matrix only 1.2% occupied Extreme variation in number of ratings per user Statistical properties of qualifying and probe sets different from training set

Dealing with Size of the Data MEMORY: 2 GB bare minimum for common algorithms 4+ GB required for some algorithms need 64-bit machine with 4+ GB RAM if serious SPEED: Program in languages that compile to fast machine code 64-bit processor Exploit low-level parallelism in code (SIMD on Intel x86/x64)

Common Types of Algorithms Global effects Nearest neighbors Matrix factorization Restricted Boltzmann machine Clustering Etc.

Nearest Neighbors in Action Identical preferences – strong weight Similar preferences – moderate weight

Matrix Factorization in Action < a bunch of numbers > reduced-rank singular value decomposition (sort of) +

Matrix Factorization in Action multiply and add features (dot product) for desired prediction +

The Power of Blending Error function (RMSE) is convex, so linear combinations of models should have lower error Find blending coefficients with simple least squares fit of model predictions to true values of probe set Example from my experience: blended 89 diverse models RMSE range = – blended model had RMSE = Improvement of over best single model 13% of progress needed to win

Algorithms: Other Things That Mattered Overfitting Models typically had millions or even billions of parameters Control with aggressive regularization Time-related effects Netflix data included date of movie release, dates of ratings Most of progress in final two years of contest was from incorporating temporal information

The Netflix Prize: Social Phenomena Competition intense, but sharing and collaboration were equally so Lots of publications and presentations at meetings while contest still active Lots of sharing on contest forums of ideas and implementation details Vast majority of teams: Not machine learning professionals Not competing to win (until very end) Mostly used algorithms published by others

One Algorithm from Winning Team (time-dependent matrix factorization) Yehuda Koren, Comm. ACM, 53, 89 (2010)

Netflix Prize Progress: Major Milestones DATE:Oct. 2007Oct. 2008July 2009 WINNER:BellKor BellKor in BigChaos ??? me, starting June, 2008

June 25, :28 GMT

June 26, 18:42 GMT – BPC Team Breaks 10%

me Genesis of The Ensemble The Ensemble 33 individuals Opera and Vandelay United 19 individuals Gravity 4 indiv. Dinosaur Planet 3 indiv. 7 other individuals Vandelay Industries 5 indiv. Opera Solutions 5 indiv. Grand Prize Team 14 individuals 9 other individuals

June 30, 16:44 GMT

July 8, 14:22 GMT

July 17, 16:01 GMT

July 25, 18:32 GMT – The Ensemble First Appears! 24 hours, 10 min before contest ends #1 and #2 teams each have one more submission !

July 26, 18:18 GMT BPC Makes Their Final Submission 24 minutes before contest ends The Ensemble can make one more submission – window opens 10 minutes before contest ends

July 26, 18:43 GMT – Contest Over!

Final Test Scores

Netflix Prize: What Did We Learn? Significantly advanced science of recommender systems Properly tuned and regularized matrix factorization is a powerful approach to collaborative filtering Ensemble methods (blending) can markedly enhance predictive power of recommender systems Crowdsourcing via a contest can unleash amazing amounts of sustained effort and creativity Netflix made out like a bandit But probably would not be successful in most problems

Netflix Prize: What Did I Learn? Several new machine learning algorithms A lot about optimizing predictive models Stochastic gradient descent Regularization A lot about optimizing code for speed and memory usage Some linear algebra and a little PDQ Enough to come up with one original approach that actually worked Money makes people crazy, in both good ways and bad COST: about 1000 hours of my free time over 13 months