The YouTube Video Recommendation System James Davidson Benjamin Liebald Junning Liu Palash Nandy Taylor Van Vleet (Google inc) Presented by Thuat Nguyen
Introduction YouTube – the most popular video community 1 billion users watch each month 24 hours of video uploaded every minute (2010) It’s a very information-rich environment
Goals The recommendation system Find videos related to users’ interests Helps users discover Keep users engaged: not just to watch or find
Challenges Videos have no or poor metadata User interactions are relatively short and noisy (compared to Netflix or Amazon) Videos usually have short life cycle
System Design 1.Input data 2.Related videos 3.Generating recommendation candidates 4.Ranking 5.System implementation -> recent, fresh, diverse, relevant
Input Data Two main classes of data: 1.Content data Title, description… 2.User activity data Rating, liking, subscribing, etc. (explicit) Start to watch, close before finish (implicit)
Related Videos Relatedness score Normalization function v i -> R i of top N candidates (impose min score)
Generating Recommendation Candidates Seed set S C 1 is narrow Broad the diversity of candidate set
Generating Recommendation Candidates (cont.)
Ranking Candidates ranked by using categorized signals: Video quality (view count, ratings…) User specificity (user’s taste and preferences) Diversification Impose constraints for each seed
System Implementation Three main steps: Data collection (log files) Recommendation generation (MapReduce) Recommendation serving Batch-oriented pre-computation approach Take advantages of CPU resources Cause delay between generating and serving
Evaluation and Results
Questions?