Jia-Bin Huang Virginia Tech

Slides:



Advertisements
Similar presentations
Machine learning continued Image source:
Advertisements

Recommender Systems Problem formulation Machine Learning.
Classification and risk prediction
Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
L15:Microarray analysis (Classification). The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Independent Component Analysis (ICA) and Factor Analysis (FA)
Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Anomaly detection Problem motivation Machine Learning.
Cao et al. ICML 2010 Presented by Danushka Bollegala.
Anomaly detection with Bayesian networks Website: John Sandiford.
Multimodal Interaction Dr. Mike Spann
Group Recommendations with Rank Aggregation and Collaborative Filtering Linas Baltrunas, Tadas Makcinskas, Francesco Ricci Free University of Bozen-Bolzano.
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition by D. Tao, X. Li, and J. Maybank, TPAMI 2007 Presented by Iulian Pruteanu.
Online Learning for Collaborative Filtering
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Virtual Vector Machine for Bayesian Online Classification Yuan (Alan) Qi CS & Statistics Purdue June, 2009 Joint work with T.P. Minka and R. Xiang.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Unsupervised Learning: Kmeans, GMM, EM Readings: Barber
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Lecture 2: Statistical learning primer for biologists
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
6.S093 Visual Recognition through Machine Learning Competition Image by kirkh.deviantart.com Joseph Lim and Aditya Khosla Acknowledgment: Many slides from.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
WEEK 2 SOFT COMPUTING & MACHINE LEARNING YOSI KRISTIAN Gradient Descent for Linear Regression.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Announcements Paper presentation Project meet with me ASAP
Matrix Factorization and Collaborative Filtering
Collaborative Filtering for Streaming data
Deep Feedforward Networks
MATRIX FACTORIZATION TECHNIQUES FOR RECOMMENDER SYSTEMS
Multimodal Learning with Deep Boltzmann Machines
Machine Learning Basics
Probabilistic Models for Linear Regression
Detecting Artifacts and Textures in Wavelet Coded Images
Jianping Fan Dept of CS UNC-Charlotte
Lecture 25 Radial Basis Network (II)
Propagating Uncertainty In POMDP Value Iteration with Gaussian Process
Advanced Artificial Intelligence
Q4 : How does Netflix recommend movies?
Probabilistic Models with Latent Variables
ECE539 final project Instructor: Yu Hen Hu Fall 2005
Research Interests.
Machine Learning Math Essentials Part 2
Movie Recommendation System
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Pattern Recognition and Machine Learning
OVERVIEW OF LINEAR MODELS
Support Vector Machine I
Unsupervised Learning and Clustering
Machine Learning – a Probabilistic Perspective
EM Algorithm and its Applications
Jia-Bin Huang Virginia Tech
Multiple features Linear Regression with multiple variables
Multiple features Linear Regression with multiple variables
Semi-Supervised Learning
Jia-Bin Huang Virginia Tech
Feature Selection in BCIs (section 5 and 6 of Review paper)
Recommender Systems Problem formulation Machine Learning.
Probabilistic Surrogate Models
What is Artificial Intelligence?
Presentation transcript:

Jia-Bin Huang Virginia Tech Recommender Systems Jia-Bin Huang Virginia Tech ECE-5424G / CS-5824 Spring 2019

Administrative HW 4 due April 10

Unsupervised Learning Clustering, K-Mean Expectation maximization Dimensionality reduction Anomaly detection Recommendation system

Motivating example: Monitoring machines in a data center 𝑥 2 (Memory use) 𝑥 1 (CPU load) 𝑥 2 (Memory use) 𝑥 1 (CPU load)

Multivariate Gaussian (normal) distribution 𝑥∈ 𝑅 𝑛 . Don’t model 𝑝 𝑥 1 ,𝑝 𝑥 2 , ⋯ separately Model 𝑝 𝑥 all in one go. Parameters: 𝜇∈ 𝑅 𝑛 , Σ∈ 𝑅 𝑛×𝑛 (covariance matrix) 𝑝 𝑥;𝜇, Σ = 1 2𝜋 𝑛/2 Σ 1/2 exp − 𝑥−𝜇 ⊤ Σ −1 (𝑥−𝜇)

Multivariate Gaussian (normal) examples Σ = 1 0 0 1 Σ = 0.6 0 0 0.6 Σ = 2 0 0 2 𝑥 2 𝑥 2 𝑥 2 𝑥 1 𝑥 1 𝑥 1

Multivariate Gaussian (normal) examples Σ = 1 0 0 1 Σ = 0.6 0 0 1 Σ = 2 0 0 1 𝑥 2 𝑥 2 𝑥 2 𝑥 1 𝑥 1 𝑥 1

Multivariate Gaussian (normal) examples Σ = 1 0 0 1 Σ = 1 0.5 0.5 1 Σ = 1 0.8 0.8 1 𝑥 2 𝑥 2 𝑥 2 𝑥 1 𝑥 1 𝑥 1

Anomaly detection using the multivariate Gaussian distribution Fit model 𝑝 𝑥 by setting 𝜇= 1 𝑚 𝑖=1 𝑚 𝑥 (𝑖) Σ= 1 𝑚 𝑖=1 𝑚 (𝑥 (𝑖) −𝜇)(𝑥 (𝑖) − 𝜇) ⊤ 2 Give a new example 𝑥, compute 𝑝 𝑥;𝜇, Σ = 1 2𝜋 𝑛/2 Σ 1/2 exp − 𝑥−𝜇 ⊤ Σ −1 (𝑥−𝜇) Flag an anomaly if 𝑝 𝑥 <𝜖

Automatically captures correlations between features Original model 𝑝 𝑥 1 ; 𝜇 1 , 𝜎 1 2 𝑝 𝑥 2 ; 𝜇 2 , 𝜎 2 2 ⋯𝑝 𝑥 𝑛 ; 𝜇 𝑛 , 𝜎 𝑛 2 Manually create features to capture anomalies where 𝑥 1 , 𝑥 2 take unusual combinations of values Computationally cheaper (alternatively, scales better) OK even if training set size is small Original model 𝑝 𝑥;𝜇, Σ = 1 2𝜋 𝑛/2 Σ 1/2 exp − 𝑥−𝜇 ⊤ Σ −1 (𝑥−𝜇) exp − 𝑥−𝜇 ⊤ Σ −1 (𝑥−𝜇) Automatically captures correlations between features Computationally more expensive Must have 𝑚>𝑛 or else Σ is non- invertible

Recommender Systems Motivation Problem formulation Content-based recommendations Collaborative filtering Mean normalization

Recommender Systems Motivation Problem formulation Content-based recommendations Collaborative filtering Mean normalization

You may also like..?

Recommender Systems Motivation Problem formulation Content-based recommendations Collaborative filtering Mean normalization

Example: Predicting movie ratings User rates movies using zero to five stars 𝑛 𝑢 = no. users 𝑛 𝑚 = no. movies 𝑟 𝑖,𝑗 =1 if user 𝑗 has rated movie 𝑖 𝑦 (𝑖,𝑗) = rating given by user 𝑗 to movie 𝑖 Movie Alice (1) Bob (2) Carol (3) Dave (4) Love at last 5 Romance forever ? Cute puppies of love 4 Nonstop car chases Swords vs. karate

Recommender Systems Motivation Problem formulation Content-based recommendations Collaborative filtering Mean normalization

Content-based recommender systems Movie Alice (1) Bob (2) Carol (3) Dave (4) 𝑥 1 (romance) 𝑥 2 (action) Love at last 5 0.9 Romance forever ? 1.0 0.01 Cute puppies of love 4 0.99 Nonstop car chases 0.1 Swords vs. karate For each user 𝑗, learn a parameter 𝜃 (𝑗) ∈ 𝑅 3 . Predict user 𝑗 as rating movie 𝑖 with (𝜃 𝑗 ) ⊤ 𝑥 (𝑖) stars.

Content-based recommender systems Movie Alice (1) Bob (2) Carol (3) Dave (4) 𝑥 1 (romance) 𝑥 2 (action) Love at last 5 0.9 Romance forever ? 1.0 0.01 Cute puppies of love 4 0.99 Nonstop car chases 0.1 Swords vs. karate 𝑥 (3) = 1 0.99 0 𝜃 1 = 0 5 0 (𝜃 1 ) ⊤ 𝑥 (3) =5∗0.99=4.95 For each user 𝑗, learn a parameter 𝜃 (𝑗) ∈ 𝑅 3 . Predict user 𝑗 as rating movie 𝑖 with (𝜃 𝑗 ) ⊤ 𝑥 (𝑖) stars.

Problem formulation 𝑟 𝑖,𝑗 =1 if user 𝑗 has rated movie 𝑖 𝑦 (𝑖,𝑗) = rating given by user 𝑗 to movie 𝑖 𝜃 (𝑗) = parameter vector for user 𝑗 𝑥 (𝑖) = feature vector for user 𝑖 For each user 𝑗, predicted rating: (𝜃 𝑗 ) ⊤ 𝑥 (𝑖) 𝑚 (𝑗) = no. of movies rated by user j Goal: learn 𝜃 (𝑗) : min 𝜃 (𝑗) 1 2 𝑚 (𝑗) 𝑖:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑚 (𝑗) 𝑘=1 𝑛 𝜃 𝑘 𝑗 2

Optimization objective Learn 𝜃 𝑗 (parameter for user 𝑗): min 𝜃 (𝑗) 1 2 𝑖:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑘=1 𝑛 𝜃 𝑘 𝑗 2 Learn 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 : min 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 1 2 𝑗=1 𝑛 𝑢 𝑖:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑗=1 𝑛 𝑢 𝑘=1 𝑛 𝜃 𝑘 𝑗 2

Optimization algorithm min 𝜃 (𝑗) 1 2 𝑗=1 𝑛 𝑢 𝑖:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑗=1 𝑛 𝑢 𝑘=1 𝑛 𝜃 𝑘 𝑗 2 Gradient descent update: 𝜃 𝑘 𝑗 ≔ 𝜃 𝑘 𝑗 −𝛼 𝑖:𝑟 𝑖,𝑗 =1 𝜃 𝑗 ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 𝑥 𝑘 𝑖 (for 𝑘=0) 𝜃 𝑘 𝑗 ≔ 𝜃 𝑘 𝑗 −𝛼 𝑖:𝑟 𝑖,𝑗 =1 ( 𝜃 𝑗 ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 ) 𝑥 𝑘 𝑖 +𝜆 𝜃 𝑘 (𝑗) (for 𝑘≠0)

Recommender Systems Motivation Problem formulation Content-based recommendations Collaborative filtering Mean normalization

Problem motivation 5 0.9 ? 1.0 0.01 4 0.99 0.1 Movie Alice (1) Bob (2) Carol (3) Dave (4) 𝑥 1 (romance) 𝑥 2 (action) Love at last 5 0.9 Romance forever ? 1.0 0.01 Cute puppies of love 4 0.99 Nonstop car chases 0.1 Swords vs. karate

Problem motivation 𝜃 1 = 0 5 0 𝜃 2 = 0 5 0 𝜃 3 = 0 0 5 𝜃 4 = 0 0 5 Movie Alice (1) Bob (2) Carol (3) Dave (4) 𝑥 1 (romance) 𝑥 2 (action) Love at last 5 ? Romance forever Cute puppies of love 4 Nonstop car chases Swords vs. karate 𝜃 1 = 0 5 0 𝜃 2 = 0 5 0 𝜃 3 = 0 0 5 𝜃 4 = 0 0 5 𝑥 1 = ? ? ?

Optimization algorithm Given 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 , to learn 𝑥 (𝑖) : min 𝑥 (𝑖) 1 2 𝑗:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑘=1 𝑛 𝑥 𝑘 (𝑖) 2 Given 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 , to learn 𝑥 (1) , 𝑥 (2) , ⋯, 𝑥 ( 𝑛 𝑚 ) : min 𝑥 (1) , 𝑥 (2) , ⋯, 𝑥 ( 𝑛 𝑚 ) 1 2 𝑖=1 𝑛 𝑚 𝑗:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑖=1 𝑛 𝑚 𝑘=1 𝑛 𝑥 𝑘 (𝑖) 2

Collaborative filtering Given 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 (and movie ratings), Can estimate 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 Given 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 Can estimate 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚

Collaborative filtering optimization objective Given 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 , estimate 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 min 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 1 2 𝑗=1 𝑛 𝑢 𝑖:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑗=1 𝑛 𝑢 𝑘=1 𝑛 𝜃 𝑘 𝑗 2 Given 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 , estimate 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 min 𝑥 (1) , 𝑥 (2) , ⋯, 𝑥 ( 𝑛 𝑚 ) 1 2 𝑖=1 𝑛 𝑚 𝑗:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑖=1 𝑛 𝑚 𝑘=1 𝑛 𝑥 𝑘 (𝑖) 2

Collaborative filtering optimization objective Given 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 , estimate 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 min 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 1 2 𝑗=1 𝑛 𝑢 𝑖:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑗=1 𝑛 𝑢 𝑘=1 𝑛 𝜃 𝑘 𝑗 2 Given 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 , estimate 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 min 𝑥 (1) , 𝑥 (2) , ⋯, 𝑥 ( 𝑛 𝑚 ) 1 2 𝑖=1 𝑛 𝑚 𝑗:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑖=1 𝑛 𝑚 𝑘=1 𝑛 𝑥 𝑘 (𝑖) 2 Minimize 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 and 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 simultaneously 𝐽= 1 2 𝑗:𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑗=1 𝑛 𝑢 𝑘=1 𝑛 𝜃 𝑘 𝑗 2 + 𝜆 2 𝑖=1 𝑛 𝑚 𝑘=1 𝑛 𝑥 𝑘 (𝑖) 2

Collaborative filtering optimization objective 𝐽( 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 , 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 )= 1 2 𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑗=1 𝑛 𝑢 𝑘=1 𝑛 𝜃 𝑘 𝑗 2 + 𝜆 2 𝑖=1 𝑛 𝑚 𝑘=1 𝑛 𝑥 𝑘 (𝑖) 2

Collaborative filtering algorithm Initialize 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 , 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 to small random values Minimize 𝐽( 𝑥 1 , 𝑥 2 , ⋯, 𝑥 𝑛 𝑚 , 𝜃 1 , 𝜃 2 , ⋯, 𝜃 𝑛 𝑢 ) using gradient descent (or an advanced optimization algorithm). For every 𝑗= 1⋯ 𝑛 𝑢 , 𝑖=1, ⋯, 𝑛 𝑚 : 𝑥 𝑘 𝑗 ≔ 𝑥 𝑘 𝑗 −𝛼 𝑗:𝑟 𝑖,𝑗 =1 ( 𝜃 𝑗 ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 ) 𝜃 𝑘 𝑖 +𝜆 𝑥 𝑘 (𝑖) 𝜃 𝑘 𝑗 ≔ 𝜃 𝑘 𝑗 −𝛼 𝑖:𝑟 𝑖,𝑗 =1 ( 𝜃 𝑗 ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 ) 𝑥 𝑘 𝑖 +𝜆 𝜃 𝑘 (𝑗) For a user with parameter 𝜃 and movie with (learned) feature 𝑥, predict a star rating of 𝜃 ⊤ 𝑥

Collaborative filtering Movie Alice (1) Bob (2) Carol (3) Dave (4) Love at last 5 Romance forever ? Cute puppies of love 4 Nonstop car chases Swords vs. karate

Collaborative filtering Predicted ratings: 𝑋= − 𝑥 1 ⊤ − − 𝑥 2 ⊤ − ⋮ − 𝑥 𝑛 𝑚 ⊤ − Θ= − 𝜃 1 ⊤ − − 𝜃 2 ⊤ − ⋮ − 𝜃 𝑛 𝑢 ⊤ − Y=X Θ ⊤ Low-rank matrix factorization

Finding related movies/products For each product 𝑖, we learn a feature vector 𝑥 (𝑖) ∈ 𝑅 𝑛 𝑥 1 : romance, 𝑥 2 : action, 𝑥 3 : comedy, … How to find movie 𝑗 relate to movie 𝑖? Small 𝑥 (𝑖) − 𝑥 (𝑗) movie j and I are “similar”

Recommender Systems Motivation Problem formulation Content-based recommendations Collaborative filtering Mean normalization

Users who have not rated any movies Alice (1) Bob (2) Carol (3) Dave (4) Eve (5) Love at last 5 ? Romance forever Cute puppies of love 4 Nonstop car chases Swords vs. karate 1 2 𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑗=1 𝑛 𝑢 𝑘=1 𝑛 𝜃 𝑘 𝑗 2 + 𝜆 2 𝑖=1 𝑛 𝑚 𝑘=1 𝑛 𝑥 𝑘 (𝑖) 2 𝜃 (5) = 0 0

Users who have not rated any movies Alice (1) Bob (2) Carol (3) Dave (4) Eve (5) Love at last 5 Romance forever ? Cute puppies of love 4 Nonstop car chases Swords vs. karate 1 2 𝑟 𝑖,𝑗 =1 (𝜃 𝑗 ) ⊤ 𝑥 𝑖 − 𝑦 𝑖,𝑗 2 + 𝜆 2 𝑗=1 𝑛 𝑢 𝑘=1 𝑛 𝜃 𝑘 𝑗 2 + 𝜆 2 𝑖=1 𝑛 𝑚 𝑘=1 𝑛 𝑥 𝑘 (𝑖) 2 𝜃 (5) = 0 0

Mean normalization Learn 𝜃 (𝑗) , 𝑥 (𝑖) For user 𝑗, on movie 𝑖 predict: 𝜃 𝑗 ⊤ 𝑥 (𝑖) + 𝜇 𝑖 User 5 (Eve): 𝜃 5 = 0 0 𝜃 5 ⊤ 𝑥 (𝑖) + 𝜇 𝑖 Learn 𝜃 (𝑗) , 𝑥 (𝑖)

Recommender Systems Motivation Problem formulation Content-based recommendations Collaborative filtering Mean normalization