Collaborative Filtering Non-negative Matrix Factorization Recommender Systems Collaborative Filtering Non-negative Matrix Factorization
Outlines What is a recommender system? Several ways to build one: Collaborative Filtering Non-negative Matrix Factorization …
Recommender Systems It’s a platform/system/engine that seeks to predict the “rating” or “preference” a user would give to an item.
Applications (in real world) Amazon.com recommends products based on purchase history.
Applications (in real world) Google News recommends new articles based on click and search history.
Applications (in real world) Netflix recommender platform.
Netflix Prize Source: https://netflixprize.com/index.html Held in 2009. Challenge was to beat Netflix recommender system, Using Netflix user-movie rating data. 480,000 users 18,000 movies 100M observed ratings, meaning sparsity index = 1.1% = 100𝑀 480𝐾 ×18𝐾 % Winner gets got $1M dollars. Bellcor’s Pragmatic Chaos [their algorithm: https://netflixprize.com/assets/GrandPrize2009_BPC_BellKor.pdf]
What is Collaborative Filtering? Intuition: Personal preferences are correlated. If Alice loves items P and Q, and Bob loves P, Q and R, Then Alice is more likely to love R Collaborative Filtering Task STEP 1: Discover patterns in observed preference behavior (e.g., purchase history, item ratings, etc.) across community of users. STEP 2: Then predict new preferences based on those patterns. It does not rely on item or user attributes (e.g., demographic info, author, genre, producer, etc.)
What is collaborative Filtering?
An example dataset Let’s consider a 4-user on 3-item rating dataset below: ID user item rating 241 u1 m1 2 222 m3 3 276 u2 5 273 m2 200 u3 229 231 1 239 u4 286
An example dataset Let’s consider a 4-user (u1, u2, u3, u4) on 3-item (m1, m2, m3) review dataset below: Let’s prepare a user-item rating matrix: m1 m2 m3 u1 2 ? 3 u2 5 u3 1 u4 ID user item rating 241 u1 m1 2 222 m3 3 276 u2 5 273 m2 200 u3 229 231 1 239 u4 286
What is collaborative Filtering?
(Item-based) Collaborative Filtering The similarities between different items in the dataset are computed by using one from a number of similarity measures, then these similarity values are used to predict ratings for user-item pairs not present in the dataset. Similarity measures: Cosine similarity Pearson correlation based similarity Adjusted Cosine similarity
Cosine Similarity cos 0 𝑜 =1 cos 90 𝑜 =0 cos 180 𝑜 =−1 Given two item vectors, 𝑎 , 𝑏 ; and you know the definition of vector dot product: 𝑎 . 𝑏 = 𝑎 𝑏 cos 𝜃 , where 𝜃 is the angle between the two vectors. So, cos 𝜃 = 𝑎 . 𝑏 𝑎 𝑏 =𝑠𝑖𝑚( 𝑎 , 𝑏 ) Figure courtesy: C.S. Perone
Cosine Similarity cos 0 𝑜 =1 cos 90 𝑜 =0 cos 180 𝑜 =−1 User-item rating matrix Cosine Similarity m1 m2 m3 u1 2 ? 3 u2 5 u3 1 u4 cos 0 𝑜 =1 cos 90 𝑜 =0 cos 180 𝑜 =−1 Given two item vectors, 𝑎 , 𝑏 ; and you know the definition of vector dot product: 𝑎 . 𝑏 = 𝑎 𝑏 cos 𝜃 , where 𝜃 is the angle between the two vectors. So, cos 𝜃 = 𝑎 . 𝑏 𝑎 𝑏 =𝑠𝑖𝑚( 𝑎 , 𝑏 ) Question: Find cosine similarity between item 𝑚1 and 𝑚2 : 𝑠𝑖𝑚( 𝑚1 , 𝑚2 )= ?
Cosine Similarity cos 0 𝑜 =1 cos 90 𝑜 =0 cos 180 𝑜 =−1 User-item rating matrix m1 m2 m3 u1 2 ? 3 u2 5 u3 1 u4 cos 0 𝑜 =1 cos 90 𝑜 =0 cos 180 𝑜 =−1 Given two item vectors, 𝑎 , 𝑏 ; and you know the definition of vector dot product: 𝑎 . 𝑏 = 𝑎 𝑏 cos 𝜃 , where 𝜃 is the angle between the two vectors. So, cos 𝜃 = 𝑎 . 𝑏 𝑎 𝑏 =𝑠𝑖𝑚( 𝑎 , 𝑏 )=𝑐𝑜𝑠( 𝑎 , 𝑏 ) Question: Find cosine similarity between item 𝑚1 and 𝑚2 : Only consider rating values where both items been rated: 𝑠𝑖𝑚( 𝑚1 , 𝑚2 )= ? 𝑚1 = 5 3 , and 𝑚2 = 2 3 . Therefore, 𝑐𝑜𝑠( 𝑚1 , 𝑚2 ) = 5∗2+3∗3 5 2 + 3 2 . 2 2 + 3 2 =0.78
Complete the item-to-item similarity matrix Cos(m1,m1) Cos(m1,m2) Cos(m1,m3) Cos(m2,m1) Cos(m2,m2) Cos(m2,m3) Cos(m3,m1) Cos(m3,m2) Cos(m3,m3) It’s a symmetric matrix. Cos(x,x) = 1
Complete the item-to-item similarity matrix 1 0.76 0.78 Cos(m2,m1) 0.86 Cos(m3,m1) Cos(m3,m2) It’s a symmetric matrix. Cos(x,x) = 1
Complete the item-to-item similarity matrix 1 0.76 0.78 0.86 It’s a symmetric matrix. Cos(x,x) = 1
Complete the item-to-item similarity matrix 1 0.76 0.78 0.86 m1 m2 m3 u1 2 ? 3 u2 5 u3 1 u4 Now, what is the rating user u1 would give to item m2? u1 has already rated items m1 and m3. 𝑟𝑎𝑡𝑖𝑛𝑔 𝑢1,𝑚2 = 𝑟𝑎𝑡𝑖𝑛𝑔 𝑢1,𝑚1 ∗𝑠𝑖𝑚 𝑚1,𝑚2 +𝑟𝑎𝑡𝑖𝑛𝑔 𝑢1,𝑚3 ∗𝑠𝑖𝑚(𝑚3,𝑚2) 𝑠𝑖𝑚 𝑚1,𝑚2 +𝑠𝑖𝑚(𝑚3,𝑚2) = 2∗0.76+3∗0.86 0.76+0.86 =2.53
Non-negative Matrix Factorization (NMF) Here a matrix V is factorized into two matrices W and H, With the property that all three matrices have only non-negative elements. The non-negativity property of elements makes the resulting matrices easier to inspect. It can be solved by optimizing the following problem, and solve W and H: Once solved, it can also be used as a recommender system. Because V[i,j] = W[i,:]*H[:,j]
How to solve NMF? Let’s take a look at the whiteboard.
References Item-based collaborative filtering recommendation algorithms, by Badrul et al. (WWW,2001) Algorithms for non-negative matrix factorization, by Lee and Seung (NIPS 2001) A tutorial https://lazyprogrammer.me/tutorial-on-collaborative- filtering-and-matrix-factorization-in-python/