Intro to RecSys and CCF Brian Ackerman 1. Roadmap Introduction to Recommender Systems & Collaborative Filtering Collaborative Competitive Filtering 2.

Slides:



Advertisements
Similar presentations
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.
Advertisements

Oct 14, 2014 Lirong Xia Recommender systems acknowledgment: Li Zhang, UCSC.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Customizable Bayesian Collaborative Filtering Denver Dash Big Data Reading Group 11/19/2007.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Classification and Prediction: Regression Analysis
Collaborative Filtering Matrix Factorization Approach
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Item-based Collaborative Filtering Recommendation Algorithms
Performance of Recommender Algorithms on Top-N Recommendation Tasks
Cao et al. ICML 2010 Presented by Danushka Bollegala.
A NON-IID FRAMEWORK FOR COLLABORATIVE FILTERING WITH RESTRICTED BOLTZMANN MACHINES Kostadin Georgiev, VMware Bulgaria Preslav Nakov, Qatar Computing Research.
Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Item Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karpis, Joseph KonStan, John Riedl (UMN) p.s.: slides adapted from:
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
Collaborative Filtering Recommendation Reporter : Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU
Yan Yan, Mingkui Tan, Ivor W. Tsang, Yi Yang,
Clustering-based Collaborative filtering for web page recommendation CSCE 561 project Proposal Mohammad Amir Sharif
Training and Testing of Recommender Systems on Data Missing Not at Random Harald Steck at KDD, July 2010 Bell Labs, Murray Hill.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
RecBench: Benchmarks for Evaluating Performance of Recommender System Architectures Justin Levandoski Michael D. Ekstrand Michael J. Ludwig Ahmed Eldawy.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
Temporal Diversity in Recommender Systems Neal Lathia, Stephen Hailes, Licia Capra, and Xavier Amatriain SIGIR 2010 April 6, 2011 Hyunwoo Kim.
A more efficient Collaborative Filtering method Tam Ming Wai Dr. Nikos Mamoulis.
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Cosine Similarity Item Based Predictions 77B Recommender Systems.
Singular Value Decomposition and Item-Based Collaborative Filtering for Netflix Prize Presentation by Tingda Lu at the Saturday Research meeting 10_23_10.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Singular Value Decomposition and Item-Based Collaborative Filtering for Netflix Prize Presentation by Tingda Lu at the Saturday Research meeting 10_23_10.
KNN CF: A Temporal Social Network kNN CF: A Temporal Social Network Neal Lathia, Stephen Hailes, Licia Capra University College London RecSys ’ 08 Advisor:
Recommendation Algorithms for E-Commerce. Introduction Millions of products are sold over the web. Choosing among so many options is proving challenging.
Community-Based Link Prediction/Recommendation in the Bipartite Network of BoardGameGeek.com Brett Boge CS 765 University of Nevada, Reno.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Collaborative Competitive Filtering: Learning recommender using context of user choices Shuang Hong Yang Bo Long, Alex Smola, Hongyuan Zha Zhaohui Zheng.
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
User Modeling and Recommender Systems: recommendation algorithms
DATA MINING LECTURE 8 Sequence Segmentation Dimensionality Reduction.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize.
Company LOGO MovieMiner A collaborative filtering system for predicting Netflix user’s movie ratings [ECS289G Data Mining] Team Spelunker: Justin Becker,
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
Improving Collaborative Filtering by Incorporating Customer Reviews Hui Hui Supervisor Prof Min-Yen Kan Dr. Kazunari Sugiyama 1.
Presented By: Madiha Saleem Sunniya Rizvi.  Collaborative filtering is a technique used by recommender systems to combine different users' opinions and.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Innovation Team of Recommender System(ITRS) Collaborative Competitive Filtering : Learning Recommender Using Context of User Choice Keynote: Zhi-qiang.
The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Collaborative Deep Learning for Recommender Systems
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
ItemBased Collaborative Filtering Recommendation Algorithms 1.
Slope One Predictors for Online Rating-Based Collaborative Filtering Daniel Lemire, Anna Maclachlan In SIAM Data Mining (SDM’05), Newport Beach, California,
Item-Based Collaborative Filtering Recommendation Algorithms
Announcements Paper presentation Project meet with me ASAP
Matrix Factorization and Collaborative Filtering
CS728 The Collaboration Graph
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Q4 : How does Netflix recommend movies?
Collaborative Filtering Matrix Factorization Approach
Movie Recommendation System
Recommendation Systems
Recommender Systems Group 6 Javier Velasco Anusha Sama
Presentation transcript:

Intro to RecSys and CCF Brian Ackerman 1

Roadmap Introduction to Recommender Systems & Collaborative Filtering Collaborative Competitive Filtering 2

Introduction to Recommender Systems & Collaborative Filtering 3

Motivation Netflix has over 20,000 movies, but you may only be interested in a small number of these movies Recommender systems can provide personalized suggestions based on a large set of items such as movies – Can be done in a variety of ways, the most popular is collaborative filtering 4

Collaborative Filtering If two users rate a subset of items similarly, then they might rate other items similarly as well 5 Item AItem BItem CItem DItem E User 1?3453 User 21345?

Roadmap (RS-CF) Motivation Problem Main CF Types – Memory-based – User-based – Model-based – Regularized SVD 6

Problem Setting Set of users, U Set of items, I Users can rate items where r ui is user u’s rating on item i Ratings are often stored in a rating matrix – R |U|×|I| 7

Sample Rating Matrix Item AItem BItem CItem DItem EItem FItem GItem HItem I User User User User User User User # is a user rating, - means a null entry, not rated 8

Problem Input – Rating matrix (R |U|×|I| ) – Active user, a (user interacting with the system) Output – Prediction for all null entries of the active user 9

Roadmap (RS-CF) Motivation Problem Main CF Types – Memory-based – User-based – Model-based – Regularized SVD 10

Main Types Memory-based – User-based* [Resnick et al. 1994] – Item-based [Sarwar et al. 2001] – Similarity Fusion (User/Item-based) [Wang et al. 2006] Model-based – SVD (Singular Value Decomposition) [Sarwar et al. 2000] – RSVD (Regularized SVD)* [Funk 2006] 11

User-based Find similar user’s – KNN or threshold Make prediction Item AItem BItem CItem DItem EItem FItem GItem HItem I Active?5?3??2?? User User User User User User

User-based – Similar Users Consider each user (row) to be a vector Compare each vector to find the similarity between two users – Let a be the vector for active user and u 3 be the vector for user 3 – Cosine similarity can be used to compare vectors 13

User-based – Similar Users KNN (k-nearest neighbors or top-k) – Only find the k most similar users Threshold – Find all users that are at most θ level of similarity Item AItem BItem CItem DItem EItem FItem GItem HItem I User 1? User User User User User User

User-based – Make Prediction Weighted by similarity – Weight each similar user’s rating based on similarity to active user Similar users Prediction for active user on item i 15

Main Types Memory-based – User-based* [Resnick et al. 1994] – Item-based [Sarwar et al. 2001] – Similarity Fusion (User/Item-based) [Wang et al. 2006] Model-based – SVD (Singular Value Decomposition) [Sarwar et al. 2000] – RSVD (Regularized SVD)* [Funk 2006] 16

Regularized SVD Netflix data has 8.5 billion entries based on 17 thousand movie and.5 million users Only 100 million ratings – 1.1% of all possible ratings Why do we need to operate on such a large matrix? 17

Regularized SVD – Setup Let each user and item be represented by a feature vector of length k – E.g. Item A may be vector A k = [a 1 a 2 a 3 … a k ] Imagine the features for items were fixed – E.g. items are movies and each feature is a genre such as comedy, drama, etc… Features of the user vector are how well a user likes that feature 18

Regularized SVD – Setup Consider the movie Die Hard – Its feature vector may be i = [1 0 0] if the features are action, comedy, and drama Maybe the user has the feature vector u = [ ] We can try to predict a user’s rating using the dot product of these two vectors – r’ ui = u ∙ i = [1 0 0] ∙ [ ] =

Regularized SVD – Goal Try to find values for each item vector that work for all users Try to find value for each user vector that can produce the actual rating when taking the dot product with the item vector Minimizing the difference between the actual and predicted (based on dot product) rating 20

Regularized SVD – Setup In reality, we cannot choose k to be large enough for a fixed number of features – There are too many to consider (e.g. genre, actors, directors, etc…) Usually k is only 25 to 50 which reduces the total size of the matrices to only roughly 25 million to 50 million (compared to 8.5 billion) Because of the size of k, the values in the vectors are NOT directly tied to any feature 21

Regularized SVD – Goal Let u be a user, i be an item, r ui is a rating by user u on item i where R is the set of all ratings, and φ u, φ i are the vectors At first thought, it seems simple to have the following optimization goal 22

Regularized SVD – Overfitting Problem is overfitting of the features – Solved by regularization 23

Regularized SVD – Regularization Introduce a new optimization goal including a term for regularization Minimizing the magnitude of the feature vectors – Controlled by fixed parameters λ u and λ i 24

Regularized SVD Many improvements have been proposed to improve the regularized optimization goal – RSVD2/NSVD1/NSVD2 [Paterek 2007]: added term for user bias and a term for item bias, minimize number of parameters – Integrated Neighborhood SVD++ [Koren 2008]: used a neighborhood-based approach to RSVD 25

Roadmap Introduction to Recommender Systems & Collaborative Filtering Collaborative Competitive Filtering 26

Collaborative Competitive Filtering: Learning Recommender Using Context of User Choice Georgia Tech and Yahoo! Labs Best Student Paper at SIGIR’11 27

Motivation A user may be given 5 random movies and chooses Die Hard – This tells us the user prefers action movies A user may be given 5 actions movies and chooses Die Hard over Rocky and Terminator – This tells us the user prefers Bruce Willis 28

Roadmap (CCF) Motivation Problem Setting & Input Techniques Extensions 29

Problem Setting Set of users, U Set of items, I Each user interaction has an offer set O and a decision set D Each user interaction is stored as a tuple (u, O, D) where D is a subset of O 30

CCF Input Item AItem BItem CItem DItem EItem FItem GItem HItem I U1-S11--- U1-S2--1- U1-S3---1 U2-S U2-S2-1-- U3-S1---1 U3-S means user interaction, - means it was in the offer set 31

Roadmap (CCF) Motivation Problem Setting & Input Techniques Extensions 32

Local Optimality of User Choice Each item has a potential revenue to the user which is r ui Users also consider the opportunity cost (OC) when deciding potential revenue – OC is what the user gives up for making a given decision OC is c ui = max( i’ | i’ in O \ i) Profit is π ui = r ui – c ui 33

Local Optimality of User Choice A user interaction is an opportunity give and take process – User is given a set of opportunities – User makes a decision to select one of the many opportunities – Each opportunity comes with some revenue (utility or relevance) 34

Competitive Collaborative Filtering Local optimality constraint – Each item in the decision set has a revenue higher than those not in the decision set – Problem becomes intractable with only this constraint, no unique solution 35

CCF – Hinge Model Optimization goal – Minimize error (ξ, slack variable) & model complexity 36

CCF – Hinge Model Find average potential utility – Average utility of non-chosen items Constraints – Chosen items have a higher utility – e ui is an error term 37

CCF – Hinge Model Optimization Goal – Assume ξ is 0 Average Relevance of Non-chosen Items 38

CCF – How to use results We can predict the relevance of all items based on user and item vectors – Can set threshold if more than one item can be chosen (e.g. θ >.9 implies action) ItemUser ActionPredicted Relevance A1.98 B-.93 C-.56 D-.25 E

Roadmap (CCF) Motivation Problem Setting & Input Techniques Extensions 40

Extensions Sessions without a response – User does not take any opportunity Adding content features – Fixed features for each item rather than a limited number of parameters to improve accuracy of new item prediction 41