Recommender Systems Robin Burke DePaul University Chicago, IL.

Slides:



Advertisements
Similar presentations
Recommender System A Brief Survey.
Advertisements

Recommender Systems & Collaborative Filtering
Fawaz Ghali Web 2.0 for the Adaptive Web.
Item Based Collaborative Filtering Recommendation Algorithms
Prediction Modeling for Personalization & Recommender Systems Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Hybrid recommender systems Hybrid: combinations of various inputs and/or composition of different mechanism Knowledge-based: "Tell me what.
Collaborative Filtering Sue Yeon Syn September 21, 2005.
Bamshad Mobasher Center for Web Intelligence School of Computing, DePaul University, Chicago, Illinois, USA.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Recommender Systems – An Introduction Dietmar Jannach, Markus Zanker, Alexander Felfernig, Gerhard Friedrich Cambridge University Press Which digital.
Nearest Neighbor. Predicting Bankruptcy Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return.
Recommender Systems Aalap Kohojkar Yang Liu Zhan Shi March 31, 2008.
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
CS345 Data Mining Recommendation Systems Netflix Challenge Anand Rajaraman, Jeffrey D. Ullman.
Agent Technology for e-Commerce
Case-based Reasoning System (CBR)
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Recommender Systems; Social Information Filtering.
Recommender systems Ram Akella November 26 th 2008.
Interface for the University Library Catalogue Implementing Direct Manipulation Proposal 4.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
Recommender systems Drew Culbert IST /12/02.
Hybrid Web Recommender Systems
User Models for Personalization Josh Alspector Chief Technology Officer.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Recommender Systems Session C Robin Burke DePaul University Chicago, IL.
User Modeling, Recommender Systems & Personalization Pattie Maes MAS 961- week 6.
Presented By :Ayesha Khan. Content Introduction Everyday Examples of Collaborative Filtering Traditional Collaborative Filtering Socially Collaborative.
Toward the Next generation of Recommender systems
1 Recommender Systems Collaborative Filtering & Content-Based Recommending.
1 Business System Analysis & Decision Making – Data Mining and Web Mining Zhangxi Lin ISQS 5340 Summer II 2006.
COMP 208/214/215/216 – Lecture 8 Demonstrations and Portfolios.
Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.
Data Mining Algorithms for Large-Scale Distributed Systems Presenter: Ran Wolff Joint work with Assaf Schuster 2003.
Objectives Objectives Recommendz: A Multi-feature Recommendation System Matthew Garden, Gregory Dudek, Center for Intelligent Machines, McGill University.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Similarity & Recommendation Arjen P. de Vries CWI Scientific Meeting September 27th 2013.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
The Summary of My Work In Graduate Grade One Reporter: Yuanshuai Sun
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Recommendation Algorithms for E-Commerce. Introduction Millions of products are sold over the web. Choosing among so many options is proving challenging.
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
User Modeling and Recommender Systems: recommendation algorithms
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Analysis of massive data sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
Item-Based Collaborative Filtering Recommendation Algorithms
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Matrix Factorization and Collaborative Filtering
Statistics 202: Statistical Aspects of Data Mining
User Modeling for Personal Assistant
Data Mining: Concepts and Techniques
Recommender Systems & Collaborative Filtering
Recommender Systems Session I
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Collaborative Filtering Nearest Neighbor Approach
Recommender Systems Copyright: Dietmar Jannah, Markus Zanker and Gerhard Friedrich (slides based on their IJCAI talk „Tutorial: Recommender Systems”)
Presentation transcript:

Recommender Systems Robin Burke DePaul University Chicago, IL

About myself PhD 1993 Northwestern University PhD 1993 Northwestern University –Intelligent Multimedia Retrieval –Post-doc at University of Chicago Kristian Hammond Kristian Hammond –Helped found Recommender, Inc. became Verb, Inc. became Verb, Inc –Dir. of Software Development –Adjunct at University of California, Irvine –California State University, Fullerton 2002-present 2002-present –DePaul University

My Interests Memory Memory –How do we remember the right thing at the right time? –Why is it that computers are so bad at this? –How does knowledge of different types shape the activity of memory?

Organization 3 days 21 hours Not me talking all the time! Partners – –For in-class activities – –For coding labs For labs – –Must be one laptop per pair – –Using Eclipse / Java

Activity 1 With your partner With your partner One person should recommend a movie or DVD to the other One person should recommend a movie or DVD to the other –asking questions as necessary –in the end, you should be confident that they are right No right or wrong way to do this! No right or wrong way to do this! Take note Take note –the questions you ask –the reasons for the recommendation

Discussion Recommender Recommender –What did you have to ask? –How did you use this information? Recommendee Recommendee –What made you sure the recommendation was good?

Example: Amazon.com

Product similarity

Market-basket analysis

Profitability analysis

Sequential pattern mining

Application: Recommender.com

Similar movies

Applying a critique

New results

Knowledge employed Similarity metric Similarity metric –what makes something "alike"? –# of features in common is not sufficient Movies Movies –genres of movies –types of actors –directorial styles –meaning of ratings NR could mean adult, but it could just be a foreign movie NR could mean adult, but it could just be a foreign movie

This class Tuesday A. 8:00 – 10:30 B. 10:45 – 13:00 C. 15:00 – 18:00 Wednesday D. 8:00 – 10:00 E. 10:15 – 13:00 F. 17:00 – 19:00 Thursday G. 8:00 – 11:00 H. 14:30 – 16:00 I. 18:00 – 20:00

Roadmap Session A: Basic Techniques I Session A: Basic Techniques I –Introduction –Knowledge Sources –Recommendation Types –Collaborative Recommendation Session B: Basic Techniques II Session B: Basic Techniques II –Content-based Recommendation –Knowledge-based Recomme –Knowledge-based Recommendation Session C: Domains and Implementation I Session C: Domains and Implementation I –Recommendation domains –Example Implementation – –Lab I Session D: Evaluation I Session D: Evaluation I –Evaluation Session E: Applications Session E: Applications –User Interaction –Web Personalization Session F: Implementation II Session F: Implementation II –La –Lab II Session G: Hybrid Recommendation Session G: Hybrid Recommendation Session H: Robustness Session H: Robustness Session I: Advanced Topics Session I: Advanced Topics –Dynamics –Beyond accuracy

Recommender Systems Wikipedia: – –Recommendation systems are programs which attempt to predict items (movies, music, books, news, web pages) that a user may be interested in, given some information about the user's profile. My definition –Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output.

Historical note Used to be a more restrictive definition Used to be a more restrictive definition –“people provide recommendations as inputs, which the system then aggregates and directs to appropriate recipients” (Resnick & Varian 1997)

Aspects of the definition basis for recommendation basis for recommendation –personalization process of recommendation process of recommendation –interactivity results of recommendation results of recommendation –interest / useful objects

Personalization –Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output. Definitions agree that recommendations are personalized Definitions agree that recommendations are personalized –Some might say that suggesting a best-seller to everyone is a form of recommendation Meaning Meaning –the process is guided by some user-specific information could be a long-term model could be a long-term model could be a query could be a query

Interactivity –Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output. Many possible interaction styles Many possible interaction styles –query / retrieve –recommendation list –predicted rating –dialog

Results –Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output. Recommendation = Search? Recommendation = Search? Search Search –a query matching process –given a query return all items that match it return all items that match it Recommendation Recommendation –a need satisfaction process –given a need return items that are likely to satisfy it return items that are likely to satisfy it

Some definitions Recommendation Recommendation Items Items Domain Domain Users Users Ratings Ratings Profile Profile

Recommendation A prediction of a given user's likely preference regarding an item A prediction of a given user's likely preference regarding an item Issues Issues –Negative prediction –Presentation / Interface Notation Notation –Pred(u,i)

Items The things being recommended The things being recommended –can be products –can be documents Assumption Assumption –Discrete items are being recommended –Not, for example, contract terms Issues Issues –Cost –Frequency of purchase –Customizability –Configurations Notation Notation –I = set of all items –i = an individual item

Recommendation Domain What is being recommended? What is being recommended? –a $0.99 music track? –a $1.9 M luxury condo? Much depends on the characteristics of the domain Much depends on the characteristics of the domain –cost how costly is a false positive? how costly is a false positive? how costly is a false negative? how costly is a false negative? –portfolio OK to recommend something that the user has already seen? OK to recommend something that the user has already seen? compatibility with owned items? compatibility with owned items? – individual vs group are we recommending something for individual or group consumption? are we recommending something for individual or group consumption? –single item vs configuration are we recommending a single item or a configuration of items? are we recommending a single item or a configuration of items? what are the constraints that tie configurations together? what are the constraints that tie configurations together? –constraints what types of constraints are users likely to impose (hard vs soft)? what types of constraints are users likely to impose (hard vs soft)?

Example 1 Music track (ala iTunes) Music track (ala iTunes) –low cost –individual –configuration fit into existing playlist? fit into existing playlist? –portfolio should not be already owned should not be already owned –constraints likely to be soft likely to be soft

Example 2 Course advising Course advising –high cost –individual –configuration must fit with other courses must fit with other courses prerequisites prerequisites –portfolio should not have already been taken should not have already been taken –constraints may be hard may be hard –graduation requirements –time and day

Example 3 DVD rental DVD rental –low cost –group consumption –no configuration issues –portfolio possible to recommend a favorite title again possible to recommend a favorite title again –Christmas movies –constraints likely to be soft likely to be soft some could be hard like maximum allowed rating some could be hard like maximum allowed rating

Users People who need / want items People who need / want items Assumption Assumption –(Usually) repeat users Issues Issues –Portfolio effects Notation Notation –U = set of all users –u = a particular user

Ratings A (numeric) score given by a user to a particular item representing the user's preference for that item. A (numeric) score given by a user to a particular item representing the user's preference for that item. Assumption Assumption –Preferences are static (or at least of long duration) Issues Issues –Multi-dimensional ratings –Context-dependencies Notation Notation –r u,i = a rating of item i by user u –R U,i = R i = the ratings of item i by all users

Explicit vs Implicit Ratings A explicit rating is one that has been provided by a user A explicit rating is one that has been provided by a user –via a user interface An implicit rating is inferred from user behavior An implicit rating is inferred from user behavior –for example, as recorded in web log data Issues Issues –effort threshold –noise

Collecting Explicit Ratings

Profile A user profile is everything that the system knows about a particular user A user profile is everything that the system knows about a particular user Issues Issues –profile dimensionality Notation Notation –P = all profiles –P u = the profile of user u

Knowledge Sources An AI system requires knowledge An AI system requires knowledge Takes various forms Takes various forms –raw data –algorithm –heuristics –ontology –rule base

In Recommendation Social knowledge Social knowledge User knowledge User knowledge Content knowledge Content knowledge

Knowledge source: Collaborative A collaborative knowledge source is one that holds information about peer users in a system A collaborative knowledge source is one that holds information about peer users in a system Examples Examples –ratings of items –age, sex, income of other users

Knowledge source: User A user knowledge source is one that holds information about the current user A user knowledge source is one that holds information about the current user –the one who needs a recommendation Example Example –a query the user has entered –a model of the user's preferences

Knowledge source: Content A content knowledge source holds information about the items being recommended A content knowledge source holds information about the items being recommended Example Example –knowledge about how items satisfy user needs –knowledge about the attributes of items

Recommendation Knowledge Sources Taxonomy Recommendation Knowledge Collaborative Content User Opinion Profiles Demographic Profiles Opinions Demographics Item Features Means-ends Domain Constraints Contextual Knowledge Requirements Query Constraints Preferences Context Domain Knowledge Feature Ontology

Break

Roadmap Session A: Basic Techniques I Session A: Basic Techniques I –Introduction –Knowledge Sources –Recommendation Types –Collaborative Recommendation Session B: Basic Techniques II Session B: Basic Techniques II –Content-based Recommendation –Knowledge-based Recomme –Knowledge-based Recommendation Session C: Domains and Implementation I Session C: Domains and Implementation I –Recommendation domains –Example Implementation – –Lab I Session D: Evaluation I Session D: Evaluation I –Evaluation Session E: Applications Session E: Applications –User Interaction –Web Personalization Session F: Implementation II Session F: Implementation II –La –Lab II Session G: Hybrid Recommendation Session G: Hybrid Recommendation Session H: Robustness Session H: Robustness Session I: Advanced Topics Session I: Advanced Topics –Dynamics –Beyond accuracy

Recommendation Types Default (non-personalized) Default (non-personalized) –“Would you like fries with that?” Collaborative Collaborative –“Most people who bought hamburgers also bought fries.” Demographic Demographic –“Most 45-year-old computer scientists buy fries.” Content-based Content-based –“You usually buy fries with your burgers.” Knowledge-based Knowledge-based –“A large order of curly fries would really complement the flavor of a Western Bacon Cheeseburger.”

Collaborative Key knowledge source Key knowledge source –opinion database Process Process –given a target user, find similar peer users –extrapolate from peer user ratings to the target user

Demographic Key knowledge sources Key knowledge sources –Demographic profiles –Opinion profiles Process Process –for target user, find users of similar demographic –extrapolate from similar users to target user

Content-based Key knowledge sources Key knowledge sources –User’s opinion –Item features Process Process –learn a function that maps from item features to user’s opinion –apply this function to new items

Knowledge-based Key knowledge source Key knowledge source –Domain knowledge Process Process –determine user’s requirements –apply domain knowledge to determine best item

Collaborative Recommendation Identify peers Generate recommendation

Recommendation Knowledge Sources Taxonomy Recommendation Knowledge Collaborative Content User Opinion Profiles Demographic Profiles Opinions Demographics Item Features Means-ends Domain Constraints Contextual Knowledge Requirements Query Constraints Preferences Context Domain Knowledge Feature Ontology

Two Problems Generate neighborhood Generate neighborhood –Peers should be users with similar needs / tastes –How to identify peer users? Generate predictions Generate predictions –Basic assumption = consistency in preference –Prefer those items generally liked by peers

Opinion Profile Consist of ratings of items Consist of ratings of items –P u = {r u,i i  I} –usually discrete numerical values We can think of such a profile as a vector We can think of such a profile as a vector – – –some (most) ratings will be missing –the vector is sparse The collection of all ratings for all users The collection of all ratings for all users –the rating matrix –usually very sparse

Cosine The angle between two vectors is given by The angle between two vectors is given by θ

Example Cosine similarity with Alice Cosine similarity with Alice

Cosine, cont'd Useful as a metric Useful as a metric –varies between -1 and 1 approaches 1 if angle is small approaches 1 if angle is small approches -1 if angle is near 180º approches -1 if angle is near 180º Common in information retrieval Common in information retrieval

Mean Adjustment Cosine is sensitive to the actual values in the vector Cosine is sensitive to the actual values in the vector –but users often have different "baseline" preferences –one might never rate an item below 3 / 5 –another might only rarely give a 5 / 5 These differences in scale These differences in scale –can mask real similarities between preferences Missing entries Missing entries –are effectively zero (very negative rating) Solution Solution –mean-adjustment –subtract the user's mean from each rating an item that gets an average score becomes a 0 an item that gets an average score becomes a 0 below average becomes negative below average becomes negative

Mean Adjusted Cosine

Example User6 now most similar User6 now most similar –because missing items aren't a penalty

Problem How to handle missing ratings? How to handle missing ratings? –sparsity Cosine Cosine –assumes a value for these values –regular cosine assumes zero (not a valid rating) assumes zero (not a valid rating) –adjusted cosine assumes the user's mean assumes the user's mean Neither really satisfactory Neither really satisfactory

Correlation Don't think of ratings as dimensions Don't think of ratings as dimensions Think of them as samples of a random variable Think of them as samples of a random variable –user opinion –taken at different points Try to estimate whether two user's opinions move in the same way Try to estimate whether two user's opinions move in the same way –if they are correlated

Correlation

Pearson's r Measurement of the correlation tendency of paired measurements Measurement of the correlation tendency of paired measurements –covariance / product of std. dev. Items not co-rated are not considered Items not co-rated are not considered

Cosine vs Correlation

Example

Neighborhood Size Too few Too few –prediction based on only a few neighbors Too many Too many –distant neighbors included –niche not specifically identified –taken to extreme overall average overall average

Sparsity What if the neighbor has only a few ratings in common with the target? What if the neighbor has only a few ratings in common with the target? Possible to compute correlation with just two ratings in common Possible to compute correlation with just two ratings in common

Example

Considerations in Prediction Proximity Proximity –should nearer neighbors get more say Sparsity Sparsity –should neighbors with less overlap get less (or no) say Baseline Baseline –different users have different average ratings All of these factors can be included in making predictions All of these factors can be included in making predictions

Typical prediction formula Take the user’s average Take the user’s average –add a weighted average of the neighbors –weight using the similarity scores

Collaborative Recommendation Advantages Advantages –possible to make recommendations knowing nothing about the items –extends common social practice, exchange of opinions –possible to find niches of users with obscure combinations of interests –possible to make disparate connections (serendipity) Disadvantages Disadvantages –vulnerability to manipulation (more later) –source of ratings needed explicit ratings preferred explicit ratings preferred –cold start problems (next slide)

Cold Start Problem New item New item –how can a new item be recommended? no users have rated it no users have rated it –must wait for the first person to rate it –possible solution: genre bot New user New user –how can a new user get a recommendation needs a profile that can be compared with others needs a profile that can be compared with others –possible solutions wait for user to rate items wait for user to rate items require users to rate items require users to rate items give some default recommendations while waiting for data give some default recommendations while waiting for data

Roadmap Session A: Basic Techniques I Session A: Basic Techniques I –Introduction –Knowledge Sources –Recommendation Types –Collaborative Recommendation Session B: Basic Techniques II Session B: Basic Techniques II –Content-based Recommendation –Knowledge-based Recomme –Knowledge-based Recommendation Session C: Domains and Implementation I Session C: Domains and Implementation I –Recommendation domains –Example Implementation – –Lab I Session D: Evaluation I Session D: Evaluation I –Evaluation Session E: Applications Session E: Applications –User Interaction –Web Personalization Session F: Implementation II Session F: Implementation II –La –Lab II Session G: Hybrid Recommendation Session G: Hybrid Recommendation Session H: Robustness Session H: Robustness Session I: Advanced Topics Session I: Advanced Topics –Dynamics –Beyond accuracy