Presentation is loading. Please wait.

Presentation is loading. Please wait.

Recommender Systems Robin Burke DePaul University Chicago, IL.

Similar presentations


Presentation on theme: "Recommender Systems Robin Burke DePaul University Chicago, IL."— Presentation transcript:

1 Recommender Systems Robin Burke DePaul University Chicago, IL

2 About myself PhD 1993 Northwestern University PhD 1993 Northwestern University –Intelligent Multimedia Retrieval 1993-1998 1993-1998 –Post-doc at University of Chicago Kristian Hammond Kristian Hammond –Helped found Recommender, Inc. became Verb, Inc. became Verb, Inc. 1998-2000 1998-2000 –Dir. of Software Development –Adjunct at University of California, Irvine 2000-2002 2000-2002 –California State University, Fullerton 2002-present 2002-present –DePaul University

3 My Interests Memory Memory –How do we remember the right thing at the right time? –Why is it that computers are so bad at this? –How does knowledge of different types shape the activity of memory?

4 Organization 3 days 21 hours Not me talking all the time! Partners – –For in-class activities – –For coding labs For labs – –Must be one laptop per pair – –Using Eclipse / Java

5 Activity 1 With your partner With your partner One person should recommend a movie or DVD to the other One person should recommend a movie or DVD to the other –asking questions as necessary –in the end, you should be confident that they are right No right or wrong way to do this! No right or wrong way to do this! Take note Take note –the questions you ask –the reasons for the recommendation

6 Discussion Recommender Recommender –What did you have to ask? –How did you use this information? Recommendee Recommendee –What made you sure the recommendation was good?

7 Example: Amazon.com

8 Product similarity

9

10 Market-basket analysis

11 Profitability analysis

12 Sequential pattern mining

13 Application: Recommender.com

14 Similar movies

15 Applying a critique

16 New results

17 Knowledge employed Similarity metric Similarity metric –what makes something "alike"? –# of features in common is not sufficient Movies Movies –genres of movies –types of actors –directorial styles –meaning of ratings NR could mean adult, but it could just be a foreign movie NR could mean adult, but it could just be a foreign movie

18 This class Tuesday A. 8:00 – 10:30 B. 10:45 – 13:00 C. 15:00 – 18:00 Wednesday D. 8:00 – 10:00 E. 10:15 – 13:00 F. 17:00 – 19:00 Thursday G. 8:00 – 11:00 H. 14:30 – 16:00 I. 18:00 – 20:00

19 Roadmap Session A: Basic Techniques I Session A: Basic Techniques I –Introduction –Knowledge Sources –Recommendation Types –Collaborative Recommendation Session B: Basic Techniques II Session B: Basic Techniques II –Content-based Recommendation –Knowledge-based Recomme –Knowledge-based Recommendation Session C: Domains and Implementation I Session C: Domains and Implementation I –Recommendation domains –Example Implementation – –Lab I Session D: Evaluation I Session D: Evaluation I –Evaluation Session E: Applications Session E: Applications –User Interaction –Web Personalization Session F: Implementation II Session F: Implementation II –La –Lab II Session G: Hybrid Recommendation Session G: Hybrid Recommendation Session H: Robustness Session H: Robustness Session I: Advanced Topics Session I: Advanced Topics –Dynamics –Beyond accuracy

20 Recommender Systems Wikipedia: – –Recommendation systems are programs which attempt to predict items (movies, music, books, news, web pages) that a user may be interested in, given some information about the user's profile. My definition –Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output.

21 Historical note Used to be a more restrictive definition Used to be a more restrictive definition –“people provide recommendations as inputs, which the system then aggregates and directs to appropriate recipients” (Resnick & Varian 1997)

22 Aspects of the definition basis for recommendation basis for recommendation –personalization process of recommendation process of recommendation –interactivity results of recommendation results of recommendation –interest / useful objects

23 Personalization –Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output. Definitions agree that recommendations are personalized Definitions agree that recommendations are personalized –Some might say that suggesting a best-seller to everyone is a form of recommendation Meaning Meaning –the process is guided by some user-specific information could be a long-term model could be a long-term model could be a query could be a query

24 Interactivity –Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output. Many possible interaction styles Many possible interaction styles –query / retrieve –recommendation list –predicted rating –dialog

25 Results –Any system that guides the user in a personalized way to interesting or useful objects in a large space of possible options or that produces such objects as output. Recommendation = Search? Recommendation = Search? Search Search –a query matching process –given a query return all items that match it return all items that match it Recommendation Recommendation –a need satisfaction process –given a need return items that are likely to satisfy it return items that are likely to satisfy it

26 Some definitions Recommendation Recommendation Items Items Domain Domain Users Users Ratings Ratings Profile Profile

27 Recommendation A prediction of a given user's likely preference regarding an item A prediction of a given user's likely preference regarding an item Issues Issues –Negative prediction –Presentation / Interface Notation Notation –Pred(u,i)

28 Items The things being recommended The things being recommended –can be products –can be documents Assumption Assumption –Discrete items are being recommended –Not, for example, contract terms Issues Issues –Cost –Frequency of purchase –Customizability –Configurations Notation Notation –I = set of all items –i = an individual item

29 Recommendation Domain What is being recommended? What is being recommended? –a $0.99 music track? –a $1.9 M luxury condo? Much depends on the characteristics of the domain Much depends on the characteristics of the domain –cost how costly is a false positive? how costly is a false positive? how costly is a false negative? how costly is a false negative? –portfolio OK to recommend something that the user has already seen? OK to recommend something that the user has already seen? compatibility with owned items? compatibility with owned items? – individual vs group are we recommending something for individual or group consumption? are we recommending something for individual or group consumption? –single item vs configuration are we recommending a single item or a configuration of items? are we recommending a single item or a configuration of items? what are the constraints that tie configurations together? what are the constraints that tie configurations together? –constraints what types of constraints are users likely to impose (hard vs soft)? what types of constraints are users likely to impose (hard vs soft)?

30 Example 1 Music track (ala iTunes) Music track (ala iTunes) –low cost –individual –configuration fit into existing playlist? fit into existing playlist? –portfolio should not be already owned should not be already owned –constraints likely to be soft likely to be soft

31 Example 2 Course advising Course advising –high cost –individual –configuration must fit with other courses must fit with other courses prerequisites prerequisites –portfolio should not have already been taken should not have already been taken –constraints may be hard may be hard –graduation requirements –time and day

32 Example 3 DVD rental DVD rental –low cost –group consumption –no configuration issues –portfolio possible to recommend a favorite title again possible to recommend a favorite title again –Christmas movies –constraints likely to be soft likely to be soft some could be hard like maximum allowed rating some could be hard like maximum allowed rating

33 Users People who need / want items People who need / want items Assumption Assumption –(Usually) repeat users Issues Issues –Portfolio effects Notation Notation –U = set of all users –u = a particular user

34 Ratings A (numeric) score given by a user to a particular item representing the user's preference for that item. A (numeric) score given by a user to a particular item representing the user's preference for that item. Assumption Assumption –Preferences are static (or at least of long duration) Issues Issues –Multi-dimensional ratings –Context-dependencies Notation Notation –r u,i = a rating of item i by user u –R U,i = R i = the ratings of item i by all users

35 Explicit vs Implicit Ratings A explicit rating is one that has been provided by a user A explicit rating is one that has been provided by a user –via a user interface An implicit rating is inferred from user behavior An implicit rating is inferred from user behavior –for example, as recorded in web log data Issues Issues –effort threshold –noise

36 Collecting Explicit Ratings

37 Profile A user profile is everything that the system knows about a particular user A user profile is everything that the system knows about a particular user Issues Issues –profile dimensionality Notation Notation –P = all profiles –P u = the profile of user u

38 Knowledge Sources An AI system requires knowledge An AI system requires knowledge Takes various forms Takes various forms –raw data –algorithm –heuristics –ontology –rule base

39 In Recommendation Social knowledge Social knowledge User knowledge User knowledge Content knowledge Content knowledge

40 Knowledge source: Collaborative A collaborative knowledge source is one that holds information about peer users in a system A collaborative knowledge source is one that holds information about peer users in a system Examples Examples –ratings of items –age, sex, income of other users

41 Knowledge source: User A user knowledge source is one that holds information about the current user A user knowledge source is one that holds information about the current user –the one who needs a recommendation Example Example –a query the user has entered –a model of the user's preferences

42 Knowledge source: Content A content knowledge source holds information about the items being recommended A content knowledge source holds information about the items being recommended Example Example –knowledge about how items satisfy user needs –knowledge about the attributes of items

43 Recommendation Knowledge Sources Taxonomy Recommendation Knowledge Collaborative Content User Opinion Profiles Demographic Profiles Opinions Demographics Item Features Means-ends Domain Constraints Contextual Knowledge Requirements Query Constraints Preferences Context Domain Knowledge Feature Ontology

44 Break

45 Roadmap Session A: Basic Techniques I Session A: Basic Techniques I –Introduction –Knowledge Sources –Recommendation Types –Collaborative Recommendation Session B: Basic Techniques II Session B: Basic Techniques II –Content-based Recommendation –Knowledge-based Recomme –Knowledge-based Recommendation Session C: Domains and Implementation I Session C: Domains and Implementation I –Recommendation domains –Example Implementation – –Lab I Session D: Evaluation I Session D: Evaluation I –Evaluation Session E: Applications Session E: Applications –User Interaction –Web Personalization Session F: Implementation II Session F: Implementation II –La –Lab II Session G: Hybrid Recommendation Session G: Hybrid Recommendation Session H: Robustness Session H: Robustness Session I: Advanced Topics Session I: Advanced Topics –Dynamics –Beyond accuracy

46 Recommendation Types Default (non-personalized) Default (non-personalized) –“Would you like fries with that?” Collaborative Collaborative –“Most people who bought hamburgers also bought fries.” Demographic Demographic –“Most 45-year-old computer scientists buy fries.” Content-based Content-based –“You usually buy fries with your burgers.” Knowledge-based Knowledge-based –“A large order of curly fries would really complement the flavor of a Western Bacon Cheeseburger.”

47 Collaborative Key knowledge source Key knowledge source –opinion database Process Process –given a target user, find similar peer users –extrapolate from peer user ratings to the target user

48 Demographic Key knowledge sources Key knowledge sources –Demographic profiles –Opinion profiles Process Process –for target user, find users of similar demographic –extrapolate from similar users to target user

49 Content-based Key knowledge sources Key knowledge sources –User’s opinion –Item features Process Process –learn a function that maps from item features to user’s opinion –apply this function to new items

50 Knowledge-based Key knowledge source Key knowledge source –Domain knowledge Process Process –determine user’s requirements –apply domain knowledge to determine best item

51 Collaborative Recommendation Identify peers Generate recommendation

52 Recommendation Knowledge Sources Taxonomy Recommendation Knowledge Collaborative Content User Opinion Profiles Demographic Profiles Opinions Demographics Item Features Means-ends Domain Constraints Contextual Knowledge Requirements Query Constraints Preferences Context Domain Knowledge Feature Ontology

53 Two Problems Generate neighborhood Generate neighborhood –Peers should be users with similar needs / tastes –How to identify peer users? Generate predictions Generate predictions –Basic assumption = consistency in preference –Prefer those items generally liked by peers

54 Opinion Profile Consist of ratings of items Consist of ratings of items –P u = {r u,i i  I} –usually discrete numerical values We can think of such a profile as a vector We can think of such a profile as a vector – – –some (most) ratings will be missing –the vector is sparse The collection of all ratings for all users The collection of all ratings for all users –the rating matrix –usually very sparse

55 Cosine The angle between two vectors is given by The angle between two vectors is given by θ

56 Example Cosine similarity with Alice Cosine similarity with Alice

57 Cosine, cont'd Useful as a metric Useful as a metric –varies between -1 and 1 approaches 1 if angle is small approaches 1 if angle is small approches -1 if angle is near 180º approches -1 if angle is near 180º Common in information retrieval Common in information retrieval

58 Mean Adjustment Cosine is sensitive to the actual values in the vector Cosine is sensitive to the actual values in the vector –but users often have different "baseline" preferences –one might never rate an item below 3 / 5 –another might only rarely give a 5 / 5 These differences in scale These differences in scale –can mask real similarities between preferences Missing entries Missing entries –are effectively zero (very negative rating) Solution Solution –mean-adjustment –subtract the user's mean from each rating an item that gets an average score becomes a 0 an item that gets an average score becomes a 0 below average becomes negative below average becomes negative

59 Mean Adjusted Cosine

60 Example User6 now most similar User6 now most similar –because missing items aren't a penalty

61 Problem How to handle missing ratings? How to handle missing ratings? –sparsity Cosine Cosine –assumes a value for these values –regular cosine assumes zero (not a valid rating) assumes zero (not a valid rating) –adjusted cosine assumes the user's mean assumes the user's mean Neither really satisfactory Neither really satisfactory

62 Correlation Don't think of ratings as dimensions Don't think of ratings as dimensions Think of them as samples of a random variable Think of them as samples of a random variable –user opinion –taken at different points Try to estimate whether two user's opinions move in the same way Try to estimate whether two user's opinions move in the same way –if they are correlated

63 Correlation

64 Pearson's r Measurement of the correlation tendency of paired measurements Measurement of the correlation tendency of paired measurements –covariance / product of std. dev. Items not co-rated are not considered Items not co-rated are not considered

65 Cosine vs Correlation

66 Example

67 Neighborhood Size Too few Too few –prediction based on only a few neighbors Too many Too many –distant neighbors included –niche not specifically identified –taken to extreme overall average overall average

68 Sparsity What if the neighbor has only a few ratings in common with the target? What if the neighbor has only a few ratings in common with the target? Possible to compute correlation with just two ratings in common Possible to compute correlation with just two ratings in common

69 Example

70 Considerations in Prediction Proximity Proximity –should nearer neighbors get more say Sparsity Sparsity –should neighbors with less overlap get less (or no) say Baseline Baseline –different users have different average ratings All of these factors can be included in making predictions All of these factors can be included in making predictions

71 Typical prediction formula Take the user’s average Take the user’s average –add a weighted average of the neighbors –weight using the similarity scores

72 Collaborative Recommendation Advantages Advantages –possible to make recommendations knowing nothing about the items –extends common social practice, exchange of opinions –possible to find niches of users with obscure combinations of interests –possible to make disparate connections (serendipity) Disadvantages Disadvantages –vulnerability to manipulation (more later) –source of ratings needed explicit ratings preferred explicit ratings preferred –cold start problems (next slide)

73 Cold Start Problem New item New item –how can a new item be recommended? no users have rated it no users have rated it –must wait for the first person to rate it –possible solution: genre bot New user New user –how can a new user get a recommendation needs a profile that can be compared with others needs a profile that can be compared with others –possible solutions wait for user to rate items wait for user to rate items require users to rate items require users to rate items give some default recommendations while waiting for data give some default recommendations while waiting for data

74 Roadmap Session A: Basic Techniques I Session A: Basic Techniques I –Introduction –Knowledge Sources –Recommendation Types –Collaborative Recommendation Session B: Basic Techniques II Session B: Basic Techniques II –Content-based Recommendation –Knowledge-based Recomme –Knowledge-based Recommendation Session C: Domains and Implementation I Session C: Domains and Implementation I –Recommendation domains –Example Implementation – –Lab I Session D: Evaluation I Session D: Evaluation I –Evaluation Session E: Applications Session E: Applications –User Interaction –Web Personalization Session F: Implementation II Session F: Implementation II –La –Lab II Session G: Hybrid Recommendation Session G: Hybrid Recommendation Session H: Robustness Session H: Robustness Session I: Advanced Topics Session I: Advanced Topics –Dynamics –Beyond accuracy


Download ppt "Recommender Systems Robin Burke DePaul University Chicago, IL."

Similar presentations


Ads by Google