Download presentation
Presentation is loading. Please wait.
1
CS728 The Collaboration Graph
Lecture 10
2
Review and Plan Today’s Plan Web Graph Ranking measure algorithms
PageRank HITS Personalization Via teleporting bias in PageRank How about In HITS? Today’s Plan Collaboration Graph Using explicit and implicit connectivity links Voting Algorithm, Correlation factors Applications Recommendation Systems (RS) Collaborative Filtering (CF)
3
Collaborative Framework
Given a set of users and items Items could be documents, products, other users … Goal: Recommend items to a user based on Past behavior of this and other users Who has viewed/bought/liked what? Additional information on users and items Both users and items can have known attributes [age, genre, price, …] Our Framework: generate links based on past behavior and attributes
4
Example-behavior Users Docs viewed ? U1 viewed d1, d2, d3.
U2 views d1, d2. Recommend d3 to U2 ?
5
Linking Users Link U2 to U1 as someone of interest (to talk to, to recommend)? U1 d1 d2 U2 d3
6
NNN-Naïve Nearest Neighbors
d1 U d2 U viewed d1, d2, d5. Look at who else viewed similar sets of docs V d5 W Compute set distance between pairs Recommend to U the doc(s) most “popular” among nearest users.
7
Typical Collaboration Issues
Large item space Usually with item attributes Large user base Usually with user attributes (age, gender, city, …) Some evidence of customer preferences Explicit ratings (powerful, but harder to elicit) Observations of user activity (purchases, page views, s, what was printed, …) Typically extremely sparse, even when user has an opinion
8
The Collaboration Graph
Users Items Links derived from similar attributes, explicit and implicit connections User-User Links Item-Item Links Links derived from similar attributes, similar content, explicit cross references, implicit correlations You have users, items, and then links between those items. Ex: item links: two movies in the same genre or with the same director. Ex: user links: two users have the same demographics (age, gender, etc) Goal: Predict user i’s utility for item k Observed preferences (Ratings, purchases, page views, laundry lists, play lists)
9
Link types bias outcomes
Explicit attributes-based Male, 18-35: Recommend The Matrix Item content similarity-based You liked The Matrix: recommend The Matrix Reloaded Collaborative Filtering Use implicit similarity links People with interests like yours also liked the adventure-comedy Pirates
10
Matrix view of Collaboration Graph
Docs A = Users Aij = 1 if user i viewed doc j, = 0 otherwise. AAt : Entries give # of docs commonly viewed by pairs of users.
11
Voting Algorithm Row i of AAt : Vector whose jth entry is the # of docs viewed by both i and j. Call this row ri’ e.g., (0, 7, 1, 13, 0, 2, ….) Measures the similarity of user i with all other users. Question: what values are on the diagonal?
12
How does this differ from
Voting Algorithm Compute the utility of each doc ri ° A is a vector whose kth entry gives a weighted vote count to doc k emphasizes users who have high weights in ri . Recommend doc(s) with highest vote counts. ri A How does this differ from the NNN algorithm?
13
Voting Algorithm - implementation issues
Wouldn’t implement using matrix operations use weight-propagation on compressed adjacency lists Need to log and maintain “user views doc” relationship. typically, log into database update vote-propagating structures periodically. For efficiency, discard all but the heaviest weights in each ri only in fast structures, not in back-end database. Exercise: Write pseudo code
14
Link Analysis Revisited
There are many connections between collaboration graph analysis and web link analysis: The voting algorithm may be viewed as one iteration of the Hubs/Authorities algorithm
15
User Links via Correlation
Correlations measure similarity between vectors E.g., dot-product, cosine measure, Pearson measure Each user i rates some docs (products, … ) say a real-valued rating vik for doc k in practice, one of several ratings on a form Thus we have a ratings vector vi for each user (with lots of zeros) Compute a correlation coefficient between every pair of users i,j normalized dot product of their ratings vectors (symmetric, scalar) measure of how much user pair i,j agrees: corrij
16
Application: Predict user i’s utility for doc k
Sum (over users j such that vjk is non-zero) corrij vjk Output this as the predicted utility for user i on doc k. So how does this differ from the voting algorithm?
17
Math of Correlation In statistics correlation, indicates the strength and direction of a linear relationship between two random variables. Refers to the departure of two variables from independence. The best known example is the Pearson product-moment correlation coefficient, which is obtained by dividing the covariance of the two variables by the product of their standard deviations.
18
Geometric Interpretation
Correlation can be viewed as the cosine of the angle between the two vectors Caution: This method only works with centered data, i.e., data which have been shifted by the sample mean so as to have an average of zero. Some practitioners prefer an uncentered coefficient. E.g., five countries are found to have gross national products of 1, 2, 3, 5, and 8 billion dollars and have 11%, 12%, 13%, 15%, and 18% poverty. data: x = (1, 2, 3, 5, 8) and y = (0.11, 0.12, 0.13, 0.15, 0.18). By the usual procedure for finding the angle between two vectors the uncentered correlation coefficient is: Centering the data, that is shifting x by E(x) = 3.8 and y by E(y) = 0.138) yields x = (-2.8, -1.8, -0.8, 1.2, 4.2) and y = (-0.028, , , 0.012, 0.042)
19
Varieties of Meanings in Correlations
Implicit (user views doc) vs. Explicit (user assigns rating to doc) Boolean vs. real-valued utility In practice, must convert user ratings on a form (say on a scale of 1-5) to real-valued utilities Can be fairly complicated mapping E.g., Likeminds function (Greening white paper) Requires understanding user’s interpretation of form
20
Sample Applications Ecommerce
Product recommendations – amazon, netflix Corporate Intranets Recommendation, finding domain experts, … Digital Libraries Finding pages/books people will like Medical Applications Matching patients to doctors, clinical trials, … Customer Relationship Management Matching customer problems to internal experts
21
Well-known recommender systems: Amazon and Netflix
22
Corporate intranets - document recommendation
23
Corporate intranets - “expert” finding
24
Rating interface
25
Resources GroupLens http://citeseer.nj.nec.com/resnick94grouplens.html
Has available data sets, including MovieLens Greening, Dan R. Building Consumer Trust with Accurate Product Recommendations: A White Paper on LikeMinds WebSell 2.1 Shardanand/Maes Sarwar et al.
26
Resources McLaughlin and Herlocker, SIGIR 2004
CoFE CoFE “Collaborative Filtering Engine” Open source Java Reference implementations of many popular CF algorithms
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.