Download presentation
Presentation is loading. Please wait.
1
Data Mining: Concepts and Techniques
Recommender Systems May 4, 2018 Data Mining: Concepts and Techniques
2
Recommender Systems RS – problem of information filtering
RS – problem of machine learning seeks to predict the 'rating' that a user would give to an item she/he had not yet considered. Enhance user experience Assist users in finding information Reduce search and navigation time
3
Types of RS Three broad types: Content based RS Collaborative RS
Hybrid RS
4
Types of RS – Content based RS
Content based RS highlights Recommend items similar to those users preferred in the past User profiling is the key Items/content usually denoted by keywords Matching “user preferences” with “item characteristics” … works for textual information Vector Space Model widely used
5
Types of RS – Content based RS
Content based RS - Limitations Not all content is well represented by keywords, e.g. images Items represented by the same set of features are indistinguishable Users with thousands of purchases is a problem New user: No history available
6
Types of RS – Collaborative RS
Collaborative RS highlights Use other users recommendations (ratings) to judge item’s utility Key is to find users/user groups whose interests match with the current user Vector Space model widely used (directions of vectors are user specified ratings) More users, more ratings: better results Can account for items dissimilar to the ones seen in the past too Example: Movielens.org
7
Types of Collaborative Filtering
User-based collaborative filtering Item-based collaborative filtering
8
User-based Collaborative Filtering
Idea: People who agreed in the past are likely to agree again To predict a user’s opinion for an item, use the opinion of similar users Similarity between users is decided by looking at their overlap in opinions for other items
9
Example: User-based Collaborative Filtering
Item 1 Item 2 Item 3 Item 4 Item 5 User 1 8 1 ? 2 7 User 2 5 User 3 4 User 4 3 User 5 6 User 6
10
Similarity between users
Item 1 Item 2 Item 3 Item 4 Item 5 User 1 8 1 ? 2 7 User 2 5 User 4 3 How similar are users 1 and 2? How similar are users 1 and 5? How do you calculate similarity?
11
Similarity between users: simple way
Item 1 Item 2 Item 3 Item 4 Item 5 User 1 8 1 ? 2 7 User 2 5 Only consider items both users have rated For each item: Calculate difference in the users’ ratings Take the average of this difference over the items Average j : Item j rated by User 1 and User 2: | rating (User 1, Item j) – rating (User 2, Item j) |
12
Algorithm 1: using entire matrix
5 7 7 Aggregation function: often weighted sum Picture shows six users, our target user in middle (with red circle indicating them), distance between users based on how similar they are. Numbers in yellow boxes are rating by users for Item 3. Ratings of all other users are used by an aggregation function (often a weighted sum) to decide on predicted rating for our target user. Weight depends on similarity 8 4
13
Algorithm 2: K-Nearest-Neighbour
Neighbours are people who have historically had the same taste as our user 5 7 7 Aggregation function: often weighted sum Picture shows six users, our target user in middle (with red circle indicating them), distance between users based on how similar they are. Blue area around target user shows nearest neighbours (including two of our users in this case). Numbers in yellow boxes are rating by users for Item 3. Ratings of nearest neighbours are used by an aggregation function (often a weighted sum) to decide on predicted rating for our target user. Weight depends on similarity 8 4
14
Item-based Collaborative Filtering
Idea: a user is likely to have the same opinion for similar items [same idea as in Content-Based Filtering] Similarity between items is decided by looking at how other users have rated them [different from Content-based, where item features are used] Advantage (compared to user-based CF): Prevents User Cold-Start problem Improves scalability (similarity between items is more stable than between users)
15
Example: Item-based Collaborative Filtering
User 1 8 1 ? 2 7 User 2 5 User 3 4 User 4 3 User 5 6 User 6
16
Similarity between items
? 2 7 5 4 3 8 6 How similar are items and 4? How similar are items 3 and 5? How do you calculate similarity? Each row in the table are the ratings one user on the items
17
Similarity between items: simple way
? 2 5 7 4 3 6 8 Only consider users who have rated both items For each user: Calculate difference in ratings for the two items Take the average of this difference over the users Average i : User i has rated Items 3 and 4: | rating (User i, Item 3) – rating (User i, Item 4) | Each row in the table are the ratings one user on the items
18
Aggregation function: often weighted sum
Algorithms As User-Based: can use nearest-neighbours or all Item 2 8 1 Aggregation function: often weighted sum Item 1 Item 3 Item 5 Showing five items, Item 3 is the one we need to know for User 1. Distances to Item 3 indicate similarity. Numbers in yellow boxes give ratings for User 1 for other items. Blue area shows nearest neighbours, items that are most similar to Item 3 based on past ratings by other users. Weight depends on similarity 7 Item 4 2
19
Types of RS – Collaborative RS
Collaborative RS - Limitations Different users might use different scales. Possible solution: weighted ratings, i.e. deviations from average rating Finding similar users/user groups isn’t very easy New user: No preferences available (user cold start problem) New item: No ratings available (item cold start problem) Demographic filtering is required
20
Some ways to make a Hybrid RS
Weighted. Ratings of several recommendation techniques are combined together to produce a single recommendation Switching. The system switches between recommendation techniques depending on the current situation Mixed. Recommendations from several different recommenders are presented simultaneously (e.g. Amazon) Cascade. One recommender refines the recommendations given by another
21
Model-based collaborative filtering
Instead of using ratings directly, develop a model of user ratings Use the model to predict ratings for new items To build the model: Bayesian network (probabilistic) Clustering (classification) Rule-based approaches (e.g., association rules between co-purchased items)
22
Model-based collaborative filtering
Cluster Models Create clusters or groups Put a customer into a category Classification simplifies the task of user matching More scalability and performance Lesser accuracy than normal collaborative filtering method
23
Possible Improvement in RS
Better understanding of users and items Social network (social RS) User level Highlighting interests, hobbies, and keywords people have in common Item level link the keywords to eCommerce
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.