CF Recommenders
DAN Best uncle Dan is checking out Sears to buy his nephew a brand new bike.
When Dan chooses the bike he wants, he gets an important reminder – People who bought this bike were also interested in buying a riding helmet.
DANA A young mother Dana is looking to buy Jeans for her kids. She tries shopping at ToysRUS and TCP online stores.
Maybe she’ll find it there. Not found! Dana didn’t find anything she likes, So she decides to check out Sears.com. Maybe she’ll find it there.
When Dana opens sears.com it automatically opens on the kids section. It also shows Jeans as the top recommended choices to her.
What are Recommender Systems? Recommender system or recommendation system is a subclass of information filtering system that seek to predict the 'rating' or 'preference' that user would give to an item. An Information filtering system is a system that removes redundant or unwanted information from an information stream using (semi)automated or computerized methods prior to presentation to a human user. [Source: Wikipedia]
What are Recommender Systems? Recommender system or recommendation system is a subclass of information filtering system that seek to predict the 'rating' or 'preference' that user would give to an item. Common use case: Recommender System is a system which analyzes patterns of user interest in products (or items) to provide personalized recommendations that suit a user’s taste.
Recommender Systems – Main Approaches Content Filtering – a profile is created for each user or product to characterize its nature. Examples: Movie profile – genre, actors, year etc. User profile – demographic information, answers provided on a questionnaire etc. The recommender system uses the profiles to associate users with matching movies (items). Requires gathering external information.
Recommender Systems – Main Approaches Collaborative Filtering – relies only on past user behavior without requiring the creation of explicit profiles. Examples: User X watched movie Y. User X gave movie Y a 4-star rating.
Recommender Systems – Main Approaches Collaborative Filtering – relies only on past user behavior without requiring the creation of explicit profiles. Analyzes relationships between users and interdependencies among products to identify new user-item associations. Can address data aspects that are elusive and difficult to profile. Domain-free. Usually more accurate than Content Filtering. Suffers from “cold start” – new users or items without previous data can’t be handled – more on that later.
Collaborative Filtering The two primary areas of collaborative filtering are: Neighborhood methods Latent factor models
Collaborative Filtering – Neighborhood Methods Computes the relationships between items or users. Some of the methods commonly used for neighborhood-based computation are: K-Nearest Neighbors (KNN) K-Means
Collaborative Filtering – Neighborhood Methods Example – user-oriented neighborhood method:
Neighborhood formation phase Let the record (or profile) of the target user be u (represented as a vector), and the record of another user be v (v T). The similarity between the target user, u, and a neighbor, v, can be calculated using the Pearson’s correlation coefficient: CS583, Bing Liu, UIC
Pearson Correlation Score
Example Using Pearson’s correlation coefficients: wD,A= 0.9 wD,B= - 0.7 wD,C= 0
Recommendation Phase Use the following formula to compute the rating prediction of item i for target user u where V is the set of k similar users, rv,i is the rating of user v given to item i, CS583, Bing Liu, UIC
Issue with the user-based kNN CF The problem with the user-based formulation of collaborative filtering is the lack of scalability: it requires the real-time comparison of the target user to all user records in order to generate predictions. A variation of this approach that remedies this problem is called item-based CF. CS583, Bing Liu, UIC
Item-based CF The item-based approach works by comparing items based on their pattern of ratings across users. The similarity of items i and j is computed as follows: CS583, Bing Liu, UIC
Recommendation phase After computing the similarity between items we select a set of k most similar items to the target item and generate a predicted value of user u’s rating where J is the set of k similar items CS583, Bing Liu, UIC