Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Power point The Problem the Paper will Solve Cold-start problem, there are three types of the problem: 1.Recommendation on existing items for new users; 2.Recommendation on new items for existing users; 3.Recommendation on new items for new users
Power point Cold-start and data sparsity Two Concepts to Clarify Cold-start : in general, a new user or item coming into the system without any description information about it Data sparsity : a user or item in the system with a little information about it, here We there are its data in the system but a bit.
Power point We will talk about traditional recommendation algorithm.
Power point Content-based filtering generates a profile for a user based on the content descriptions of the items previously rated by the user. The major benefit of this approach is that it can recommend users new items, which have not been rated by any users. Here to some extent, content- based filtering method can solve the new items, but just a little.
Power point However the content-based filtering have some drawback: The recommended items of content-based filtering are similar to the items previously consumed by the user. Content-based filtering cannot provide recommendations to new users who have no historical ratings. Another limitation of content-based filtering is that its performance highly depends on the quality of feature generation and selection.
Power point Collaborative filtering typically associates a user with a group of like-minded users, and then recommends items enjoyed by others in the group.
Power point There are some merits using Collaborative Filtering: Collaborative filtering does not require any feature generation and selection method and it can be applied to any domains if user ratings ( either explicit or implicit ) are available. In other words, collaborative filtering is content-independent. Collaborative filtering can provide “serendipitous finding”, whereas content-based filtering cannot. However, collaborative filtering depends on the user ratings greatly, that is to say, it still suffer from the cold-start problem where no historical ratings on items or users are available.
Power point A key challenge in recommender systems including content- based and collaborative filtering is how to provide recommendations at early stage when available data is extremely sparse. The problem is of course more severe when the system newly launches and most users and items are new.
Power point We will talk about our method for cold-start problem.
Power point Profile Construction In this paper, each item is represented by a set of features, denoted as a vector z, where and D is the number of item features, Similarly, each user is represented by a set of user features, denoted as x, where and C is the number of user features. Note that we append a constant with no information is represented as [0,…,0,1] instead of a vector of zero entries.
Power point Profile Construction In traditional collaborative filtering (CF), the ratings given by users on items of interest are used as user profiles to evaluate commonalities between users. In our regression approach, we separate these feedbacks from user profiles. The ratings are utilized as targets that reveal affinities between user features to item features.
Power point Profile Construction In the paper, we have collected three sets of data, including item features, user profiles and the ratings on items given by users. Let index the u-th user as and the i-th content item as, and denote by the interaction between the user and the item, and denotes by the index set of observations { }.
Power point Regression on Pairwise Preference A predictive model relates a pair of vectors, and, to the Rating on the item given by the user. There are various ways to construct joint feature space user/item pairs. We focus on the represented as, a vector of CD entries, where denotes the b-th feature of and denotes the a-th Feather of.
Power point Regression on Pairwise Preference We define a parametric indicator as a bilinear function of x u and z i in the following: (1) where C and D are the dimensionality of user and content features respectively, a, bare feature indices. The weight variable is independent of user and content features and characterizes the affinity of these two factors and in interaction.
Power point Regression on Pairwise Preference The indicator can be equivalently rewritten as Where W is a matrix containing entries, w denotes a column vector stacked from W, and denotes the outer Product of and, a column vector of entries.
Power point The regression coefficients can be optimized in regularization- tion framework, i.e. where is a tradeoff between empirical error and model complexity.
Power point Regression on Pairwise Preference The optimal solution of w is unique and has a closed form of matrix manipulation, i.e. where I is an CD by CD identity matrix. By exploiting the tensor structure, the matrix preparation costs where M and N are the number of items and users respectively. The matrix inverse costs, which becomes the most expensive part if M < CD and.
Power point In this paper, we introduce a personalized pairwise loss in the regression framework. For each user,the loss function is generalized as where denotes the index set of all items the user have rated, the number of ratings given by the user.
Power point Replacing the squares loss by the personalized pairwise loss in the regularization framework, we have the following optimization problem: where u runs over all users. The optimal sulution can be computed in a closed form as well, i.e.
Power point The size in matrix inverse is still CD and the matrix preparation costs same as that of the least squares loss.
Power point We will talk about experiments.
Power point Competitive approaches Most popular Segmented most popular Vibes Affinity 1 Vibes Affinity 2
Power point Most popular Most Popular ( MP ) provides the same recommendation to all users based on global popularity of items. The global popularity of an item is measured as following :
Power point Segmented most popular Segmented Most Popular ( SMP ) divides users into several user segments based on user features ( i.e. age or gender ) and computes predictions of items based on their local popularity within the user segment which a target user belongs to :
Power point Vibes Affinity 1 The algorithm computes item-item affinity based on conditional probability such as
Power point Vibes Affinity 1 Then preference score of each item j for a user u is computed as following :
Power point Vibes Affinity 2 To provide cold start recommendation, for the existing item recommendation for new users, we modified the equation to measure user attribute-item affinity such as
Power point Vibes Affinity 2 For the new item for existing user and the new item for new user we measure user attribute-item feature affinity such as
Power point DataSet We evaluate our approach with two standard movie data sets : MovieLens and EachMovie. We split user ratings into four partitions. We randomly select half of users and the rest as existing users. Similarly, we randomly split items as new and existing items. Then we use partition I for training and partition II, III and IV for test. We summarize available techniques for each partition in the following: Partition I ( recommendation on existing items for new existing users ) : This is the standard case for most existing collaborative filtering techniques, such as user-user, item based collaborative filtering, singular value deposition ( SVD ), etc.
Power point Partition II ( recommendation on existing items for new users ) : For new users without historical ratings, “most popular” strategy that recommends the highly-rated items to new users serves as a strong baseline; Partition III ( recommendation on new items for existing users ) : Content-based filtering can effectively recommend new items to existing users based on the users’ historical ratings and features of items; Partition IV ( recommendation on new items for new users ) : This is a hard case, where “random” strategy is the basic means of collecting ratings.
Power point DataSet
Power point Measurement We evaluate performance of recommender system based on the correctness of ranking rather than prediction accuracy, the normalized Discounted Cumulative Gain ( nDCG ), widely used in information retrieval ( IR ), as performance metric. If you want to konw more infromation about nDCG, please read the article at
Power point