Download presentation
Presentation is loading. Please wait.
Published byAnne Casey Modified over 9 years ago
1
The Summary of My Work In Graduate Grade One Reporter: Yuanshuai Sun E-mail: sunyuan_2008@yahoo.cn
2
12345 KNN Algorithm—CF Recommender System Matrix Factorization MF on Hadoop Thesis Framework Content
3
1 Recommender System Recommender system is a system which can recommend something you are maybe interested that you haven ’ t a try. For example, if you have bought a book about machine learning, the system would give a recommendation list including some books about data mining, pattern recognition, even some programming technology.
4
1 Recommender System
5
1 But how she get the recommendation list ? Machine Learning 1. Nuclear Pattern Recognition Method and Its Application 2. Introduction to Robotics 3. Data Mining 4. Beauty of Programming 5. Artificial Intelligence
6
1 Recommender System There are many ways by which we can get the list. Recommender systems are usually classified into the following categories, based on how recommendations are made , 1. Content-based recommendations: The user will be recommended items similar to the ones the user preferred in the past;
7
1 Recommender System 2. Collaborative recommendations: The user will be recommended items that people with similar tastes and preferences liked in the past; Corated Item Top 1 The similar user favorite but target user not bought recommend it to target user
8
1 Recommender System 3. Hybrid approaches: These methods combine collaborative and content-based methods, which can help to avoid certain limitations of content-based and collaborative. Different ways to combine collaborative and content-based methods into a hybrid recommender system can be classified as follows: 1). implementing collaborative and content-based methods separately and combining their predictions, 2). incorporating some content-based characteristics into a collaborative approach, 3). incorporating some collaborative characteristics into a content-based approach, 4). constructing a general unifying model that incorporates both content-based and collaborative characteristics.
9
2 KNN Algorithm — CF KDD CUP 2011 website: http://kddcup.yahoo.com/index.php Recommending Music Items based on the Yahoo! Music Dataset. The dataset is split into two subsets: - Train data: in the file trainIdx2.txt - Test data: in the file testIdx2.txt At each subset, user rating data is grouped by user. First line for a user is formatted as: | \n Each of the next lines describes a single rating by. Rating line format: \t \n The scores are integers lying between 0 and 100, and are withheld from the test set. All user id's and item id's are consecutive integers, both starting at zero
10
2 KNN Algorithm — CF KNN is the algorithm used when I participate the KDD CUP 2011 with my advisor Mrs Lin, KNN belongs to collaborative recommendation. Corated Item Top 1 The similar user ’ s favorite song but target user not seen recommend it to target user
11
2 KNN Algorithm — CF user item
12
2 KNN Algorithm — CF 1. Cosine distance 2. Pearson correlation coefficient Where Sxy is the set of all items corated by both users x and y.
13
2 KNN Algorithm — CF 1. Cosine distance whereand
14
2 KNN Algorithm — CF where 2. Pearson correlation coefficient and
15
2 KNN Algorithm — CF trackData.txt - Track information formatted as: | | | |...| \n albumData.txt - Album information formatted as: | | |...| \n artistData.txt - Artist listing formatted as: \n genreData.txt - Genre listing formatted as: \n
16
2 KNN Algorithm — CF
17
2 is comentropy. 1. The distance between parent node with child node where 2. Similarity between c1 and c2
18
2 KNN Algorithm — CF
19
2
20
3 Matrix Factorization u1 u2 u3 i1i2i3 Users Feature MatrixItems Feature Matrix x11*y11 + x12*y12 = 1 x11*y21 + x12*y22 = 3 x21*y11 + x22*y12 = 2 x31*y21 + x32*y22 = 1 x31*y31 + x32*y32 = 3 U,V x11*y31 + x12*y32 = ? x21*y21 + x22*y22 = ? x21*y31 + x22*y32 = ? x31*y11 + x32*y12 = ?
21
3 Matrix Factorization Matrix factorization (abbr. MF), just as the name suggests, decomposes a big matrix into the multiplication form of several small matrix. It defines mathematically as follows, We here assume the target matrix, the factor matrix and, where k << min (m, n), so it is
22
3 Matrix Factorization Kernel Function Kernel Function decides how to compute the prediction matrix, that is, it’s a function with the features matrix U and V as the arguments. We can express it as follows:
23
3 Matrix Factorization Kernel Function For the kernel K : one can use one of the following well-known kernels: linear polynomial RBF logistic ……………… ………… ……….. ……… with
24
3 Matrix Factorization We quantify the quality of the approximation with the Euclidean distance, so we can get the objective function as follows, Where i.e. is the predict value.
25
3 Matrix Factorization 1. Alternating Descent Method This method only works, when the loss function implies with Euclidean distance. So, we can get The same to.
26
3 Matrix Factorization 2. Gradient Descent Method The update rules of U defines as follows, where The same to.
27
3 Matrix Factorization Gradient Algorithm Stochastic Gradient Algorithm
28
3 Matrix Factorization Online Algorithm Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender Systems
29
4 MF on Hadoop Loss Function We update the factor V for reducing the objective function f with the conventional gradient descendent, as follows,, the same to factor matrix U., so it is reachable Here we set
30
4 MF on Hadoop
31
4
32
4
33
4 ×= Left Matrix Right Matrix ×= × = + + ||
34
4 MF on Hadoop
35
4 where
36
5 Thesis Framework Recommendation System 1.Introduction to recommendation system 2.My work to KNN 3.Matrix factorization in recommendation system 4.MF incremental updating using Hadoop
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.