Presentation is loading. Please wait.

Presentation is loading. Please wait.

Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology

Similar presentations


Presentation on theme: "Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology"— Presentation transcript:

1 Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology polcicova@dcs.elf.stuba.sk navrat@elf.stuba.sk Gabriela Polčicová Pavol Návrat

2 Overview Information Filtering and its Types Combined Method Experiment with Information Filtering Methods Conclusions

3 Information Filtering (1) –delivery of relevant information to the people who need it Types of Information Filtering –Content-based - for textual documents –Collaborative - for communities of users Interests –information about interests - stored in profiles –expressing opinions to documents - ratings Ratings {i, j, r ij } –for user i, item j, the value of rating r ij

4 Information Filtering (2) Filter Learning interests Estimating the value of rating Choosing recommendations Rated items {user, item, value} Unrated items {user, item} Recommendations {user, item, estimation}

5 Content-based Filtering (1) Basic idea –recommending documents based on content and properties of document Profile –consists of keywords with assigned weights –only documents matching profile are recommended Recommendations –based on objective measurable properties

6 Content-based Filtering (2) Documents rated by the user Documents of interest Documents unrated by the user PROFILE Keywords, phrases with weights Documents matching profile => recommended documents Documents, ratings

7 Collaborative Filtering (1) Basic idea –automating “word of mouth” –leverage opinions of like-minded users while making decisions Schema –collecting users’ opinions –searching for like-minded users –making recommendations

8 Collaborative Filtering (2) Profile of current user Profile of user 1 Profile of user 2 Profile of user 3 Profile of user 4 Profile of user 5 Documents from like-minded users’ profiles => recommended documents

9 k ci =  (r cj - r c ) (r ij - r i ) j  I ci  (r cj - r c ) 2  (r ij - r i ) 2 j  I ci Recommendations computation: weighted sum of ratings r cj = r c +  (r ij - r i ) k ci i  U cj  |k ci | i  U cj Collaborative Filtering (3) Similarity measure: Pearson Correlation Coefficient

10 Combining Content-based and Collaborative Filtering (1) Computing of estimates for missing ratings by Content- based Filtering method for each user Searching for like-minded users –computing coefficient k ci between current and i-th user (only from ratings) –computing coefficient k ci ’ between current and i-th user (from both ratings and estimates) New recommendations computation –using ratings (with coefficients k ci ) and also ratings with estimates (with coefficient k ci ’) as weights in weighted sum of ratings and estimates

11 Datasets for Experiments Data: –EachMovie - users‘ ratings for movies www.research.digital.com/SRC/eachmovie/ –IMDB - textual information for CBF (movies‘ descriptions) www.imdb.com/ Datasets: –A - ratings from the period up to Mar 1, 1996 (810 ratings from 71 users) –B - ratings from the period uo to Mar 15, 1996 (2407 ratings from 131 users) –C - ratings from the period up to Apr 1, 1996 (12290 ratings from 651 users)

12 EachMovie Data and Constant Method Constant Method r cj = 5

13 Experiments with Combination of Content- based and Collaborative Filtering (2) Dataset Divide dataset into training set (90%) and test set (10%) Apply filtering methods and evaluate their performance Content-based Filtering method Collaborative Filtering method Combined Filtering method recommendations test, training sets Evaluation of methods’ performance Constant method recommendations test set

14 Metrics Coverage = percentage of items for which the method is able to compute estimates Accuracy = F-measure = NMAE = 2.Precision.Recall Precision + Recall |R  L| + |R  L| |L| + |L| |R  L| |R| |R  L| |L|  |r ij - r ij | n.s Precision = Recall = R - set of recommended items L - set of liked items

15 Results of Experiments

16 Conclusions Combination of content-based and collaborative filtering might help in initial phase Future work Weighting of coefficients Comparing method with additional methods

17 Content-based Filtering - Vector Representation of Documents and Profiles W j = (0, …, 0, 0.5, 0, …, 0, 0.3, 0, …, 0, 0.2, 0, …, 0) profile i =  r j.w ij n j = 1 D = ( …, computer, …, learning, …, machine, …. ) Document j computer machine learning TF-IDF W. Profile |W|. |Profile| Sim(W, Profile) =

18 Collaborative Filtering - Example ABCDEFG current1 45 1 35 12 2 1 3 25 3 5 1 4 5 41 424 52425 2

19 k ci =  (r cj - r c ) (r ij - r i ) j  I ci  (r cj - r c ) 2  (r ij - r i ) 2 j  I ci Recommendations computation: weighted sum of ratings and estimates r cj = r c +  (r ij - r i ) k ci +  (r ij - r i ) k ci ’ i  U cj CBF  |k ci | +  |k ci ’| i  U’ cj i  U cj i  U’ cj Combining Content-based and Collaborative Filtering (2) Similarity measure: Pearson Correlation Coefficient ’ ’ ’’ CBF

20 Experiments with Combination of Content- based and Collaborative Filtering (1) Content-based Filtering Method (CBF) –documents and profiles: vector representation - weighted keywords (TF-IDF) –estimation computation: normalized dot product of document and profile vectors Collaborative Filtering (CF) –Pearson correlation coefficient –weighted sum of ratings Combination of CF and CBF –Pearson correlation coefficients –weighted sum of ratings and CBF estimations Constant Method (r cj = 5)


Download ppt "Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology"

Similar presentations


Ads by Google