A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

Item Based Collaborative Filtering Recommendation Algorithms
DECISION TREES. Decision trees  One possible representation for hypotheses.
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Contextual Advertising by Combining Relevance with Click Feedback D. Chakrabarti D. Agarwal V. Josifovski.
Sean Blong Presents: 1. What are they…?  “[…] specific type of information filtering (IF) technique that attempts to present information items (movies,
LYRIC-BASED ARTIST NETWORK METHODOLOGY Derek Gossi CS 765 Fall 2014.
The Wisdom of the Few A Collaborative Filtering Approach Based on Expert Opinions from the Web Xavier Amatriain Telefonica Research Nuria Oliver Telefonica.
Filtering and Recommender Systems Content-based and Collaborative Some of the slides based On Mooney’s Slides.
Item-based Collaborative Filtering Idea: a user is likely to have the same opinion for similar items [if I like Canon cameras, I might also like Canon.
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
Memory-Based Recommender Systems : A Comparative Study Aaron John Mani Srinivasan Ramani CSCI 572 PROJECT RECOMPARATOR.
CS345 Data Mining Recommendation Systems Netflix Challenge Anand Rajaraman, Jeffrey D. Ullman.
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
1 Collaborative Filtering Rong Jin Department of Computer Science and Engineering Michigan State University.
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Agent Technology for e-Commerce
Customizable Bayesian Collaborative Filtering Denver Dash Big Data Reading Group 11/19/2007.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
1 Introduction to Recommendation System Presented by HongBo Deng Nov 14, 2006 Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D. Ullman.
Recommender systems Ram Akella November 26 th 2008.
Algorithms for Efficient Collaborative Filtering Vreixo Formoso Fidel Cacheda Víctor Carneiro University of A Coruña (Spain)
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Collaborative Recommendation via Adaptive Association Rule Mining KDD-2000 Workshop on Web Mining for E-Commerce (WebKDD-2000) Weiyang Lin Sergio A. Alvarez.
Collaborative Filtering & Content-Based Recommending
Combining Content-based and Collaborative Filtering Department of Computer Science and Engineering, Slovak University of Technology
Item-based Collaborative Filtering Recommendation Algorithms
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Sarah Fatima Varda Sarfraz.  What is Recommendation systems?  Three recommendation approaches  Content-based  Collaborative  Hybrid approach  Conclusions.
1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Presented By :Ayesha Khan. Content Introduction Everyday Examples of Collaborative Filtering Traditional Collaborative Filtering Socially Collaborative.
Google News Personalization: Scalable Online Collaborative Filtering
Toward the Next generation of Recommender systems
1 Recommender Systems Collaborative Filtering & Content-Based Recommending.
Online Learning for Collaborative Filtering
Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches John HannonJohn Hannon, Mike Bennett, Barry SmythBarry Smyth.
Distributed Information Retrieval Server Ranking for Distributed Text Retrieval Systems on the Internet B. Yuwono and D. Lee Siemens TREC-4 Report: Further.
Evaluation of Recommender Algorithms for an Internet Information Broker based on Simple Association Rules and on the Repeat-Buying Theory WEBKDD 2002 Edmonton,
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Cosine Similarity Item Based Predictions 77B Recommender Systems.
Collaborative Filtering Zaffar Ahmed
Pearson Correlation Coefficient 77B Recommender Systems.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
CS378 Final Project The Netflix Data Set Class Project Ideas and Guidelines.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
User Modeling and Recommender Systems: recommendation algorithms
Company LOGO MovieMiner A collaborative filtering system for predicting Netflix user’s movie ratings [ECS289G Data Mining] Team Spelunker: Justin Becker,
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
ItemBased Collaborative Filtering Recommendation Algorithms 1.
Item-Based Collaborative Filtering Recommendation Algorithms
Collaborative Filtering With Decoupled Models for Preferences and Ratings Rong Jin 1, Luo Si 1, ChengXiang Zhai 2 and Jamie Callan 1 Language Technology.
Recommender Systems & Collaborative Filtering
Collaborative Filtering Nearest Neighbor Approach
M.Sc. Project Doron Harlev Supervisor: Dr. Dana Ron
Movie Recommendation System
ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS
Presentation transcript:

A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation

Collaborative Filtering Method of automating word-of-mouth Large groups of users collaborate by rating products, services, news articles, etc. Analyze ratings data of the group to produce recommendations for individual users –Find users with similar tastes

Problems with Collaborative Filtering Methods Performance –Prohibitively large dataset Scalability –Will the solution scale to millions of users on the Internet? Sparsity of data –User who has rated few items –Item with few ratings

Problems with Collaborative Filtering Methods Cannot compare users that have no common ratings User 1User 2 Billy Madison4 Happy Gilmore 5 Mr. Deeds 4 50 First Dates5 Big Daddy 4 (Ratings on a scale of 1-5)

A Content-Based Approach Build a feature list for each user based on content of items rated Compare users’ features to make recommendations Now we can find similarity between users with no common ratings

Data Source EachMovie Project –Compaq Systems Research Center –Over 18 months collected 2,811,983 ratings for 1,628 movies from 72,916 users –Ratings given on 1-5 scale –Dataset split into 75% training, 25% testing Internet Movie Database (IMDb) –Huge database of movie information Actors, director, genre, plot description, etc.

Creating the Feature List Retrieve content information for each movie from IMDb dataset – create “bag of words” Throw out common words (i.e.: the, and, but) Calculate frequency of remaining words, create movie’s feature list –Frequencies weighted based on total number of terms Goldeneye satellite2destroy2 xenia3london2 thriller2villain2 simon4revenge2

Comparing Users Each user has positive and negative feature list –Combine feature lists of movies they have rated Compare user’s feature lists using Pearson Correlation Coefficient Users can be compared with no common ratings Able to recommend items with few ratings Users only need to rate a few items to receive recommendations

Methods Three methods attempted to improve performance: –Clustering of users –Random groups of users –Compare users directly to items

User Clustering Simple algorithm, starting with first user: –Compare to existing clusters first If similarity is high, merge user into cluster –Compare to each remaining user –Stop if correlation is above threshold –Once a similar user is found, create a new cluster from the two users Cluster has combined feature list of all its users Not as efficient as possible - O(n 2 )

User Clustering Once clusters are formed, we can predict ratings for each item –For each user, find their 10 nearest neighbors –Predicted rating is the average rating of item from these neighbors

Selecting a Random Group Randomly select 5000 users as a (hopefully) representative sample As before, find a user’s 10 nearest neighbors from the random group –Predicted rating is the average rating of item from these neighbors Much less work than clustering –How much accuracy (if any) will be lost?

Comparing Users to Items No collaborative filtering involved Compare the positive and negative feature lists of user to feature list of item –Make prediction based on which feature list has higher correlation with item Pretty quick and easy to do –How accurate will this be?

Analyzing Predictions Collected 3 metrics to evaluate predictions –Accuracy: all items predicted correctly –Precision: positive items predicted correctly –Recall: unseen positive items predicted correctly Precision and recall have inverse relationship

Results

Conclusions Large gain from clustering users –Is the extra work worth it? –Depends on the application Purely content-based predictions worked pretty well –Simple, fast solution Random group prediction also performed reasonably well Problems solved by content-based analysis: –Sparsity of data –Performance –Scalability