Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.

Slides:



Advertisements
Similar presentations
Recommender System A Brief Survey.
Advertisements

Recommender Systems & Collaborative Filtering
Content-based Recommendation Systems
Item Based Collaborative Filtering Recommendation Algorithms
Prediction Modeling for Personalization & Recommender Systems Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Collaborative Filtering Sue Yeon Syn September 21, 2005.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.
Oct 14, 2014 Lirong Xia Recommender systems acknowledgment: Li Zhang, UCSC.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Intro to RecSys and CCF Brian Ackerman 1. Roadmap Introduction to Recommender Systems & Collaborative Filtering Collaborative Competitive Filtering 2.
Recommender Systems Aalap Kohojkar Yang Liu Zhan Shi March 31, 2008.
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
CS345 Data Mining Recommendation Systems Netflix Challenge Anand Rajaraman, Jeffrey D. Ullman.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Chapter 8 Collaborative Filtering Stand
Agent Technology for e-Commerce
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Sparsity, Scalability and Distribution in Recommender Systems
1 Introduction to Recommendation System Presented by HongBo Deng Nov 14, 2006 Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D. Ullman.
Collaborative Filtering CMSC498K Survey Paper Presented by Hyoungtae Cho.
Recommender systems Ram Akella November 26 th 2008.
Algorithms for Efficient Collaborative Filtering Vreixo Formoso Fidel Cacheda Víctor Carneiro University of A Coruña (Spain)
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Chapter 5 Data mining : A Closer Look.
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Item-based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karpis, Joseph KonStan, John Riedl (UMN) p.s.: slides adapted from:
Collaborative Filtering Recommendation Reporter : Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Presented By :Ayesha Khan. Content Introduction Everyday Examples of Collaborative Filtering Traditional Collaborative Filtering Socially Collaborative.
Toward the Next generation of Recommender systems
1 Recommender Systems Collaborative Filtering & Content-Based Recommending.
1 Computing Relevance, Similarity: The Vector Space Model.
EigenRank: A Ranking-Oriented Approach to Collaborative Filtering IDS Lab. Seminar Spring 2009 강 민 석강 민 석 May 21 st, 2009 Nathan.
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl
The Effect of Dimensionality Reduction in Recommendation Systems
Collaborative Data Analysis and Multi-Agent Systems Robert W. Thomas CSCE APR 2013.
Temporal Diversity in Recommender Systems Neal Lathia, Stephen Hailes, Licia Capra, and Xavier Amatriain SIGIR 2010 April 6, 2011 Hyunwoo Kim.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
Recommender Systems. Recommender Systems (RSs) n RSs are software tools providing suggestions for items to be of use to users, such as what items to buy,
Collaborative Filtering Zaffar Ahmed
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
User Modeling and Recommender Systems: recommendation algorithms
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
The Wisdom of the Few Xavier Amatrian, Neal Lathis, Josep M. Pujol SIGIR’09 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Collaborative Filtering: Searching and Retrieving Web Information Together Huimin Lu December 2, 2004 INF 385D Fall 2004 Instructor: Don Turnbull.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
ItemBased Collaborative Filtering Recommendation Algorithms 1.
Slope One Predictors for Online Rating-Based Collaborative Filtering Daniel Lemire, Anna Maclachlan In SIAM Data Mining (SDM’05), Newport Beach, California,
Item-Based Collaborative Filtering Recommendation Algorithms
Collaborative Filtering With Decoupled Models for Preferences and Ratings Rong Jin 1, Luo Si 1, ChengXiang Zhai 2 and Jamie Callan 1 Language Technology.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Data Mining: Concepts and Techniques
Recommender Systems & Collaborative Filtering
Item-to-Item Recommender Network Optimization
Methods and Metrics for Cold-Start Recommendations
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Recommender Systems: Collaborative & Content-based Filtering Features
Recommendation Systems
Presentation transcript:

Distributed Networks & Systems Lab

Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent advances in CF Conclusion

Distributed Networks & Systems Lab Recommendation System Help users to discover new items that may be hard for users to find Subclass of information filtering system that seek to predict the ‘rating’ or ‘preference’ that user would give to an item Recommender systems identify recommendations autonomously for individual users based on past purchases and searches, and on other users' behavior

Distributed Networks & Systems Lab

Recommendation System Content-based Collaborative Filtering Hybrid based on a descripti on of the item and a profile of the user’s p reference Combination of collabora tive filtering and content- based approach based on collecting and analyz ing a large amount of informati on on users’ behaviors, activiti es or preferences and predicti ng what users will like based o n their similarity to other users.

Distributed Networks & Systems Lab Recommendation System Content-based Collaborative Filtering Hybrid Memory-based Model-basedHybrid

Distributed Networks & Systems Lab

Collaborative filtering has performance challenges from the distinguishable characteristics Data sparsity Scalability Synonymy Gray sheep Shilling attacks

Distributed Networks & Systems Lab In internet markets, the variation of products makes user-item matrix sparse. How to process sparse data and match?

Distributed Networks & Systems Lab Cold start problem A new user or item has just entered the system. Hard to find similar ones since there is not enough information Too small users’ ratings compared to the large number of items in the system Causes reduced coverage

Distributed Networks & Systems Lab Users with same tastes may not be indentified as such if there is no co- rated items

Distributed Networks & Systems Lab Dimensionality reduction techniques Singular Value Decomposition  Removes unrepresentative or insignificant users or items to reduce the dimensionalities of the user-item based matrix directly Reduced sparsity, but some drawbacks  Meaningful data also discarded  Caused decrease in quality

Distributed Networks & Systems Lab Large size of data caused longer compute time under limited resources Dimensionality reduction can help this problem, but requires extra steps(matrix factorization) which has expensive cost Incremental SVD algorithm has been suggested to reduce the cost of the step

Distributed Networks & Systems Lab Same kind of products, different names “Children movie”, “children film” Memory based CF systems are vulnerable to this problem Attempts were made to solve this Intellectual or automatically term expansion could have partial solution, but has some drawbacks

Distributed Networks & Systems Lab Users that are not ordinary Hard to make prediction for them No full solution for this Per-user approach were made to reduce this problem

Distributed Networks & Systems Lab Intended increase in good rating and negative rating by the product sales company Item based CF algorithm was much less affected by the attacks than the user-based CF algorithm

Distributed Networks & Systems Lab Observing personal habit of users Privacy invasion Noise increase From increase in diversity Explainability Let users know the reason why the system recommends the specific item

Distributed Networks & Systems Lab Memorize the rating matrix and issue recommendations based on the relationship between the queried user and item and the rest of the matrix Uses the entire or a sample of the user-item database to make prediction Every user is part of a group of people with similar interests

Distributed Networks & Systems Lab Most popular memory-based CF method Predict ratings by referring to users whose ratings are similar to the queried user, or to items that are similar to queried item. Calculate similarity or weight then,  Aggregate the neighbors to get the top-N most frequent items as the recommendation

Distributed Networks & Systems Lab Critical step For item-based CF Compute similarity between items For user-based CF Compute similarity between users u and v who have both rated the same items

Distributed Networks & Systems Lab To get the similarity  W u,v between two users u and v  W i,j between two items i and j Pearson Correlation is used to measure similarity  Measures the linear independence between two variables(or users) as a function of their attributes

Distributed Networks & Systems Lab User-based algorithm i ∈ I summations are over the items that both the users u and v have rated, And is the average rating of the co-rated items of the u-th user. Item-based algorithm r u,I s is the rating of user u on item I, And is the rating of the i-th item by those users.

Distributed Networks & Systems Lab Used to find similarity between two documents each document as a vector of word frequencies Compute the cosine of the angle formed by the frequency vectors For collaborative filtering, Treat users or items as a vector of ratings and compute the cosine of the angle formed by the rating vectors

Distributed Networks & Systems Lab Similarity between two items i and j Example: For vector A={x1, y1}, vector B={x2, y2}

Distributed Networks & Systems Lab In the neighborhood-based CF, a subset of nearest neighbors of the active user are chosen based on their similarity with him or her and weighted aggregate of their ratings is used to generate predictions for the active user

Distributed Networks & Systems Lab To make prediction for active user a, on a certain item i, We can take a weighted average of all the ratings on that item by using this average ratings for the user a on all other ratings average ratings for the user u on all other ratings w a,u weight between the user a and user u

Distributed Networks & Systems Lab To predict the rating for U1 on I2,

Distributed Networks & Systems Lab For item-based prediction, We can use simple weighted average P u,i for user u on item i

Distributed Networks & Systems Lab To recommend a set of N top-ranked items that will be of interest to a certain user Returning customer may get the list of recommendation Top-N recommendation techniques analyze the user-item matrix to discover relations between different users or items and use them to compute recommendations Association rule mining can be used to make Top-N recommendations

Distributed Networks & Systems Lab The design and development of models (machine learning, data mining algorithms) can allow the system to learn to recognize the complex patterns based on training data and make predictions from learned models Classification algorithm can be used as CF models if the user ratings are categorical Regression models and SVD methods can be used for numerical ratings

Distributed Networks & Systems Lab Uses a naïve Bayes (NB) strategy to make predictions Assuming the features are independent given the class The probability of a certain class given all of the features can be computed Then class with the highest probability will be classified as the predicted classes

Distributed Networks & Systems Lab Shows better scalability Make predictions within much smaller clusters rather than the entire customer bse

Distributed Networks & Systems Lab Memory-based and model-based CF approaches are combined to from hybrid CF approaches Shows some improvement Probabilistic memory-based CF Personality diagnosis

Distributed Networks & Systems Lab Combined memory-based and model based To address the New user problem, an active learning extension to the PMCF system can be used to actively query a user for additional information. To reduce computation time, PMCF Selects a small subset, ‘profile space’ from the entire database of user ratings and make prediction from the small profile space, not the whole database Better accuracy than Pearson correlation-based CF Model based using naïve Bayes

Distributed Networks & Systems Lab Combined and keeps the both advantage Given the active user’s known ratings, we can calculate the probability that he or she is the same “personality type” as other users, and predict whether he will like the new items

Distributed Networks & Systems Lab