Recommender systems Ram Akella November 26 th 2008.

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

Content-based Recommendation Systems
DECISION TREES. Decision trees  One possible representation for hypotheses.
Chapter 5: Introduction to Information Retrieval
A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002.
Data Mining Classification: Alternative Techniques
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Distance and Similarity Measures
Sean Blong Presents: 1. What are they…?  “[…] specific type of information filtering (IF) technique that attempts to present information items (movies,
Recommender Systems Aalap Kohojkar Yang Liu Zhan Shi March 31, 2008.
Information Retrieval in Practice
Chapter 9 Business Intelligence Systems
CS345 Data Mining Recommendation Systems Netflix Challenge Anand Rajaraman, Jeffrey D. Ullman.
Week 9 Data Mining System (Knowledge Data Discovery)
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Computing Trust in Social Networks
Agent Technology for e-Commerce
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
1 Introduction to Recommendation System Presented by HongBo Deng Nov 14, 2006 Refer to the PPT from Stanford: Anand Rajaraman, Jeffrey D. Ullman.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Chapter 5: Information Retrieval and Web Search
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Julian Keenaghan 1 Personalization of Supermarket Product Recommendations IBM Research Report (2000) R.D. Lawrence et al.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Recommender systems Drew Culbert IST /12/02.
The identification of interesting web sites Presented by Xiaoshu Cai.
Clustering-based Collaborative filtering for web page recommendation CSCE 561 project Proposal Mohammad Amir Sharif
Knowledge Discovery and Data Mining Evgueni Smirnov.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
User Modeling, Recommender Systems & Personalization Pattie Maes MAS 961- week 6.
Presented By :Ayesha Khan. Content Introduction Everyday Examples of Collaborative Filtering Traditional Collaborative Filtering Socially Collaborative.
Toward the Next generation of Recommender systems
Chapter 6: Information Retrieval and Web Search
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.
Personalized Course Navigation Based on Grey Relational Analysis Han-Ming Lee, Chi-Chun Huang, Tzu- Ting Kao (Dept. of Computer Science and Information.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
Information Retrieval and Organisation Chapter 14 Vector Space Classification Dell Zhang Birkbeck, University of London.
Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
User Modeling and Recommender Systems: recommendation algorithms
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Recommendation Systems By: Bryan Powell, Neil Kumar, Manjap Singh.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
Analysis of massive data sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Information Retrieval in Practice
Data Mining: Concepts and Techniques
Recommender Systems & Collaborative Filtering
Information Organization: Overview
Contextual Intelligence as a Driver of Services Innovation
Collaborative Filtering Nearest Neighbor Approach
Prepared by: Mahmoud Rafeek Al-Farra
Text Categorization Assigning documents to a fixed set of categories
Recommendation Systems
Information Organization: Overview
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Presentation transcript:

Recommender systems Ram Akella November 26 th 2008

Outline  Types of recommendation systems Search-based recommendations Category-based recommendations Collaborative filtering Clustering Association Rules Information filtering Classifiers

Types of recommendation systems

Search-based recommendations  The only visitor types a search query « data mining customer »  The system retrieves all the items that correspond to that query e.g. 6 books  The system recommend some of these books based on general, non-personalized ranking (sales rank, popularity, etc.)

Search-based recommendations  Pros: Simple to implement  Cons: Not very powerful Which criteria to use to rank recommendations? Is it really « recommendations »? The user only gets what he asked

Category-based recommendations  Each item belongs to one category or more.  Explicit / implicit choice: The customer select a category of interest (refine search, opt-in for category-based recommendations, etc.).  « Subjects > Computers & Internet > Databases > Data Storage & Management > Data Mining » The system selects categories of interest on the behalf of the customer, based on the current item viewed, past purchases, etc.  Certain items (bestsellers, new items) are eventually recommended

Category-based recommendations  Pros: Still simple to implement  Cons: Again: not very powerful, which criteria to use to order recommendations? is it really « recommendations »? Capacity highly dependd upon the kind of categories implemented  Too specific: not efficient  Not specific enough: no relevant recommendations

Collaborative filtering  Collaborative filtering techniques « compare » customers, based on their previous purchases, to make recommendations to « similar » customers  It’s also called « social » filtering  Follow these steps: 1. Find customers who are similar (« nearest neighbors ») in term of tastes, preferences, past behaviors 2. Aggregate weighted preferences of these neighbors 3. Make recommendations based on these aggregated, weighted preferences (most preferred, unbought items)

Collaborative filtering  Example: the system needs to make recommendations to customer C  Customer B is very close to C (he has bought all the books C has bought). Book 5 is highly recommended  Customer D is somewhat close. Book 6 is recommended to a lower extent  Customers A and E are not similar at all. Weight=0

Collaborative filtering  Pros: Extremely powerful and efficient Very relevant recommendations (1) The bigger the database, (2) the more the past behaviors, the better the recommendations  Cons: Difficult to implement, resource and time-consuming What about a new item that has never been purchased? Cannot be recommended What about a new customer who has never bought anything? Cannot be compared to other customers  no items can be recommended

Clustering  Another way to make recommendations based on past purchases of other customers is to cluster customers into categories  Each cluster will be assigned « typical » preferences, based on preferences of customers who belong to the cluster  Customers within each cluster will receive recommendations computed at the cluster level

Clustering  Customers B, C and D are « clustered » together. Customers A and E are clustered into another separate group  « Typicical » preferences for CLUSTER are: Book 2, very high Book 3, high Books 5 and 6, may be recommended Books 1 and 4, not recommended at all

Clustering  How does it work?  Any customer that shall be classified as a member of CLUSTER will receive recommendations based on preferences of the group: Book 2 will be highly recommended to Customer F Book 6 will also be recommended to some extent

Clustering  Problem: customers may belong to more than one cluster; clusters may overlap  Predictions are then averaged across the clusters, weighted by participation

Clustering  Pros: Clustering techniques work on aggregated data: faster It can also be applied as a « first step » for shrinking the selection of relevant neighbors in a collaborative filtering algorithm  Cons: Recommendations (per cluster) are less relevant than collaborative filtering (per individual)

Association rules  Clustering works at a group (cluster) level  Collaborative filtering works at the customer level  Association rules work at the item level

Association rules  Past purchases are transformed into relationships of common purchases

Association rules  These association rules are then used to made recommendations  If a visitor has some interest in Book 5, he will be recommended to buy Book 3 as well  Of course, recommendations are constrained to some minimum levels of confidence

Association rules  What if recommendations can be made using more than one piece of information?  Recommendations are aggregated If a visitor is interested in Books 3 and 5, he will be recommended to buy Book 2, than Book 3

Association rules  Pros: Fast to implement Fast to execute Not much storage space required Not « individual » specific Very successful in broad applications for large populations, such as shelf layout in retail stores  Cons: Not suitable if knowledge of preferences change rapidly It is tempting to do not apply restrictive confidence rules

Information filtering  Association rules compare items based on past purchases  Information filtering compare items based on their content  Also called « content-based filtering » or « content-based recommendations »

Information filtering  What is the « content » of an item?  It can be explicit « attributes » or « characteristics » of the item. For example for a film: Action / adventure Feature Bruce Willis Year 1995  It can also be « textual content » (title, description, table of content, etc.) Several techniques exist to compute the distance between two textual documents

Information filtering  How does it work? A textual document is scanned and parsed Word occurrences are counted (may be stemmed) Several words or « tokens » are not taken into account. That includes « stop words » (the, a, for), and words that do not appear enough in documents Each document is transformed into a normed TFIDF vector, size N (Term Frequency / Inverted Document Frequency). The distance between any pair of vector is computed

Information filtering

 An (unrealistic) example: how to compute recommendations between 8 books based only on their title?  Books selected: Building data mining applications for CRM Accelerating Customer Relationships: Using CRM and Relationship Technologies Mastering Data Mining: The Art and Science of Customer Relationship Management Data Mining Your Website Introduction to marketing Consumer behavior marketing research, a handbook Customer knowledge management

Data Data mining your website Mastering Data Mining: The Art and Science of Customer Relationship Management

Information filtering  A customer is interested in the following book: « Building data mining applications for CRM »  The system computes distances between this book and the 7 others  The « closest » books are recommended: #1: Data Mining Your Website #2: Accelerating Customer Relationships: Using CRM and Relationship Technologies #3: Mastering Data Mining: The Art and Science of Customer Relationship Management Not recommended: Introduction to marketing Not recommended: Consumer behavior Not recommended: marketing research, a handbook Not recommended: Customer knowledge management

Information filtering  Pros: No need for past purchase history Not extremely difficult to implement  Cons: « Static » recommendations Not efficient is content is not very informative e.g. information filtering is more suited to recommend technical books than novels or movies

Classifiers  Classifiers are general computational models  They may take in inputs: Vector of item features (action / adventure, Bruce Willis) Preferences of customers (like action / adventure) Relations among items  They may give as outputs: Classification Rank Preference estimate  That can be a neural network, Bayesian network, rule induction model, etc.  The classifier is trained using a training set

Classifiers  Pros: Versatile Can be combined with other methods to improve accuracy of recommendations  Cons: Need a relevant training set