EigenTaste: A Constant Time Collaborative Filtering Algorithm Ken Goldberg Students: Theresa Roeder, Dhruv Gupta, Chris Perkins Industrial Engineering.

Slides:



Advertisements
Similar presentations
FMRI Methods Lecture 10 – Using natural stimuli. Reductionism Reducing complex things into simpler components Explaining the whole as a sum of its parts.
Advertisements

Recommender Systems & Collaborative Filtering
Item Based Collaborative Filtering Recommendation Algorithms
Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.
Collaborative Filtering Sue Yeon Syn September 21, 2005.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
1 RegionKNN: A Scalable Hybrid Collaborative Filtering Algorithm for Personalized Web Service Recommendation Xi Chen, Xudong Liu, Zicheng Huang, and Hailong.
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
2. Introduction Multiple Multiplicative Factor Model For Collaborative Filtering Benjamin Marlin University of Toronto. Department of Computer Science.
Collaborative Filtering in iCAMP Max Welling Professor of Computer Science & Statistics.
I NCREMENTAL S INGULAR V ALUE D ECOMPOSITION A LGORITHMS FOR H IGHLY S CALABLE R ECOMMENDER S YSTEMS (S ARWAR ET AL ) Presented by Sameer Saproo.
Memory-Based Recommender Systems : A Comparative Study Aaron John Mani Srinivasan Ramani CSCI 572 PROJECT RECOMPARATOR.
DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) – FastMap Dimensionality Reductions or data projections.
Principal Component Analysis
Privacy-PreservingEigentaste-based Collaborative Filtering Ibrahim Yakut and Huseyin Polat Department of Computer Engineering.
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
Top-N Recommendation Algorithm Based on Item-Graph
Sparsity, Scalability and Distribution in Recommender Systems
Analysis of Recommendation Algorithms for E-Commerce Badrul M. Sarwar, George Karypis*, Joseph A. Konstan, and John T. Riedl GroupLens Research/*Army HPCRC.
Dimensionality Reduction. Multimedia DBs Many multimedia applications require efficient indexing in high-dimensions (time-series, images and videos, etc)
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
CS 277: Data Mining Recommender Systems
Item-based Collaborative Filtering Recommendation Algorithms
Summarized by Soo-Jin Kim
Chapter 2 Dimensionality Reduction. Linear Methods
Presented By Wanchen Lu 2/25/2013
Cao et al. ICML 2010 Presented by Danushka Bollegala.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
Clustering-based Collaborative filtering for web page recommendation CSCE 561 project Proposal Mohammad Amir Sharif
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Recommender Systems David M. Pennock NEC Research Institute contributions: John Riedl, GroupLens University of Minnesota.
1 Recommender Systems Collaborative Filtering & Content-Based Recommending.
1 Social Networks and Collaborative Filtering Qiang Yang HKUST Thanks: Sonny Chee.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Collaborative Filtering  Introduction  Search or Content based Method  User-Based Collaborative Filtering  Item-to-Item Collaborative Filtering  Using.
Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl
SINGULAR VALUE DECOMPOSITION (SVD)
The Effect of Dimensionality Reduction in Recommendation Systems
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
CSE 185 Introduction to Computer Vision Face Recognition.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &
1 Privacy-Enhanced Collaborative Filtering Privacy-Enhanced Personalization workshop July 25, 2005, Edinburgh, Scotland Shlomo Berkovsky 1, Yaniv Eytani.
Cosine Similarity Item Based Predictions 77B Recommender Systems.
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Ken goldberg, gail de kosnik, kimiko ryokai (+ students) uc berkeley Opinion Space.
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
Principal Components Analysis ( PCA)
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
Item-Based Collaborative Filtering Recommendation Algorithms
Collaborative Filtering With Decoupled Models for Preferences and Ratings Rong Jin 1, Luo Si 1, ChengXiang Zhai 2 and Jamie Callan 1 Language Technology.
SWAMI Shared Wisdom Through Amalgamation of Many Interpretations.
Principal Component Analysis (PCA)
Recommender Systems & Collaborative Filtering
Collaborative Filtering
Dimension Reduction via PCA (Principal Component Analysis)
Principal Component Analysis
Collaborative Filtering Nearest Neighbor Approach
M.Sc. Project Doron Harlev Supervisor: Dr. Dana Ron
Techniques for studying correlation and covariance structure
Movie Recommendation System
ITEM BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHEMS
Feature Selection Methods
Principal Component Analysis
Presentation transcript:

EigenTaste: A Constant Time Collaborative Filtering Algorithm Ken Goldberg Students: Theresa Roeder, Dhruv Gupta, Chris Perkins Industrial Engineering and Operations Research Electrical Engineering and Computer Science UC Berkeley

CF Problem Definition A set of objects (movies, books, jokes) A user rates a subset of objects Based on the ratings, retrieve objects from the complement of this subset. Criteria: –Effective : recommended objects should receive high ratings –Efficient : the online recommendation process should run quickly and be scalable

Some Previous Work D. Goldberg, et al. - Tapestry (1992) Riedel, Resnick, Konstan et. al. - GroupLens(1994- ) Shardanand and Maes - Ringo (1995) Resnick and Varian (1997) Breese et. al. at Microsoft Research (1998) Pazzani (1999) Herlocker et. al. - GroupLens (1999)

WWW-based Recommender Systems Firefly MovieCritic MovieLens

EigenTaste Algorithm 1) Principal Component Analysis 2) Universal Queries (dense ratings matrix) 3) Fine-grained ratings bar (captures nuances) 4) Offline and Online Processing 5) Online: Constant time recommendations

Universal Queries Most CF systems require users to select which items they want to rate: sparse ratings matrix Eigentaste allows users to rate all items based on short unbiased descriptions (eg, film synopsis) Eigentaste uses a subset of highly discriminatory items for the gauge set

DisapproveApprove Continuous Rating Scale

EigenTaste Algorithm A is the n x m normalized rating matrix –n users –m objects C is the k x k reduced correlation matrix –k objects in the gauge set: –C = (1/n) A T A –assumes ratings are continuous with linear rel. E is the ortho. matrix of eigenvectors of C  is the diagonal matrix of eigenvalues

Correlation Matrix

EigenTaste ECE T =  C = E T  E Let B = AE T R B = (1/n) B T B = ECE T =  –transformed points are uncorrelated and each column of B has variance i Principle Components (Pearson 1901) –consider m largest eigenvectors, E m B m = AE m T choose m based on “knee” in eigenvalues

Dimensionality Reduction First two principal components (eigenvectors) account for nearly 50% of the variation in user ratings Project user ratings along first two principal components: x = AE 2 T Facilitates visualization...

Eigen Plane Recursive Clustering

The EigenTaste Algorithm Offline: –Compute eigenvectors and project users onto eigen plane. –Cluster and compute average ratings for each cluster. Online: –Collect ratings for objects in gauge set –Project onto the eigen plane –Find representative cluster –Recommend objects based on average ratings within that cluster

First Application (1999) Jester: Recommending Jokes Sense of humor is difficult to specify Advantages: –Rating process is not altogether unpleasant –Can evaluate jokes quickly: –Dense ratings matrix (large sample size) Disadvantages: –Offensive/Shaggy Dog jokes –Temporal Effects, Portfolio Effects –Priming/Masking

Jester: User Interface

System Architecture Client Web Server Recommendation Engine User Rating Profiles Content Database Internet CGI Login Interface CGI

Measure of Effectiveness Metric: Normalized Mean Absolute Error (NMAE): Average absolute deviation of actual ratings from predicted ratings, normalized over rating range. MAE = 1/c  |r - p| NMAE = MAE / (r_max - r_min)

Effectiveness Based on 18,000 users

Computational Complexity n - number of users k - number of objects in gauge set Nearest Neighborhood algorithm : Online processing - O(kn) EigenTaste algorithm: Offline processing - O(k 2 n) Online processing - O(k)

Effectiveness and Efficiency

Prediction Speed Algorithm Time to process 9000 users Nearest Neighbor 28 hours EigenTaste 3 minutes

Current Jester Dataset 62,000 registered users approx. 3,000,000 ratings

Second Application (2000) Sleeper: Recommending Books

EigenTaste Algorithm 1) Principal Component Analysis 2) Universal Queries (dense ratings matrix) 3) Fine-grained ratings bar (captures nuances) 4) Offline and Online Processing 5) Online: Constant time recommendations Patent application 21 December 1999 by UC Regents

Eigentaste: A Constant Time Collaborative Filtering Algorithm (to appear: Information Retrieval Journal, 2001)