A clustering algorithm to find groups with homogeneous preferences J. Díez, J.J. del Coz, O. Luaces, A. Bahamonde Centro de Inteligencia Artificial. Universidad.

Slides:



Advertisements
Similar presentations
Chapter 5 One- and Two-Sample Estimation Problems.
Advertisements

Slide 1 Insert your own content. Slide 2 Insert your own content.
Effective Change Detection Using Sampling Junghoo John Cho Alexandros Ntoulas UCLA.
Optimizing Cost and Performance for Multihoming Nick Feamster CS 6250 Fall 2011.
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Combining Like Terms. Only combine terms that are exactly the same!! Whats the same mean? –If numbers have a variable, then you can combine only ones.
Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao Wei Fan Jing JiangJiawei Han University of Illinois at Urbana-Champaign IBM T. J.
0 - 0.
Addition Facts
Year 6 mental test 10 second questions Numbers and number system Numbers and the number system, fractions, decimals, proportion & probability.
CS4026 Formal Models of Computation Running Haskell Programs – power.
SADC Course in Statistics Estimating population characteristics with simple random sampling (Session 06)
Assumptions underlying regression analysis
Data Visualization Lecture 4 Two Dimensional Scalar Visualization
Evaluating Provider Reliability in Risk-aware Grid Brokering Iain Gourlay.
1 Using Partial Order Bounding in Shogi Game Programming Workshop 2003 Reijer Grimbergen, Kenji Hadano and Masanao Suetsugu Department of Information Science.
Evaluating the Robustness of Learning from Implicit Feedback Filip Radlinski Thorsten Joachims Presentation by Dinesh Bhirud
Object Recognition using Local Descriptors Javier Ruiz-del-Solar, and Patricio Loncomilla Center for Web Research Universidad de Chile.
5.9 + = 10 a)3.6 b)4.1 c)5.3 Question 1: Good Answer!! Well Done!! = 10 Question 1:
Degree Distribution of XORed Fountain codes
Text Categorization.
1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.
Traditional IR models Jian-Yun Nie.
Nonparametric Methods: Nearest Neighbors
Machine Learning: Intro and Supervised Classification
Splines I – Curves and Properties
Document Clustering Carl Staelin. Lecture 7Information Retrieval and Digital LibrariesPage 2 Motivation It is hard to rapidly understand a big bucket.
Addition 1’s to 20.
Test B, 100 Subtraction Facts
Music Recommendation by Unified Hypergraph: Music Recommendation by Unified Hypergraph: Combining Social Media Information and Music Content Jiajun Bu,
Bottoms Up Factoring. Start with the X-box 3-9 Product Sum
X-box Factoring. X- Box 3-9 Product Sum Factor the x-box way Example: Factor 3x 2 -13x (3)(-10)= x 2x 3x 2 x-5 3x +2.
CHAPTER 14: Confidence Intervals: The Basics
Why does it work? We have not addressed the question of why does this classifier performs well, given that the assumptions are unlikely to be satisfied.
Introduction Distance-based Adaptable Similarity Search
Item Based Collaborative Filtering Recommendation Algorithms
Prediction Modeling for Personalization & Recommender Systems Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Lirong Xia Reinforcement Learning (2) Tue, March 21, 2014.
Introduction to Information Retrieval
COLLABORATIVE FILTERING Mustafa Cavdar Neslihan Bulut.
Lecture: Dudu Yanay.  Input: Each instance is associated with a rank or a rating, i.e. an integer from ‘1’ to ‘K’.  Goal: To find a rank-prediction.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Recommendations via Collaborative Filtering. Recommendations Relevant for movies, restaurants, hotels…. Recommendation Systems is a very hot topic in.
Agent Technology for e-Commerce
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Collaborative Filtering CMSC498K Survey Paper Presented by Hyoungtae Cho.
Recommender systems Ram Akella November 26 th 2008.
Population Proportion The fraction of values in a population which have a specific attribute p = Population proportion X = Number of items having the attribute.
Educational Research: Correlational Studies EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
Group Recommendations with Rank Aggregation and Collaborative Filtering Linas Baltrunas, Tadas Makcinskas, Francesco Ricci Free University of Bozen-Bolzano.
Universit at Dortmund, LS VIII
Machine Learning in Ad-hoc IR. Machine Learning for ad hoc IR We’ve looked at methods for ranking documents in IR using factors like –Cosine similarity,
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
Sampling Methods, Sample Size, and Study Power
NTU & MSRA Ming-Feng Tsai
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
Item-Based Collaborative Filtering Recommendation Algorithms
Evaluation Anisio Lacerda.
Machine Learning With Python Sreejith.S Jaganadh.G.
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Author: Kazunari Sugiyama, etc. (WWW2004)
Collaborative Filtering Matrix Factorization Approach
Presentation transcript:

A clustering algorithm to find groups with homogeneous preferences J. Díez, J.J. del Coz, O. Luaces, A. Bahamonde Centro de Inteligencia Artificial. Universidad de Oviedo at Gijón Workshop on Implicit Measures of User Interests and Preferences

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 2 People tend to rate their preferences in a relative way Which middle circle do you think is larger? The framework to learn preferences

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 3 o Regression is not a good idea o We will use training sets of preference judgments pairs of vectors (v, u) where someone expresses that he or she prefers v to u The framework to learn people’s preferences Me {v i > u i : i  I Me } SVM linear f Me f Me is a linear ranking function: f(v i ) > f(u i ) whenever v i is preferable to u i

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 4 The framework to learn people’s preferences Me {v i > u i : i  I Me } SVM linear f Me How useful is this ranking function f Me ? reliable, general, … Accuracy, generalization error # Training examples # Attributes

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 5 To improve ranking functions, we present a new algorithm for clustering preference criteria The problem addressed P1 {v i 1 > u i 1 } f1f1 P2 {v i 2 > u i 2 } P3 {v i 3 > u i 3 } f2f2 f3f3 P4 {v i 4 > u i 4 } f4f4 f 2U3 if f 2U3 is better than f 2 and f 3 f 2U3

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 6 Applications oInformation retrieval Optimizing Search Engines Using Clickthrough Data [Joachims, 2002] oPersonalized recommenders Adaptive Route Advisor [Fiechter, Rogers, 2000] oAnalysis of sensory data Used to test the quality (or the acceptability) of market products Panels of experts and consumers

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 7 Baseline approaches P1 {object 1  rating 1 } P2 {object 2  rating 2 } P3 {object 3  rating 3 } P4 {object 4  rating 4 } If rating i  rating j then merge Pi with Pj Where  uses correlation or cosine

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 8 Weaknesses of baseline approaches Correlation or cosine were devised for prediction purposes in collaborative filtering, and they are not easily extendable to clustering: oNot all people have seen the same objects oTwo samples of preferences of the same person would not be considered homogeneous oRating is not a good idea

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 9 Our approach: a clustering algorithm Ranking functions are linear maps: f(x) =w·x Then weight vectors w codify the rationale for these preferences Therefore, we will try to merge data sets with similar (cosine) ranking functions (= weight vectors) The merge will be accepted if the join ranking function improves the quality of individual functions

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 10 Our approach: a clustering algorithm A set of clusters ClusterPreferencesCriteria (a list of preference judgments (PJ i : i = 1,…, N)) { Clusters =  ; for each i = 1 to N { w i = Learn a ranking hyperplane from (PJ i ); Clusters = Clusters U {(PJ i, w i )}; } repeat { let (PJ 1, w 1 ) and (PJ 2, w 2 ) be the clusters with most similar w 1 and w 2 ; w = Learn a ranking hyperplane from (PJ 1 U PJ 2 ); if (quality of w >= (quality of w 1 + quality of w 2 )) then replace the clusters (PJ 1, w 1 ) and (PJ 2, w 2 ) by (PJ 1 U PJ 2, w) in Clusters; } until (no new merges can be tested); return Clusters; }

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 11 To estimate quality of ranking functions The quality of the ranking functions depends on: oAccuracy, generalization errors oNumber of Training examples oNumber of Attributes

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 12 To estimate quality of ranking functions If we have enough training data: odivide them in train (itself) and verification sets o compute the confidence interval of the probability of error when we apply each ranking function to the corresponding verification set: [L, R] oquality is 1-R the estimated proportion of successful generalization errors in the pessimistic case

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 13 To estimate quality of ranking functions If we don’t have too many training data: oXi-alpha estimator [Joachims, 2000] (texts) oCross-validation oOther

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 14 Experimental results oPeople: the 100 spectators with more ratings oObjects: the ratings of 504 movies (60% train, 20% verification, 20% test) given by other 89 spectators 808 spectators Training sets: preference judgments We used a collection of preference judgments taken from EachMovie to simulate reasonable situations in the study of preferences of groups of people

To find groups with homogeneous preferences Centro de Inteligencia Artificial. Univ. Oviedo at Gijón. Spain 15 Experimental results

A clustering algorithm to find groups with homogeneous preferences J. Díez, J.J. del Coz, O. Luaces, A. Bahamonde Centro de Inteligencia Artificial. Universidad de Oviedo at Gijón Workshop on Implicit Measures of User Interests and Preferences