Predicting Sequential Rating Elicited from Humans Aviv Zohar & Eran Marom.

Slides:



Advertisements
Similar presentations
Pretty-Good Tomography Scott Aaronson MIT. Theres a problem… To do tomography on an entangled state of n qubits, we need exp(n) measurements Does this.
Advertisements

Google News Personalization: Scalable Online Collaborative Filtering
Scalable and Dynamic Quorum Systems Moni Naor & Udi Wieder The Weizmann Institute of Science.
Part II – TIME SERIES ANALYSIS C3 Exponential Smoothing Methods © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Introduction to Algorithms Rabie A. Ramadan rabieramadan.org 2 Some of the sides are exported from different sources.
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:
Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Active and Accelerated Learning of Cost Models for Optimizing Scientific Applications Piyush Shivam, Shivnath Babu, Jeffrey Chase Duke University.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #22.
Histogram Analysis to Choose the Number of Clusters for K Means By: Matthew Fawcett Dept. of Computer Science and Engineering University of South Carolina.
What is Statistical Modeling
 CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand.  CpG islands are particular short subsequences in.
Sorting Algorithms. Motivation Example: Phone Book Searching Example: Phone Book Searching If the phone book was in random order, we would probably never.
Planning under Uncertainty
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.
Heuristic alignment algorithms and cost matrices
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Probabilistic Robotics
MGTO 231 Human Resources Management Personnel selection I Dr. Kin Fai Ellick WONG.
Expectation Maximization Algorithm
Co-training LING 572 Fei Xia 02/21/06. Overview Proposed by Blum and Mitchell (1998) Important work: –(Nigam and Ghani, 2000) –(Goldman and Zhou, 2000)
Maximum Likelihood (ML), Expectation Maximization (EM)
BPS - 3rd Ed. Chapter 131 Confidence intervals: the basics.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Classification and Prediction: Basic Concepts Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Bayesian Networks 4 th, December 2009 Presented by Kwak, Nam-ju The slides are based on, 2nd ed., written by Ian H. Witten & Eibe Frank. Images and Materials.
by B. Zadrozny and C. Elkan
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
CSC321: Neural Networks Lecture 12: Clustering Geoffrey Hinton.
August th Computer Olympiad1 Learning Opponent-type Probabilities for PrOM search Jeroen Donkers IKAT Universiteit Maastricht.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Today Ensemble Methods. Recap of the course. Classifier Fusion
ADVANCED PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
BPS - 3rd Ed. Chapter 131 Confidence Intervals: The Basics.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
A Repetition Based Measure for Verification of Text Collections and for Text Categorization Dmitry V.Khmelev Department of Mathematics, University of Toronto.
CSC321: Neural Networks Lecture 16: Hidden Markov Models
1 Modelling procedures for directed network of data blocks Agnar Höskuldsson, Centre for Advanced Data Analysis, Copenhagen Data structures : Directed.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
Artificial Intelligence in Game Design Lecture 20: Hill Climbing and N-Grams.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Data Structure Visualization Vikrant Colaso Kunal Garach Anuj Shah.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
Modeling of Core Protection Calculator System Software February 28, 2005 Kim, Sung Ho Kim, Sung Ho.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Advanced Algorithms Analysis and Design
7. Performance Measurement
Data Mining Lecture 11.
Objective of This Course
Randomized Hill Climbing
Learning Markov Networks
Discrete Event Simulation - 4
NanoBPM Status and Multibunch Mark Slater, Cambridge University
Cross-validation for the selection of statistical models
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
Probabilistic Latent Preference Analysis
DESIGN OF EXPERIMENTS by R. C. Baker
Presentation transcript:

Predicting Sequential Rating Elicited from Humans Aviv Zohar & Eran Marom

2 Motivation Recommender systems base recommendations on information collected from users Humans are not consistent when rating items Framing problem (scaling) Relative ratings, not absolute Short memory Most recommender systems usually ignore the sequential aspect of the rating process We wanted to find out if we can use this information to better predict unseen future ratings

3 Collected Data 12 quotes of different qualities were chosen Each user rates all quotes in a random order Quotes are easy and quick to rank A total of 500 people completed the survey Binned the ratings to 5 values

4 Simple Analysis of the Data A naïve classifier that picks the most common rating is right 33.8% of the time Ordered data seems to have added information

5 The Graphical Model q r t Quote type/quality (unknown) Quote rating Person type (unknown) q1q1 q2q2 q3q3 t1t1 r11r11 r21r21 r31r31 r41r41 q4q4 t2t2 r12r12 r22r22 r32r32 r42r42

6 Dependencies in the Model V-Structures q2q2 t1t1 r21r21 t2t2 r32r32 Given the data, and not the quotes qualities, every path between types and qualities is open When quotes qualities are given types are independent and vice versa Conclusion: types are hard to learn together with qualities The learning is exponential in min(N People,N Quotes )

7 Likelihood of the Data

8 EM - Learning Parameters (known types) Expectation Maximization

9 Selecting Strict People Types Cleverly Using additional information on users Clustering using ratings

10 How to Predict the Next Rating q1q1 q2q2 q3q3 ? r1r1 r2r2 ? ? q4q4 We want to predict this Unknown, can be estimated from previous ratings We have a prior from past data

11 How to Predict the Next Rating Use the previous ratings by the user to estimate the user type, and then predict the following rating

12 Experiments Learning Process: Initialized with random, almost uniform, priors Final network selected by max likelihood from 100 EM runs Score: percentage of correct predictions 5-fold cross validation was used

13 Results for Learning with Strictly Selected People Types To check if the order information helps us with prediction, tests were repeated with scrambled order Only a slight difference was found True order – 37.98% Permuted order – 37.40% The type of each person is better estimated with each new rating given – so we ’ ll make less mistakes over time

14 Adaptively Assigning People Types Multi-phase EM Alternating between optimal selection of types and optimal selection of parameters

15 Results for Learning with Dynamic Adaptation of People Types Average score is still 0.5% better for the ordered run Could not see improvements in predictions compared to constant people types deduced from k-means General rise in score with progress of ratings is seen, though other underlying aspects of the data have effect Permutated Ordered

16 Numerical Problems when Increasing the Data Size Insufficient data for learning the parameter space 11 rating values, 3 quote types, 2 person types Number of parameters to learn: 11  12  3  2  3  795 Number of predictions from 500 people: 500  12  6000 Precision problems Alas, using more than 400 people implies such small probabilities that we have to use infinite precision tools

17 Conclusions and Possible Extensions A significant improvement is already gained by very naïve predictors, but still we can do better Adding order dependencies improves predictions only slightly Multi-phase learning could be done using strict quotes qualities while using types probabilities for each person Model can be extended to encompass more aspects of sequential information More data may yield better results Algorithms ran fast – system is scalable to larger data sets