Ranking and Rating Data in Joint RP/SP Estimation by JD Hunt, University of Calgary M Zhong, University of Calgary PROCESSUS Second International Colloquium.

Slides:



Advertisements
Similar presentations
Linear Regression.
Advertisements

Brief introduction on Logistic Regression
Discrete Choice Modeling William Greene Stern School of Business New York University.
Error Component models Ric Scarpa Prepared for the Choice Modelling Workshop 1st and 2nd of May Brisbane Powerhouse, New Farm Brisbane.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
The General Linear Model. The Simple Linear Model Linear Regression.
MSS 905 Methods of Missiological Research
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 12 Simple Regression
Simple Linear Regression
AN INTRODUCTION TO PORTFOLIO MANAGEMENT
Evaluating Hypotheses
Clustered or Multilevel Data
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Domestic Tourism Destination Choices- A Choice Modelling Analysis Assignment 3 Group 3 Hari Hara Sharan Nagalur Subraveti Kasun Dilhara Wimalasena Kento.
Data Analysis Statistics. Inferential statistics.
Estimation of switching models from revealed preferences and stated intentions Ben-Akiva, Moshe, and Takayuki Morikawa. "Estimation of switching models.
Introduction to the design (and analysis) of experiments James M. Curran Department of Statistics, University of Auckland
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Week 9: QUANTITATIVE RESEARCH (3)
1. Homework #2 2. Inferential Statistics 3. Review for Exam.
Discrete Choice Models William Greene Stern School of Business New York University.
Inference for regression - Simple linear regression
Linear Regression and Correlation
Correlation and Linear Regression
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Chapter 15 Correlation and Regression
(a.k.a: The statistical bare minimum I should take along from STAT 101)
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Introduction to Linear Regression
LINKING PSYCHOMETRIC RISK TOLERANCE WITH CHOICE BEHAVIOUR FUR Conference – July 2008 Peter Brooks, Greg B. Davies and Daniel P. Egan.
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 3: The Foundations of Research 1.
Examining Relationships in Quantitative Research
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Discrete Choice Modeling William Greene Stern School of Business New York University.
University of Ostrava Czech republic 26-31, March, 2012.
AP Statistics Section 11.1 B More on Significance Tests.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.
Chapter 7 An Introduction to Portfolio Management.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
ILUTE A Tour-Based Mode Choice Model Incorporating Inter-Personal Interactions Within the Household Matthew J. Roorda Eric J. Miller UNIVERSITY OF TORONTO.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Hypothesis Testing and Statistical Significance
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
BUS 308 Entire Course (Ash Course) For more course tutorials visit BUS 308 Week 1 Assignment Problems 1.2, 1.17, 3.3 & 3.22 BUS 308.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
MSS 905 Methods of Missiological Research
Multiple Regression.
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
M.Sc. in Economics Econometrics Module I
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Essential Statistics (a.k.a: The statistical bare minimum I should take along from STAT 101)
Discrete Choice Modeling
Multiple Regression.
No notecard for this quiz!!
The Examination of Residuals
ENM 310 Design of Experiments and Regression Analysis Chapter 3
GENERALIZATION OF RESULTS OF A SAMPLE OVER POPULATION
Presentation transcript:

Ranking and Rating Data in Joint RP/SP Estimation by JD Hunt, University of Calgary M Zhong, University of Calgary PROCESSUS Second International Colloquium Toronto ON, Canada June 2005

Overview Introduction Context Motivations Definitions Revealed Preference Choice Stated Preference Rankings Revealed Preference Ratings Stated Preference Ratings Estimation Testbed Concept Synthetic Data Generation Results Conclusions

Overview Introduction Context Motivations Definitions Revealed Preference Choice Stated Preference Rankings Revealed Preference Ratings Stated Preference Ratings Estimation Testbed Concept Synthetic Data Generation Results – so far Conclusions – so far

Introduction Context Common task to estimate logit model utility function for non-existing mode alternatives Joint RP/SP estimation available Good for sensitivity coefficients Problems with alternative specific constants (ASC) Motivation Improve situation regarding ASC Seeking to expand on joint RP/SP estimation Add rating information 0 to 10 scores Direct utility Increase understanding of issues regarding ASC generally

Definitions Revealed Preference Choice Stated Preference Ranking Revealed Preference Ratings Stated Preference Ratings Linear-in-parameters logit utility function U m = Σ k α m,k x m,k + β m

Definitions Revealed Preference Choice Stated Preference Ranking Revealed Preference Ratings Stated Preference Ratings Linear-in-parameters logit utility function U m = Σ k α m,k x m,k + β m sensitivity coefficient ASC

Revealed Preference Choice Actual behaviour Best alternative choice from existing Attribute values determined separately Indirect utility measure – observe outcome U m r = λ r [ Σ k α m,k x m,k + β m ] + β m r

Revealed Preference Choice Disaggregate estimation provides U m r = Σ k α’ m,k r x m,k + β’ m r with α’ m,k r = λ r α m,k β’ m r = λ r β m + β m r

Stated Preference Ranking Stated behaviour Ranking alternatives from presented set Attribute values indicated Indirect utility measure – observe outcome U m s = λ s [ Σ k α m,k x m,k + β m ] + β m s

Stated Preference Ranking Disaggregate (exploded) estimation provides U m s = Σ k α’ m,k s x m,k + β’ m s with α’ m,k s = λ s α m,k β’ m s = λ s β m + β m s

Revealed Preference Ratings Stated values for selected and perhaps also unselected alternatives Providing 0 to 10 score with associated descriptors 10 = excellent; 5 = reasonable; 0 = terrible Attribute values determined separately Direct utility measure (scaled?) R m g = θ g [ Σ k α m,k x m,k + β m ] + β m g

Revealed Preference Ratings Regression estimation provides R m g = Σ k α’ m,k g x m,k + β’ m g with α’ m,k g = θ g α m,k β’ m g = θ g β m + β m g

Stated Preference Ratings Stated values for each of set of alternatives Providing 0 to 10 score with associated descriptors 10 = excellent; 5 = reasonable; 0 = terrible Attribute values indicated Provides verification of rankings Direct utility measure (scaled?) R m h = θ h [ Σ k α m,k x m,k + β m ] + β m h

Stated Preference Ratings Regression estimation provides R m h = Σ k α’ m,k h x m,k + β’ m h with α’ m,k h = θ h α m,k β’ m h = θ h β m + β m h

Estimation Testbed Specify true parameter values (α m,k and β m ) Generate synthetic observations Assume attribute values and error distributions Sample to get specific error values Calculate utility values using attribute values, true parameter values and error values Develop RP choice observations and SP ranking observations using utility values Develop RP ratings observations and SP ratings observations by scaling utility values to fit within 0 to 10 range Test estimation techniques in terms of returning to true parameter values

True Utility Function U m = Σ k α m,k x m,k + β m + e m

True Parameter Values

Attribute Values sampled from N(μ m,k,σ m,k ) with

Error Values Sampled from N(μ= 0, σ m ) σ m varies by observation type: RP Choice: σ m = σ r m = 2.4 SP Rankings: σ m = σ s m = 1.5 RP Ratings: σ m = σ g m = 2.1 SP Ratings: σ m = σ h m = 1.8

Generated Synthetic Samples Each of 4 observation types 7 alternatives for each observation (m=7) Set of 15,000 observations Sometimes considered subsets of alternatives with overall across observation types, as indicated below

Testbed Estimations RP Choice SP Rankings Joint RP/SP Data Ratings Combined RP/SP Data and Ratings

RP Choice Used ALOGIT software Set β’ m=1 r = 0 to avoid over-specification Provides: α’ m,k r = λ r α m,k β’ m r = λ r β m + β m r Know that λ r = π / ( √6 σ r m ) = 0.534

ρ 2 0 = ρ 2 c =

RP Choice Selection frequencies and ASC estimates

RP Choice 2 Selection frequencies and ASC estimates

ρ 2 0 = ρ 2 c =

RP Choice 3 Selection frequencies and ASC estimates

ρ 2 0 = ρ 2 c =

SP Rankings Used ALOGIT software Set β’ m=1 s = 0 to avoid over-specification Provides: α’ m,k s = λ s α m,k β’ m s = λ s β m + β m s Know that λ s = π / ( √6 σ s m ) = 0.855

SP Rankings More information with full ranking Also confirm against RP above ‘ranking version’ available estimate using full ranking

RP Rankings Estimates vs True Values with 15,000 observations observed estimated

SP vs RP Rankings ASC translated en bloc to some extent

SP Rankings: Role of σ m,k Impact of changing σ m,k used when synthesizing attribute values Sampling from N(μ m,k,σ m,k ) Different σ m,k means different spreads on attribute values Impacts relative size of σ s m Implications for SP survey design

Attribute Values sampled from N(μ m,k,σ m,k ) with

SP Rankings: Role of σ m,k Increasingσ m,k improves estimators Roughly proportional Ratio of β m to α m,k maintained Use 1.00 · α m,k in remaining work here Implications for SP survey design More variation in attribute values is better

Joint RP/SP Data Two basic approaches for α m,k Sequential (Hensher) First estimate α’ m,k s using SP observations; Then estimate α’ m,k r using RP observations, also forcing ratios among α’ m,k r to match those obtained first for α’ m,k s Simultaneous ( Ben Akiva; Morikawa; Daly; Bradley) Estimate α’ m,k r using RP observations and α’ m,k r using SP observations and (λ s /λ r ) altogether where (λ s /λ r ) α’ m,k r is used in place of α’ m,k s Little concensus on approach for β m

Joint RP/SP Data Used ALOGIT software Set β’ m=1 s = 0 and β’ m=1 r = 0 to avoid over- specification Provides: α’ m,k s = λ s α m,k α’ m,k r = λ r α m,k β’ m s = λ s β m +β m s β’ m r = λ r β m +β m r λ r /λ s Know that λ r = and λ s = 1.166

Joint RP/SP Ranking Estimation for Full set of RP and SP 15,000 Observations (7 Alternatives for each) observed estimated

Joint RP/SP Ranking Estimation with 15,000 RP Observations for Alternative 1-4 and 15,000 SP Observations for Alternatives Observed Estimated

RP Ratings Two potential interpretations of ratings Value provided is a (scaled?) direct utility Value provided is 10x probability of selection Issue of reference ‘excellent’ in terms of other people’s travel ‘excellent’ relative to other alternatives for respondent specifically Related to interpretation above Here: Use direct utility interpretation and thus reference is in terms of other people’s travel

RP Ratings Used MINITAB MLE Provides: α’ m,k g = θ g α m,k β’ m g = θ g β m + β m g

Estimation of Plotted RP Ratings Values θ g is found by minimizing the minimum square error between estimated sensitivities ( θ g α m,k ) and the true values α m,k The estimated values for β m are then found using (β’ m g - β m g min )/ θ g with the above-determined value for θ g

SP Ratings Used MINITAB Provides: α’ m,k h = θ h α m,k β’ m h = θ h β m + β m h

Estimation of Plotted SP Ratings Values θ h is found by minimizing the minimum square error between estimated sensitivities ( θ h α m,k ) and the true values α m,k The estimated values for β m are then found using (β’ m h - β m h min )/ θ h with the above-determined value for θ h

Combined RP/SP Data and Ratings Purpose-built software Log-Likelihood function: L = Σ k Prob(m s *) + Σ k Prob(m r *) - w g Σ k Σ m (R m g obs - R m g mod ) 2 - w h Σ k Σ m (R m h obs - R m h mod ) 2 where: m r * = selected alternative in RP observation m s * = selected alternative in SP observation Prob(m) = probability model assigns to alternative m

Combined RP/SP Data and Ratings Prob(m s *) = exp( [Σ k α’ m*,k s x m*,k ] + β’ m* s )/ ( Σ m exp( [Σ k α’ m,k s x m,k ] + β’ m s ) ) Prob(m r *) = exp( [Σ k α’ m*,k r x m*,k ] + β’ m* r )/ ( Σ m exp( [Σ k α’ m,k r x m,k ] + β’ m r ) ) R m g mod = Σ k α’ m,k g x m,k + β’ m g R m h mod = Σ k α’ m,k h x m,k + β’ m h

Combined RP/SP Data and Ratings Consider range of results for β m r for different settings on variables Example planned settings Set θ h = 1 Set w g and w h = 1 This ‘anchors’ utilities to values provided in SP Ratings

Conclusions Work in progress Not complete, but still discovering things β m estimators problematic generally Even with existing alternatives Not as efficient as those for α m,k Influenced by variation in attribute values σ m,k Influenced by frequency of chosen alternatives? T-statistics not a useful guide? Ranking (exploded) helps Rating also expected to help