Ranking and Rating Data in Joint RP/SP Estimation by JD Hunt, University of Calgary M Zhong, University of Calgary PROCESSUS Second International Colloquium Toronto ON, Canada June 2005
Overview Introduction Context Motivations Definitions Revealed Preference Choice Stated Preference Rankings Revealed Preference Ratings Stated Preference Ratings Estimation Testbed Concept Synthetic Data Generation Results Conclusions
Overview Introduction Context Motivations Definitions Revealed Preference Choice Stated Preference Rankings Revealed Preference Ratings Stated Preference Ratings Estimation Testbed Concept Synthetic Data Generation Results – so far Conclusions – so far
Introduction Context Common task to estimate logit model utility function for non-existing mode alternatives Joint RP/SP estimation available Good for sensitivity coefficients Problems with alternative specific constants (ASC) Motivation Improve situation regarding ASC Seeking to expand on joint RP/SP estimation Add rating information 0 to 10 scores Direct utility Increase understanding of issues regarding ASC generally
Definitions Revealed Preference Choice Stated Preference Ranking Revealed Preference Ratings Stated Preference Ratings Linear-in-parameters logit utility function U m = Σ k α m,k x m,k + β m
Definitions Revealed Preference Choice Stated Preference Ranking Revealed Preference Ratings Stated Preference Ratings Linear-in-parameters logit utility function U m = Σ k α m,k x m,k + β m sensitivity coefficient ASC
Revealed Preference Choice Actual behaviour Best alternative choice from existing Attribute values determined separately Indirect utility measure – observe outcome U m r = λ r [ Σ k α m,k x m,k + β m ] + β m r
Revealed Preference Choice Disaggregate estimation provides U m r = Σ k α’ m,k r x m,k + β’ m r with α’ m,k r = λ r α m,k β’ m r = λ r β m + β m r
Stated Preference Ranking Stated behaviour Ranking alternatives from presented set Attribute values indicated Indirect utility measure – observe outcome U m s = λ s [ Σ k α m,k x m,k + β m ] + β m s
Stated Preference Ranking Disaggregate (exploded) estimation provides U m s = Σ k α’ m,k s x m,k + β’ m s with α’ m,k s = λ s α m,k β’ m s = λ s β m + β m s
Revealed Preference Ratings Stated values for selected and perhaps also unselected alternatives Providing 0 to 10 score with associated descriptors 10 = excellent; 5 = reasonable; 0 = terrible Attribute values determined separately Direct utility measure (scaled?) R m g = θ g [ Σ k α m,k x m,k + β m ] + β m g
Revealed Preference Ratings Regression estimation provides R m g = Σ k α’ m,k g x m,k + β’ m g with α’ m,k g = θ g α m,k β’ m g = θ g β m + β m g
Stated Preference Ratings Stated values for each of set of alternatives Providing 0 to 10 score with associated descriptors 10 = excellent; 5 = reasonable; 0 = terrible Attribute values indicated Provides verification of rankings Direct utility measure (scaled?) R m h = θ h [ Σ k α m,k x m,k + β m ] + β m h
Stated Preference Ratings Regression estimation provides R m h = Σ k α’ m,k h x m,k + β’ m h with α’ m,k h = θ h α m,k β’ m h = θ h β m + β m h
Estimation Testbed Specify true parameter values (α m,k and β m ) Generate synthetic observations Assume attribute values and error distributions Sample to get specific error values Calculate utility values using attribute values, true parameter values and error values Develop RP choice observations and SP ranking observations using utility values Develop RP ratings observations and SP ratings observations by scaling utility values to fit within 0 to 10 range Test estimation techniques in terms of returning to true parameter values
True Utility Function U m = Σ k α m,k x m,k + β m + e m
True Parameter Values
Attribute Values sampled from N(μ m,k,σ m,k ) with
Error Values Sampled from N(μ= 0, σ m ) σ m varies by observation type: RP Choice: σ m = σ r m = 2.4 SP Rankings: σ m = σ s m = 1.5 RP Ratings: σ m = σ g m = 2.1 SP Ratings: σ m = σ h m = 1.8
Generated Synthetic Samples Each of 4 observation types 7 alternatives for each observation (m=7) Set of 15,000 observations Sometimes considered subsets of alternatives with overall across observation types, as indicated below
Testbed Estimations RP Choice SP Rankings Joint RP/SP Data Ratings Combined RP/SP Data and Ratings
RP Choice Used ALOGIT software Set β’ m=1 r = 0 to avoid over-specification Provides: α’ m,k r = λ r α m,k β’ m r = λ r β m + β m r Know that λ r = π / ( √6 σ r m ) = 0.534
ρ 2 0 = ρ 2 c =
RP Choice Selection frequencies and ASC estimates
RP Choice 2 Selection frequencies and ASC estimates
ρ 2 0 = ρ 2 c =
RP Choice 3 Selection frequencies and ASC estimates
ρ 2 0 = ρ 2 c =
SP Rankings Used ALOGIT software Set β’ m=1 s = 0 to avoid over-specification Provides: α’ m,k s = λ s α m,k β’ m s = λ s β m + β m s Know that λ s = π / ( √6 σ s m ) = 0.855
SP Rankings More information with full ranking Also confirm against RP above ‘ranking version’ available estimate using full ranking
RP Rankings Estimates vs True Values with 15,000 observations observed estimated
SP vs RP Rankings ASC translated en bloc to some extent
SP Rankings: Role of σ m,k Impact of changing σ m,k used when synthesizing attribute values Sampling from N(μ m,k,σ m,k ) Different σ m,k means different spreads on attribute values Impacts relative size of σ s m Implications for SP survey design
Attribute Values sampled from N(μ m,k,σ m,k ) with
SP Rankings: Role of σ m,k Increasingσ m,k improves estimators Roughly proportional Ratio of β m to α m,k maintained Use 1.00 · α m,k in remaining work here Implications for SP survey design More variation in attribute values is better
Joint RP/SP Data Two basic approaches for α m,k Sequential (Hensher) First estimate α’ m,k s using SP observations; Then estimate α’ m,k r using RP observations, also forcing ratios among α’ m,k r to match those obtained first for α’ m,k s Simultaneous ( Ben Akiva; Morikawa; Daly; Bradley) Estimate α’ m,k r using RP observations and α’ m,k r using SP observations and (λ s /λ r ) altogether where (λ s /λ r ) α’ m,k r is used in place of α’ m,k s Little concensus on approach for β m
Joint RP/SP Data Used ALOGIT software Set β’ m=1 s = 0 and β’ m=1 r = 0 to avoid over- specification Provides: α’ m,k s = λ s α m,k α’ m,k r = λ r α m,k β’ m s = λ s β m +β m s β’ m r = λ r β m +β m r λ r /λ s Know that λ r = and λ s = 1.166
Joint RP/SP Ranking Estimation for Full set of RP and SP 15,000 Observations (7 Alternatives for each) observed estimated
Joint RP/SP Ranking Estimation with 15,000 RP Observations for Alternative 1-4 and 15,000 SP Observations for Alternatives Observed Estimated
RP Ratings Two potential interpretations of ratings Value provided is a (scaled?) direct utility Value provided is 10x probability of selection Issue of reference ‘excellent’ in terms of other people’s travel ‘excellent’ relative to other alternatives for respondent specifically Related to interpretation above Here: Use direct utility interpretation and thus reference is in terms of other people’s travel
RP Ratings Used MINITAB MLE Provides: α’ m,k g = θ g α m,k β’ m g = θ g β m + β m g
Estimation of Plotted RP Ratings Values θ g is found by minimizing the minimum square error between estimated sensitivities ( θ g α m,k ) and the true values α m,k The estimated values for β m are then found using (β’ m g - β m g min )/ θ g with the above-determined value for θ g
SP Ratings Used MINITAB Provides: α’ m,k h = θ h α m,k β’ m h = θ h β m + β m h
Estimation of Plotted SP Ratings Values θ h is found by minimizing the minimum square error between estimated sensitivities ( θ h α m,k ) and the true values α m,k The estimated values for β m are then found using (β’ m h - β m h min )/ θ h with the above-determined value for θ h
Combined RP/SP Data and Ratings Purpose-built software Log-Likelihood function: L = Σ k Prob(m s *) + Σ k Prob(m r *) - w g Σ k Σ m (R m g obs - R m g mod ) 2 - w h Σ k Σ m (R m h obs - R m h mod ) 2 where: m r * = selected alternative in RP observation m s * = selected alternative in SP observation Prob(m) = probability model assigns to alternative m
Combined RP/SP Data and Ratings Prob(m s *) = exp( [Σ k α’ m*,k s x m*,k ] + β’ m* s )/ ( Σ m exp( [Σ k α’ m,k s x m,k ] + β’ m s ) ) Prob(m r *) = exp( [Σ k α’ m*,k r x m*,k ] + β’ m* r )/ ( Σ m exp( [Σ k α’ m,k r x m,k ] + β’ m r ) ) R m g mod = Σ k α’ m,k g x m,k + β’ m g R m h mod = Σ k α’ m,k h x m,k + β’ m h
Combined RP/SP Data and Ratings Consider range of results for β m r for different settings on variables Example planned settings Set θ h = 1 Set w g and w h = 1 This ‘anchors’ utilities to values provided in SP Ratings
Conclusions Work in progress Not complete, but still discovering things β m estimators problematic generally Even with existing alternatives Not as efficient as those for α m,k Influenced by variation in attribute values σ m,k Influenced by frequency of chosen alternatives? T-statistics not a useful guide? Ranking (exploded) helps Rating also expected to help