Presentation is loading. Please wait.

Presentation is loading. Please wait.

Econometrics Chengyuan Yin School of Mathematics.

Similar presentations


Presentation on theme: "Econometrics Chengyuan Yin School of Mathematics."— Presentation transcript:

1 Econometrics Chengyuan Yin School of Mathematics

2 23. Discrete Choice Modeling
Econometrics 23. Discrete Choice Modeling

3 A Microeconomics Platform
Consumers Maximize Utility (!!!) Fundamental Choice Problem: Maximize U(x1,x2,…) subject to prices and budget constraints A Crucial Result for the Classical Problem: Indirect Utility Function: V = V(p,I) Demand System of Continuous Choices The Integrability Problem: Utility is not revealed by demands

4 Theory for Discrete Choice
Theory is silent about discrete choices Translation to discrete choice Existence of well defined utility indexes: Completeness of rankings Rationality: Utility maximization Axioms of revealed preferences Choice sets and consideration sets – consumers simplify choice situations Implication for choice among a set of discrete alternatives Commonalities and uniqueness Does this allow us to build “models?” What common elements can be assumed? How can we account for heterogeneity? Revealed choices do not reveal utility, only rankings which are scale invariant

5 Choosing Between Two Alternatives
Modeling the Binary Choice Ui,suv = suv + Psuv + suvIncome + i,suv Ui,sed = sed + Psed + sedIncome + i,sed Chooses SUV: Ui,suv > Ui,sed Ui,suv - Ui,sed > 0 (SUV-SED) + (PSUV-PSED) + (SUV-sed)Income + i,suv - i,sed > 0 i > -[ + (PSUV-PSED) + Income]

6 What Can Be Learned from the Data? (A Sample of Consumers, i = 1,…,N)
Are the attributes “relevant?” Predicting behavior Individual Aggregate Analyze changes in behavior when attributes change

7 Application 210 Commuters Between Sydney and Melbourne
Available modes = Air, Train, Bus, Car Observed: Choice Attributes: Cost, terminal time, other Characteristics: Household income First application: Fly or Other

8 Binary Choice Data Choose Air Gen.Cost Term Time Income

9 An Econometric Model Choose to fly iff UFLY > 0
Ufly = +1Cost + 2Time + Income +  Ufly > 0   > -(+1Cost + 2Time + Income) Probability model: For any person observed by the analyst, Prob(fly) = Prob[ > -(+1Cost + 2Time + Income)] Note the relationship between the unobserved  and the outcome

10 +1Cost + 2TTime + Income

11 Modeling Approaches Nonparametric – “relationship”
Minimal Assumptions Minimal Conclusions Semiparametric – “index function” Stronger assumptions Robust to model misspecification (heteroscedasticity) Still weak conclusions Parametric – “Probability function and index” Strongest assumptions – complete specification Strongest conclusions Possibly less robust. (Not necessarily)

12 Nonparametric P(Air)=f(Income)

13 Semiparametric MSCORE: Find b’x so that
sign(b’x) * sign(y) is maximized. Klein and Spady: Find b to maximize a semiparametric likelihood of G(b’x)

14 MSCORE

15 Klein and Spady Semiparametric
Note necessary normalizations. Coefficients are not very meaningful.

16 Parametric: Logit Model

17 Logit vs. MScore Logit fits worse
MScore fits better, coefficients are meaningless

18 Parametric Model Estimation
How to estimate , 1, 2, ? It’s not regression The technique of maximum likelihood Prob[y=1] = Prob[ > -(+1Cost + 2Time + Income)] Prob[y=0] = Prob[y=1] Requires a model for the probability

19 Completing the Model: F()
The distribution Normal: PROBIT, natural for behavior Logistic: LOGIT, allows “thicker tails” Gompertz: EXTREME VALUE, asymmetric, underlies the basic logit model for multiple choice Does it matter? Yes, large difference in estimates Not much, quantities of interest are more stable.

20 Underlying Probability Distributions for Binary Choice

21 Estimated Binary Choice (Probit) Model
| Binomial Probit Model | | Maximum Likelihood Estimates | | Dependent variable MODE | | Weighting variable None | | Number of observations | | Iterations completed | | Log likelihood function | | Restricted log likelihood | | Chi squared | | Degrees of freedom | | Prob[ChiSqd > value] = | | Hosmer-Lemeshow chi-squared = | | P-value= with deg.fr. = | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| Index function for probability Constant GC TTME HINC

22 Estimated Binary Choice Models
LOGIT PROBIT EXTREME VALUE Variable Estimate t-ratio Estimate t-ratio Estimate t-ratio Constant GC TTME HINC Log-L Log-L(0)

23 +1Cost + 2Time + (Income+1) ( is positive)
Effect on Predicted Probability of an Increase in Income +1Cost + 2Time + (Income+1) ( is positive)

24 Marginal Effects in Probability Models
Prob[Outcome] = some F(+1Cost…) “Partial effect” =  F(+1Cost…) / ”x” (derivative) Partial effects are derivatives Result varies with model Logit:  F(+1Cost…) / x = Prob * (1-Prob) *  Probit:  F(+1Cost…) / x = Normal density  Scaling usually erases model differences

25 The Delta Method

26 Marginal Effects for Binary Choice
Logit Probit

27 Estimated Marginal Effects
Logit Probit Extreme Value Estimate t-ratio GC 3.267 3.466 3.354 TTME -5.042 -5.754 -4.871 HINC 2.193 2.532 2.064

28 Marginal Effect for a Dummy Variable
Prob[yi = 1|xi,di] = F(’xi+di) =conditional mean Marginal effect of d Prob[yi = 1|xi,di=1]=Prob[yi= 1|xi,di=0] Logit:

29 (Marginal) Effect – Dummy Variable
HighIncm = 1(Income > 50) | Partial derivatives of probabilities with | | respect to the vector of characteristics. | | They are computed at the means of the Xs. | | Observations used are All Obs | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| Characteristics in numerator of Prob[Y = 1] Constant GC E E TTME E E Marginal effect for dummy variable is P|1 - P|0. HIGHINCM E E

30 Computing Effects Compute at the data means?
Simple Inference is well defined Average the individual effects More appropriate? Asymptotic standard errors. (Not done correctly in the literature – terms are correlated!) Is testing about marginal effects meaningful?

31 Average Partial Effects

32 Elasticities Elasticity = How to compute standard errors? Delta method
Bootstrap Bootstrap the individual elasticities? (Will neglect variation in parameter estimates.) Bootstrap model estimation?

33 Estimated Income Elasticity for Air Choice Model
| Results of bootstrap estimation of model.| | Model has been reestimated times. | | Statistics shown below are centered | | around the original estimate based on | | the original full sample of observations.| | Result is ETA = | | bootstrap samples have 840 observations.| | Estimate RtMnSqDev Skewness Kurtosis | | | | Minimum = Maximum = | Mean Income = 34.55, Mean P = .2716, Estimated ME = , Estimated Elasticity=

34 Odds Ratio – Logit Model Only
Effect Measure? “Effect of a unit change in the odds ratio.”

35 Ordered Outcomes E.g.: Taste test, credit rating, course grade
Underlying random preferences: Mapping to observed choices Strength of preferences Censoring and discrete measurement The nature of ordered data

36 Modeling Ordered Choices
Random Utility Uit =  + ’xit + i’zit + it = ait + it Observe outcome j if utility is in region j Probability of outcome = probability of cell Pr[Yit=j] = F(j – ait) - F(j-1 – ait)

37 Health Care Satisfaction (HSAT)
Self administered survey: Health Care Satisfaction? (0 – 10) Continuous Preference Scale

38 Ordered Probability Model

39 Ordered Probabilities

40 Five Ordered Probabilities

41 Coefficients

42 Effects in the Ordered Probability Model
Assume the βk is positive. Assume that xk increases. β’x increases. μj- β’x shifts to the left for all 5 cells. Prob[y=0] decreases Prob[y=1] decreases – the mass shifted out is larger than the mass shifted in. Prob[y=2] decreases – same reason. Prob[y=3] increases. Prob[y=4] increases When βk > 0, increase in xk decreases Prob[y=0] and increases Prob[y=J]. Intermediate cells are ambiguous, but there is only one sign change in the marginal effects from 0 to 1 to … to J

43 Ordered Probability Model for Health Satisfaction
| Ordered Probability Model | | Dependent variable HSAT | | Number of observations | | Underlying probabilities based on Normal | | Cell frequencies for outcomes | | Y Count Freq Y Count Freq Y Count Freq | | | | | | | | | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| Index function for probability Constant FEMALE EDUC AGE HHNINC HHKIDS Threshold parameters for index Mu(1) Mu(2) Mu(3) Mu(4) Mu(5) Mu(6) Mu(7) Mu(8) Mu(9)

44 Ordered Probability Effects
| Marginal effects for ordered probability model | | M.E.s for dummy variables are Pr[y|x=1]-Pr[y|x=0] | | Names for dummy variables are marked by * | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| These are the effects on Prob[Y=00] at means. *FEMALE EDUC D AGE D HHNINC *HHKIDS These are the effects on Prob[Y=01] at means. *FEMALE EDUC D AGE D HHNINC *HHKIDS ... repeated for all 11 outcomes These are the effects on Prob[Y=10] at means. *FEMALE EDUC AGE HHNINC *HHKIDS

45 Ordered Probit Marginal Effects

46 Multinomial Choice Among J Alternatives
• Random Utility Basis Uitj = ij + i ’xitj + i’zit + ijt i = 1,…,N; j = 1,…,J(i); t = 1,…,T(i) • Maximum Utility Assumption Individual i will Choose alternative j in choice setting t iff Uitj > Uitk for all k  j. • Underlying assumptions Smoothness of utilities Axioms: Transitive, Complete, Monotonic

47 Utility Functions The linearity assumption and curvature
The choice set Deterministic and random components: The “model” Generic vs. alternative specific components Attributes and characteristics Coefficients Part worths Alternative specific constants Scaling

48 The Multinomial Logit (MNL) Model
Independent extreme value (Gumbel): F(itj) = 1 – Exp(-Exp(itj)) (random part of each utility) Independence across utility functions Identical variances (means absorbed in constants) Same parameters for all individuals (temporary) Implied probabilities for observed outcomes

49 Specifying Probabilities
• Choice specific attributes (X) vary by choices, multiply by generic coefficients. E.g., TTME, GC Generic characteristics (Income, constants) must be interacted with choice specific constants. (Else they fall out of the probability) • Estimation by maximum likelihood; dij = 1 if person i chooses j

50 Observed Data Types of Data Attributes and Characteristics
Individual choice Market shares Frequencies Ranks Attributes and Characteristics Choice Settings Cross section Repeated measurement (panel data)

51 Data on Discrete Choices
Line MODE TRAVEL INVC INVT TTME GC HINC 1 AIR 2 TRAIN 3 BUS 4 CAR 5 AIR 6 TRAIN 7 BUS 8 CAR 321 AIR 322 TRAIN 323 BUS 324 CAR 325 AIR 326 TRAIN 327 BUS 328 CAR

52 Estimated MNL Model +---------------------------------------------+
| Discrete choice (multinomial logit) model | | Maximum Likelihood Estimates | | Model estimated: Jan 20, 2004 at 03:05:11PM.| | Dependent variable Choice | | Weighting variable None | | Number of observations | | Iterations completed | | Log likelihood function | | R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj | | Constants only | | Chi-squared[ 2] = | | Prob [ chi squared > value ] = | | Response data are given as ind. choice. | | Number of obs.= 210, skipped 0 bad obs. | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | GC TTME A_AIR A_TRAIN A_BUS

53 Estimated MNL Model +---------------------------------------------+
| Discrete choice (multinomial logit) model | | Maximum Likelihood Estimates | | Model estimated: Jan 20, 2004 at 03:05:11PM.| | Dependent variable Choice | | Weighting variable None | | Number of observations | | Iterations completed | | Log likelihood function | | R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj | | Constants only | | Chi-squared[ 2] = | | Prob [ chi squared > value ] = | | Response data are given as ind. choice. | | Number of obs.= 210, skipped 0 bad obs. | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | GC TTME A_AIR A_TRAIN A_BUS

54 Estimated MNL Model +---------------------------------------------+
| Discrete choice (multinomial logit) model | | Maximum Likelihood Estimates | | Model estimated: Jan 20, 2004 at 03:05:11PM.| | Dependent variable Choice | | Weighting variable None | | Number of observations | | Iterations completed | | Log likelihood function | | R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj | | Constants only | | Chi-squared[ 2] = | | Prob [ chi squared > value ] = | | Response data are given as ind. choice. | | Number of obs.= 210, skipped 0 bad obs. | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | GC TTME A_AIR A_TRAIN A_BUS

55 Estimated MNL Model +---------------------------------------------+
| Discrete choice (multinomial logit) model | | Maximum Likelihood Estimates | | Model estimated: Jan 20, 2004 at 03:05:11PM.| | Dependent variable Choice | | Weighting variable None | | Number of observations | | Iterations completed | | Log likelihood function | | R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj | | Constants only | | Chi-squared[ 2] = | | Prob [ chi squared > value ] = | | Response data are given as ind. choice. | | Number of obs.= 210, skipped 0 bad obs. | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | GC TTME A_AIR A_TRAIN A_BUS

56 Estimated MNL Model +---------------------------------------------+
| Discrete choice (multinomial logit) model | | Maximum Likelihood Estimates | | Model estimated: Jan 20, 2004 at 03:05:11PM.| | Dependent variable Choice | | Weighting variable None | | Number of observations | | Iterations completed | | Log likelihood function | | R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj | | Constants only | | Chi-squared[ 2] = | | Prob [ chi squared > value ] = | | Response data are given as ind. choice. | | Number of obs.= 210, skipped 0 bad obs. | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | GC TTME A_AIR A_TRAIN A_BUS

57 Model Fit Based on Log Likelihood
Three sets of predicted probabilities No model: Pij = 1/J (.25) Constants only: Pij = (1/N)i dij [(58,63,30,59)/210=.286,.300,.143,.281) Estimated model: Logit probabilities Compute log likelihood Measure improvement in log likelihood with R-squared = 1 – LogL/LogL0 (“Adjusted” for number of parameters in the model.) NOT A MEASURE OF “FIT!”

58 Fit the Model with Only ASCs
| Iterations completed | | Log likelihood function | | R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj | | Constants only | |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | A_AIR A_TRAIN A_BUS | Log likelihood function | | Constants only | | Chi-squared[ 2] = | | Prob [ chi squared > value ] = | GC TTME A_AIR A_TRAIN A_BUS

59 Based on the log likelihood
CLOGIT Fit Measures Based on the log likelihood | Log likelihood function | | Log-L for Choice model = | | R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj | | No coefficients | | Constants only | | Chi-squared[ 7] = | | Significance for chi-squared = | Values in parentheses below show the number of correct predictions by a model with only choice specific constants. Based on the model predictions | Cross tabulation of actual vs. predicted choices. | | Row indicator is actual, column is predicted | | Predicted total is F(k,j,i)=Sum(i=1,...,N) P(k,j,i). | | Column totals may be subject to rounding error | Matrix Crosstab has 5 rows and 5 columns. AIR TRAIN BUS CAR Total AIR | (16) TRAIN | (19) BUS | (4) CAR | (17) Total |

60 Effects of Changes in Attributes on Probabilities
Partial Effects: Effect of a change in attribute “k” of alternative “m” on the probability that choice “j” will be made is Proportional changes: Elasticities Note the elasticity is the same for all choices “j.” (IIA)

61 Elasticities for CLOGIT
Request: ;Effects: attribute (choices where changes occur ) ; Effects: INVT(*) (INVT changes in all choices) | Elasticity Averaged over observations | | Effects on probabilities of all choices in the model: | | * indicates direct Elasticity effect of the attribute | | Trunk Limb Branch Choice Effect| | Attribute is INVT in choice AIR | | * Choice=AIR | | Choice=TRAIN | | Choice=BUS | | Choice=CAR | | Attribute is INVT in choice TRAIN | | Choice=AIR | | * Choice=TRAIN | | Choice=BUS | | Choice=CAR | | Attribute is INVT in choice BUS | | Choice=AIR | | Choice=TRAIN | | * Choice=BUS | | Choice=CAR | | Attribute is INVT in choice CAR | | Choice=AIR | | Choice=TRAIN | | Choice=BUS | | * Choice=CAR | Own effect Cross effects Note the effect of IIA on the cross effects.

62 Choice Based Sampling Over/Underrepresenting alternatives in the data set Biases in parameter estimates? (Constants only?) Biases in estimated variances Weighted log likelihood, weight = j / Fj for all i. Fixup of covariance matrix ; Choices = list of names / list of true proportions $ Choice Air Train Bus Car True 0.14 0.13 0.09 0.64 Sample 0.28 0.30

63 Choice Based Sampling Estimators
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Unweighted GC E E TTME E INVT E E INVC E E A_AIR AIRxHIN E E A_TRAIN TRAxHIN E E A_BUS BUSxHIN E E Weighted GC E TTME E INVT E E INVC E A_AIR AIRxHIN E E A_TRAIN TRAxHIN E E A_BUS BUSxHIN E E

64 Changes in Estimated Elasticities
| Elasticity Averaged over observations | | Attribute is GC in choice CAR | | Effects on probabilities of all choices in the model: | | * indicates direct Elasticity effect of the attribute | | Unweighted | | Choice=AIR | | Choice=TRAIN | | Choice=BUS | | * Choice=CAR | | Weighted | | Choice=AIR | | Choice=TRAIN | | Choice=BUS | | * Choice=CAR |

65 Uitj = ij + i ’xitj + i’zit + ijt
The I.I.D Assumption Uitj = ij + i ’xitj + i’zit + ijt F(itj) = 1 – Exp(-Exp(itj)) (random part of each utility) Independence across utility functions Identical variances (means absorbed in constants) Restriction on scaling Correlation across alternatives? Implication for cross elasticities (we saw earlier) Behavioral assumption, independence from irrelevant alternatives (IIA)


Download ppt "Econometrics Chengyuan Yin School of Mathematics."

Similar presentations


Ads by Google