Discrete Choice Modeling William Greene Stern School of Business New York University.

Slides:

Advertisements

Similar presentations

9. Heterogeneity: Mixed Models. RANDOM PARAMETER MODELS.

Advertisements

Econometrics I Professor William Greene Stern School of Business

Discrete Choice Modeling

Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013 William Greene Department of Economics Stern School.

Part 17: Nonlinear Regression 17-1/26 Econometrics I Professor William Greene Stern School of Business Department of Economics.

[Part 1] 1/15 Discrete Choice Modeling Econometric Methodology Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.

Discrete Choice Modeling William Greene Stern School of Business New York University Lab Sessions.

Part 24: Bayesian Estimation 24-1/35 Econometrics I Professor William Greene Stern School of Business Department of Economics.

Part 12: Random Parameters [ 1/46] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.

1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA.

Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics.

Part 23: Simulation Based Estimation 23-1/26 Econometrics I Professor William Greene Stern School of Business Department of Economics.

8. Heterogeneity: Latent Class Models. Latent Classes A population contains a mixture of individuals of different types (classes) Common form of the.

Discrete Choice Modeling William Greene Stern School of Business New York University Lab Sessions.

Discrete Choice Modeling William Greene Stern School of Business New York University.

2. Binary Choice Estimation. Modeling Binary Choice.

Econometric Methodology. The Sample and Measurement Population Measurement Theory Characteristics Behavior Patterns Choices.

1. Descriptive Tools, Regression, Panel Data. Model Building in Econometrics Parameterizing the model Nonparametric analysis Semiparametric analysis Parametric.

Discrete Choice Modeling William Greene Stern School of Business New York University.

Discrete Choice Modeling William Greene Stern School of Business New York University.

Discrete Choice Modeling William Greene Stern School of Business New York University.

Discrete Choice Modeling William Greene Stern School of Business New York University.

Spatial Discrete Choice Models Professor William Greene Stern School of Business, New York University.

[Part 9] 1/79 Discrete Choice Modeling Modeling Heterogeneity Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.

Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013 William Greene Department of Economics Stern School.

Discrete Choice Modeling William Greene Stern School of Business New York University.

1/68: Topic 4.2 – Latent Class Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA William Greene.

[Part 4] 1/43 Discrete Choice Modeling Bivariate & Multivariate Probit Discrete Choice Modeling William Greene Stern School of Business New York University.

Spatial Discrete Choice Models Professor William Greene Stern School of Business, New York University.

Discrete Choice Modeling William Greene Stern School of Business New York University.

Discrete Choice Modeling William Greene Stern School of Business New York University.

[Topic 2-Endogeneity] 1/33 Topics in Microeconometrics William Greene Department of Economics Stern School of Business.

Discrete Choice Modeling William Greene Stern School of Business New York University.

[Topic 8-Random Parameters] 1/83 Topics in Microeconometrics William Greene Department of Economics Stern School of Business.

[Topic 9-Latent Class Models] 1/66 9. Heterogeneity: Latent Class Models.

1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA.

Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013 William Greene Department of Economics Stern School.

Discrete Choice Modeling William Greene Stern School of Business New York University.

Discrete Choice Modeling William Greene Stern School of Business New York University.

1/53: Topic 3.1 – Models for Ordered Choices Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA William.

6. Ordered Choice Models. Ordered Choices Ordered Discrete Outcomes E.g.: Taste test, credit rating, course grade, preference scale Underlying random.

[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.

Discrete Choice Modeling William Greene Stern School of Business New York University.

5. Extensions of Binary Choice Models

Microeconometric Modeling

Microeconometric Modeling

William Greene Stern School of Business New York University

William Greene Stern School of Business New York University

Discrete Choice Modeling

Discrete Choice Modeling

Discrete Choice Modeling

Discrete Choice Modeling

Discrete Choice Modeling

Microeconometric Modeling

Microeconometric Modeling

Microeconometric Modeling

Econometric Analysis of Panel Data

Microeconometric Modeling

Microeconometric Modeling

Microeconometric Modeling

Microeconometric Modeling

Econometrics Chengyuan Yin School of Mathematics.

Econometrics Analysis

Microeconometric Modeling

Microeconometric Modeling

Microeconometric Modeling

Microeconometric Modeling

Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2019 William Greene Department of Economics Stern School.

Empirical Methods for Microeconomic Applications

Presentation transcript:

Discrete Choice Modeling William Greene Stern School of Business New York University

Part 6 Modeling Latent Parameter Heterogeneity

Parameter Heterogeneity  Fixed and Random Effects Models Latent common time invariant “effects” Heterogeneity in level parameter – constant term – in the model  General Parameter Heterogeneity in Models Discrete: There is more than one time of individual in the population – parameters differ across types. Produces a Latent Class Model Continuous; Parameters vary randomly across individuals: Produces a Random Parameters Model or a Mixed Model. (Synonyms)

Latent Class Models  There are Q types of people, q = 1,…,Q  For each type, Prob(Outcome|type=q) = f(y,x|β q )  Individual i is and remains a member of class q  An individual will be drawn at random from the population. Prob(in class q) = π q  From the modeler’s point of view: Prob(Outcome) = Σ q π q Prob(Outcome|type=q) = Σ q π q f(y,x|β q )

Finite Mixture Model  Prob(Outcome|type=q) = f(y,x|β q ) depends on parameter vector  Parameters are randomly, discretely distributed among population members, with Prob(β = β q ) = π q, q = 1,…,Q  Integrating out the variation across parameters, Prob(Outcome) = Σ q π q f(y,x|β q )  Same model, slightly different interpretation

Estimation Problems  Estimation of population features Latent parameter vectors, β q, q = 1,…,Q Mixing probabilities, π q, q = 1,…,Q Probabilities, partial effects, predictions, etc. Model structure: The number of classes, Q  Classification: Prediction of class membership for individuals

An Extended Latent Class Model

Log Likelihood for an LC Model

Example: Mixture of Normals

Unmixing a Mixed Sample N[1,1] and N[5,1] Sample ; 1 – 1000$ Calc ; Ran(123457)$ Create; lc1=rnn(1,1) ; lc2=rnn(5,1)$ Create; class=rnu(0,1)$ Create; if(class<.3)ylc=lc1 ; (else)ylc=lc2$ Kernel; rhs=ylc $ Regress ; lhs=ylc;rhs=one;lcm;pts=2;pds=1$

Mixture of Normals | Latent Class / Panel LinearRg Model | | Dependent variable YLC | | Number of observations 1000 | | Log likelihood function | | Info. Criterion: AIC = | | LINEAR regression model | | Model fit with 2 latent classes. | |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X| Model parameters for latent class 1 | |Constant| *** | |Sigma | *** | Model parameters for latent class 2 | |Constant| *** | |Sigma |.95746*** | Estimated prior probabilities for class membership | |Class1Pr|.70003*** | |Class2Pr|.29997*** | | Note: ***, **, * = Significance at 1%, 5%, 10% level. |

Estimating Which Class

Posterior for Normal Mixture

Estimated Posterior Probabilities

How Many Classes?

More Difficult When the Populations are Close Together

The Technique Still Works Latent Class / Panel LinearRg Model Dependent variable YLC Sample is 1 pds and 1000 individuals LINEAR regression model Model fit with 2 latent classes Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X |Model parameters for latent class 1 Constant| *** Sigma| *** |Model parameters for latent class 2 Constant|.90156*** Sigma|.86951*** |Estimated prior probabilities for class membership Class1Pr|.73447*** Class2Pr|.26553***

Heckman and Singer RE Model  Random Effects Model  Random Constants with Discrete Distribution

LCM for Health Status  Self Assessed Health Status = 0,1,…,10  Recoded: Healthy = HSAT > 6  Using only groups observed T=7 times; N=887  Prob =  (Age,Educ,Income,Married,Kids)  2, 3 classes

Too Many Classes

Two Class Model Latent Class / Panel Probit Model Dependent variable HEALTHY Unbalanced panel has 887 individuals PROBIT (normal) probability model Model fit with 2 latent classes Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X |Model parameters for latent class 1 Constant|.61652** AGE| *** EDUC|.11759*** HHNINC| MARRIED| HHKIDS| |Model parameters for latent class 2 Constant| AGE| *** EDUC| HHNINC|.61039*** MARRIED| HHKIDS|.19465** |Estimated prior probabilities for class membership Class1Pr|.56604*** Class2Pr|.43396***

Partial Effects in LC Model Partial derivatives of expected val. with respect to the vector of characteristics. They are computed at the means of the Xs. Conditional Mean at Sample Point.6116 Scale Factor for Marginal Effects.3832 B for latent class model is a wghted avrg Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Elasticity |Two class latent class model AGE| *** EDUC|.02904*** HHNINC|.12475** MARRIED| HHKIDS|.04196** |Pooled Probit Model AGE| *** EDUC|.03219*** HHNINC|.16699*** |Marginal effect for dummy variable is P|1 - P|0. MARRIED| |Marginal effect for dummy variable is P|1 - P|0. HHKIDS|.06754***

Conditional Means of Parameters

Heckman and Singer Model – 3 Points

Heckman/Singer vs. REM Random Effects Binary Probit Model Sample is 7 pds and 887 individuals | Standard Prob. 95% Confidence HEALTHY| Coefficient Error z |z|>Z* Interval Constant| (Other coefficients omitted) Rho|.52565*** Rho =  2 /(1+s2) so  2 = rho/(1-rho) = Mean =.33609, Variance = For Heckman and Singer model, 3 points a1,a2,a3 = ,.50135, probabilities p1,p2,p3 =.31094,.45267, Mean = variance =.90642

An Extended Latent Class Model

Health Satisfaction Model Latent Class / Panel Probit Model Used mean AGE and FEMALE Dependent variable HEALTHY in class probability model Log likelihood function Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X |Model parameters for latent class 1 Constant|.60050** AGE| *** EDUC|.10597*** HHNINC| MARRIED| HHKIDS| |Model parameters for latent class 2 Constant| AGE| *** EDUC| HHNINC|.59026*** MARRIED| HHKIDS|.20652*** |Estimated prior probabilities for class membership ONE_1| *** (.56519) AGEBAR_1| * FEMALE_1| *** ONE_2| (Fixed Parameter) (.43481) AGEBAR_2| (Fixed Parameter) FEMALE_2| (Fixed Parameter)

The EM Algorithm

Implementing EM for LC Models

Random Parameters Models

A Mixed Probit Model

Application – Healthy German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods Variables in the file are Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice. There are altogether 27,326 observations. The number of observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). DOCTOR = 1(Number of doctor visits > 0) HSAT = health satisfaction, coded 0 (low) - 10 (high) DOCVIS = number of doctor visits in last three months HOSPVIS = number of hospital visits in last calendar year PUBLIC = insured in public health insurance = 1; otherwise = 0 ADDON = insured by add-on insurance = 1; otherswise = 0 HHNINC = household nominal monthly net income in German marks / (4 observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC = years of schooling AGE = age in years MARRIED = marital status

Estimates of a Mixed Probit Model

Partial Effects are Also Simulated

Simulating Conditional Means for Individual Parameters Posterior estimates of E[parameters(i) | Data(i)]

Summarizing Simulated Estimates

Correlated Parameters Random Coefficients Probit Model Dependent variable HEALTHY PROBIT (normal) probability model Simulation based on 25 random draws Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X |Means for random parameters Constant| AGE| *** EDUC|.15526*** HHNINC|.28023** MARRIED| HHKIDS| Partial derivatives of expected val. with respect to the vector of characteristics. They are computed at the means of the Xs. Conditional Mean at Sample Point.6351 Scale Factor for Marginal Effects.3758 AGE| *** EDUC|.05835*** HHNINC|.10532** MARRIED| HHKIDS|

Cholesky Matrix Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X |Means for random parameters Constant| AGE| *** EDUC|.15526*** HHNINC|.28023** MARRIED| HHKIDS| |Diagonal elements of Cholesky matrix Constant|.66612*** AGE|.01041*** EDUC|.07307*** HHNINC|.18897* MARRIED|.47889*** HHKIDS|.44804*** |Below diagonal elements of Cholesky matrix lAGE_ONE| lEDU_ONE|.07359*** lEDU_AGE| ** lHHN_ONE| ** lHHN_AGE| lHHN_EDU|.44021*** lMAR_ONE| ** lMAR_AGE| *** lMAR_EDU| lMAR_HHN|.07949* lHHK_ONE| lHHK_AGE|.21508*** lHHK_EDU|.31374*** lHHK_HHN| *** lHHK_MAR| ***

Estimated Parameter Correlation Matrix

Modeling Parameter Heterogeneity

Hierarchical Probit Model Random Coefficients Probit Model Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X |Means for random parameters Constant| *** AGE| *** EDUC| *** HHNINC| *** MARRIED|.61453** HHKIDS| |Scale parameters for dists. of random parameters Constant|.12981*** AGE|.01424*** EDUC|.00368** HHNINC|.52685*** MARRIED|.16399*** HHKIDS|.13928*** |Heterogeneity in the means of random parameters cONE_AGE| cONE_FEM| *** cAGE_AGE| cAGE_FEM|.01552*** cEDU_AGE|.00575*** cEDU_FEM| cHHN_AGE| *** cHHN_FEM| cMAR_AGE| ** cMAR_FEM|.20538* cHHK_AGE|.01053* cHHK_FEM| ***

Mixed Model Estimation Programs differ on the models fitted, the algorithms, the paradigm, and the extensions provided to the simplest RPM,  i = +u i.  MLWin: Multilevel models Regression and some loglinear models  WinBUGS: Mainly for Bayesian Applications MCMC User specifies the model – constructs the Gibbs Sampler/Metropolis Hastings  SAS: Proc Mixed. Classical Uses primarily a kind of GLS/GMM (method of moments algorithm for loglinear models)  LIMDEP/NLOGIT Classical Mixing done by Monte Carlo integration – maximum simulated likelihood Numerous linear, nonlinear, loglinear models, multinomial choice models  Stata: Classical - GLAMM Mixing done by quadrature. (Very, very slow for 2 or more dimensions) Several loglinear models Arne Hole has developed a basic RP multinomial logit estimator  Ken Train’s Free Gauss Code Monte Carlo integration Used by many researchers Mixed Multinomial Logit model only (but free!)  Biogeme – Michel Bierlaire free multinomial logit package  R: nlme package – multilevel linear regression

Hierarchical Model

Maximum Simulated Likelihood

Monte Carlo Integration

Example: Monte Carlo Integral

Simulated Log Likelihood for a Mixed Probit Model

Generating a Random Draw

Drawing Uniform Random Numbers

Quasi-Monte Carlo Integration Based on Halton Sequences For example, using base p=5, the integer r=37 has b 0 = 2, b 1 = 2, and b 3 = 1; (37=1x x x5 0 ). Then H(37|5) = 2    5 -3 =

Halton Sequences vs. Random Draws Requires far fewer draws – for one dimension, about 1/10. Accelerates estimation by a factor of 5 to 10.