Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Qualitative and Limited Dependent Variable Models Chapter 18.
Economics 20 - Prof. Anderson1 Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
Brief introduction on Logistic Regression
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Longitudinal and Multilevel Methods for Models with Discrete Outcomes with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David.
Conclusion to Bivariate Linear Regression Economics 224 – Notes for November 19, 2008.
Limited Dependent Variables
Prediction, Goodness-of-Fit, and Modeling Issues ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Instrumental Variables Estimation and Two Stage Least Square
Qualitative and Limited Dependent Variable Models
Qualitative and Limited Dependent Variable Models
Models with Discrete Dependent Variables
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
Chapter 13 Additional Topics in Regression Analysis
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Qualitative and Limited Dependent Variable Models Prepared by Vera Tabakova, East Carolina University.
Topic 3: Regression.
Lecture 14-2 Multinomial logit (Maddala Ch 12.2)
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
1 Regression Models with Binary Response Regression: “Regression is a process in which we estimate one variable on the basis of one or more other variables.”
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Qualitative and Limited Dependent Variable Models
Lecture 14-1 (Wooldridge Ch 17) Linear probability, Probit, and
Lecture 15 Tobit model for corner solution
Qualitative and Limited Dependent Variable Models Adapted from Vera Tabakova’s notes ECON 4551 Econometrics II Memorial University of Newfoundland.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Limited Dependent Variables Ciaran S. Phibbs May 30, 2012.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
“Analyzing Health Equity Using Household Survey Data” Owen O’Donnell, Eddy van Doorslaer, Adam Wagstaff and Magnus Lindelow, The World Bank, Washington.
M.Sc. in Economics Econometrics Module I Topic 7: Censored Regression Model Carol Newman.
Issues in Estimation Data Generating Process:
Discrete Choice Modeling William Greene Stern School of Business New York University.
SAMPLE SELECTION in Earnings Equation Cheti Nicoletti ISER, University of Essex.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
The Simple Linear Regression Model: Specification and Estimation ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s.
1 Topic 4 : Ordered Logit Analysis. 2 Often we deal with data where the responses are ordered – e.g. : (i) Eyesight tests – bad; average; good (ii) Voting.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
11 Chapter 5 The Research Process – Hypothesis Development – (Stage 4 in Research Process) © 2009 John Wiley & Sons Ltd.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
 Binary models Logit and Probit  Binary models with correlation (multivariate)  Multinomial non ordered  Ordered models (rankings)  Count models.
EED 401: ECONOMETRICS COURSE OUTLINE
Nonrandom Sampling and Tobit Models ECON 721. Different Types of Sampling Random sampling Censored sampling Truncated sampling Nonrandom –Exogenous stratified.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Econometric methods of analysis and forecasting of financial markets Lecture 6. Models of restricted dependent variables.
4. Tobit-Model University of Freiburg WS 2007/2008 Alexander Spermann 1 Tobit-Model.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Non-Linear Dependent Variables Ciaran S. Phibbs November 17, 2010.
Logistic Regression: Regression with a Binary Dependent Variable.
Lecture 15 Tobit model for corner solution
BINARY LOGISTIC REGRESSION
Limited Dependent Variables
M.Sc. in Economics Econometrics Module I
Simultaneous equation system
Charles University Charles University STAKAN III
Limited Dependent Variable Models and Sample Selection Corrections
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Seminar in Economics Econ. 470
Econometrics Chengyuan Yin School of Mathematics.
Limited Dependent Variables
Presentation transcript:

Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes

 16.1 Models with Binary Dependent Variables  16.2 The Logit Model for Binary Choice  16.3 Multinomial Logit  16.4 Conditional Logit  16.5 Ordered Choice Models  16.6 Models for Count Data  16.7 Limited Dependent Variables: Heckman selection model

 Problem: our sample is not a random sample. The data we observe are “selected” by a systematic process for which we do not account  We wonder about the relationship between x and y but data are available only for observations in which another variable, z*, exceeds a certain value  Selection bias occurs when your sample is truncated and the cause of that truncation is correlated with the dependent variable

 Solution: a technique called Heckit, named after its developer, James Heckman  Heckman was awarded the Nobel Prize in 2000 for this contribution: “for his development of theory and methods for analyzing selective samples”. although his contribution to economics is gigantic (he is considered one of the ten most influential economists alive)  Who did he fly to Stockholm with?

Other examples of selection issues:  Will GRE scores help us screen MA applicants for 2014?  Does getting married before “shacking up” help keep marriages off divorce proceedings?  Planes coming back from the war with bullet holes?

 For example, you go to George St. to collect data on the drinking habits of MUN students (very convenient but that is why convenience samples are not valid for general inference!)  To the extent that the likelihood of someone being there is somehow related to the number of drinks they are going to have you would have a sample selection issue  We want our sampling to be random or at least due to some exogenous sampling  Endogenous sampling leads to inconsistent and biased estimation

 Sample selection can arise in many settings and for different reasons, so there are many “sample selection models”  For example, selection into the sample may be due to self-selection, with the outcome of interest determined in part by individual choice of whether or not to participate in the activity of interest  That is why you want your census to be compulsory!

 The Tobit model can be considered a type of basic selection models too  More flexible extensions of Tobit are what most people refer to as sample selection models  A simple extension is to consider a bivariate sample selection model (as labelled by Cameron and Trivedi), which generalizes the Tobit model by introducing a censoring latent variable that differs from the latent variable generating the outcome of interest  Example: there needs to be something else other than the desired number of hours to supply prompting wives to go to work  Amemiya calls this model the Tobit II (while we already know about the Tobit I above)

 Consistent estimation the under sample selection on unobservables relieson quite strong distributional assumptions  Experimental would allow us to avoid selection problems by using random assignment to a treatment  However, experiments can be difficult to implement in economics applications for both cost and ethical reasons  The treatment effects approach attempts to apply the experimental approach to observational data (See Cameron and Trivedi MMA, Ch 25)  There is an increasing number of works dealing with this type of approach

 We will focus on this simple Tobit II model/bivariate sample selection model/Heckman model/Heckit/ Tobit modelwith stochastic threshold …  There’s more!!!  Wooldridge calls the model one with a probit selection equation.  Others call this model the generalized Tobit model  Others call it simply “the” selection model but there are may selection models

 Let y ∗ 2 denote the outcome of interest (say the wives’ wages or inour initial example how many hours to work)  Tobit assumes that this outcome is observed if y ∗ 2 > 0  A more general model uses a different latent variable, y ∗ 1, such that y ∗ 2 is observed if y ∗ 1 > 0  y ∗ 1 determines whether to work or not BUT NOT how much to work, y ∗ 2 does

 The Heckit technique famously takes into account that the decision to work may be correlated with the expected wage as in the mroz.dta example  We only observe the wages of women who do work, the non-working wives we also observe but we have no salary for them  If the reason why the decision to work is somehow related to some unobservable characteristic that also affects their wage, we are in trouble  Although the decision to work could be informed by many other things that have nothing to do with the wages ones does earn, wives work if the salary they are offered exceeds their reservation wage…so clearly we have an issue!

Classic application was to labor supply, where  y ∗ 1 is the unobserved desire or propensity to work,  y2 is actual hours worked  See Mroz (1987)

 The econometric model describing the situation is composed of two equations. The first, is the selection equation/participation equation that determines whether the variable of interest is observed.

 The second equation is the linear model of interest. It is run only on the observations for which we have information on y

 The estimated “Inverse Mills Ratio” is  The estimating equation is This helps us cover for the missing information the omitted variable that was biasing OLS A test of whether or not the errors are correlated and sample selection correction is needed can be built as a Wald test of the estimated coefficient of the inverse Mills ratio

 The estimated “Inverse Mills Ratio” is  The estimating equation is This helps us cover for the missing information the omitted variable that was biasing OLS Both the usual OLS standard errors and heteroskedasticity-robust standard errors reported from the regression if done manually are incorrect, use Heckit software!

 The maximum likelihood estimated wage equation is The standard errors based on the full information maximum likelihood procedure are smaller than those yielded by the two-step estimation method.

 We use two data different generation processes to explain the decision to work and the wage  But there is a subtle difference relative to the Cragg model  We here have a third (unobservable) element explaining the decision to work  If that element is also in the error of the main equation, we have a problem of sample selection

 Heckit with normal errors is theoretically identified without any restriction on the regressors  In principle, same regressors can appear in the equations for y ∗ 1 and y ∗ 2  In practice you want exclusion restrictions (something in the participation equation that is not in the outcome equation)  Otherwise you would be relying only on the nonlinearity of the inverse Mills ration for identification and the inverse Mills ratio term is actually approximately linear over a wide range of its argument, leading to multicollinearity issues   The problem is less severe the better a probit model can discriminate between participants and nonparticipants

 You would prefer to use them but it can be very difficult to make defensible exclusion restrictions

 Extensions  Plenty! And variations of the main model  And different names for the same models!   Example: what is the selection process is not just a binary decision but an ordered one: oheckman (Chiburis, R. and M. Lokshin (2007))

 Extensions  Heckit handles linear regression models when there is a selection mechanism  However, if the outcome equation involves a dichotomous dependent variable too, we would have a probit selection equation and a probit outcome equation  That ‘double probit model’/‘bivariate probit model with selection’ can be estimated with heckprob in STATA

 Extensions  A simpler version is the bivariate probit (biprobit)

Slide Principles of Econometrics, 3rd Edition  binary choice models  censored data  conditional logit  count data models  feasible generalized least squares  Heckit  identification problem  independence of irrelevant alternatives (IIA)  index models  individual and alternative specific variables  individual specific variables  latent variables  likelihood function  limited dependent variables  linear probability model  logistic random variable  logit  log-likelihood function  marginal effect  maximum likelihood estimation  multinomial choice models  multinomial logit  odds ratio  ordered choice models  ordered probit  ordinal variables  Poisson random variable  Poisson regression model  probit  selection bias  tobit model  truncated data

Further models  Survival analysis (time-to-event data analysis)  Multivariate probit (biprobit, triprobit, mvprobit)

References  Hoffmann, 2004 for all topics  Long, S. and J. Freese for all topics  Cameron and Trivedi’s book for count data  Agresti, A. (2001) Categorical Data Analysis (2nd ed). New York: Wiley.