Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Qualitative and Limited Dependent Variable Models Chapter 18.
Econometrics I Professor William Greene Stern School of Business
The Simple Regression Model
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Prediction, Goodness-of-Fit, and Modeling Issues ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
Nguyen Ngoc Anh Nguyen Ha Trang
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
Models with Discrete Dependent Variables
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
Binary Response Lecture 22 Lecture 22.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
Simultaneous Equations Models
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Qualitative and Limited Dependent Variable Models Prepared by Vera Tabakova, East Carolina University.
Topic 3: Regression.
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: Tobit models Original citation: Dougherty, C. (2012) EC220 - Introduction.
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Qualitative and Limited Dependent Variable Models
ECON 6012 Cost Benefit Analysis Memorial University of Newfoundland
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Hypothesis Testing in Linear Regression Analysis
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
Qualitative and Limited Dependent Variable Models Adapted from Vera Tabakova’s notes ECON 4551 Econometrics II Memorial University of Newfoundland.
Review of Probability Concepts ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes SECOND.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Review of Probability Concepts ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Managerial Economics Demand Estimation & Forecasting.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
M.Sc. in Economics Econometrics Module I Topic 7: Censored Regression Model Carol Newman.
Issues in Estimation Data Generating Process:
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
The Simple Linear Regression Model: Specification and Estimation ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s.
Dynamic Models, Autocorrelation and Forecasting ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
 Binary models Logit and Probit  Binary models with correlation (multivariate)  Multinomial non ordered  Ordered models (rankings)  Count models.
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
Nonrandom Sampling and Tobit Models ECON 721. Different Types of Sampling Random sampling Censored sampling Truncated sampling Nonrandom –Exogenous stratified.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
4. Tobit-Model University of Freiburg WS 2007/2008 Alexander Spermann 1 Tobit-Model.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Econometric analysis of CVM surveys. Estimation of WTP The information we have depends on the elicitation format. With the open- ended format it is relatively.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Linear Regression with One Regression
Hypotheses and test procedures
Limited Dependent Models
Limited Dependent Variables
The Simple Linear Regression Model: Specification and Estimation
Charles University Charles University STAKAN III
Review of Probability Concepts
Simultaneous Inferences and Other Regression Topics
Review of Probability Concepts
Interval Estimation and Hypothesis Testing
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Regression Lecture-5 Additional chapters of mathematics
Simple Linear Regression
Chapter 7: The Normality Assumption and Inference with OLS
Introduction to Econometrics, 5th edition
Limited Dependent Variables
Presentation transcript:

Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes

Censoring, Truncation, sample selection and related models We now consider two closely related models: regression when the dependent variable of interest is incompletely observed (due to censoring or truncation) regression when the dependent variable is completely observed but is observed in a selected sample that is not representative of the population

Censoring, Truncation, sample selection and related models OLS regression yields inconsistent estimates because the sample is not representative of the population The first-generation estimation methods require strong distributional assumptions and even seemingly minor departures from those assumptions, such as heteroskedasticity, can lead to inconsistency

 Censored Data Figure 16.3 Histogram of Wife’s Hours of Work in 1975

Having censored data means that a substantial fraction of the observations on the dependent variable take a limit value. The regression function is no longer given by (16.30). The least squares estimators of the regression parameters obtained by running a regression of y on x are biased and inconsistent—least squares estimation fails.

Having censored data means that a substantial fraction of the observations on the dependent variable take a limit value. The regression function is no longer given by (16.30). The least squares estimators of the regression parameters obtained by running a regression of y on x are biased and inconsistent—least squares estimation fails.

Censoring versus Truncation  Censoring occurs when some of the observations of the dependent variable have been recorded as having reached a limit value regardless of what their actual value might be  For instance, anyone earning $1 million or more per year might be recorded in your dataset at the upper limit of $1 million

Censoring versus Truncation  With truncation, we only observe the value of the regressors when the dependent variable takes a certain value (usually a positive one instead of zero)  With censoring we observe in principle the value of the regressors for everyone, but not the value of the dependent variable for those whose dependent variable takes a value beyond the limit

We give the parameters the specific values and Assume

 Create N = 200 random values of x i that are spread evenly (or uniformly) over the interval [0, 20]. These we will keep fixed in further simulations.  Obtain N = 200 random values e i from a normal distribution with mean 0 and variance 16.  Create N = 200 values of the latent variable.  Obtain N = 200 values of the observed y i using

Figure 16.4 Uncensored Sample Data and Regression Function

Figure 16.5 Censored Sample Data, and Latent Regression Function and Least Squares Fitted Line

The maximum likelihood procedure is called Tobit in honor of James Tobin, winner of the 1981 Nobel Prize in Economics, who first studied this model. The probit probability that y i = 0 is:

The maximum likelihood estimator is consistent and asymptotically normal, with a known covariance matrix. Using the artificial data the fitted values are:

Because the cdf values are positive, the sign of the coefficient does tell the direction of the marginal effect, just not its magnitude. If β 2 > 0, as x increases the cdf function approaches 1, and the slope of the regression function approaches that of the latent variable model.

Figure 16.6 Censored Sample Data, and Regression Functions for Observed and Positive y values Uncensored mean Truncated mean Censored mean

26.66 Marginal effect on the observed hours while is the effect on the underlying “unconditional” hours* *NB: in all cases the expectation is conditional on the values of the regressors, so do not get confused by the terminology here

Estimating the model by OLS with the zero observations in the model would reduce all of the slope coefficients substantially Eliminating the zero observations as in the OLS regression just shown even reverses the sign of the effect of years of schooling (though it is a non-significant effect) For only women in the labor force, more schooling has no effect on hours worked If you consider the entire population of women, however, more schooling does increase hours, but we can now see that it is likely by encouraging more women into the labor force, not by encouraging those already in the market to work more hours

There are several marginal effects of potential interest after -tobit-: -the marginal effect on the expected value of the latent dependent variable (on E(y*), simply given by the Tobit estimate) -the marginal effect on the expected value of the dependent variable conditional on its being greater than the lower limit (on E(y|x, y>0)=E(y*|x, y>0)) -the marginal effect on the expected value of the observed (that is zeros included) dependent variable (on E(y|x), given by Expression 16.35) -the marginal effect on the probability of the dependent variable exceeding the lower limit

By default Stata chooses the effect on the latent variable option, which are exactly the same as the coefficients estimated by -tobit-. You will have to specify the - predict() - option in -mfx- to get the other marginal effects. See help mfx- help tobit postestimation-

-the marginal effect on the expected value of the latent dependent variable (on E(y*), simply given by the Tobit estimate) -the marginal effect on the expected value of the dependent variable conditional* on its being uncensored, that is, greater than the lower limit (on E(y|x, y>0)=E(y*|x, y>0)) mfx compute, predict(e(0,.)) mfx compute, predict(e(a,b)) -*NB: in all cases the expectation is conditional on the values of the regressors, so do not get confused by the terminology here

-the marginal effect on the expected value of the observed (that is, zeros included) dependent variable (on E(y|x), given by Expression 16.35) mfx compute, predict(ys(0,.)) mfx compute, predict(ys(a,b)) -the marginal effect on the probability of the dependent variable exceeding the lower limit -mfx compute, predict(p(0,1)) -mfx compute, predict(p(a,b))

 Interval data are data recorded in intervals rather than as a continuous variable  Survey data are often collected in this way to make it easier for the respondent and to provide some greater anonymity in responses to more personal question such as income and age  Income is often reported in intervals of $10,000 and then topcoded at a figure like $100,000 or $130,000  In contingent valuation studies, sometimes a questions to elicit willingness to pay ask respondents to choose an interval Such data are then censored at multiple points, with the observed data y being only the particular interval in which the unobserved y ∗ lies

 Interval data are data recorded in intervals rather than as a continuous variable  In these cases you have a multi-censored dependent variable

 Interval data are data recorded in intervals rather than as a continuous variable  STATA’s intreg will help with this model

 Interval data are data recorded in intervals rather than as a continuous variable  In contingent valuation studies, sometimes a double- bound dichotomous-choice questions to elicit willingness to pay  In these cases you have a doubly-censored dependent variable with two variable limits  STATA’s intreg will help with this model

 Interval data are data recorded in intervals rather than as a continuous variable  You are probably guessing that another (less flexible) way to model these cases is by using an ordered regression model  The ordered probit in particular would be quite close to the interval regression model

 Interval data are data recorded in intervals rather than as a continuous variable  STATA’s intreg will help with this model  Example:

 STATA’s intreg will help with this model  intreg depvar1 depvar2 [indepvars] [if] [in] [weight] [, options]  By choosing the depvar1 depvar2 smartly you can also fit other models: Type of data depvar1 depvar point data a = [a,a] a a interval data [a,b] a b left-censored data (-inf,b]. b right-censored data [a,inf) a

Slide Principles of Econometrics, 3rd Edition  binary choice models  censored data  latent variables  likelihood function  limited dependent variables  log-likelihood function  marginal effect  maximum likelihood estimation  multinomial choice models  ordered choice models  ordered probit  ordinal variables  probit  tobit model  truncated data

Further models  Survival analysis (time-to-event data analysis)

References  Hoffmann, 2004 for all topics  Long, S. and J. Freese for all topics  Agresti, A. (2001) Categorical Data Analysis (2nd ed). New York: Wiley.