LIMITED DEPENDENT VARIABLE MODELS

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Qualitative and Limited Dependent Variable Models Chapter 18.
Properties of Least Squares Regression Coefficients
Economics 20 - Prof. Anderson1 Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
Lesson 10: Linear Regression and Correlation
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Nguyen Ngoc Anh Nguyen Ha Trang
Models with Discrete Dependent Variables
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Binary Response Lecture 22 Lecture 22.
Maximum likelihood (ML) and likelihood ratio (LR) test
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
Chapter 3 Simple Regression. What is in this Chapter? This chapter starts with a linear regression model with one explanatory variable, and states the.
Linear Regression with One Regression
Regression with a Binary Dependent Variable. Introduction What determines whether a teenager takes up smoking? What determines if a job applicant is successful.
The Simple Regression Model
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Multiple Regression and Correlation Analysis
Maximum likelihood (ML)
Multiple Regression – Basic Relationships
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Ordinary Least Squares
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
Regression Method.
Lecture 15 Tobit model for corner solution
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Estimation Kline Chapter 7 (skip , appendices)
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Managerial Economics Demand Estimation & Forecasting.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
M.Sc. in Economics Econometrics Module I Topic 7: Censored Regression Model Carol Newman.
Issues in Estimation Data Generating Process:
Discrete Choice Modeling William Greene Stern School of Business New York University.
Lecture 14 Summary of previous Lecture Regression through the origin Scale and measurement units.
Estimation Kline Chapter 7 (skip , appendices)
Linear Regression Linear Regression. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Purpose Understand Linear Regression. Use R functions.
Nonrandom Sampling and Tobit Models ECON 721. Different Types of Sampling Random sampling Censored sampling Truncated sampling Nonrandom –Exogenous stratified.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
Econometric methods of analysis and forecasting of financial markets Lecture 6. Models of restricted dependent variables.
4. Tobit-Model University of Freiburg WS 2007/2008 Alexander Spermann 1 Tobit-Model.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Chapter 4. The Normality Assumption: CLassical Normal Linear Regression Model (CNLRM)
Stats Methods at IC Lecture 3: Regression.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Limited Dependent Variables
THE LOGIT AND PROBIT MODELS
Ch3: Model Building through Regression
Simultaneous equation system
Charles University Charles University STAKAN III
Limited Dependent Variable Models and Sample Selection Corrections
Simple Linear Regression - Introduction
Introduction to Econometrics
THE LOGIT AND PROBIT MODELS
CHAPTER 26: Inference for Regression
Undergraduated Econometrics
Interval Estimation and Hypothesis Testing
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Simple Linear Regression
Regression & Correlation (1)
Presentation transcript:

LIMITED DEPENDENT VARIABLE MODELS -Copyright @ Amrapali Roy Barman Apurva Dey Jessica Pudussery Vasundhara Rungta

Introduction Truncation-When sample data are drawn from a restricted or limited subset of a larger population. Concern-infering the characteristics of full population from a restricted sample. Censoring-A sample in which information on the regressand is available only for some observations is known as a censored sample.

In economics, such a model was first suggested in a pioneering paper by Tobin in 1958, named “Estimation of Relationships for Limited Dependent Variables”. DEMAND FOR DURABLE GOODS He analyzed household expenditure on durable goods as a function of income using a regression model which took account of the fact that the expenditure cannot be negative.

IMPORTANT POINT: There are several observations where the expenditure is zero. This feature destroys the linearity assumption. So the least squares method is inappropriate.

Example of censored model : Charitable contributions[Reece (1979)] Example of truncated model : Suppose we have a sample of AIEEE rejects-those who scored below the 30th percentile .We wish to estimate an IQ equation AIEEE=f( education, age, socio economic characteristics ) Some other examples : Number of extramarital affairs [Fair (1977, 1978)] Number of arrests after release from prison [Witte (1980)] Annual marketing of new chemical entities [Wiggins (1981)] Number of hours worked by a woman in the labour force [Quester and Greene (1982)]

AIM OF THE PROJECT 2.Study truncated model STUDYING LIMITED DEPENDENT VARIABLES 1.Study censored model We regress pension as a function of age,education,tenure,experience & no. of dependents. Censored because a lot of people do not receive pension. so for them pension=0 in the data. 2.Study truncated model We regress GATE( a special programme) score on language test score and mathematics test score received by them prior to taking the GATE. Students enter GATE program only if they receive a minimum GATE score of 40.So the model is truncated.

METHODOLOGY AND THEORY A limited dependent variable Y is defined as a dependent variable whose range is substantively restricted. In the usual linear regression model we write Yi= β’Xi + ui Where ui ~ N(0,  σ2) =>Yi ~N(β’Xi, , σ2 ) => -∞ < Yi < ∞ However in many economics applications Yi does not satisfy this restriction. Mostly we have Yi ≥ 0 Example : Working hours ,where 0≤ Yi ≤ 24 More generally , a≤ Yi ≤ b.

Yi = Yi* if Yi* >0 = 0 if Yi* ≤ 0 To handle this problem , there are two methods : Non linear Specification :   We write Yi= e β’Xi + ui However inference is a problem in this method. Latent Variable Framework : We can write it in a latent variable framework. Yi*= β’Xi + ui , ui ~ N(0, σ2 u) Yi = Yi* if Yi* >0 = 0 if Yi* ≤ 0

The two types of limited dependent variable models are : Censoring occurs when the values of the dependent variable are restricted to a range of values ie. we observe both Yi =0 and Yi >0. When data is censored the distribution that applies to the sample data is a mixture of discrete and continuous distribution. The total probability is 1 as required ,so we simply assign the full probability in the censored region to the censoring point ,in this case 0. P(Yi>0) P(Yi=0) Truncation In a truncated model we observe only Yi > 0.Here the area under the curve after the truncation is scaled down so that its total area is 1.

CENSORED MODEL ESTIMATION : Estimation of the eq Yi= β’Xi + ui by OLS generates inconsistent estimates of β Mathematical explanation :

The Inverse Mills Ratio or the hazard rate : It  is the ratio of the probability density function to the cumulative distribution function. A common application of the inverse Mills ratio  to take account of a possible selection bias  

Intuitive explanation : Intuitively we see that the resulting intercept and slope coefficients are bound to be different than if all the observations were taken into account

Maximum likelihood estimation(ML): Censored regression models are usually estimated by the Maximum Likelihood (ML) method Observations: Li is a mixture of probability and density It depends on β and σ  

Non linear estimation :   Non linear estimation gives highly non linear equations that are difficult to solve Difference between Ml and NlE? In ML we assume ui~ N(0, σ2 ) In NLE we assume only independence of error , no assumption on distribution of errors

HECKMAN’S 2 step procedure A popular alternative to maximum likelihood estimation of the tobit model is Heckman’s two-step, or correction, method. Step 1: Use the probit estimate to compute estimate of (β / σ) Step 2: For positive observations of Y, run a regression of Yi on X1i and X2i . We get consistent estimates but not efficient .

TRUNCATED MODEL ESTIMATION: Regression Yi on Xi produces inconsistent β because of omitted variable bias. Heckman’s 2 step not possible as we cannot turn this into a probit model and get (β/ σ) estimate Non linear estimation is also difficult to do  

Maximum Likelihood :   Step 2 : Maximise Σ(Log N-Log D) Step 3 : Iterate to convergence TWO LIMIT TOBIT MODEL : Yi= β’Xi + ui  Yi = L1 if Yi*< L1  Yi = Yi* if L1 ≤ Yi ≤ L2  Yi=L2 if Yi* ≥ L2

LIKELIHOOD FUNCTION

SAS OUTPUT AND RESULT INTERPRETATION

Censored model Dependent variable- pension: $ value of employee pension Explanatory variables- exper: years of work experience age : age in years tenure : years with current employer educ : years schooling depends: number of dependents The sample is censored with the lower boundary being at 0. We have 616 observations in the sample.

PROC QLIM The QLIM procedure analyzes limited dependent variable models in which dependent variables take discrete or a continous range of values . We use it for models in which the dependent variable is censored or truncated from below or above or both. QLIM uses maximum likelihood estimation.  The model is estimated by specifying the endogenous variable to be truncated or censored. The limits of the dependent variable can be specified with the CENSORED or TRUNCATED option in the ENDOGENOUS or MODEL statement when the data are limited by specific values or variables.   The lb=(or ub=) option on the endogenous statement indicates the value at which the left (or right) truncation takes place.

Censored regression-Maximum likelihood SAS commands data sasuser.censoreddata; proc qlim data=sasuser.censoreddata; model pension=exper age tenure educ depends; endogenous pension~censored(lb=0); run;

Censored Regression Maximum Likelihood Results The coefficients of all the explanatory have a priori expected signs and are statistically significant at 5% level of significance.  An increase in the experience, educational attainment, tenure and the no. of dependents ,all lead to an increase in expected pension. And an increase in age leads to a decrease in expected value of pension received. Educational attainment contributes most to the increase in expected pension.

Heckman’s method We specify exactly two MODEL statements when we use this method One of the models must be a binary probit model; therefore, we must specify the DISCRETE option in the MODEL or in the ENDOGENOUS statement. We base the selection on the binary probit model for the second model; therefore, we must specify the SELECT option for this model.

Censored regression heckman commands data sasuser.heck1; set sasuser.heck; sel = (pension~=0); run; proc qlim data=sasuser.heck1; model sel=exper age tenure educ depends/discrete; model pension=exper age tenure educ depends/select (sel=1) ;

Censored Regression Heckman’s Method Results The coefficients of all the explanatory variables in our model have the signs expected a-priori from the theory

Truncated Model Dependent variable- achiv: This is the achievement score of the students in the GATE program which is truncated at the score of 40 since students need to have a minimum score of 40 to enter the program. Explanatory variables-langscore :language score mathscore :maths score 178 observations in the sample.

Truncated regression maximum likelihood commands data sasuser.truncateddata; proc qlim data=sasuser.truncateddata; model achiv= langscore mathscore; endogenous achiv~truncated(lb=40); run;  

Truncated regression-max likelihood results The coefficients of all the explanatory variables in our model a priori expected signs. Are statistically significant at 1% level of significance. An increase in both the language score and mathematics score of an individual leads to increase in the achievement in GATE.

Truncated regression summary statistics and histogram commands proc means data = sasuser.truncateddata; var achiv langscore mathscore; run; proc sgplot data = sasuser.truncateddata; histogram achiv / scale = count showbins; density achiv;  

Histogram of the truncated data

Truncated model- summary statistics and histogram results The summary statistics of the continuous outcome variable includes the mean of achiv and its standard error . achiv is truncated at the value of 40 since the minimum is 41. The histogram shows this truncation.

CONCLUSION Truncated and Censored Models have a wide range of economic applications ,such as the Asset holding model of Rosset ,Dividend Payment model ,Hazard Analysis etc

THANK YOU