Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Qualitative and Limited Dependent Variable Models Chapter 18.
Managerial Economics in a Global Economy
Brief introduction on Logistic Regression
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Binary Logistic Regression: One Dichotomous Independent Variable
Conclusion to Bivariate Linear Regression Economics 224 – Notes for November 19, 2008.
FIN822 Li11 Binary independent and dependent variables.
Limited Dependent Variables
Statistics: Purpose, Approach, Method. The Basic Approach The basic principle behind the use of statistical tests of significance can be stated as: Compare.
Models with Discrete Dependent Variables
8.4 Weighted Least Squares Estimation Before the existence of heteroskedasticity-robust statistics, one needed to know the form of heteroskedasticity -Het.
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
Chapter 13 Multiple Regression
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Multiple Linear Regression Model
Binary Response Lecture 22 Lecture 22.
CHAPTER 3 ECONOMETRICS x x x x x Chapter 2: Estimating the parameters of a linear regression model. Y i = b 1 + b 2 X i + e i Using OLS Chapter 3: Testing.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
Chapter 12 Multiple Regression
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Regression with a Binary Dependent Variable. Introduction What determines whether a teenager takes up smoking? What determines if a job applicant is successful.
FIN357 Li1 Binary Dependent Variables Chapter 12 P(y = 1|x) = G(  0 + x  )
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Chapter 8 Estimation: Single Population
Topic 3: Regression.
Economics Prof. Buckles
1 Regression Models with Binary Response Regression: “Regression is a process in which we estimate one variable on the basis of one or more other variables.”
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Introduction to Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Hypothesis Testing in Linear Regression Analysis
Lecture 14-1 (Wooldridge Ch 17) Linear probability, Probit, and
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Theory of Probability Statistics for Business and Economics.
7.1 Multiple Regression More than one explanatory/independent variable This makes a slight change to the interpretation of the coefficients This changes.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Specification Error I.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
Issues in Estimation Data Generating Process:
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Dummy Variable Regression Models chapter ten.
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
6. Simple Regression and OLS Estimation Chapter 6 will expand on concepts introduced in Chapter 5 to cover the following: 1) Estimating parameters using.
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
04/19/2006Econ 6161 Econ 616 – Spring 2006 Qualitative Response Regression Models Presented by Yan Hu.
Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
1/25 Introduction to Econometrics. 2/25 Econometrics Econometrics – „economic measurement“ „May be defined as the quantitative analysis of actual economic.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Regression Analysis Part A Basic Linear Regression Analysis and Estimation of Parameters Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied.
Chapter 4. The Normality Assumption: CLassical Normal Linear Regression Model (CNLRM)
Regression Overview. Definition The simple linear regression model is given by the linear equation where is the y-intercept for the population data, is.
6. Simple Regression and OLS Estimation
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS
THE LOGIT AND PROBIT MODELS
CHAPTER 12 More About Regression
THE LOGIT AND PROBIT MODELS
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Introduction to Econometrics, 5th edition
Limited Dependent Variables
Presentation transcript:

Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration

1. Linear Probability Model

Introduction Sometimes we have a situation where the dependent variable is qualitative in nature It takes on two (or more) mutually exclusive values Examples: Whether or not a person is in the labor force Union membership

Linear Probability Model Examine choice of whether an individual owns a house. Y i = b 1 + b 2 X i + u i where Y i = 1 if family owns a house Y i = 0 if family does not own a house X i = family income

Linear Probability Model We can estimate such a model by OLS. However, we don't get good results. This is called a linear probability model because E(Y i | X i ) is the conditional probability that the event (buying a house) will occur given X i (family income).

Derivation Expected value of above: E(Y i |X i ) = b 1 + b 2 X i since E(u i ) = 0. Let P i = probability that Y i =1 (the event occurs) Then 1-P i is the probability Y i =0 Then by definition of a mathematical expectation: E(Y i |X i )= 0(1-P i ) + 1(P i ) = P i

Derivation So E(Y i |X i )= b 1 + b 2 X i = P i So the conditional expectation is like a conditional probability.

Problems with LPM Error term is not normally distributed but follows a binomial probability distribution For OLS we do not require that the error term is distributed normally. But we do assume this for the purposes of hypothesis testing.

Problems with LPM However we can ’ t assume normality for the error term here U i takes on only two values: When Y i = 1 then u i = 1 - b 1 - b 2 X i Y i = 0 then u i = - b 1 - b 2 X i So u i is not normally distributed, but follows a binomial distribution. Note that the OLS point estimates still remain unbiased. As n rises the estimators will tend to be ~ N

Problems with LPM Error term is heteroskedastic Though the E(u i ) = 0, the errors are not homoscedastic. var(u i ) = E(Y i |X i )[1-E(Y i |X i )] var (u i )= P i (1- P i ) This is heteroskedastic because the conditional expectation of Y, depends on the value taken by X.

Problems with LPM What does this imply? With heteroskedasticity, OLS estimators are unbiased but not efficient They do not have minimum variance. We correct the heteroskedasticity - Transform data with weight = P i (1- P i ) This eliminates the heteroskedasticity

Problems with LPM In practice we don't know the true probability - so estimate it: a. Run OLS on original model. b. Get predicted Y i and construct w i = predictedY i* (1-predictedY i ) c. Do OLS regression on transformed data

Problems with LPM Probabilities falling outside 0 and 1 is main problem with LPM. Although in theory P(Y i | X i ) would fall between 0 and 1, there is no guarantee that predicted probabilities in the linear model will We can estimate by OLS and see if estimated probabilities lie outside these bounds, then assume them to be at 0 or 1.

Problems with LPM Or use probit or logit model that guarantees that the estimated probabilities will fall between these limits. Graph

Problems with LPM LPM assumes that probabilities increase linearly with the explanatory variables Each unit increase in an X has the same effect on the probability of Y occurring regardless of the level of the X. More realistic to assume a smaller effect at high probability levels. Probit and Logit make this assumption

2. CDF

Introduction Probit and Logit have a S shaped probability function As X increases, probability of Y increases, but never steps outside the 0-1 interval The relationship between the probability of Y and X is nonlinear It approaches zero at slower and slower rates as X gets small

Introduction It approaches one at slower and slower rates as X gets large. The S-shaped curve can be modeled by a cumulative distribution function (CDF). The CDF of a random variable X: F(X) = P(X  x) CDF measures the probability that X takes a value of less than or equal to a given x

Introduction Graph of F(X) vs X The CDF's most commonly chosen are : The logistic function - logit; The cumulative normal - probit Logit and probit quite different models, different interpretation. Logit distribution has flatter tails Approaches the axes more slowly

3. Probit

Introduction Suppose the decision to join union depends on some unobserved index Z i "the propensity to join" for each individual. Don't observe the "propensity to join" Just observe union or not. So we only observe dummy variable D,

Introduction Defined as: D = 0 if a worker is nonunion. D = 1 if a worker is union member Behind this "observed" dummy variable is the "unobserved" index Assume Z depends on explanatory variables such as wage. So Z i = b 1 + b 2 X i where X i is the wage of the i'th individual

Introduction Each individual's Z index can be expressed a function of some intercept term and wage with attached coefficient Reality: many X's, not just wage Suppose there's a critical level or threshold level of the Z, -- Z i *, If Z i >Z i * an individual will join, otherwise will not.

Introduction Assume Z i * is distributed normally with the same mean and variance as Z i. What's the probability that Z i >Z i * In other words, what's the probability that this individual will join?.

Introduction P i, the probability of joining, is measured by the area under the standard normal curve from -  to Z i. Individuals are at different points along this function Have different critical values pushing them into joining, depending on characteristics.

Introduction How do we estimate Z i ? Use the inverse of the cumulative normal function, Z i =F -1 (P i ) = b 1 +b 2 X i