Lecture 14-1 (Wooldridge Ch 17) Linear probability, Probit, and

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

The Simple Regression Model
Linear Regression.
Lecture 8 (Ch14) Advanced Panel Data Method
FIN822 Li11 Binary independent and dependent variables.
 These 100 seniors make up one possible sample. All seniors in Howard County make up the population.  The sample mean ( ) is and the sample standard.
Instrumental Variables Estimation and Two Stage Least Square
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Lecture 12 (Ch16) Simultaneous Equations Models (SEMs)
Nguyen Ngoc Anh Nguyen Ha Trang
Models with Discrete Dependent Variables
The Simple Linear Regression Model: Specification and Estimation
Cross section and panel method
Binary Response Lecture 22 Lecture 22.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
GRA 6020 Multivariate Statistics; The Linear Probability model and The Logit Model (Probit) Ulf H. Olsson Professor of Statistics.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Sample size computations Petter Mostad
Chapter 4 Multiple Regression.
Maximum likelihood (ML) and likelihood ratio (LR) test
FIN357 Li1 Binary Dependent Variables Chapter 12 P(y = 1|x) = G(  0 + x  )
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
The Simple Regression Model
Lecture 15-2 Censored regression
Lecture 14-2 Multinomial logit (Maddala Ch 12.2)
The Binary Logit Model Definition Characteristics Estimation 0.
Lecture 13 (Greene Ch 16) Maximum Likelihood Estimation (MLE)
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Lecture 2 (Ch3) Multiple linear regression
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
GRA 6020 Multivariate Statistics Probit and Logit Models Ulf H. Olsson Professor of Statistics.
BINARY CHOICE MODELS: LOGIT ANALYSIS
Lecture 16 Duration analysis: Survivor and hazard function estimation
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Lecture 3-2 Summarizing Relationships among variables ©
Lecture 15 Tobit model for corner solution
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
Lecture 3-3 Summarizing r relationships among variables © 1.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
Issues in Estimation Data Generating Process:
Week 41 Estimation – Posterior mean An alternative estimate to the posterior mode is the posterior mean. It is given by E(θ | s), whenever it exists. This.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
Chapter 11: Linear Regression and Correlation Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
CHAPTER 5 CONTINUOUS PROBABILITY DISTRIBUTION Normal Distributions.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
CHAPTER 4 ESTIMATES OF MEAN AND ERRORS. 4.1 METHOD OF LEAST SQUARES I n Chapter 2 we defined the mean  of the parent distribution and noted that the.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
4. Tobit-Model University of Freiburg WS 2007/2008 Alexander Spermann 1 Tobit-Model.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Data Modeling Patrice Koehl Department of Biological Sciences
The simple linear regression model and parameter estimation
Lecture 15 Tobit model for corner solution
Oliver Schulte Machine Learning 726
THE LOGIT AND PROBIT MODELS
THE LOGIT AND PROBIT MODELS
Multiple Regression Analysis with Qualitative Information
Multiple Regression Analysis with Qualitative Information
Introduction to Econometrics, 5th edition
Presentation transcript:

Lecture 14-1 (Wooldridge Ch 17) Linear probability, Probit, and Research Method Lecture 14-1 (Wooldridge Ch 17) Linear probability, Probit, and Logit models

Linear probability model (This is from Ch 7) Often the dependent variable is a dummy variable. For example, if you would like to find the determinants of labor force participation, the dependent variable is 1 if the person is working(participate in the labor force), and 0 if the person is not working(does not participate in the labor force).

Simplest way to predict the probability that the person would participate in the labor force is to estimate the following model using OLS. inlf=β0+β1x1+…+βkxk+u Where inlf =1 if the person is in the labor force and inlf=0 if the person is not in the labor force. This type of the model is called the linear probability model.

As long as explanatory variables are not correlated with the error term, this simple method is perfectly fine. However, a caution is necessary. When the dependent variable is a dummy variable, var(u|X)=Xβ[1- Xβ] where Xβ is a short hand notation for (β0+β1x1+…+βkxk) Thus, the variance depends on the values of the x-variables, which means that it is heteroskedastic. So, you should always use the robust-standard errors.

Exercise Using Mroz.dta, estimate the following linear probability model of labor force participation. inlf=β0+β1nwifinc+β2educ+β3exper +β4exper+β5age+β6kidslt6+β7kidsge6 +u

Answer

One problem of the linear probability model is that, the predicted probability can be greater than 1 or smaller than 0. This problem can be avoided by using Probit or Logit models which are described below. Nonetheless, you should note that, in many applications, the predicted probabilities fall mostly within the [0,1] interval. Thus, it is still a reasonable model, and it has been applied in many papers.

The probit and logit models Consider that you would like to know the determinants of the labor force participations of married women. For simplicity, let us consider the model with only one explanatory variable.

In probit and logit model, we assume that there is an unobserved variable (latent variable) y* that depends on x: y*=β0+β1x+u You can consider y* as the utility from participating in the labor force. We do not observe y*. We only observe the actual labor force participation outcome. Y=0 if the person is not working Y=1 if the person is working.

Then, we assume the following. If Y=0 (the person is not working), then y* must have been negative. If Y=1 (the person is working), then y* must have been positive.

This assumption can be written as: Intuitive interpretation is the following. If the utility from participating in the labor force is negative, the person will not work. But if the utility from participating in the labor force is positive, then the person will work.

Given the data about x and Y, we can compute the likelihood contribution for each person in the data by making the distributional assumption about the error term u. The model: yi*=β0+β1xi+ui The i-subscript denotes the observation id.

When we assume that ui follows the standard normal distribution (normal with mean 0 and variance 1), the model is called the Probit model. When we assume that u follows the logistic distribution, the model is called the Logit model.

Probit model: Likelihood function example The model yi*=β0+β1xi+ui Suppose that you have the following data. We assume that ui~N(0,1) Id Y X 1 2 4 3 5 6 9

Thus, the likelihood contribution is Take 2nd observation as an example. Since Y=0 for this observation, we know y*<0 Thus, the likelihood contribution is Id Y X 1 2 4 3 5 6 9 L2 -β0-β1 -β0-9β1 -β0-4β1 -β0-5β1 -β0-6β1

Thus, the likelihood contribution is Now, take 3nd observation as an example. Since Y=0 for this observation, we know y*≥0 Thus, the likelihood contribution is Id Y X 1 2 4 3 5 6 9 L3 -β0-β1 -β0-9β1 -β0-4β1 -β0-5β1 -β0-6β1

Then the likelihood function is given by Id Y X 1 2 4 3 5 6 9

Probit model Likelihood for more general case Consider the following model. yi*=β0+β1xi+ui ui~N(0,1) Id Y X 1 Y1 x1 2 Y2 x2 3 Y3 x3 : n Yn xn

Then, we know that The above can be conveniently written as: Since normal distribution is symmetric, we have . Thus, we have Thus, the likelihood function is given by

Usually, you maximize Log(L). The values of the parameters that maximize Log(L) are the estimators of the probit model. The MLE is done automatically by STATA.

The Logit Model The likelihood function example Consider the following model yi*=β0+β1xi+ui In Logit model, we assume that ui follows the logistic distribution with mean 0 and variance 1.

The density function of the logistic distribution with mean 0 and variance 1 is given by the following. The cumulative distribution function of the logistic distribution with mean 0 and variance 1 has the ‘closed form’.

Now, suppose that you have the following data. Take the 2nd observation as an example. Since Y2=0, it must have been the case that y2*<0. Thus, the likelihood contribution is: Id Y X 1 2 4 3 5 6 9

Thus, the likelihood contribution is: Now, take the 3rd observation as an example. Since Y3=0, it must have been the case that y3*<0. Thus, the likelihood contribution is: Id Y X 1 2 4 3 5 6 9

Thus the likelihood function for the data set is given by Id Y X 1 2 4 3 5 6 9

Logit model Likelihood for more general case Consider the following model. yi*=β0+β1xi+ui We assume ui follows the logistic distribution. Id Y X 1 Y1 x1 2 Y2 x2 3 Y3 x3 : n Yn xn

Then, we know that The above can be conveniently written as: Thus, the likelihood function is given by

Usually, you maximize log(L). The values of the parameters that maximize log(L) is the estimators of the Logit model.

Exercise Consider the following model. y*=β0+β1nwifinc+β2educ+β3exper +β4exper2+β5age+β6kidslt6+β7kidsge6 +u inlf=0 if y*<0 (i.e, not work if y*<0) inlf=1 if y*≥0 (i.e., work if y*≥0) Using Mroz.dta, estimate the parameters using both Probit and Logit models.

Probit

Logit

Interpreting the parameters Consider the following model. yi*=β0+β1xi+ui In probit and Logit models, interpretation of the parameters is not straightforward. For the explanation purpose, consider that this model is estimating the determinants of the labor force participation.

Probit case Note that =the probability that the person participates in the labor force. Therefore, if β1 is positive, an increase in x would increase the probability that the person participates in the labor force. If β1 is negative, then an increase in x will decrease the probability that the person participates in the labor force.

Example Increase in non wife income will decrease the probability that the woman works. Increase in education will increase the probability that the woman works. Increase in the number of kids who are younger than 6 will decrease the probability that the woman works.

The Partial Effects of Probit model (Continuous variable case) The sign of the parameters can tell you if the probability of the labor force participation will increase or decrease. We also want to know “by how much” the probability increases or decreases. This can be done by computing the partial effects (sometimes called the marginal effects).

The increase in the probability due to an increase in x-variable by one unit can be computed by taking the derivative of Φ(β0+β1xi) with respect to xi. Partial effect(Marginal effect)=The increase in the probability that the person works due to one unit increase in xi Partial effect Note, this is the density function, not the cumulative distribution function.

As you can see, the partial effect depends on the value of the x-variable. Therefore, it is different for different person in the data. However, we want to know the overall effect of x on the probability. There are two ways to do so. 1. Partial effect at average 2. The average partial effect

Partial effect at average Stata computes this automatically. This is called the marginal effect in Stata, Partial effect at average This is the partial effect evaluated at the average value of x. Average partial effect (APE) You compute the partial effect for each person, then take the average. This is not done automatically, but easy to do manually.

The partial effect of the probit model (Discrete variable case) Consider the following labor force participation model where Di is a dummy variable that is 1 if the person lives with parents, and 0 otherwise. y*=β0+β1xi+β2Di+ui

You want to know the effect of living with parents on the probability that the woman participates in the labor force. This is a dummy variable. In such a case, the partial effect formula described before is not a good approximation. For a dummy variable case, a better way to compute the partial effect is given in the next slides.

The partial effect at average for discrete case: This is computed automatically by STATA.

The average partial effect is computed as: Then, the average partial effect is computed as the sample average of the partial effect.

Exercise Using Mroz.dta, estimate the following model. y*=β0+β2educ+β3exper+u inlf=0 if y*<0 (i.e, not work if y*<0) inlf=1 if y*≥0 (i.e., work if y*≥0) u~N(0,1) Compute the effect of education on the labor force participation. Compute both the partial effect at average and the average partial effect.

Computing partial effect at average, manually.

Computing partial effect at average automatically.

Computing average partial effect, “manually”.

Exercise Use JPSC1.dta to estimate the following model. y*=β0+β2exper+β3(livetogether)+u Work=0 if y*<0 (i.e, not work if y*<0) Work=1 if y*≥0 (i.e., work if y*≥0) u~N(0,1) Livetogehter is a dummy variable for those who are living with parents. Q1. Estimate the effect of living with parents on the labor force participation.

Computing partial effect at average, “manually”.

Computing the partial effect at average automatically.

Computing the average partial effect “manually”.