Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Brief introduction on Logistic Regression
Logistic Regression.
Limited Dependent Variables
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Nguyen Ngoc Anh Nguyen Ha Trang
Models with Discrete Dependent Variables
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 6: Interpreting Regression Results Logarithms (Chapter 4.5) Standard Errors (Chapter.
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
The Simple Linear Regression Model: Specification and Estimation
Multiple Linear Regression Model
Binary Response Lecture 22 Lecture 22.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
Regression with a Binary Dependent Variable. Introduction What determines whether a teenager takes up smoking? What determines if a job applicant is successful.
The Simple Regression Model
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Log-linear and logistic models
Topic 3: Regression.
Review.
Chapter 5 Continuous Random Variables and Probability Distributions
An Introduction to Logistic Regression
BINARY CHOICE MODELS: LOGIT ANALYSIS
Generalized Linear Models
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: binary choice logit models Original citation: Dougherty, C. (2012) EC220.
What Is Hypothesis Testing?
Chapter 4 Continuous Random Variables and Probability Distributions
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
1 Regression Models with Binary Response Regression: “Regression is a process in which we estimate one variable on the basis of one or more other variables.”
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Lecture 14-1 (Wooldridge Ch 17) Linear probability, Probit, and
Review of Probability Concepts ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Managerial Economics Demand Estimation & Forecasting.
“Analyzing Health Equity Using Household Survey Data” Owen O’Donnell, Eddy van Doorslaer, Adam Wagstaff and Magnus Lindelow, The World Bank, Washington.
Issues in Estimation Data Generating Process:
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
Regression with a Binary Dependent Variable
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
04/19/2006Econ 6161 Econ 616 – Spring 2006 Qualitative Response Regression Models Presented by Yan Hu.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Logistic Regression and Odds Ratios Psych DeShon.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Econometric analysis of CVM surveys. Estimation of WTP The information we have depends on the elicitation format. With the open- ended format it is relatively.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Chapter 4: Basic Estimation Techniques
BINARY LOGISTIC REGRESSION
Chapter 4 Basic Estimation Techniques
Logistic Regression APKC – STATS AFAC (2016).
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS
THE LOGIT AND PROBIT MODELS
Regression with a Binary Dependent Variable.  Linear Probability Model  Probit and Logit Regression Probit Model Logit Regression  Estimation and Inference.
Generalized Linear Models
THE LOGIT AND PROBIT MODELS
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Simple Linear Regression
Parametric Methods Berlin Chen, 2005 References:
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Introduction to Econometrics, 5th edition
Limited Dependent Variables
Presentation transcript:

Linear Probability and Logit Models (Qualitative Response Regression Models) SA Quimbo September 2012

Dummy dependent variables Dependent variable is qualitative in nature For example, dependent variable takes only two possible values, 0 or 1 Examples.  Labor force participation  Insurance decision  Voter’s choice  School enrollment decision  Union membership  Home ownership Predicted dependent variable ~ estimated probability

Dummy dependent (2) Discrete choice models  Agent chooses among discrete choices: {commute, walk}  Utility maximizing choice is that which solves: Max [U(commute), U(walk)]  Utility levels are not observed, but choices are  Use a dummy variable for actual choice  Estimate a demand function for public transportation where Y = 1 if individual chose to commute = 0 otherwise

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Binary Dependent Variables (cont.) Suppose we were to predict whether NFL football teams win individual games, using the reported point spread from sports gambling authorities. For example, if the Packers have a spread of 6 against the Dolphins, the gambling authorities expect the Packers to lose by no more than 6 points.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Binary Dependent Variables (cont.) Using the techniques we have developed so far, we might regress How would we interpret the coefficients and predicted values from such a model?

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Binary Dependent Variables (cont.) D i Win is either 0 or 1. It does not make sense to say that a 1 point increase in the spread increases D i Win by  1. D i Win can change only from 0 to 1 or from 1 to 0. Instead of predicting D i Win itself, we predict the probability that D i Win = 1.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Binary Dependent Variables (cont.) It can make sense to say that a 1 point increase in the spread increases the probability of winning by  1. Our predicted values of D i Win are the probability of winning.

Linear Probability Model (LPM) Consider the ff. model: Y i = β 1 + β 2 X i + u i where i~ families Y i = 1 if family owns a house = 0 otherwise X i = family income Dichotomous variable is a linear function of X i

The predicted values of Y i can be interpreted as the estimated probability of owning a house, conditional on income i.e., E(Y i |X i ) = Pr(Y i =1|X i ) LPM (2)

Let P i = probability that Y i =1 Probability that Y i =0 is 1-P i E(Y i )? E(Y i ) = (1)(P i ) + (0) (1-P i ) = P i Y i = β 1 + β 2 X i + u i Linear Probability Model LPM (3)

Assuming E(u i ) = 0 Then E(Y i |X i ) = β 1 + β 2 X i Or P i = β 1 + β 2 X i Where 0 ≤ P i ≤ 1 LPM (4)

Problems in Estimating the LPM Non-normality of disturbances: Y i = β 1 + β 2 X i + u i u i = Y i – β 1 – β 2 X i If Y i = 1: u i = 1 – β 1 – β 2 X i Y i = 0: u i = - β 1 – β 2 X i * u i ’s are binomially distributed -> OLS estimates are unbiased; -> as the sample increases, u i ’s will tend to be normal

Heteroskedastic Disturbances var(u i ) = E(u i -E(u i )) 2 = E(u i 2 ) = (1 – β 1 – β 2 X i ) 2 (P i ) + (- β 1 – β 2 X i ) 2 (1-P i ) = (1 – β 1 – β 2 X i ) 2 ( β 1 + β 2 X i ) + (- β 1 – β 2 X i ) 2 (1- β 1 -β 2 X i ) = ( β 1 + β 2 X i ) (1- β 1 -β 2 X i ) = P i (1-P i ) * Var (u i ) will vary with X i Problems (2)

Transform model in such a way that the transformed disturbances are not heteroskedastic: Let w i = P i (1-P i ) Problems (3)

R 2 may not be a good measure of model fit X X X X X X XXX XXXXXXXX SRF Income Home ownership Problems (4)

Assumed bounds (0 ≤ E(Y i |X i ) ≤ 1) could be violated Example, Gujarati (see next slide): six estimated values are negative and six values are in excess of one Problems (4)

Example Hypothetical data on home ownership Gujarati, p.588

Linear vs. Non-linear Probability Models P X 0 1 SRF, LPM example CDF, RV Constant= Slope = 0.10 ~ logistically or normally distributed RVs

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Binary Dependent Variables (cont.) We need a procedure to translate our linear regression results into true probabilities. We need a function that takes a value from -∞ to +∞ and returns a value from 0 to 1.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Binary Dependent Variables (cont.) We want a translator such that:  The closer to -∞ is the value from our linear regression model, the closer to 0 is our predicted probability.  The closer to +∞ is the value from our linear regression model, the closer to 1 is our predicted probability.  No predicted probabilities are less than 0 or greater than 1.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Figure 19.2 A Graph of Probability of Success and X

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Binary Dependent Variables How can we construct such a translator? How can we estimate it?

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Probit/Logit Models (Chapter 19.2) In common practice, econometricians use TWO such “translators”:  probit  logit The differences between the two models are subtle. For present purposes there is no practical difference between the two models.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Probit/Logit Models Both the Probit and Logit models have the same basic structure. 1.Estimate a latent variable Z using a linear model. Z ranges from negative infinity to positive infinity. 2.Use a non-linear function to transform Z into a predicted Y value between 0 and 1.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Probit/Logit Model (cont.) Suppose there is some unobserved continuous variable Z that can take on values from negative infinity to infinity. The higher E ( Z ) is, the more probable it is that a team will win, or a student will graduate, or a consumer will purchase a particular brand.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Probit/Logit Model (cont.) We call an unobserved variable, Z, that we use for intermediate calculations, a latent variable.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Deriving Probit/Logit (cont.)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Deriving Probit/Logit (cont.) Note: the assumption that the breakpoint falls at 0 is arbitrary.  0 can adjust for whichever breakpoint you might choose to set.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Deriving Probit/Logit (cont.) We assume we know the distribution of u i. In the probit model, we assume u i is distributed by the standard normal. In the logit model, we assume u i is distributed by the logistic.

30 Probit model (one explanatory variable: )

31 Probit model

32 Steps for a probit model

28-33 Estimating a Probit/Logit Model (Chapter 19.2) In practice, how do we implement a probit or logit model? Either model is estimated using a statistical method called the method of maximum likelihood.

Copyright © 2006 Pearson Addison- Wesley. All rights reserved Estimating a Probit/Logit Model In practice, how do we implement a probit or logit model? Either model is estimated using a statistical method called the method of maximum likelihood.

Alternative Estimation Method Ungrouped/ individual data Maximum Likelihood Estimation  Choose the values of the unknown parameters (β 1,β 2 ) such that the probability of observing the given Ys is the highest possible

MLE Recall: P’s are not observed but Y’s are;  Pr(Y=1)=P i  Pr(Y=0)=1-P i Joint probability of observing n Y values: f(Y 1,…,Y n )=Π i=1,…,n P i Yi (1-P i ) 1-Yi

MLE (Gujarati, page 634) Danao (2013), page 485: “Under standard regularity conditions, maximum likelihood estimators are consistent, asymptotically normal, and asymptotically efficient. In other words, in large samples, maximum likelihood estimators are consistent, normal, and best.

MLE (2) Taking its natural logarithm, the log likelihood function is obtained: ln f(Y 1,…,Y n )=  Y i (β 1 + β 2 X i ) -  ln[1+exp(β 1 +β 2 X i )] Max log likelihood function by choosing (β 1,β 2 )

Example Individual data

Interpreting the results Iterative procedure to get at the maximum of the log likelihood function Use Z (standard normal variable) instead of t Pseudo R 2 – more meaningful alternative to R 2 ; or, use the count R 2 LR statistic is equivalent to the F ratio computed in testing the overall significance of the model Estimated slope coefficient measures the estimated change in the logit for a unit change in X Predicted probability (at the mean income) of owning a home is 0.63 Or, every unit increase in income increases the odds of owning a home by 11 percent

Pseudo R 2 Danao, page 487, citing Gujarati (2008): “In binary regressand models, goodness of fit is of secondary importance. What matters are the expected signs of the regression coefficients and their statistical and practical significance”

Logit Model Consider the home ownership model: Y i = β 1 + β 2 X i + u i where i~ families Y i = 1 if family owns a house = 0 otherwise X i = family income

43 Logistic Probability Distribution PDF: f(x) = exp(x)/[1+exp(x)] 2 CDF: F(a) = exp(a)/[1+exp(a)]  Symmetric, unimodal distribution  Looks a lot like the normal  Incredibly easy to evaluate the CDF and PDF  Mean of zero, variance > 1 (more variance than normal)

The Logistic Distribution Function Assume that owning a home is a random event, and the probability that this random event occurs is given by: where Z i = β 1 + β 2 X i (i) 0 ≤ P i ≤1 and (ii) P i is a nonlinear function of Z i OLS is not appropriate

The Odds Ratio (1)

If the probability of owning a home is 10 percent, then the odds ratio is.10/(1-0.10), or the odds are 1 to 9 in favor of owning a home The Odds Ratio (2)

Logit ln (P i /(1-P i ))=ln (e zi )=Z i ln (P i /(1-P i )) = β 1 + β 2 X i - The log of the odds-ratio is a linear function of X and the parameters β 1 and β 2 - P i Є [0,1] but ln (P i /(1-P i )) Є (-∞, ∞) - L i = ln (P i /(1-P i )), L i ~ “logit”

The Logit Model L i = ln (P i /(1-P i )) = β 1 + β 2 X i + u i  Although P Є [0,1], logits are unbounded.  Logits are linear in X but P is not linear in X  L<0 if the odds ratio<1 and L>0 if the odds ratio>1  β 2 measures the change in L (“log-odds”) as X changes by one unit

Estimating the Logit Model Problem with individual households/units: ln(1/0) and ln(0/1) are undefined Solution: Estimate P i /(1-P i ) from the data, where P i =relative frequency = n i /N i N i = number of families for a specific level of X i (say, income) n i = number of families owning a home

Example using Grouped Data Estimate the home ownership model using grouped data and OLS: Y i = β 1 + β 2 X i + u i

Estimating (2) Problem: heteroskedastic disturbances If the proportion of families owning a home follows a binomial distribution, then

Estimating (3) Solution: Transform the model such that the new disturbance term is homoscedastic Consider: w i = N i P i (1-P i )

Estimating (4) Estimate the ff. by OLS: where Note: regression model has two regressors and no constant

Example (2) STATA results:

Interpreting the results A unit increase in weighted income (=sqrt(w)*X) increases the weighted log-odds (=sqrt(w)*L) by

Interpreting (2) (antilog of the estimated coefficient of weighted X – 1) *100 = percent change in the odds in favor of owning a house for every unit increase in weighted X; Predicted probabilities: where V is the predicted logit (= predicted lstar divided by sqrt(w)) How does a unit increase in X impact on predicted probabilities? -> varies with X ->

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) The computer then calculates the   ’s that assigns the highest probability to the outcomes that were observed. The computer can calculate the   ’s for you. You must know how to interpret them.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved TABLE 19.3 What Point Spreads Say About the Probability of Winning in the NFL: III

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) In a linear regression, we look to coefficients for three elements: 1.Statistical significance: You can still read statistical significance from the slope dZ / dX. The z -statistic reported for probit or logit is analogous to OLS’s t -statistic. 2.Sign: If dZ / dX is positive, then dProb(Y) / dX is also positive.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.)  The z -statistic on the point spread is -7.22, well exceeding the 5% critical value of The point spread is a statistically significant explanator of winning NFL games.  The sign of the coefficient is negative. A higher point spread predicts a lower chance of winning.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) 3.Magnitude: the magnitude of dZ / dX has no particular interpretation. We care about the magnitude of dProb(Y) / dX.  From the computer output for a probit or logit estimation, you can interpret the statistical significance and sign of each coefficient directly. Assessing magnitude is trickier.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Probit/Logit (cont.) To predict the Prob ( Y ) for a given X value, begin by calculating the fitted Z value from the predicted linear coefficients. For example, if there is only one explanator X :

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Probit/Logit Model (cont.)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Probit/Logit Model (cont.) Then use the nonlinear function to translate the fitted Z value into a Prob(Y ):

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Probit/Logit Model (cont.)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) Problems in Interpreting Magnitude: 1.The estimated coefficient relates X to Z. We care about the relationship between X and Prob(Y = 1). 2.The effect of X on Prob(Y = 1) varies depending on Z.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) There are two basic approaches to assessing the magnitude of the estimated coefficient. One approach is to predict Prob(Y ) for different values of X, to see how the probability changes as X changes.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) Note Well: the effect of a 1-unit change in X varies greatly, depending on the initial value of E(Z ). E(Z ) depends on the values of all explanators.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) For example, let’s consider the effect of 1 point change in the point spread, when we start 1 standard deviation above the mean, at SPREAD = 5.88 points. Note: In this example, there is only one explanator, SPREAD. If we had other explanators, we would have to specify their values for this calculation, as well.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) Step One: Calculate the E(Z ) values for X = 5.88 and X = 6.88, using the fitted values. Step Two: Plug the E(Z ) values into the formula for the logistic density function.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) Changing the point spread from 5.88 to 6.88 predicts a 2.4 percentage point decrease in the team’s chance of victory. Note that changing the point spread from 8.88 to 9.88 predicts only a 2.1 percentage point decrease.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) The other approach is to use calculus.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved Estimating a Probit/Logit Model (cont.) Some econometrics software packages can calculate such “pseudo-slopes” for you. In STATA, the command is “dprobit.” EViews does NOT have this function.

78 Tobit Model

79

Censored Regression Model 80

Truncated Regression Model 81

82