9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Properties of Least Squares Regression Coefficients
Econometric Modeling Through EViews and EXCEL
Brief introduction on Logistic Regression
Binary Logistic Regression: One Dichotomous Independent Variable
Nguyen Ngoc Anh Nguyen Ha Trang
Models with Discrete Dependent Variables
8.4 Weighted Least Squares Estimation Before the existence of heteroskedasticity-robust statistics, one needed to know the form of heteroskedasticity -Het.
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Hypothesis Testing.
Multiple Linear Regression Model
Binary Response Lecture 22 Lecture 22.
CHAPTER 3 ECONOMETRICS x x x x x Chapter 2: Estimating the parameters of a linear regression model. Y i = b 1 + b 2 X i + e i Using OLS Chapter 3: Testing.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Chapter 4 Multiple Regression.
Econ 140 Lecture 71 Classical Regression Lecture 7.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
FIN357 Li1 Binary Dependent Variables Chapter 12 P(y = 1|x) = G(  0 + x  )
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Topic 3: Regression.
Economics Prof. Buckles
Ordinary Least Squares
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Hypothesis Testing in Linear Regression Analysis
Lecture 14-1 (Wooldridge Ch 17) Linear probability, Probit, and
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Chapter 10 Hetero- skedasticity Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin The Two-Variable Model: Hypothesis Testing chapter seven.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
1 We will now look at the properties of the OLS regression estimators with the assumptions of Model B. We will do this within the context of the simple.
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
4. Binary dependent variable Sometimes it is not possible to quantify the y’s Ex. To work or not? To vote one or other party, etc. Some difficulties: 1.Heteroskedasticity.
10-1 MGMG 522 : Session #10 Simultaneous Equations (Ch. 14 & the Appendix 14.6)
5-1 MGMG 522 : Session #5 Multicollinearity (Ch. 8)
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Chapter 4. The Normality Assumption: CLassical Normal Linear Regression Model (CNLRM)
Regression Overview. Definition The simple linear regression model is given by the linear equation where is the y-intercept for the population data, is.
Logistic Regression: Regression with a Binary Dependent Variable.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Financial Econometrics Lecture Notes 4
Generalized Linear Models
Pure Serial Correlation
HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?
THE LOGIT AND PROBIT MODELS
Basic Econometrics Chapter 4: THE NORMALITY ASSUMPTION:
The regression model in matrix form
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Simple Linear Regression
Financial Econometrics Fin. 505
Limited Dependent Variables
Presentation transcript:

9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)

9-2 Dummy Dependent Variable  Up to now, our dependent variable is continuous.  In some study, our dependent variable may take on a few values.  We will deal with a case where a dependent variable takes on the values of zero and one only in this session.  Note: –There are other types of regression that deal with a dependent variable that takes on, say, 3-4 values. –The dependent variable needs not be a quantitative variable, it could be a qualitative variable as well.

9-3 Linear Probability Model  Example: D i = 0 + 1 X 1i + 2 X 2i +ε i -- (1)  D i is a dummy variable.  If we run OLS of (1), this is a “Linear Probability Model.”

9-4 Problem of Linear Probability Model 1.The error term is not normally distributed.  This violates the classical assumption #7.  In fact, the error term is binomially distributed.  Hence, hypothesis testing becomes unreliable. 2.The error term is heteroskedastic.  Var(ε i ) = P i (1-P i ), where P i is the probability that D i = 1.  P i changes from one observation to another, therefore, Var(ε i ) is not constant.  This violates the classical assumption #5.

9-5 3.R 2 is not a reliable measure of overall fit.  R 2 reported from OLS will be lower than the true R 2.  For an exceptionally good fit, R 2 reported from OLS can be much lower than is not bounded between zero and one.  Substituting values for X 1 and X 2 into the regression equation, we could get > 1 or < 0. < 0. Problem of Linear Probability Model

9-6 Remedies for Problems The error term is not normal.  OLS estimator does not require that the error term be normally distributed.  OLS is still BLUE even though the classical assumption #7 is violated.  Hypothesis testing is still questionable, however. 2.The error term is heteroskedastic.  We can use WLS: Divide (1) through by  But we don’t know P i, but we know that P i is the probability that D i = 1.  So, we will divide (1) through by D i /Z i = 0 + 0 /Z i + 1 X 1i /Z i + 2 X 2i /Z i +u i : u i = ε i /Z i  can be obtained from substituting X 1 and X 2 into the regression equation.

9-7 Remedies for Problems R 2 is lower than actual.  Use R P 2 = the percentage of observations being predicted correctly.  Set >=.5 to predict D i = 1 and =.5 to predict D i = 1 and <.5 to predict D i = 0. OLS result does not report R P 2 automatically, you must calculate it by hand. 4. is not bounded between zero and one.  Follow this rule to avoid unboundedness problem. If > 1, then D i = 1. If < 0, then D i = 0.

9-8 Binomial Logit Model  To deal with unboundedness problem, we need another type of regression that mitigates the unboundedness problem, called Binomial Logit model.  Binomial Logit model deals with unboundedness problem by using a variant of the cumulative logistic function.  We no longer model D i directly.  We will use ln[D i /(1-D i )] instead of D i.  Our model becomes ln[D i /(1-D i )]= 0 + 1 X 1i + 2 X 2i +ε i --- (2)

9-9 How does Logit model solve unbounded problem?  ln[D i /(1-D i )]= 0 + 1 X 1i + 2 X 2i +ε i --- (2)  It can be shown that (2) can be written as  If the value in [..] = +∞, D i = 1.  If the value in [..] = –∞, D i = 0.  Unboundedness problem is now solved. See 13-4 on p. 601 for proof.

9-10 Logit Estimation  Logit estimation of coefficients cannot be done by OLS due to non-linearity in coefficients.  Use Maximum Likelihood Estimator (MLE) instead of OLS.  MLE is consistent and asymptotically efficient (unbiased and minimum variance for large samples).  It can be shown that for a linear equation that meets all 6 classical assumptions plus normal error term assumption, OLS and MLE will produce identical coefficient estimates.  Logit estimation works well for large samples, typically 500 observations or more.

9-11 Logit: Interpretations   1 measures the impact of one unit increase in X 1 on the ln[D i /(1-D i )], holding other Xs constant.  We still cannot use R 2 to compare overall goodness of fit because the variable ln[D i /(1-D i )] is not the same as D i in a linear probability model.  Even we use Quasi-R 2, the value of Quasi- R 2 we calculated will be lower than its true value.  Use R P 2 instead.

9-12 Binomial Probit Model  Binomial Probit model deals with unboundedness problem by using a variant of the cumulative normal distribution. --- (3) --- (3) Where, P i = probability that D i = 1 Z i =  0 + 1 X 1i + 2 X 2i +… s = a standardized normal variable

9-13  (3) can be rewritten as Z i = F -1 (P i )  Where F -1 is the inverse of the normal cumulative distribution function.  We also need MLE to estimate coefficients.

9-14 Similarities and Differences between Logit and Probit  Similarities –Graphs of Logit and Probit look very similar. –Both need MLE to estimate coefficients. –Both need large samples. –R 2 s produced by Logit and Probit are not an appropriate measure of overall fit.  Differences –Probit takes more computer time to estimate coefficients than Logit. –Probit is more appropriate for normally distributed variables. –However, for an extremely large sample set, most variables become normally distributed. The extra computer time required for running Probit regression is not worth the benefits of normal distribution assumption.