Specification Error II

Slides:



Advertisements
Similar presentations
Further Inference in the Multiple Regression Model Hill et al Chapter 8.
Advertisements

Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
Welcome to Econ 420 Applied Regression Analysis
There are at least three generally recognized sources of endogeneity. (1) Model misspecification or Omitted Variables. (2) Measurement Error.
The Multiple Regression Model.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Introduction and Overview
Studenmund(2006): Chapter 8
Instrumental Variables Estimation and Two Stage Least Square
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Chapter 13 Additional Topics in Regression Analysis
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Additional Topics in Regression Analysis
Chapter 9 Simultaneous Equations Models. What is in this Chapter? In Chapter 4 we mentioned that one of the assumptions in the basic regression model.
The Simple Regression Model
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Econ 140 Lecture 171 Multiple Regression Applications II &III Lecture 17.
Chapter 7 Multicollinearity. What is in this Chapter? In Chapter 4 we stated that one of the assumptions in the basic regression model is that the explanatory.
Economics 310 Lecture 18 Simultaneous Equations There is a two-way, or simultaneous, relationship between Y and (some of) the X’s, which makes the distinction.
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Topic 3: Regression.
1.The independent variables do not form a linearly dependent set--i.e. the explanatory variables are not perfectly correlated. 2.Homoscedasticity --the.
Multiple Regression and Correlation Analysis
Ekonometrika 1 Ekonomi Pembangunan Universitas Brawijaya.
AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Chapter 13.3 Multicollinearity.
Simple Linear Regression Analysis
Chapter 11 Simple Regression
Lecture 17 Summary of previous Lecture Eviews. Today discussion  R-Square  Adjusted R- Square  Game of Maximizing Adjusted R- Square  Multiple regression.
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
Specification Error I.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Six.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Chap 6 Further Inference in the Multiple Regression Model
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
1. The independent variables do not form a linearly dependent set--i.e. the explanatory variables are not perfectly correlated. 2. Homoscedasticity--the.
10-1 MGMG 522 : Session #10 Simultaneous Equations (Ch. 14 & the Appendix 14.6)
5-1 MGMG 522 : Session #5 Multicollinearity (Ch. 8)
MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED?
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
The Instrumental Variables Estimator The instrumental variables (IV) estimator is an alternative to Ordinary Least Squares (OLS) which generates consistent.
11.1 Heteroskedasticity: Nature and Detection Aims and Learning Objectives By the end of this session students should be able to: Explain the nature.
8- Multiple Regression Analysis: The Problem of Inference The Normality Assumption Once Again Example 8.1: U.S. Personal Consumption and Personal Disposal.
Chow test.
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
Econometric methods of analysis and forecasting of financial markets
Simultaneous equation system
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Fundamentals of regression analysis
Chapter 6: MULTIPLE REGRESSION ANALYSIS
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED
Goodness of Fit The sum of squared deviations from the mean of a variable can be decomposed as follows: TSS = ESS + RSS This decomposition can be used.
MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED
Statistics II: An Overview of Statistics
BEC 30325: MANAGERIAL ECONOMICS
Chapter 13 Additional Topics in Regression Analysis
Instrumental Variables Estimation and Two Stage Least Squares
Autocorrelation Dr. A. PHILIP AROKIADOSS Chapter 5 Assistant Professor
Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences? It may be difficult to separate.
Financial Econometrics Fin. 505
BEC 30325: MANAGERIAL ECONOMICS
Presentation transcript:

Specification Error II

Aims and Learning Objectives By the end of this session students should be able to: Understand the causes and consequences of multicollinearity Analyse regression results for possible Understand the nature of endogeneity Analyse regression results for possible endogeneity

Introduction In this lecture we consider what happens when we violate Assumption 7: No exact collinearity or perfect multicollinearity among the explanatory variables and Assumption 3: Cov(Ui, X2i) = Cov(Ui, X3i)... =... Cov(Ui,Xki) = 0

What is Multicollinearity? The term “independent variable” means an explanatory variable is independent of of the error term, but not necessarily independent of other explanatory variables. Definitions Perfect Multicollinearity: exact linear relationship between two or more explanatory variables Imperfect Multicollinearity: two or more explanatory variables are approximately linearly related

Example: Perfect Multicollinearity Suppose we want to estimate the following model: If there is an exact linear relationship between X2 and X3. For example, if Then we cannot estimate the individual partial regression coefficients

This is because substituting the last expression into the first we get: If we let

Example: Imperfect Multicollinearity Although perfect multicollinearity is theoretically possible, in practice imperfect multicollinearity is what we commonly observed. Typical examples of perfect multicollinearity are when the researcher makes a mistake (including the same variable twice or forgetting to omit a default category for a series of dummy variables)

Consequences of Multicollinearity OLS remains BLUE, however some adverse practical consequences: 1. No OLS output when multicollinearity is exact. 2. large standard errors and wide confidence intervals. 3. Estimators sensitive to deletion or addition of a few observations or “insignificant” variables. Estimators non-robust. 4. Estimators have the “wrong” sign

Detecting Multicollinearity No formal “tests” for multicollinearity 1. Few significant t-ratios but a high R2 and a collective significance of the variables 2. High pairwise correlation between the explanatory variables 3. Examination of partial correlations 4. Estimate auxiliary regressions 5. Estimate variance inflation factor (VIF)

Auxiliary Regressions Auxiliary Regressions - regress each explanatory variable on the remaining explanatory variables The R2 will show how strongly Xji is collinear with the other explanatory variables

Variance Inflation Factor In the two variable model (bivariate regression) the variance of the OLS estimator was: where Extending this to the case of more than two variables leads to the formulae laid out in lecture 5, or alternatively:

Example: Imperfect Multicollinearity CON INC WLTH 1 70 80 810 2 65 100 1009 3 90 120 1273 4 95 140 1425 5 110 160 1633 6 115 180 1876 7 120 200 2052 8 140 220 2201 9 155 240 2435 10 150 260 2686 Hypothetical data on weekly family consumption expenditure (CON), weekly family income (INC) and wealth (WLTH)

R2 is high (96%); wealth has the wrong sign but Regression Results: CON = 24.775 + 0.942INC -0.0424WLTH (3.669) (1.1442) (-0.526) (t-ratios in parentheses) R2 = 0.964 ESS = 8,565.554 RSS = 324.446 F= 92.349 R2 is high (96%); wealth has the wrong sign but neither slope coefficient is individually statistically significant. Joint hypothesis, however, is significant

Auxiliary Regression Results: INC = -0.386 + 0.098WLTH (-0.133) (62.04) (t-ratios in parentheses) R2 = 0.998 F= 3849 Variance Inflation Factor:

Remedying Multicollinearity High multicollinearity occurs because of a lack of adequate information in the sample 1. Collect more data with better information. 2. Perform robustness checks 3. If all else fails at least point out that the poor model performance might be due to the multicollinearity problem (or it might not).

The Nature of Endogenous Explanatory Variables In real world applications we distinguish between: Exogenous (pre-determined) Variables Endogenous (jointly determined) Variables When one or more explanatory variable is endogenous, there is implicitly a system of simultaneous equations

Example: Endogeneity But Therefore Cov(S, U)  0 OLS of the relationship between W and S gives “credit” to education for changes in the disturbances. Resulting OLS estimator is biased upwards (since Cov (Si, Ui) > 0) and, because the problem persists even in large samples, the estimator is also inconsistent

Remedies for Endogeneity Two options: Try and find a suitable proxy for the unobserved variable Leave the unobserved variable in the error term but use an instrument for the endogenous explanatory variable (involves a different estimation technique)

Example and Include a proxy for ability Find an instrument for education Needs to have the following properties Cov(Z,U) = 0 and Cov(Z, S)  0

Hausman Test for Endogeneity Suppose we wish to test whether S is uncorrelated with U. Stage 1: Estimate the reduced form: Stage 2: Add to the structural equation and test the significance of Decision rule: if is significant reject null hypothesis of exogeneity

Summary In this lecture we have: 1. Outlined the theoretical and practical consequences of multicollinearity 2. Described a number of procedures for detecting the presence of multicollinearity 3. Outlined the basic consequences of endogeneity 4. Outlined a procedure for detecting the presence of endogeneity