Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences? It may be difficult to separate.

Slides:



Advertisements
Similar presentations
F-tests continued.
Advertisements

3.3 Hypothesis Testing in Multiple Linear Regression
Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
Multivariate Regression
The Multiple Regression Model.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Specification Error II
Introduction and Overview
Studenmund(2006): Chapter 8
Multicollinearity Multicollinearity - violation of the assumption that no independent variable is a perfect linear function of one or more other independent.
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Variance and covariance M contains the mean Sums of squares General additive models.
Multiple Linear Regression Model
Chapter 7 Multicollinearity. What is in this Chapter? In Chapter 4 we stated that one of the assumptions in the basic regression model is that the explanatory.
Multicollinearity Omitted Variables Bias is a problem when the omitted variable is an explanator of Y and correlated with X1 Including the omitted variable.
Lecture 11 Multivariate Regression A Case Study. Other topics: Multicollinearity  Assuming that all the regression assumptions hold how good are our.
Multiple Regression and Correlation Analysis
Chapter 9 Multicollinearity
Ekonometrika 1 Ekonomi Pembangunan Universitas Brawijaya.
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Chapter 13.3 Multicollinearity.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,
Objectives of Multiple Regression
What does it mean? The variance of the error term is not constant
Lecture 17 Summary of previous Lecture Eviews. Today discussion  R-Square  Adjusted R- Square  Game of Maximizing Adjusted R- Square  Multiple regression.
STA302/ week 111 Multicollinearity Multicollinearity occurs when explanatory variables are highly correlated, in which case, it is difficult or impossible.
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
Specification Error I.
2 Multicollinearity Presented by: Shahram Arsang Isfahan University of Medical Sciences April 2014.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 16 Data Analysis: Testing for Associations.
Assumption checking in “normal” multiple regression with Stata.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Chap 6 Further Inference in the Multiple Regression Model
5-1 MGMG 522 : Session #5 Multicollinearity (Ch. 8)
Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED?
Regression Analysis Part A Basic Linear Regression Analysis and Estimation of Parameters Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Heteroscedasticity Heteroscedasticity is present if the variance of the error term is not a constant. This is most commonly a problem when dealing with.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Ch5 Relaxing the Assumptions of the Classical Model
F-tests continued.
Chapter 9 Multiple Linear Regression
Chow test.
The Problem of Large Correlations Among the Independent Variables
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
Multiple Regression Analysis and Model Building
Regression.
Multivariate Regression
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Fundamentals of regression analysis
Multivariate Analysis Lec 4
Chapter 6: MULTIPLE REGRESSION ANALYSIS
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED
MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED
Some issues in multivariate regression
Simple Linear Regression
Regression Diagnostics
Multicollinearity Susanta Nag Assistant Professor Department of Economics Central University of Jammu.
Regression Forecasting and Model Building
Chapter 13 Additional Topics in Regression Analysis
Instrumental Variables Estimation and Two Stage Least Squares
Heteroskedasticity.
Financial Econometrics Fin. 505
Presentation transcript:

Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences? It may be difficult to separate out the effects of the individual regressors. Standard errors may be overestimated and t-values depressed. Note: a symptom may be high R2 but low t-values How can you detect the problem? Examine the correlation matrix of regressors - also carry out auxiliary regressions amongst the regressors. Look at the Variance Inflation Factors NOTE: be careful not to apply t tests mechanically without checking for multicollinearity multicollinearity is a data problem, not a misspecification problem

Multicollinearity Sources of multicollinearity Problem in data collection Over defined model has more predictor variables Most economic variables move together Use of lagged values of some explanatory variables

Multicollinearity Consequences In case of perfect multicollinearity the values of the estimates are indeterminate and their variances are infinite. While in case of partial collinearity it is possible to get the estimator but their variances are tend to be vary large as the degree of correlation between explanatory variable increases.

Multicollinearity 2. Because of large standard error confidence interval are tend to be larger. 3.Coefficient of determination may also be high 4.OLS estimator and variances become vary sensitive

Multicollinearity Detection of Multicollinearity Examination of Correlation matrix If the determinant of correlation matrix is near 1, so no multicollinearity. If if near to zero a perfect multicollinearity.

Variance Inflation Factor (VIF) Multicollinearity inflates the variance of an estimator VIFJ is cjj of (x’x)-1 serious multicollinearity problem if VIFJ>5

Multicollinearity Farrar-Gluaber test It is a set of three test Chi square for detecting the existence and severity of multicollinearity . F test is used for locating which variables are multicollinear. T-test is used for finding pattern of Multicollinearity

Examination of correlation matrix A very simple procedure to detect multicollinearity is the inspection of off diagonal element in the correlation matrix. If the explanatory variables and are linearly independent the will be close to zero. It is helpful only in detecting pair wise collinearity. But it is not sufficient for detecting anything more complex than pair wise. For more complex one we used the determinant of correlation matrix if significant multicollinearity, while no multicollinearity

Multicollinearity: Definition Multicollinearity is the condition where the independent variables are related to each other. Causation is not implied by multicollinearity. As any two (or more) variables become more and more closely correlated, the condition worsens, and ‘approaches singularity’. Since the X's are supposed to be fixed, this a sample problem. Since multicollinearity is almost always present, it is a problem of degree, not merely existence. May 19, 2019

Multicollinearity: Implications Consider the following cases A) No multicollinearity The regression would appear to be identical to separate bivariate regressions This produces variances which are biased upward (too large) making t-tests too small. For multiple regression this satisfies the assumption. May 19, 2019

Multicollinearity: Implications (cont.) B) Perfect Multicollinearity Some variable Xi is a perfect linear combination of one or more other variables Xj, therefore X'X is singular, and |X'X| = 0. This is matrix algebra notation. It means that one variable is a perfect linear function of another. (e.g. X2 = X1+3.2) A model cannot be estimated under such circumstances. The computer dies. May 19, 2019

Multicollinearity: Implications (cont.) C. A high degree of Multicollinearity When the independent variables are highly correlated the variances and covariances of the Bi's are inflated (t ratio's are lower) and R2 tends to be high as well. The B's are unbiased (but perhaps useless due to their imprecise measurement as a result of their variances being too large). In fact they are still BLUE. OLS estimates tend to be sensitive to small changes in the data. Relevant variables may be discarded. May 19, 2019

Multicollinearity: Causes Sampling mechanism.Poorly constructed design & measurement scheme or limited range. Statistical model specification: adding polynomial terms or trend indicators. Too many variables in the model - the model is over-determined. Theoretical specification is wrong. Inappropriate construction of theory or even measurement May 19, 2019

Multicollinearity: Tests/Indicators |X'X| = approaches 0 Since the determinant is a function of variable scale, this measure doesn't help a whole lot. We could, however, use the determinant of the correlation matrix and therefore bound the range from 0. to 1.0 May 19, 2019

Multicollinearity: Tests/Indicators (cont.) Tolerance: If the tolerance equals 1, the variables are unrelated. If TOLj = 0, then they are perfectly correlated. Variance Inflation Factors (VIFs) Tolerance May 19, 2019

Interpreting VIFs No multicollinearity produces VIFs = 1.0 If the VIF is greater than 10.0, then multicollinearity is probably severe. 90% of the variance of Xj is explained by the other X’s. In small samples, a VIF of about 5.0 may indicate problems May 19, 2019

Multicollinearity: Tests/Indicators (cont.) R2 deletes - tries all possible models of X's and by includes/ excludes based on small changes in R2 with the inclusion/omission of the variables (taken 1 at a time) F is significant, But no t value is. Adjusted R2 declines with a new variable Multicollinearity is of concern when either May 19, 2019

Multicollinearity: Tests/Indicators (cont.) I would avoid the rule of thumb Beta's are > 1.0 or < -1.0 Sign changes occur with the introduction of a new variable The R2 is high, but few t-ratios are. Eigenvalues and Condition Index - If this topic is beyond Gujarati, it’s beyond me. May 19, 2019

Multicollinearity: Remedies Increase sample size Omit Variables Scale Construction/Transformation Factor Analysis Constrain the estimation. Such as the case where you can set the value of one coefficient relative to another. May 19, 2019

Multicollinearity: Remedies (cont.) Change design (LISREL maybe or Pooled cross- sectional Time series) Ridge Regression This technique introduces a small amount of bias into the coefficients to reduce their variance. Ignore it - report adjusted R2 and claim it warrants retention in the model. May 19, 2019