Chapter 15, part C. Multicollinearity. V. E. Multicollinearity Although we have used the term “independent variable” to describe our x-variables, there.

Slides:



Advertisements
Similar presentations
Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
Advertisements

Here we add more independent variables to the regression.
Welcome to Econ 420 Applied Regression Analysis
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Example 1 To predict the asking price of a used Chevrolet Camaro, the following data were collected on the car’s age and mileage. Data is stored in CAMARO1.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
Specification Error II
Introduction and Overview
Studenmund(2006): Chapter 8
Multicollinearity Multicollinearity - violation of the assumption that no independent variable is a perfect linear function of one or more other independent.
1 Multiple Regression Interpretation. 2 Correlation, Causation Think about a light switch and the light that is on the electrical circuit. If you and.
Multiple Regression [ Cross-Sectional Data ]
Analysis of Economic Data
Statistics 350 Lecture 21. Today Last Day: Tests and partial R 2 Today: Multicollinearity.
1 Multiple Regression Here we add more independent variables to the regression. In this section I focus on sections 13.1, 13.2 and 13.4.
Plots, Correlations, and Regression Getting a feel for the data using plots, then analyzing the data with correlations and linear regression.
The Simple Regression Model
ASSESSING THE STRENGTH OF THE REGRESSION MODEL. Assessing the Model’s Strength Although the best straight line through a set of points may have been found.
Multicollinearity Omitted Variables Bias is a problem when the omitted variable is an explanator of Y and correlated with X1 Including the omitted variable.
Summarizing Empirical Estimation EconS 451: Lecture #9 Transforming Variables to Improve Model Using Dummy / Indicator Variables Issues related to Model.
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Twelve Multiple Regression and Correlation Analysis GOALS When.
Topic 3: Regression.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
1.The independent variables do not form a linearly dependent set--i.e. the explanatory variables are not perfectly correlated. 2.Homoscedasticity --the.
Multiple Regression and Correlation Analysis
AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Chapter 13.3 Multicollinearity.
Chapter 10r Linear Regression Revisited. Correlation A numerical measure of the direction and strength of a linear association. –Like standard deviation.
Point Biserial Correlation Example
Linear Regression Modeling with Data. The BIG Question Did you prepare for today? If you did, mark yes and estimate the amount of time you spent preparing.
Objectives of Multiple Regression
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
5.1 Basic Estimation Techniques  The relationships we theoretically develop in the text can be estimated statistically using regression analysis,  Regression.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Writing Research Hypotheses Null and Alternative Hypotheses The Null Hypothesis states “There is no significance present” – Represented by H 0 The Alternative.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Multiple Regression Lab Chapter Topics Multiple Linear Regression Effects Levels of Measurement Dummy Variables 2.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Six.
Examining Relationships in Quantitative Research
Steps in Regression Analysis (1) Choose the dependent and independent variables (2) Examine the scatterplots and the correlation matrix Check for any high.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Chapter 13 Multiple Regression
SCHEDULE OF WEEK 10 Project 2 is online, due by Monday, Dec 5 at 03:00 am 2. Discuss the DW test and how the statistic attains less/greater that 2 values.
3-1 MGMG 522 : Session #3 Hypothesis Testing (Ch. 5)
1 B IVARIATE AND MULTIPLE REGRESSION Estratto dal Cap. 8 di: “Statistics for Marketing and Consumer Research”, M. Mazzocchi, ed. SAGE, LEZIONI IN.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
1. The independent variables do not form a linearly dependent set--i.e. the explanatory variables are not perfectly correlated. 2. Homoscedasticity--the.
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
Multiple Regression II 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 2) Terry Dielman.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED?
Managerial Economics & Decision Sciences Department introduction  inflated standard deviations  the F  test  business analytics II Developed for ©
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
More Multiple Regression
Basic Estimation Techniques
Chapter 15, part C..
Basic Estimation Techniques
More Multiple Regression
More Multiple Regression
Chapter 7: The Normality Assumption and Inference with OLS
Chapter 13 Additional Topics in Regression Analysis
Correlation & Trend Lines
Presentation transcript:

Chapter 15, part C. Multicollinearity

V. E. Multicollinearity Although we have used the term “independent variable” to describe our x-variables, there are very few cases where each x is truly independent of the rest. For example, if we had a model of earnings (y) as a function of age (x1) and experience (x2), it’s safe to say that age and experience are related. Thus they are not truly independent.

Potential Problems The goal of the regression model is to find x- variables that explain a significant amount of the variability in the y-variable. In other words, each x has a strong correlation with y, thus creating a strong model. But what if there is more correlation between x1 and x2 than there is between x1 and y, or between x2 and y?

A Hypothetical Scenario Suppose we estimate an Earnings (y) model as: and find that the F test shows the overall relationship to be significant. But we conduct a t-test on  1 and cannot reject the null that  1 = 0. Should we conclude that Age is truly an insignificant variable in an earnings model? Maybe not.

What’s Going On? In our scenario, we have a model that exhibits overall significance, but one of our key variables appears to be insignificant. In fact, both might be insignificant. Since it’s safe to say that Experience = f(Age), what we’ve got is a multicollinearity problem. Our two independent variables are highly correlated. In fact, it appears that all of the explanatory power of Age is being captured by Experience.

Testing for Multicollinearity There are some more complicated tests, but a rule of thumb that will serve us is to do a correlation matrix of all independent, and the dependent, variables. If any two independent variables have a correlation coefficient that is greater than.7 (absolute value), multicollinearity is a potential problem.

Fixing the Problem The issue at hand is that it is impossible to separate the effects of x 1 on y, and x 2 on y, when x 1 and x 2 are highly correlated. Your estimates of b 1 and b 2 are thus unreliable and may even switch signs. The simplest remedy is to drop one of the offending variables. If you believe, and past literature supports you, that Experience is a more reliable determinant of Earnings, then I would drop Age and re-estimate the model.

F. Example Y is the weekly $ spent on alcohol. Age is the student’s age Drinks is the weekly # of alcohol drinks consumed.

Regression Results What do you make of these results? Make sure you can comment on goodness of fit, significance of coefficients and overall significance.

Test for Multicollinearity The correlation coefficient between Age and Alcohol is very low, only.066. Thus it is very unlikely that we have a serious M.C. problem. Since our estimated coefficients are both significant and have the expected signs, I wouldn’t modify this model.