Statistics 350 Lecture 21
Today Last Day: Tests and partial R 2 Today: Multicollinearity
Multicollinearity When explanatory variables are highly correlated, weird things can happen with the regression analysis Multicollinearity is said to exist among explanatory variables any time a regression of one of the explanatory variables against the rest yields a strong linear relationship, as measured by a high R 2 Can also attempt to visualize this relationship using a scatter-plot matrix
Multicollinearity Back to Example:
Multicollinearity Why might this matter? Consider the 3 variable linear regression model: Can view 1 in the model as the partial regression coefficient for X 1 What is its interpretation?
Multicollinearity If other variables tend to be correlated with X 1 this effect is difficult to isolate and estimate RESULT:
Multicollinearity Back to example:
Multicollinearity Back to example: Notice: If did a regression of X 1 on X 2 and X 3, the R 2 is Conclusion:
Multicollinearity Why exactly have we observed this phenomenon? Consider the 3 variable model in the body fat example:
Multicollinearity As a result:
Multicollinearity Detecting multicollinearity in practice: