Download presentation
Presentation is loading. Please wait.
Published byJustin Hunter Modified over 8 years ago
1
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building
2
Yandell – Econ 216 Chap 15-2 Chapter Goals After completing this chapter, you should be able to: understand model building using multiple regression analysis use variable transformations to model nonlinear relationships recognize potential problems in multiple regression analysis and take the steps to correct the problems Apply techniques to obtain a best-fit regression equation
3
Yandell – Econ 216 Chap 15-3 The relationship between the dependent variable and an independent variable may not be linear Useful when scatter diagram indicates non- linear relationship Example: Quadratic model The second independent variable is the square of the first variable Nonlinear Relationships
4
Yandell – Econ 216 Chap 15-4 Polynomial Regression Model where: β 0 = Population regression constant β i = Population regression coefficient for variable X j : j = 1, 2, …k p = Order of the polynomial i = Model error If p = 2 the model is a quadratic model: General form:
5
Yandell – Econ 216 Chap 15-5 Linear fit does not give random residuals Linear vs. Nonlinear Fit Nonlinear fit gives random residuals x residuals x y x y x
6
Yandell – Econ 216 Chap 15-6 Quadratic Regression Model Quadratic models may be considered when scatter diagram takes on the following shapes: x1x1 y x1x1 x1x1 yyy β 1 < 0β 1 > 0β 1 < 0β 1 > 0 β 1 = the coefficient of the linear term β 2 = the coefficient of the squared term x1x1 β 2 > 0 β 2 < 0
7
Yandell – Econ 216 Chap 15-7 Testing for Significance: Quadratic Model Test for Overall Relationship F test statistic = Testing the Quadratic Effect Compare quadratic model with the linear model Hypotheses (No 2 nd order polynomial term) (2 nd order polynomial term is needed) H 0 : β 2 = 0 H 1 : β 2 0
8
Yandell – Econ 216 Chap 15-8 Higher Order Models Y X If p = 3 the model is a cubic form:
9
Yandell – Econ 216 Chap 15-9 Multicollinearity Multicollinearity: High correlation exists between two independent variables This means the two variables contribute redundant information to the multiple regression model
10
Yandell – Econ 216 Chap 15-10 Multicollinearity Including two highly correlated independent variables can adversely affect the regression results No new information provided Can lead to unstable coefficients (large standard error and low t-values) Coefficient signs may not match prior expectations (continued)
11
Yandell – Econ 216 Chap 15-11 Some Indications of Severe Multicollinearity Incorrect signs on the coefficients Large change in the value of a previous coefficient when a new variable is added to the model A previously significant variable becomes insignificant when a new independent variable is added The estimate of the standard deviation of the model increases when a variable is added to the model
12
Yandell – Econ 216 Chap 15-12 Detect Collinearity (Variance Inflationary Factor) VIF j is used to measure collinearity: If VIF j > 5, X j is highly correlated with the other explanatory variables r 2 j is the coefficient of determination when the j th independent variable is regressed against the remaining k – 1 independent variables
13
Yandell – Econ 216 Chap 15-13 Detect Collinearity in PHStat Output for the pie sales example: Since there are only two explanatory variables, only one VIF is reported VIF is < 5 There is no evidence of collinearity between Price and Advertising Regression Analysis Price and all other X Regression Statistics Multiple R0.030437581 R Square0.000926446 Adjusted R Square-0.075925366 Standard Error1.21527235 Observations15 VIF1.000927305 PHStat / regression / multiple regression … Check the “variance inflationary factor (VIF)” box
14
Yandell – Econ 216 Chap 15-14 Model Building Goal is to develop a model with the best set of independent variables Easier to interpret if unimportant variables are removed Lower probability of collinearity Stepwise regression procedure Provide evaluation of alternative models as variables are added Best-subset approach Try all combinations and select the best using the highest adjusted r 2 and lowest standard error
15
Yandell – Econ 216 Chap 15-15 Idea: develop the least squares regression equation in steps, either through forward selection, backward elimination, or through standard stepwise regression The coefficient of partial determination is the measure of the marginal contribution of each independent variable, given that other independent variables are in the model Stepwise Regression
16
Yandell – Econ 216 Chap 15-16 Best Subsets Regression Idea: estimate all possible regression equations using all possible combinations of independent variables Choose the best fit by looking for the highest adjusted r 2 and lowest standard error Stepwise regression and best subsets regression can be performed using PHStat
17
Yandell – Econ 216 Chap 15-17 Aptness of the Model Diagnostic checks on the model include verifying the assumptions of multiple regression: Each X i is linearly related to Y Errors have constant variance Errors are independent Error are normally distributed Errors (or Residuals) are given by
18
Yandell – Econ 216 Chap 15-18 Residual Analysis Non-constant variance Constant variance xx residuals Not IndependentIndependent x residuals x
19
Yandell – Econ 216 Chap 15-19 The Normality Assumption Errors are assumed to be normally distributed Standardized residuals can be calculated by computer Examine a histogram or a normal probability plot of the standardized residuals to check for normality
20
Yandell – Econ 216 Chap 15-20 Model Building Flowchart Choose X 1,X 2,…X k Run Regression to find VIFs Remove Variable with Highest VIF Any VIF>5? Run Subsets Regression to Obtain “best” models in terms of C p Do Complete Analysis Add Curvilinear Term and/or Transform Variables as Indicated Perform Predictions No More than One? Remove this X Yes No Yes
21
Yandell – Econ 216 Chap 15-21 Chapter Summary Described nonlinear regression models Described multicollinearity Discussed model building Stepwise regression Best subsets regression Examined residual plots to check model assumptions
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.