Download presentation
Presentation is loading. Please wait.
Published byHelena McGee Modified over 9 years ago
1
©2006 Thomson/South-Western 1 Chapter 14 – Multiple Linear Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western Concise Managerial Statistics Concise Managerial Statistics KVANLI PAVUR KEELING KVANLI PAVUR KEELING
2
©2006 Thomson/South-Western 2 Multiple Regression Model Y = 0 + 1 X 1 + 2 X 2 + k X k + e Deterministic component 0 + 1 X 1 + 2 X 2 + k X k Least Squares Estimate SSE = ∑(Y - Y) 2 ^
3
©2006 Thomson/South-Western 3 Multiple Regression Model Figure 14.1 Y X1X1X1X1 X2X2X2X2 Y = 0 + 1 X 1 + 2 X 2 e (positive) e (negative)
4
©2006 Thomson/South-Western 4 Housing Example Y = Home square footage (100s) X1 = Annual Income ($1,000s) X2 = Family Size X3 = Combined years of education beyond high school for all household members
5
©2006 Thomson/South-Western 5 Multiple Regression Model Figure 14.2
6
©2006 Thomson/South-Western 6 Multiple Regression Model Figure 14.3
7
©2006 Thomson/South-Western 7 Multiple Regression Model Figure 14.4
8
©2006 Thomson/South-Western 8 Assumptions of the Multiple Regression Model The errors follow a normal distribution, centered at zero, with common variance The errors follow a normal distribution, centered at zero, with common variance The errors are independent The errors are independent
9
©2006 Thomson/South-Western 9 Errors in Multiple Linear Regression Figure 14.5 Y X1X1X1X1 X2X2X2X2 Y = 0 + 1 X 1 + 2 X 2 X 1 = 30, X 2 = 8 X 1 = 50, X 2 = 2 e e
10
©2006 Thomson/South-Western 10 Multiple Regression Model An estimate of e 2 s 2 = e 2 = = SSE n - (k + 1) SSE n - k - 1 ^
11
©2006 Thomson/South-Western 11 Hypothesis Test for the Significance of the Model H o : 1 = 2 = … = k H a : at least one of the ’s ≠ 0 Reject H o if F > F ,k,n-k-1 F =F =F =F =MSRMSE
12
©2006 Thomson/South-Western 12 Associated F Curve reject H 0 F,v,vF,v,vF,v,vF,v,v 12 Area = Figure 14.6
13
©2006 Thomson/South-Western 13 Test for H o : i = 0 reject H o if |t| > t ./2,n-k-1 t = b1 b1sbsb b1 b1sbsb1 H o : 1 = 0 (X 1 does not contribute) H a : 1 ≠ 0 (X 1 does contribute) H o : 2 = 0 (X 2 does not contribute) H a : 2 ≠ 0 (X 2 does contribute) H o : 3 = 0 (X 3 does not contribute) H a : 3 ≠ 0 (X 3 does contribute) b i - t /2,n-k-1 s b to b i + t /2,n-k-1 s b ii (1- ) 100% Confidence Interval
14
©2006 Thomson/South-Western 14 Housing Example Figure 14.7
15
©2006 Thomson/South-Western 15 BB Investments Example BB Investments wants to develop a model to predict the amount of money invested by various clients in their portfolio of high-risk securities Y = Investment Amount ($) X1 = Annual Income ($1,000s) X2 = Economic Index, showing expected increase in interest levels, manufacturing costs, and price inflation (1 -100 scale)
16
©2006 Thomson/South-Western 16 BB Investments Example Figure 14.8
17
©2006 Thomson/South-Western 17 BB Investments Example Figure 14.9
18
©2006 Thomson/South-Western 18 BB Investments Example Figure 14.10
19
©2006 Thomson/South-Western 19 Coefficient of Determination SST= total sum of squares = SS Y = ∑(Y - Y) 2 = ∑Y 2 - (∑Y) 2 n R 2 = 1 - SSESST F =F =F =F = R 2 / k (1 - R 2 ) / (n - k - 1)
20
©2006 Thomson/South-Western 20 Partial F Test R c 2 = the value of R 2 for the complete model R r 2 = the value of R 2 for the reduced model Test statistic F =F =F =F = (R c 2 - R r 2 ) / v 1 (1 - R c 2 ) / v 2
21
©2006 Thomson/South-Western 21 Motormax Example Motormax produces electric motors in home furnaces. They want to study the relationship between the dollars spent per week in inspecting finished products (X) and the number of motors produced during that week that were returned to the factory by the customer (Y)
22
©2006 Thomson/South-Western 22 Motormax Example Figure 14.11
23
©2006 Thomson/South-Western 23 Quadratic Curves 2422 16 |11|111 |22|222|3|4|5 24 18 16 |11|111 |22|222|3|4|5 Figure 14.12 YX (a) XXYY(b)(b)
24
©2006 Thomson/South-Western 24 Motormax Example Figure 14.13
25
©2006 Thomson/South-Western 25 Error From Extrapolation Figure 14.14 Predicted Actual YX |1|2|3|4|5
26
©2006 Thomson/South-Western 26 Multicollinearity Occurs when independent variables are highly correlated with each other Often detectable through pairwise correlations readily available in statistical packages Often detectable through pairwise correlations readily available in statistical packages The variance inflation factor can also be used The variance inflation factor can also be used VIF j = 1 1 - R j 2 Conclude severe multicollinearity exists when the maximum VIF j > 10
27
©2006 Thomson/South-Western 27 Multicollinearity Example Figure 14.15
28
©2006 Thomson/South-Western 28 Multicollinearity Example Figure 14.16
29
©2006 Thomson/South-Western 29 Multicollinearity Example Figure 14.17
30
©2006 Thomson/South-Western 30 Multicollinearity The stepwise selection process can help eliminate correlated predictor variables The stepwise selection process can help eliminate correlated predictor variables Other advanced procedures such as ridge regression can also be applied Other advanced procedures such as ridge regression can also be applied Care should be taken during the model selection phase as multicollinearity can be difficult to detect and eliminate Care should be taken during the model selection phase as multicollinearity can be difficult to detect and eliminate
31
©2006 Thomson/South-Western 31 Dummy Variables Dummy, or indicator, variables allow for the inclusion of qualitative variables in the model For example: X1 =X1 =X1 =X1 = 1if female 0if male
32
©2006 Thomson/South-Western 32 Dummy Variable Example Figure 14.18
33
©2006 Thomson/South-Western 33 Stepwise Procedures Procedures either choose or eliminate variables, one at a time, in an effort to avoid including variables with either no predictive ability or are highly correlated with other predictor variables Forward regression Add one variable at a time until contribution is insignificant Forward regression Add one variable at a time until contribution is insignificant Backward regression Remove one variable at a time starting with the “worst” until R 2 drops significantly Backward regression Remove one variable at a time starting with the “worst” until R 2 drops significantly Stepwise regression Forward regression with the ability to remove variables that become insignificant Stepwise regression Forward regression with the ability to remove variables that become insignificant
34
©2006 Thomson/South-Western 34 Stepwise Regression Figure 14.19 Include X 3 Include X 6 Include X 2 Include X 5 Remove X 2 (When X 5 was inserted into the model X 2 became unnecessary) Include X 7 Remove X 7 - it is insignificant Stop Final model includes X 3, X 5 and X 6
35
©2006 Thomson/South-Western 35 Checking Model Assumptions Checking Assumption 1 - Normal distribution Construct a histogram Checking Assumption 3 - Errors are independent Durbin-Watson statistic Checking Assumption 2 - Constant variance Plot residuals versus predicted Y values ^
36
©2006 Thomson/South-Western 36 Detecting Sample Outliers Sample leverages Sample leverages Standardized residuals Standardized residuals Cook’s distance measure Cook’s distance measure Standardized residual = Y i – Y i s 1 - h i ^
37
©2006 Thomson/South-Western 37 Cook’s Distance Measure k1 or 23 or 4≥ 5 DMAX.8.91.0 Table 14.1 D i = (standardized residual) 2 1 k + 1 h i 1 - h i = (Y i - Y i ) 2 (k + 1)s 2 h i (1 – h i ) 2 ^
38
©2006 Thomson/South-Western 38 Residual Analysis Figure 14.20
39
©2006 Thomson/South-Western 39 Residual Analysis Figure 14.21
40
©2006 Thomson/South-Western 40 Residual Analysis Figure 14.22
41
©2006 Thomson/South-Western 41 Prediction Using Multiple Regression Figure 14.23
42
©2006 Thomson/South-Western 42 Prediction Using Multiple Regression Figure 14.24
43
©2006 Thomson/South-Western 43 Prediction Using Multiple Regression Figure 14.25
44
©2006 Thomson/South-Western 44 Prediction Using Multiple Regression Confidence and Prediction Intervals Y - t /2,n-k-1 s Y to Y + t /2,n-k-1 s Y ^^ ^^ (1- ) 100% Confidence Interval for µ Y | X 0 (1- ) 100% Confidence Interval for Y x 0 Y - t /2,n-k-1 s 2 + s Y 2 to Y + t /2,n-k-1 s 2 + s Y 2 ^^ ^^
45
©2006 Thomson/South-Western 45 Interaction Effects Implies how variables occur together has an impact on prediction of the dependent variable Y = 0 + 1 X 1 + 2 X 2 + 3 X 1 X 2 + e µ Y = 0 + 1 X 1 + 2 X 2 + 3 X 1 X 2
46
©2006 Thomson/South-Western 46 Interaction Effects Figure 14.26 µYµYµYµY X1X1X1X1 (a) |1|2 µ Y = 18 + 5X 1 µ Y = 30 - 10X 1 X 2 = 2 X 2 = 5 µ Y = 30 + 15X 1 µ Y = 18 + 15X 1 X 2 = 2 X 2 = 5 60 – 50 – 40 – 30 – 20 – 10 – 60 – 50 – 40 – 30 – 20 – 10 – X1X1X1X1 X1X1X1X1 (b)(b) |1|1|2|2 µYµYµYµY µYµYµYµY
47
©2006 Thomson/South-Western 47 Quadratic and Second-Order Models Y = 0 + 1 X 1 + 2 X 1 2 + e Quadratic Effects Y = 0 + 1 X 1 + 2 X 2 + 3 X 1 X 2 + 4 X 1 2 + 5 X 2 2 + e Complete Second-Order Models Y = 0 + 1 X 1 + 2 X 2 + 3 X 3 + 4 X 1 X 2 + 5 X 2 X 3 + 6 X 2 X 3 + 7 X 1 2 + 8 X 2 2 + 9 X 3 2 + e
48
©2006 Thomson/South-Western 48 Financial Example Figure 14.27
49
©2006 Thomson/South-Western 49 Financial Example Figure 14.28
50
©2006 Thomson/South-Western 50 Financial Example Figure 14.29
51
©2006 Thomson/South-Western 51 Financial Example Figure 14.30
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.