Download presentation
Presentation is loading. Please wait.
Published byJoel Tucker Modified over 9 years ago
1
12a - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part I
2
12a - 2 © 2000 Prentice-Hall, Inc. Learning Objectives 1.Explain the Linear Multiple Regression Model 2.Explain Residual Analysis 3.Test Overall Significance 4.Explain Multicollinearity 5.Interpret Linear Multiple Regression Computer Output
3
12a - 3 © 2000 Prentice-Hall, Inc. Types of Regression Models
4
12a - 4 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
5
12a - 5 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation Expanded in Multiple Regression
6
12a - 6 © 2000 Prentice-Hall, Inc. Linear Multiple Regression Model Hypothesizing the Deterministic Component Expanded in Multiple Regression
7
12a - 7 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
8
12a - 8 © 2000 Prentice-Hall, Inc. Linear Multiple Regression Model 1.Relationship between 1 dependent & 2 or more independent variables is a linear function Dependent (response) variable Independent (explanatory) variables Population slopes Population Y-intercept Random error
9
12a - 9 © 2000 Prentice-Hall, Inc. Population Multiple Regression Model Bivariate model
10
12a - 10 © 2000 Prentice-Hall, Inc. Sample Multiple Regression Model Bivariate model
11
12a - 11 © 2000 Prentice-Hall, Inc. Parameter Estimation Expanded in Multiple Regression
12
12a - 12 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
13
12a - 13 © 2000 Prentice-Hall, Inc. Multiple Linear Regression Equations Too complicated by hand! Ouch!
14
12a - 14 © 2000 Prentice-Hall, Inc. Interpretation of Estimated Coefficients
15
12a - 15 © 2000 Prentice-Hall, Inc. Interpretation of Estimated Coefficients 1.Slope ( k ) Estimated Y Changes by k for Each 1 Unit Increase in X k Holding All Other Variables Constant Estimated Y Changes by k for Each 1 Unit Increase in X k Holding All Other Variables Constant Example: If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X 1 ) Given the Number of Sales Rep’s (X 2 ) Example: If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X 1 ) Given the Number of Sales Rep’s (X 2 ) ^ ^ ^
16
12a - 16 © 2000 Prentice-Hall, Inc. Interpretation of Estimated Coefficients 1.Slope ( k ) Estimated Y Changes by k for Each 1 Unit Increase in X k Holding All Other Variables Constant Estimated Y Changes by k for Each 1 Unit Increase in X k Holding All Other Variables Constant Example: If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X 1 ) Given the Number of Sales Rep’s (X 2 ) Example: If 1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X 1 ) Given the Number of Sales Rep’s (X 2 ) 2.Y-Intercept ( 0 ) Average Value of Y When X k = 0 Average Value of Y When X k = 0 ^ ^ ^ ^
17
12a - 17 © 2000 Prentice-Hall, Inc. Parameter Estimation Example You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.) & newspaper circulation (000) on the number of ad responses (00). You’ve collected the following data: RespSizeCirc 112 488 131 357 264 4106
18
12a - 18 © 2000 Prentice-Hall, Inc. Parameter Estimation Computer Output Parameter Estimates Parameter Estimates Parameter Standard T for H0: Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP 1 0.0640 0.2599 0.246 0.8214 ADSIZE 1 0.2049 0.0588 3.656 0.0399 CIRC 1 0.2805 0.0686 4.089 0.0264 PP 22 00 11 ^ ^ ^ ^
19
12a - 19 © 2000 Prentice-Hall, Inc. Interpretation of Coefficients Solution
20
12a - 20 © 2000 Prentice-Hall, Inc. Interpretation of Coefficients Solution 1.Slope ( 1 ) # Responses to Ad Is Expected to Increase by.2049 (20.49) for Each 1 Sq. In. Increase in Ad Size Holding Circulation Constant # Responses to Ad Is Expected to Increase by.2049 (20.49) for Each 1 Sq. In. Increase in Ad Size Holding Circulation Constant ^
21
12a - 21 © 2000 Prentice-Hall, Inc. Interpretation of Coefficients Solution 1.Slope ( 1 ) # Responses to Ad Is Expected to Increase by.2049 (20.49) for Each 1 Sq. In. Increase in Ad Size Holding Circulation Constant # Responses to Ad Is Expected to Increase by.2049 (20.49) for Each 1 Sq. In. Increase in Ad Size Holding Circulation Constant 2.Slope ( 2 ) # Responses to Ad Is Expected to Increase by.2805 (28.05) for Each 1 Unit (1,000) Increase in Circulation Holding Ad Size Constant # Responses to Ad Is Expected to Increase by.2805 (28.05) for Each 1 Unit (1,000) Increase in Circulation Holding Ad Size Constant ^ ^
22
12a - 22 © 2000 Prentice-Hall, Inc. Evaluating the Model Expanded in Multiple Regression
23
12a - 23 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation
24
12a - 24 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity
25
12a - 25 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!
26
12a - 26 © 2000 Prentice-Hall, Inc. Variation Measures
27
12a - 27 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!
28
12a - 28 © 2000 Prentice-Hall, Inc. Coefficient of Multiple Determination 1.Proportion of Variation in Y ‘Explained’ by All X Variables Taken Together R 2 = Explained Variation = SSR Total Variation SS yy 2.Never Decreases When New X Variable Is Added to Model Only Y Values Determine SS yy Only Y Values Determine SS yy Disadvantage When Comparing Models Disadvantage When Comparing Models
29
12a - 29 © 2000 Prentice-Hall, Inc. Residual Analysis
30
12a - 30 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!
31
12a - 31 © 2000 Prentice-Hall, Inc. Residual Analysis 1.Graphical Analysis of Residuals Plot Estimated Errors vs. X i Values Plot Estimated Errors vs. X i Values Difference Between Actual Y i & Predicted Y i Difference Between Actual Y i & Predicted Y i Estimated Errors Are Called Residuals Estimated Errors Are Called Residuals Plot Histogram or Stem-&-Leaf of Residuals Plot Histogram or Stem-&-Leaf of Residuals 2.Purposes Examine Functional Form (Linear vs. Non-Linear Model) Examine Functional Form (Linear vs. Non-Linear Model) Evaluate Violations of Assumptions Evaluate Violations of Assumptions
32
12a - 32 © 2000 Prentice-Hall, Inc. Linear Regression Assumptions 1.Mean of Probability Distribution of Error Is 0 2.Probability Distribution of Error Has Constant Variance 3.Probability Distribution of Error is Normal 4.Errors Are Independent
33
12a - 33 © 2000 Prentice-Hall, Inc. Residual Plot for Functional Form Add X 2 Term Correct Specification ^ ^
34
12a - 34 © 2000 Prentice-Hall, Inc. Residual Plot for Equal Variance Unequal Variance Correct Specification Fan-shaped. Standardized residuals used typically.
35
12a - 35 © 2000 Prentice-Hall, Inc. Residual Plot for Independence Not Independent Correct Specification Plots reflect sequence data were collected.
36
12a - 36 © 2000 Prentice-Hall, Inc. Residual Analysis Computer Output Dep Var Predict Student Dep Var Predict Student Obs SALES Value Residual Residual -2-1-0 1 2 1 1.0000 0.6000 0.4000 1.044 | |** | 2 1.0000 1.3000 -0.3000 -0.592 | *| | 3 2.0000 2.0000 0 0.000 | | | 4 2.0000 2.7000 -0.7000 -1.382 | **| | 5 4.0000 3.4000 0.6000 1.567 | |*** | Plot of standardized (student) residuals
37
12a - 37 © 2000 Prentice-Hall, Inc. Testing Parameters
38
12a - 38 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!
39
12a - 39 © 2000 Prentice-Hall, Inc. Testing Overall Significance 1.Shows If There Is a Linear Relationship Between All X Variables Together & Y 2.Uses F Test Statistic 3.Hypotheses H 0 : 1 = 2 =... = k = 0 H 0 : 1 = 2 =... = k = 0 No Linear Relationship No Linear Relationship H a : At Least One Coefficient Is Not 0 H a : At Least One Coefficient Is Not 0 At Least One X Variable Affects Y At Least One X Variable Affects Y
40
12a - 40 © 2000 Prentice-Hall, Inc. Testing Overall Significance Computer Output Analysis of Variance Sum of Mean Sum of Mean Source DF Squares Square F Value Prob>F Model 2 9.2497 4.6249 55.440 0.0043 Error 3 0.2503 0.0834 C Total 5 9.5000 kn - k -1 n - 1P-Value MS(Model) MS(Error)
41
12a - 41 © 2000 Prentice-Hall, Inc. Multicollinearity
42
12a - 42 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!
43
12a - 43 © 2000 Prentice-Hall, Inc. Multicollinearity 1.High Correlation Between X Variables 2.Coefficients Measure Combined Effect 3.Leads to Unstable Coefficients Depending on X Variables in Model 4.Always Exists -- Matter of Degree 5.Example: Using Both Age & Height as Explanatory Variables in Same Model
44
12a - 44 © 2000 Prentice-Hall, Inc. Detecting Multicollinearity 1.Examine Correlation Matrix Correlations Between Pairs of X Variables Are More than With Y Variable Correlations Between Pairs of X Variables Are More than With Y Variable 2.Examine Variance Inflation Factor (VIF) If VIF j > 5, Multicollinearity Exists If VIF j > 5, Multicollinearity Exists 3.Few Remedies Obtain New Sample Data Obtain New Sample Data Eliminate One Correlated X Variable Eliminate One Correlated X Variable
45
12a - 45 © 2000 Prentice-Hall, Inc. Correlation Matrix Computer Output Correlation Analysis Pearson Corr Coeff /Prob>|R| under HO:Rho=0/ N=6 RESPONSE ADSIZE CIRC RESPONSE ADSIZE CIRC RESPONSE 1.00000 0.90932 0.93117 0.0 0.0120 0.0069 0.0 0.0120 0.0069 ADSIZE 0.90932 1.00000 0.74118 0.0120 0.0 0.0918 0.0120 0.0 0.0918 CIRC 0.93117 0.74118 1.00000 0.0069 0.0918 0.0 0.0069 0.0918 0.0 r Y1 r Y2 All 1’s r 12
46
12a - 46 © 2000 Prentice-Hall, Inc. Variance Inflation Factors Computer Output Parameter Standard T for H0: Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP 1 0.0640 0.2599 0.246 0.8214 ADSIZE 1 0.2049 0.0588 3.656 0.0399 CIRC 1 0.2805 0.0686 4.089 0.0264 Variance Variance Variable DF Inflation INTERCEP 1 0.0000 ADSIZE 1 2.2190 CIRC 1 2.2190 VIF 1 5
47
12a - 47 © 2000 Prentice-Hall, Inc. Regression Cautions
48
12a - 48 © 2000 Prentice-Hall, Inc. Regression Cautions 1.Violated Assumptions 2.Relevancy of Historical Data 3.Level of Significance 4.Extrapolation 5.Cause & Effect
49
12a - 49 © 2000 Prentice-Hall, Inc. Y Interpolation X ExtrapolationExtrapolation Relevant Range Extrapolation
50
12a - 50 © 2000 Prentice-Hall, Inc. Cause & Effect Liquor Consumption # Teachers
51
12a - 51 © 2000 Prentice-Hall, Inc. Conclusion 1.Explained the Linear Multiple Regression Model 2.Explained Residual Analysis 3.Tested Overall Significance 4.Explained Multicollinearity 5.Interpreted Linear Multiple Regression Computer Output
52
End of Chapter Any blank slides that follow are blank intentionally.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.