Presentation is loading. Please wait.

Presentation is loading. Please wait.

12a - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part I.

Similar presentations


Presentation on theme: "12a - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part I."— Presentation transcript:

1 12a - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part I

2 12a - 2 © 2000 Prentice-Hall, Inc. Learning Objectives 1.Explain the Linear Multiple Regression Model 2.Explain Residual Analysis 3.Test Overall Significance 4.Explain Multicollinearity 5.Interpret Linear Multiple Regression Computer Output

3 12a - 3 © 2000 Prentice-Hall, Inc. Types of Regression Models

4 12a - 4 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation

5 12a - 5 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation Expanded in Multiple Regression

6 12a - 6 © 2000 Prentice-Hall, Inc. Linear Multiple Regression Model Hypothesizing the Deterministic Component Expanded in Multiple Regression

7 12a - 7 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation

8 12a - 8 © 2000 Prentice-Hall, Inc. Linear Multiple Regression Model 1.Relationship between 1 dependent & 2 or more independent variables is a linear function Dependent (response) variable Independent (explanatory) variables Population slopes Population Y-intercept Random error

9 12a - 9 © 2000 Prentice-Hall, Inc. Population Multiple Regression Model Bivariate model

10 12a - 10 © 2000 Prentice-Hall, Inc. Sample Multiple Regression Model Bivariate model

11 12a - 11 © 2000 Prentice-Hall, Inc. Parameter Estimation Expanded in Multiple Regression

12 12a - 12 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation

13 12a - 13 © 2000 Prentice-Hall, Inc. Multiple Linear Regression Equations Too complicated by hand! Ouch!

14 12a - 14 © 2000 Prentice-Hall, Inc. Interpretation of Estimated Coefficients

15 12a - 15 © 2000 Prentice-Hall, Inc. Interpretation of Estimated Coefficients 1.Slope (  k ) Estimated Y Changes by  k for Each 1 Unit Increase in X k Holding All Other Variables Constant Estimated Y Changes by  k for Each 1 Unit Increase in X k Holding All Other Variables Constant Example: If  1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X 1 ) Given the Number of Sales Rep’s (X 2 ) Example: If  1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X 1 ) Given the Number of Sales Rep’s (X 2 ) ^ ^ ^

16 12a - 16 © 2000 Prentice-Hall, Inc. Interpretation of Estimated Coefficients 1.Slope (  k ) Estimated Y Changes by  k for Each 1 Unit Increase in X k Holding All Other Variables Constant Estimated Y Changes by  k for Each 1 Unit Increase in X k Holding All Other Variables Constant Example: If  1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X 1 ) Given the Number of Sales Rep’s (X 2 ) Example: If  1 = 2, then Sales (Y) Is Expected to Increase by 2 for Each 1 Unit Increase in Advertising (X 1 ) Given the Number of Sales Rep’s (X 2 ) 2.Y-Intercept (  0 ) Average Value of Y When X k = 0 Average Value of Y When X k = 0 ^ ^ ^ ^

17 12a - 17 © 2000 Prentice-Hall, Inc. Parameter Estimation Example You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.) & newspaper circulation (000) on the number of ad responses (00). You’ve collected the following data: RespSizeCirc 112 488 131 357 264 4106

18 12a - 18 © 2000 Prentice-Hall, Inc. Parameter Estimation Computer Output Parameter Estimates Parameter Estimates Parameter Standard T for H0: Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP 1 0.0640 0.2599 0.246 0.8214 ADSIZE 1 0.2049 0.0588 3.656 0.0399 CIRC 1 0.2805 0.0686 4.089 0.0264 PP 22 00 11 ^ ^ ^ ^

19 12a - 19 © 2000 Prentice-Hall, Inc. Interpretation of Coefficients Solution

20 12a - 20 © 2000 Prentice-Hall, Inc. Interpretation of Coefficients Solution 1.Slope (  1 ) # Responses to Ad Is Expected to Increase by.2049 (20.49) for Each 1 Sq. In. Increase in Ad Size Holding Circulation Constant # Responses to Ad Is Expected to Increase by.2049 (20.49) for Each 1 Sq. In. Increase in Ad Size Holding Circulation Constant ^

21 12a - 21 © 2000 Prentice-Hall, Inc. Interpretation of Coefficients Solution 1.Slope (  1 ) # Responses to Ad Is Expected to Increase by.2049 (20.49) for Each 1 Sq. In. Increase in Ad Size Holding Circulation Constant # Responses to Ad Is Expected to Increase by.2049 (20.49) for Each 1 Sq. In. Increase in Ad Size Holding Circulation Constant 2.Slope (  2 ) # Responses to Ad Is Expected to Increase by.2805 (28.05) for Each 1 Unit (1,000) Increase in Circulation Holding Ad Size Constant # Responses to Ad Is Expected to Increase by.2805 (28.05) for Each 1 Unit (1,000) Increase in Circulation Holding Ad Size Constant ^ ^

22 12a - 22 © 2000 Prentice-Hall, Inc. Evaluating the Model Expanded in Multiple Regression

23 12a - 23 © 2000 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify Probability Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation

24 12a - 24 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity

25 12a - 25 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!

26 12a - 26 © 2000 Prentice-Hall, Inc. Variation Measures

27 12a - 27 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!

28 12a - 28 © 2000 Prentice-Hall, Inc. Coefficient of Multiple Determination 1.Proportion of Variation in Y ‘Explained’ by All X Variables Taken Together R 2 = Explained Variation = SSR Total Variation SS yy 2.Never Decreases When New X Variable Is Added to Model Only Y Values Determine SS yy Only Y Values Determine SS yy Disadvantage When Comparing Models Disadvantage When Comparing Models

29 12a - 29 © 2000 Prentice-Hall, Inc. Residual Analysis

30 12a - 30 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!

31 12a - 31 © 2000 Prentice-Hall, Inc. Residual Analysis 1.Graphical Analysis of Residuals Plot Estimated Errors vs. X i Values Plot Estimated Errors vs. X i Values Difference Between Actual Y i & Predicted Y i Difference Between Actual Y i & Predicted Y i Estimated Errors Are Called Residuals Estimated Errors Are Called Residuals Plot Histogram or Stem-&-Leaf of Residuals Plot Histogram or Stem-&-Leaf of Residuals 2.Purposes Examine Functional Form (Linear vs. Non-Linear Model) Examine Functional Form (Linear vs. Non-Linear Model) Evaluate Violations of Assumptions Evaluate Violations of Assumptions

32 12a - 32 © 2000 Prentice-Hall, Inc. Linear Regression Assumptions 1.Mean of Probability Distribution of Error Is 0 2.Probability Distribution of Error Has Constant Variance 3.Probability Distribution of Error is Normal 4.Errors Are Independent

33 12a - 33 © 2000 Prentice-Hall, Inc. Residual Plot for Functional Form Add X 2 Term Correct Specification ^ ^

34 12a - 34 © 2000 Prentice-Hall, Inc. Residual Plot for Equal Variance Unequal Variance Correct Specification Fan-shaped. Standardized residuals used typically.

35 12a - 35 © 2000 Prentice-Hall, Inc. Residual Plot for Independence Not Independent Correct Specification Plots reflect sequence data were collected.

36 12a - 36 © 2000 Prentice-Hall, Inc. Residual Analysis Computer Output Dep Var Predict Student Dep Var Predict Student Obs SALES Value Residual Residual -2-1-0 1 2 1 1.0000 0.6000 0.4000 1.044 | |** | 2 1.0000 1.3000 -0.3000 -0.592 | *| | 3 2.0000 2.0000 0 0.000 | | | 4 2.0000 2.7000 -0.7000 -1.382 | **| | 5 4.0000 3.4000 0.6000 1.567 | |*** | Plot of standardized (student) residuals

37 12a - 37 © 2000 Prentice-Hall, Inc. Testing Parameters

38 12a - 38 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!

39 12a - 39 © 2000 Prentice-Hall, Inc. Testing Overall Significance 1.Shows If There Is a Linear Relationship Between All X Variables Together & Y 2.Uses F Test Statistic 3.Hypotheses H 0 :  1 =  2 =... =  k = 0 H 0 :  1 =  2 =... =  k = 0 No Linear Relationship No Linear Relationship H a : At Least One Coefficient Is Not 0 H a : At Least One Coefficient Is Not 0 At Least One X Variable Affects Y At Least One X Variable Affects Y

40 12a - 40 © 2000 Prentice-Hall, Inc. Testing Overall Significance Computer Output Analysis of Variance Sum of Mean Sum of Mean Source DF Squares Square F Value Prob>F Model 2 9.2497 4.6249 55.440 0.0043 Error 3 0.2503 0.0834 C Total 5 9.5000 kn - k -1 n - 1P-Value MS(Model) MS(Error)

41 12a - 41 © 2000 Prentice-Hall, Inc. Multicollinearity

42 12a - 42 © 2000 Prentice-Hall, Inc. Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity New! New! Expanded! New!

43 12a - 43 © 2000 Prentice-Hall, Inc. Multicollinearity 1.High Correlation Between X Variables 2.Coefficients Measure Combined Effect 3.Leads to Unstable Coefficients Depending on X Variables in Model 4.Always Exists -- Matter of Degree 5.Example: Using Both Age & Height as Explanatory Variables in Same Model

44 12a - 44 © 2000 Prentice-Hall, Inc. Detecting Multicollinearity 1.Examine Correlation Matrix Correlations Between Pairs of X Variables Are More than With Y Variable Correlations Between Pairs of X Variables Are More than With Y Variable 2.Examine Variance Inflation Factor (VIF) If VIF j > 5, Multicollinearity Exists If VIF j > 5, Multicollinearity Exists 3.Few Remedies Obtain New Sample Data Obtain New Sample Data Eliminate One Correlated X Variable Eliminate One Correlated X Variable

45 12a - 45 © 2000 Prentice-Hall, Inc. Correlation Matrix Computer Output Correlation Analysis Pearson Corr Coeff /Prob>|R| under HO:Rho=0/ N=6 RESPONSE ADSIZE CIRC RESPONSE ADSIZE CIRC RESPONSE 1.00000 0.90932 0.93117 0.0 0.0120 0.0069 0.0 0.0120 0.0069 ADSIZE 0.90932 1.00000 0.74118 0.0120 0.0 0.0918 0.0120 0.0 0.0918 CIRC 0.93117 0.74118 1.00000 0.0069 0.0918 0.0 0.0069 0.0918 0.0 r Y1 r Y2 All 1’s r 12

46 12a - 46 © 2000 Prentice-Hall, Inc. Variance Inflation Factors Computer Output Parameter Standard T for H0: Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob>|T| INTERCEP 1 0.0640 0.2599 0.246 0.8214 ADSIZE 1 0.2049 0.0588 3.656 0.0399 CIRC 1 0.2805 0.0686 4.089 0.0264 Variance Variance Variable DF Inflation INTERCEP 1 0.0000 ADSIZE 1 2.2190 CIRC 1 2.2190 VIF 1  5

47 12a - 47 © 2000 Prentice-Hall, Inc. Regression Cautions

48 12a - 48 © 2000 Prentice-Hall, Inc. Regression Cautions 1.Violated Assumptions 2.Relevancy of Historical Data 3.Level of Significance 4.Extrapolation 5.Cause & Effect

49 12a - 49 © 2000 Prentice-Hall, Inc. Y Interpolation X ExtrapolationExtrapolation Relevant Range Extrapolation

50 12a - 50 © 2000 Prentice-Hall, Inc. Cause & Effect Liquor Consumption # Teachers

51 12a - 51 © 2000 Prentice-Hall, Inc. Conclusion 1.Explained the Linear Multiple Regression Model 2.Explained Residual Analysis 3.Tested Overall Significance 4.Explained Multicollinearity 5.Interpreted Linear Multiple Regression Computer Output

52 End of Chapter Any blank slides that follow are blank intentionally.


Download ppt "12a - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part I."

Similar presentations


Ads by Google