Download presentation
Presentation is loading. Please wait.
Published byRudolf Hodge Modified over 9 years ago
1
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-1 Business Statistics, 4e by Ken Black Chapter 15 Building Multiple Regression Models
2
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-2 Learning Objectives Analyze and interpret nonlinear variables in multiple regression analysis. Understand the role of qualitative variables and how to use them in multiple regression analysis. Learn how to build and evaluate multiple regression models. Learn how to detect influential observations in regression analysis.
3
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-3 General Linear Regression Model Y = 0 + 1 X 1 + 2 X 2 + 3 X 3 +... + k X k + Y = the value of the dependent (response) variable 0 = the regression constant 1 = the partial regression coefficient of independent variable 1 2 = the partial regression coefficient of independent variable 2 k = the partial regression coefficient of independent variable k k = the number of independent variables = the error of prediction
4
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-4 Non Linear Models: Mathematical Transformation First-order with Two Independent Variables Second-order with One Independent Variable Second-order with an Interaction Term Second-order with Two Independent Variables
5
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-5 Sales Data and Scatter Plot for 13 Manufacturing Companies 0 50 100 150 200 250 300 350 400 450 500 024681012 Number of Representatives Sales Manufacturer Sales ($1,000,000) Number of Manufacturing Representatives 12.12 23.61 36.22 410.43 522.84 635.64 757.15 883.55 9109.46 10128.67 11196.88 12280.010 13462.311
6
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-6 Excel Simple Linear Regression Output for the Manufacturing Example Regression Statistics Multiple R0.933 R Square0.870 Adjusted R Square0.858 Standard Error51.10 Observations13 CoefficientsStandard Errort StatP-value Intercept-107.0328.737 -3.720.003 numreps 41.0264.779 8.580.000 ANOVA dfSSMSFSignificance F Regression1192395 73.690.000 Residual11 28721 2611 Total12 221117
7
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-7 Manufacturing Data with Newly Created Variable Manufacturer Sales ($1,000,000) Number of Mgfr Reps X 1 (No. Mgfr Reps) 2 X 2 = (X 1 ) 2 12.124 23.611 36.224 410.439 522.8416 635.6416 757.1525 883.5525 9109.4636 10128.6749 11196.8864 12280.010100 13462.311121
8
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-8 Scatter Plots Using Original and Transformed Data 0 50 100 150 200 250 300 350 400 450 500 024681012 Number of Representatives Sales 0 50 100 150 200 250 300 350 400 450 500 050100150 Number of Mfg. Reps. Squared Sales
9
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-9 Computer Output for Quadratic Model to Predict Sales Regression Statistics Multiple R0.986 R Square0.973 Adjusted R Square0.967 Standard Error24.593 Observations13 CoefficientsStandard Errort StatP-value Intercept 18.06724.673 0.730.481 MfgrRp-15.723 9.5450 - 1.65 0.131 MfgrRpSq4.7500.776 6.12 0.000 ANOVA dfSSMSFSignificance F Regression2215069107534177.790.000 Residual10 6048 605 Total12221117
10
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-10 Tukey’s Four Quadrant Approach
11
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-11 Prices of Three Stocks over a 15-Month Period Stock 1Stock 2Stock 3 413635 393635 38 32 455141 5239 4355 475752 495854 416265 357077 367275 3974 338381 2810192 3110791
12
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-12 Regression Models for the Three Stocks First-order with Two Independent Variables Second-order with an Interaction Term
13
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-13 Regression for Three Stocks: First-order, Two Independent Variables The regression equation is Stock 1 = 50.9 - 0.119 Stock 2 - 0.071 Stock 3 Predictor Coef StDev T P Constant 50.855 3.791 13.41 0.000 Stock 2 -0.1190 0.1931 -0.62 0.549 Stock 3 -0.0708 0.1990 -0.36 0.728 S = 4.570 R-Sq = 47.2% R-Sq(adj) = 38.4% Analysis of Variance Source DF SS MS F P Regression 2 224.29 112.15 5.37 0.022 Error 12 250.64 20.89 Total 14 474.93
14
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-14 Regression for Three Stocks: Second-order With an Interaction Term The regression equation is Stock 1 = 12.0 - 0.879 Stock 2 - 0.220 Stock 3 – 0.00998 Inter Predictor Coef StDev T P Constant 12.046 9.312 1.29 0.222 Stock 2 0.8788 0.2619 3.36 0.006 Stock 3 0.2205 0.1435 1.54 0.153 Inter -0.009985 0.002314 -4.31 0.001 S = 2.909 R-Sq = 80.4% R-Sq(adj) = 25.1% Analysis of Variance Source DF SS MS F P Regression 3 381.85 127.28 15.04 0.000 Error 11 93.09 8.46 Total 14 474.93
15
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-15 Nonlinear Regression Models: Model Transformation
16
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-16 Data Set for Model Transformation Example CompanyYX 125801.2 2119422.6 398452.2 4278003.2 5189262.9 648001.5 7145502.7 CompanyLOG YX 13.411621.2 24.0770772.6 33.9932162.2 44.4440453.2 54.2770592.9 63.6812411.5 74.1628632.7 ORIGINAL DATATRANSFORMED DATA Y = Sales ($ million/year)X = Advertising ($ million/year)
17
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-17 Regression Output for Model Transformation Example Regression Statistics Multiple R0.990 R Square0.980 Adjusted R Square0.977 Standard Error0.054 Observations7 CoefficientsStandard Errort StatP-value Intercept2.90030.072939.800.000 X0.4751 0.030015.820.000 ANOVA dfSSMSFSignificance F Regression10.7392 250.360.000 Residual50.01480.0030 Total60.7540
18
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-18 Prediction with the Transformed Model
19
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-19 Prediction with the Transformed Model
20
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-20 Indicator (Dummy) Variables Qualitative (categorical) Variables The number of dummy variables needed for a qualitative variable is the number of categories less one. [c - 1, where c is the number of categories] For dichotomous variables, such as gender, only one dummy variable is needed. There are two categories (female and male); c = 2; c - 1 = 1. Your office is located in which region of the country? ___Northeast___Midwest___South___West number of dummy variables = c - 1 = 4 - 1 = 3
21
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-21 Data for the Monthly Salary Example Observation Monthly Salary ($1000) Age (10 Years) Gender (1=Male, 0=Female) 11.5483.21 21.6293.81 31.0112.70 41.2293.40 51.7463.61 61.5284.11 71.0183.80 81.1903.40 91.5513.31 100.9853.20 111.6103.51 121.4322.91 131.2153.30 140.9902.80 151.5853.51
22
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-22 Regression Output for the Monthly Salary Example The regression equation is Salary = 0.732 + 0.111 Age + 0.459 Gender Predictor Coef StDev T P Constant 0.7321 0.2356 3.11 0.009 Age 0.11122 0.07208 1.54 0.149 Gender 0.45868 0.05346 8.58 0.000 S = 0.09679 R-Sq = 89.0% R-Sq(adj) = 87.2% Analysis of Variance Source DF SS MS F P Regression 2 0.90949 0.45474 48.54 0.000 Error 12 0.11242 0.00937 Total 14 1.02191
23
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-23 Regression Model Depicted with Males and Females Separated 0.800 1.000 1.200 1.400 1.600 1.800 0234 Males Females
24
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-24 Data for Multiple Regression to Predict Crude Oil Production YWorld Crude Oil Production X 1 U.S. Energy Consumption X 2 U.S. Nuclear Generation X 3 U.S. Coal Production X 4 U.S. Dry Gas Production X 5 U.S. Fuel Rate for Autos
25
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-25 Model-Building: Search Procedures All Possible Regressions Stepwise Regression Forward Selection Backward Elimination
26
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-26 All Possible Regressions with Five Independent Variables Four Predictors X 1,X 2 3 4 X 1 2 3 5 X 1 2 4 5 X 1 3 4 5 X 2 3 4 5 Single Predictor X 1 X 2 X 3 X 4 X 5 Two Predictors X 1,X 2 X 1 3 X 1 4 X 1 5 X 2 3 X 2 4 X 2 5 X 3 4 X 3 5 X 4 5 Three Predictors X 1,X 2 3 X 1 2 4 X 1 2 5 X 1 3 4 X 1 3 5 X 1 4 5 X 2 3 4 X 2 3 5 X 2 4 5 X 3 4 5 Five Predictors X 1,X 2 3 4 5
27
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-27 Stepwise Regression Perform k simple regressions; and select the best as the initial model Evaluate each variable not in the model –If none meet the criterion, stop –Add the best variable to the model; reevaluate previous variables, and drop any which are not significant Return to previous step
28
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-28 Forward Selection Like stepwise, except variables are not reevaluated after entering the model
29
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-29 Backward Elimination Start with the “full model” (all k predictors) If all predictors are significant, stop Otherwise, eliminate the most nonsignificant predictor; return to previous step
30
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-30 Stepwise: Step 1 - Simple Regression Results for Each Independent Variable Dependent Variable Independent Variablet-RatioR 2 YX 1 11.7785.2% YX 2 4.4345.0% YX 3 3.9138.9% YX 4 1.08 4.6% YX 5 33.5434.2%
31
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-31 MINITAB Stepwise Output Stepwise Regression F-to-Enter: 4.00 F-to-Remove: 4.00 Response is CrOilPrd on 5 predictors, with N = 26 Step 1 2 Constant 13.075 7.140 USEnCons 0.580 0.772 T-Value 11.77 11.91 FuelRate -0.52 T-Value -3.75 S 1.52 1.22 R-Sq 85.24 90.83
32
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-32 Multicollinearity Condition that occurs when two or more of the independent variables of a multiple regression model are highly correlated –Difficult to interpret the estimates of the regression coefficients –Inordinately small t values for the regression coefficients –Standard deviations of regression coefficients are overestimated –Sign of predictor variable’s coefficient opposite of what expected
33
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 15-33 Correlations among Oil Production Predictor Variables Energy ConsumptionNuclearCoalDry GasFuel Rate Energy Consumption10.8560.7910.0570.791 Nuclear0.85610.952-0.4040.972 Coal0.7910.9521-0.4480.968 Dry Gas0.057-0.404-0.4481-0.423 Fuel Rate0.7960.9720.968-0.4231
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.