Presentation is loading. Please wait.

Presentation is loading. Please wait.

McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv Jaggia.

Similar presentations


Presentation on theme: "McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv Jaggia."— Presentation transcript:

1 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly

2 15-2 Analyzing the Winning Percentage in Baseball Sports analysts frequently quarrel over what statistics separate winning teams from the losers. Is a high batting average (BA) the best predictor, or is it a low earned run average (ERA)? Or both? We will fit three regression models and use the statistical significance of the predictors to help decide.

3 15-3 15.1 Tests of Significance LO 15.1 Conduct tests of individual significance. We will find the best model which be most reliable for prediction.

4 15-4 LO 15.1

5 15-5 Computer-Generated Output Virtually all statistical software will automatically report a test statistic and a p-value with each coefficient estimate. These values can be used to test whether the regression coefficient differs from zero. LO 15.1

6 15-6 Example 15.1 LO 15.1 We can test β 2 for ERA. Reject H 0. Both β 1 and β 2 are Not 0, so we include them in the model. This model is acceptable. But we find the best model.

7 15-7 Intervals for the Parameters LO 15.1 Find it from Excel output We can test β 2 for ERA. Reject H 0. Both β 1 and β 2 are Not 0, so we include them in the model. This model is acceptable. But we find the best model.

8 15-8 Excel Output for Model 1 and Model 2 CoefficientsStandard Errort StatP-valueLower 95%Upper 95% Intercept-0.273120.282599-0.966460.342089-0.8520.305757 BA3.0053891.0976012.7381450.0106190.7570565.253722 CoefficientsStandard Errort StatP-valueLower 95%Upper 95% Intercept0.9503710.09164410.370284.29E-110.7626481.138095 ERA-0.110550.022384-4.938753.28E-05-0.1564-0.0647

9 15-9 Synopsis of the Introductory Case We compare three models with some measures as blow. Model 1 is acceptable because p-value of BA < 0.05, but R 2 is 0.2112 Model 2 is acceptable because p-value of ERA < 0.05, but R 2 is 0.4656 Model 3 best because p-values of BA and ERA < 0.05 and adjusted R 2 is 0.6945 (highest). So prediction with this model id most reliable.

10 15-10 Selecting Best Model Build alternative models where all independent variable or variables are significant by comparing p-value with level of significance (α) which is usually 0.05. Select the model which has highest R 2 or adjusted R 2.

11 15-11 Test of Joint Significance (Finding if the model is effective as a whole)

12 15-12 Example 15.4

13 15-13 The p-value We use p-value from Excel output SUMMARY OUTPUT Regression Statistics Multiple R0.84592414 R Square0.71558765 Adjusted R Square0.69452006 Standard Error0.03754897 Observation s30 ANOVA dfSSMSF Significance F Regression20.095780.0478933.966294.24917E-08 Residual270.0380680.00141 Total290.133848 Coefficients Standard Errort StatP-valueLower 95% Upper 95% Intercept0.126897040.182220.6963960.492133 - 0.2469868630.500781 BA3.275445740.6723084.8719444.3E-051.8959841344.654907 ERA - 0.115260240.016657-6.919671.95E-07-0.14943738-0.08108

14 15-14 Conclusion P-value (0) < α (0.05), so we reject H 0. It means the model as a whole is effective.

15 15-15 Synopsis of the Introductory Case Goodness-of-fit measures indicated that including both batting average and ERA is most appropriate. In Model 3, explanatory variables are individually significant and the regression is jointly significant. We can conclude that both batting average and earned run average are good predictors of overall winning percentage.

16 15-16 Another Example We want to predict return of stock (y) based on P/E (x 1 ) and P/S (x 2 )ratios. The data file is dow2010 on S:\jslee\qm3620\data We test 3 models. Model 1: y = β 0 + β 1 x 1 + ε Model 2: y = β 0 + β 1 x 2 + ε Model 3: y = β 0 + β 1 x 1 + β 2 x 2 + ε

17 15-17 Another Example In Case 1, Model 1: y = β 0 + β 1 x 1 + ε has p value < 0.05 and R square = 0.39. The model is effective. Model 2: y = β 0 + β 1 x 2 + ε has p value > 0.05 and R square = 0.02. The model is ineffective. Model 3: y = β 0 + β 1 x 1 + β 2 x 2 + ε has p value (from ANOVA table) < 0.05 and adjusted R square = 0.36. The model is effective. The most effective model is Model 1.

18 15-18 Another Example In Case 2, Model 1: y = β 0 + β 1 x 1 + ε has p value < 0.05 and R square = 0.28. The model is effective. Model 2: y = β 0 + β 1 x 2 + ε has p value < 0.05 and R square = 0.21. The model is effective. Model 3: y = β 0 + β 1 x 1 + β 2 x 2 + ε has p value (from ANOVA table) < 0.05 and adjusted R square = 0.3. The model is effective. The most effective model is Model 3. True? It depends.

19 15-19 One Rule Even if a multiple regression model has one or more insignificant independent variables (p value > 0.05), if the p-value from ANOVA table < 0.05 and highest adjusted R square, it may be the most effective model.

20 15-20 Homework We want to predict quarterback salaries base on PC (pass completion rate: x 1 ), TD (touchdown scores: x 2 and/or age (x 3 ). The data, quarterback_salaries, is on s:\jslee\qm3620\data. Compare the following models by generating Excel regression outputs and find the best model. Model 1: y = β 0 + β 1 x 1 + ε Model 2: y = β 0 + β 1 x 2 + ε Model 3: y = β 0 + β 1 x 3 + ε Model 4: y = β 0 + β 1 x 1 + β 2 x 2 + ε Model 5: y = β 0 + β 1 x 1 + β 2 x 3 + ε Model 6: y = β 0 + β 1 x 2 + β 2 x 3 + ε Model 7: y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + ε


Download ppt "McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv Jaggia."

Similar presentations


Ads by Google