Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft ® Excel 4 th Edition
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-2 Chapter Goals After completing this chapter, you should be able to: Recognize situations in which to use analysis of variance (ANOVA) Understand different analysis of variance designs Evaluate assumptions of the model Perform a single-factor ANOVA and interpret the results Conduct and interpret a Tukey-Kramer post-analysis to determine which means are different Analyze two-factor analysis of variance tests Conduct and interpret a Tukey-Kramer post-analysis procedure to determine which factors are different
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-3 General ANOVA Analysis Investigator controls one or more independent variables Called factors or treatment variables One factor contains three or more levels or groups or categories/classifications Other factors contains two or more levels or groups or categories/classifications Experimental design: the plan used to test the hypothesis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-4 One-Factor ANOVA Also known as Completely Randomized Design and One-way ANOVA Experimental units (subjects) are assigned randomly to treatments Subjects are assumed homogeneous Only one factor or independent variable With three or more treatment levels Analyzed by one-factor analysis of variance
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-5 One-Factor Analysis of Variance Evaluates the difference among the means of three or more groups Examples: Accident rates for 1 st, 2 nd, and 3 rd shift Expected mileage for five brands of tires Assumptions Populations are normally distributed (test with Box plot or Normal Probability Plot) Populations have equal variances (use Levene’s Test for Homogeneity of Variance) Samples are randomly and independently drawn
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-6 Why Analysis of Variance? We could compare the means in pairs using a t test for difference of means Each t test contains Type 1 error The total Type 1 error with k pairs of means is 1- (1 - ) k If there are 5 means and you use =.05 Must perform 10 comparisons Type I error is 1 – (.95) 10 =.40 40% of the time you will reject the null hypothesis of equal means in favor of the alternative even when the null is true!
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-7 Hypotheses of One-Factor ANOVA All population means are equal i.e., no treatment effect At least one population mean is different i.e., there is a treatment effect Does not mean that all population means are different (some pairs may be the same)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-8 One-Factor ANOVA All Means are the same: The Null Hypothesis is True (No Treatment Effect)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-9 One-Factor ANOVA At least one mean is different: The Null Hypothesis is NOT true (Treatment Effect is present) or (continued)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap One-Factor ANOVA F Test Example You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the.05 significance level, is there a difference in mean driving distance? Club 1 Club 2 Club
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap One-Factor ANOVA Example: Scatter Diagram Distance Club 1 Club 2 Club Club 1 2 3
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap SUMMARY GroupsCountSumAverageVariance Club Club Club ANOVA Source of Variation SSdfMSFP-valueF crit Between Groups E Within Groups Total ANOVA -- Single Factor: Excel Output EXCEL: Tools | Data Analysis | ANOVA: Single Factor
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap F = One-Factor ANOVA Example Solution H 0 : μ 1 = μ 2 = μ 3 H 1 : μ i not all equal =.05 df 1 = 2 df 2 = 12 Test Statistic: p-value: 4.99E-05 Decision: Conclusion: Reject H 0 at = 0.05 There is evidence that at least one μ i differs from the rest 0 =.05 Reject H 0 Do not reject H 0
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap The Tukey-Kramer Procedure Tells which population means are significantly different e.g.: μ 1 = μ 2 μ 3 Done after rejection of equal means in ANOVA Allows pair-wise comparisons Compare absolute mean differences with critical range x μ 1 = μ 2 μ 3
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap The Tukey-Kramer Procedure: Example 1. Compute absolute mean differences: Club 1 Club 2 Club Find a Q U value from the table in appendix E.9 with c = 3 (across the table) and n – c = 15 – 3 = 12 degrees of freedom (down the table) for the desired level of ( =.05 used here): Get MSW from the ANOVA output.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap The Tukey-Kramer Procedure: Example 5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at 5% level of significance. 3. Compute Critical Range: 4. Compare: (continued) PhStat does all the calculations for you but you must input the Q value
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Tukey-Kramer in PHStat
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Two-Factor ANOVA (With or Without Replication) Examines the effect of Two or more factors of interest on the dependent variable e.g.: Percent carbonation and line speed on soft drink bottling process Interaction between the different levels of these two factors e.g.: Does the effect of one particular percentage of carbonation depend on which level the line speed is set?
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Assumptions Populations are normally distributed Populations have equal variances Independent random samples are drawn (continued) Two-Factor ANOVA (With or Without Replication)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Two-Factor ANOVA Summary Table Without Replication Source of Variation Degrees of Freedom Sum of Square s Mean Squares F Statistic p-value Rows Factor A r – 1SSAMSA MSA/ MSE f (F A ) Columns Factor B c – 1SSBMSB MSB/ MSE f (F B ) Error (r-1)(c-1)SSEMSE Total r c–1 SST
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Two-Factor ANOVA Summary Table With Replication Source of Variation Degrees of Freedom Sum of Squares Mean Squares F Statistic p-value Sample Factor A (Row) r – 1SSA MSA = SSA/(r – 1) MSA/ MSE f (F A ) Columns Factor B c – 1SSB MSB = SSB/(c – 1) MSB/ MSE f (F B ) Interaction (AB) (r – 1)(c – 1)SSAB MSAB = SSAB/ [(r – 1)(c – 1)] MSAB/ MSE f (F A&B ) Within Error r c n ’ – 1) SSE MSE = SSE/[r c n ’ – 1)] Total r c n ’ – 1 SST
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Two-Factor ANOVA: The F Test Statistic F Test for Factor B Effect F Test for Interaction Effect H 0 : μ 1A = μ 2A = μ 3A = H 1 : Not all means are equal H 0 : Interaction of A and B is equal to zero H 1 : Interaction of A and B is not zero F Test for Factor A Effect H 0 : μ 1B = μ 2B = μ 3B = H 1 : Not all means are equal Only for Two-Factor with replication
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap As production manager, you want to see if 3 filling machines have different mean filling times when used with 5 types of boxes. At the.05 level, is there a difference in machines, in boxes? Box Machine1 Machine2 Machine Two-Factor ANOVA Without Replication
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Summary Table Source of Variation Degrees of Freedom Sum of Squares Mean Square.6309 Rows ( Boxes ) = Columns ( Machines ) = Total(3 · 5)-1 = P-ValueF.6543 Error (5-1)(3-1) = 8
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Two-Factor ANOVA With Replication As production manager, you want to see if 3 filling machines have different mean filling times when used with 5 types of boxes. At the.05 level, is there a difference in machines, in boxes? Is there an interaction? Box Machine1 Machine2 Machine
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Summary Table Source of Variation Degrees of Freedom Sum of Squares Mean Square Sample ( Boxes ) = Columns ( Machines ) = Total3 · 5 ·2 -1 = P-ValueF.0277 Within (Error) E-09 (5-1)(3-1) = 8 Interaction 5·3·(2-1)=
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap The Tukey-Kramer Procedure: With Replication Factor A 1.No Macro - compute by formula only 2.MSW (Within) from ANOVA Printout 3.Q from Table E.9 in the Book page 860 alpha =.05 or.01 r is the number of levels of factor A (across table) rc(n’-1) (down the table) c is the number of levels of factor B n’ is the number of replications
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap The Tukey-Kramer Procedure: With Replication Factor B 1.No Macro compute by formula only 2.MSW (Within) from ANOVA Printout 3.Q from Table E.9 in the Book page 860 alpha =.05 or.01 c is the number of levels of factor B (across table) rc(n’-1) (down the table) r is the number of levels of factor A n’ is the number of replications
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap Chapter Summary Described one-factor analysis of variance ANOVA assumptions ANOVA test for difference in c means The Tukey-Kramer procedure for multiple comparisons Described two-factor analysis of variance Examined effects of multiple factors Examined interaction between factors for the model with and without replicated observations The Tukey-Kramer procedure for multiple comparisons for both factor A and factor B for the model with replication