Copyright © 2004 Pearson Education, Inc.
Chapter 11 Analysis of Variance 11-1 Overview 11-2 One-Way ANOVA 11-3 Two-Way ANOVA Copyright © 2004 Pearson Education, Inc.
Overview and One-Way ANOVA Section 11-1 & 11-2 Overview and One-Way ANOVA Created by Erin Hodgess, Houston, Texas Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Overview Analysis of variance(ANOVA) is a method for testing the hypothesis that three or more population means are equal. For example: H0: µ1 = µ2 = µ3 = . . . µk H1: At least one mean is different A method to test more than two population means are equal similar to the way we tested two population means equal in Chapter 8. Discussion of why one cannot just test two samples at a time at bottom of page 615. Copyright © 2004 Pearson Education, Inc.
ANOVA methods require the F-distribution 1. The F-distribution is not symmetric; it is skewed to the right. 2. The values of F can be 0 or positive, they cannot be negative. 3. There is a different F-distribution for each pair of degrees of freedom for the numerator and denominator. Critical values of F are given in Table A-5 page 616 of text Because the procedures used in this chapter require complicated calculations, the use and interpretation of computer software or a TI-83 Plus calculator is strongly encouraged. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. F - distribution Figure 11-1 Copyright © 2004 Pearson Education, Inc.
An Approach to Understanding ANOVA One-Way ANOVA An Approach to Understanding ANOVA 1. Understand that a small P-value (such as 0.05 or less) leads to the rejection of the null hypothesis of equal means. With a large P-value (such as greater than 0.05), fail to reject the null hypothesis of equal means. 2. Develop an understanding of the underlying rationale by studying the example in this section. page 617 of text The requirements of normality and equal variances are somewhat relaxed, because the methods in this section work reasonably well unless a population has a distribution that is very nonnormal or the population variances differ by large amounts. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. 3. Become acquainted with the nature of the SS (sum of squares) and MS (mean square) values and their role in determining the F test statistic, but use statistical software packages or a calculator for finding those values. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Definition Treatment (or factor) A treatment(or factor) is a property or characteristic that allows us to distinguish the different populations from another. Use computer software or TI-83 Plus for ANOVA calculations if possible The method used is called one-way analysis of variance because we use a single property, or characteristic, for categorizing the populations. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. One-Way ANOVA Assumptions 1. The populations have approximately normal distributions. 2. The populations have the same variance 2 (or standard deviation ). 3. The samples are simple random samples. 4. The samples are independent of each other. 5. The different samples are from populations that are categorized in only one way. page 617 of text The requirements of normality and equal variances are somewhat relaxed, because the methods in this section work reasonably well unless a population has a distribution that is very nonnormal or the population variances differ by large amounts. Copyright © 2004 Pearson Education, Inc.
Procedure for testing: H0: µ1 = µ2 = µ3 = . . . 1. Use STATDISK, Minitab, Excel, or a TI- 83 Calulator to obtain results. 2. Identify the P-value from the display. 3. Form a conclusion based on these criteria: If P-value , reject the null hypothesis of equal means. If P-value > , fail to reject the null hypothesis of equal means. Example at bottom of page 618. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: Readability Scores Given the readability scores summarized in Table 11-1 and a significance level of = 0.05, use STATDISK, Minitab, Excel, or a TI-83 PLUS calculator to test the claim that the three samples come from populations with means that are not all the same. H0: 1 = 2 = 3 H1: At least one of the means is different from the others. Please refer to the displays on the next slide. The displays all show a P-value of 0.000562 or 0.001. Because the P-value is less than the significance level of = 0.05, we reject the null hypothesis of equal means. This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: Readability Scores This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
Readability Scores (cont’d) Example: Readability Scores (cont’d) This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: Readability Scores Given the readability scores summarized in Table 11-1 and a significance level of = 0.05, use STATDISK, Minitab, Excel, or a TI-83 PLUS calculator to test the claim that the three samples come from populations with means that are not all the same. There is sufficient evidence to support the claim that the three population means are not all the same. We conclude that those books have readability levels that are not all the same. This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
ANOVA Fundamental Concept Estimate the common value of 2 using 1. The variance between samples (also called variation due to treatment) is an estimate of the common population variance 2 that is based on the variability among the sample means. 2. The variance within samples (also called variation due to error) is an estimate of the common population variance 2 based on the sample variances. Even with heavy use of technology with this topic, it is important to have a basic understanding of underlying principles. page 620 of text Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Relationships Among Components of ANOVA Figure 11-2 at top of page 619 Copyright © 2004 Pearson Education, Inc.
ANOVA Fundamental Concept Test Statistic for One-Way ANOVA F = variance between samples variance within samples A excessively large F test statistic is evidence against equal population means. Copyright © 2004 Pearson Education, Inc.
Calculations with Equal Sample Sizes Variance between samples = nsx 2 where sx = variance of samples means 2 Variance within samples = sp 2 where sp = pooled variance (or the mean of the sample variances) 2 Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: Sample Calculations Use Table 11-2 to calculate the variance between samples, variance within samples, and the F test statistic. This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: Sample Calculations Use Table 11-2 to calculate the variance between samples, variance within samples, and the F test statistic. s2x = (5.5)2 + (6.0)2 + (6.0)2 – [(5.5+6.0+6.0)2/3] 3–1 s2x = 102.25 – 102.08333 = 0.0833 2 This is exercise #7 on page 521. ns2x = 4(0.0833) = 0.3332 Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: Sample Calculations Use Table 11-2 to calculate the variance between samples, variance within samples, and the F test statistic. s2p = 3.0 + 2.0 + 2.0 = 2.333 3 F= ns2x = 0.3332 = 0.1428 s2p 2.3333 This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Critical Value of F Right-tailed test Degree of freedom with k samples of the same size n numerator df = k – 1 denominator df = k(n – 1) page 621 of text Copyright © 2004 Pearson Education, Inc.
Calculations with Unequal Sample Sizes F = = variance between samples variance within samples (ni – 1)si (ni – 1) 2 ni(xi – x)2 k –1 where x = mean of all sample scores combined k = number of population means being compared ni = number of values in the ith sample xi = mean values in the ith sample si = variance of values in the ith sample 2 Copyright © 2004 Pearson Education, Inc.
Key Components of ANOVA Method SS(total), or total sum of squares, is a measure of the total variation (around x) in all the sample data combined. Formula 11-1 SS(total) = (x – x)2 Copyright © 2004 Pearson Education, Inc.
Key Components of ANOVA Method SS(treatment) is a measure of the variation between the samples. In one-way ANOVA, SS(treatment) is sometimes referred to as SS(factor). Because it is a measure of variability between the sample means, it is also referred to as SS (between groups) or SS (between samples). SS(treatment) = n1(x1 – x)2 + n2(x2 – x)2 + . . . nk(xk – x)2 = ni(xi - x)2 Formula 11-2 Copyright © 2004 Pearson Education, Inc.
Key Components of ANOVA Method SS(error) is a sum of squares representing the variability that is assumed to be common to all the populations being considered. middle of page 623 of text Copyright © 2004 Pearson Education, Inc.
Key Components of ANOVA Method SS(error) is a sum of squares representing the variability that is assumed to be common to all the populations being considered. SS(error) = (n1 –1)s1 + (n2 –1)s2 + (n3 –1)s3 . . . nk(xk –1)si = (ni – 1)si 2 2 2 2 2 Formula 11-3 Copyright © 2004 Pearson Education, Inc.
Key Components of ANOVA Method SS(total) = SS(treatment) + SS(error) Formula 11-4 Copyright © 2004 Pearson Education, Inc.
MS (treatment) is mean square for treatment, obtained as follows: Mean Squares (MS) Sum of Squares SS(treatment) and SS(error) divided by corresponding number of degrees of freedom. MS (treatment) is mean square for treatment, obtained as follows: Formula 11-5 MS (treatment) = SS (treatment) k – 1 Copyright © 2004 Pearson Education, Inc.
MS (error) is mean square for error, obtained as follows: Mean Squares (MS) MS (error) is mean square for error, obtained as follows: Formula 11-6 MS (error) = SS (error) N – k MS (total) = SS (total) N – 1 Formula 11-7 Copyright © 2004 Pearson Education, Inc.
Test Statistic for ANOVA with Unequal Sample Sizes MS (treatment) MS (error) Formula 11-8 Numerator df = k – 1 Denominator df = N – k Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: Readability Scores This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
Section 11-3 Two-Way ANOVA Created by Erin Hodgess, Houston, Texas Copyright © 2004 Pearson Education, Inc.
Two-Way Analysis of Variance Two-Way ANOVA involves two factors. The data are partitioned into subcategories called cells. page 631 of text Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: New York Marathon Runners This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Definition There is an interaction between two factors if the effect of one of the factors changes for different categories of the other factor. page 632 of text It is important to consider the interaction rather than just conducting another one-way ANOVA test on another factor. Example at bottom of page 632. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: New York Marathon Runners This is exercise #7 on page 521. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Assumptions 1. For each cell, the sample values come from a population with a distribution that is approximately normal. 2. The populations have the same variance 2. 3. The samples are simple random samples. 4. The samples are independent of each other. 5. The sample values are categorized two ways. 6. All of the cells have the same number of sample values. page 631 of text Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Two-Way ANOVA calculations are quite involved, so we will assume that a software package is being used. Minitab and Excel can do two-way ANOVA, but STATDISK and the TI-83 Plus calculator cannot. Copyright © 2004 Pearson Education, Inc.
Figure 11-3 Procedure for Two-Way ANOVA page 634 of text Example continued on page 635 Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: New York Marathon Runners Given the Minitab printout from the New York Marathon data, work through the procedure for two-way ANOVA. Check for Interaction Effects: H0: There are no interaction effects. H1: There are interaction effects. F = MS(interaction) = 10,521,304 = 1.17 MS(error) 9,028,477 This is exercise #7 on page 521. The P-value is shown as 0.329, so we fail to reject the null hypothesis of no interaction between the two factors. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: New York Marathon Runners Given the Minitab printout from the New York Marathon data, work through the procedure for two-way ANOVA. Check for Row Effects: H0: There are no effects from the row factors. H1: There are row effects. F = MS(gender) = 15,225,413 = 1.69 MS(error) 9,028,477 This is exercise #7 on page 521. The P-value is shown as 0.206, so we fail to reject the null hypothesis of no effects from gender. Copyright © 2004 Pearson Education, Inc.
Copyright © 2004 Pearson Education, Inc. Example: New York Marathon Runners Given the Minitab printout from the New York Marathon data, work through the procedure for two-way ANOVA. Check for Column Effects: H0: There are no effects from the column factors. H1: There are column effects. F = MS(age) = 46,043,490 = 5.10 MS(error) 9,028,477 This is exercise #7 on page 521. The P-value is shown as 0.014. We have significance at the 0.05 level, but not the 0.01 level. We reject the null hypothesis of no effects from age. Copyright © 2004 Pearson Education, Inc.
Special Case: One Observation Per Cell and No Interaction There is no variation within individual cells and those samples cannot be calculated. If it seems reasonable to assume (based on knowledge about the circumstances) that there is no interaction between the two factors, make the assumption and then proceed as before to test the following two hypotheses separately: H0 : There are no effects from the row factor. H0 : There are no effects from the column factor Example on page 636 of text Copyright © 2004 Pearson Education, Inc.