Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparing Three or More Means

Similar presentations


Presentation on theme: "Comparing Three or More Means"— Presentation transcript:

1 Comparing Three or More Means
Chapter 13 Comparing Three or More Means

2 Comparing Three or More Means (One-Way Analysis of Variance)
Section 13.1 Comparing Three or More Means (One-Way Analysis of Variance)

3 Objectives Verify the requirements to perform a one-way ANOVA
Test a hypothesis regarding three or more means using one-way ANOVA

4 Analysis of Variance (ANOVA) is an inferential method used to test the equality of three or more population means.

5 CAUTION! Do not test H0: 1=2=3 by conducting three separate hypothesis tests, because the probability of making a Type I error will be much higher than .

6 Objective 1 Verify the requirements to perform a one-way ANOVA

7 Requirements of a One-Way ANOVA Test
There are k simple random samples; one from each of k populations. The k samples are independent of each other; that is, the subjects in one group cannot be related in any way to subjects in a second group. The populations are normally distributed. The populations have the same variance; that is, each treatment group has the population variance 2.

8 Verifying the Requirement of Equal
Population Variance The one-way ANOVA procedures may be used provided that the largest sample standard deviation is no more than twice the smallest sample standard deviation.

9 Parallel Example 1: Verifying the Requirements of ANOVA
The following data represent the weight (in grams) of pennies minted at the Denver mint in 1990,1995, and Verify that the requirements in order to perform a one-way ANOVA are satisfied.

10

11 Solution The 3 samples are simple random samples. The samples were obtained independently. Normal probability plots for the 3 years follow. All of the plots are roughly linear so the normality assumption is satisfied.

12

13

14

15 Solution 4. The sample standard deviations are computed for each sample using Minitab and shown on the following slide. The largest standard deviation is not more than twice the smallest standard deviation (2* = > ) so the requirement of equal population variances is considered satisfied.

16 Descriptive Statistics
Variable N Mean Median TrMean StDev SE Mean Variable Minimum Maximum Q Q3

17 Objective 2 Test a Hypothesis Regarding Three or More Means Using One-Way ANOVA

18 The basic idea in one-way ANOVA is to determine if the sample data could come from populations with the same mean, , or if the sample evidence suggests that at least one sample comes from a population whose mean is different from the others. To make this decision, we compare the variability among the sample means to the variability within each sample.

19 We call the variability among the sample means the between-sample variability, and the variability of each sample the within-sample variability. If the between-sample variability is large relative to the within-sample variability, we have evidence to suggest that the samples come from populations with different means.

20 ANOVA F-Test Statistic
The analysis of variance F-test statistic is given by

21 Computing the F-Test Statistic
Step 1: Compute the sample mean of the combined data set by adding up all the observations and dividing by the number of observations. Call this value . Step 2: Find the sample mean for each sample (or treatment). Let represent the sample mean of sample 1, represent the sample mean of sample 2, and so on. Step 3: Find the sample variance for each sample (or treatment). Let represent the sample variance for sample 1, represent the sample variance for sample 2, and so on.

22 Computing the F-Test Statistic
Step 4: Compute the sum of squares due to treatments, SST, and the sum of squares due to error, SSE. Step 5: Divide each sum of squares by its corresponding degrees of freedom (k-1 and n-k, respectively) to obtain the mean squares MST and MSE. Step 6: Compute the F-test statistic:

23 Compute the F-test statistic for the penny data.
Parallel Example 2: Computing the F-Test Statistic Compute the F-test statistic for the penny data.

24 Solution Step 1: Compute the overall mean. Step 2: Find the sample variance for each treatment (year).

25 Solution Step 3: Find the sample variance for each treatment (year).

26 Solution Step 4: Compute the sum of squares due to treatment, SST, and the sum of squares due to error, SSE. SST =11( )2 + 11( )2 + 11( )2 = SSE =(11-1)(0.0005) + (11-1)(0.0006) (11-1)(0.0002) = 0.013

27 Solution Step 5: Compute the mean square due to treatment, MST, and the mean square due to error, MSE.

28 Solution Step 6: Compute the F-statistic.

29 Solution ANOVA Table: Source of Variation Sum of Squares
Degrees of Freedom Mean Squares F-Test Statistic Treatment 0.0009 2 0.0005 1.25 Error 0.013 30 0.0004 Total 0.0139 32

30 Decision Rule in the One-Way ANOVA Test
If the P-value is less than the level of significance, , reject the null hypothesis.

31 Post Hoc Tests on One-Way Analysis of Variance
Section 13.2 Post Hoc Tests on One-Way Analysis of Variance

32 Objectives Perform the Tukey Test

33 When the results from a one-way ANOVA lead us to conclude that at least one population mean is different from the others, we can make additional comparisons between the means to determine which means differ significantly. The procedures for making these comparisons are called multiple comparison methods.

34 Objective 1 Perform the Tukey Test

35 The computation of the test statistic for Tukey’s test for comparing pairs of means follows the same logic as the test for comparing two means from independent sampling. However, the standard error that is used is where s2 is the mean square error estimate (MSE) of 2 from the one-way ANOVA, n1 is the sample size from population 1, and n2 is the sample size from population 2.

36 Test Statistic for Tukey’s Test
The test statistic for Tukey’s test when testing H0: 1= 2 versus H1: 1≠ 2 is given by where s2 is the mean square estimate of  (MSE) from ANOVA n1 is the sample size from population 1 n2 is the sample size from population 2

37 Critical Value for Tukey’s Test
The critical value for Tukey’s test using a familywise error rate  is given by where  is the degrees of freedom due to error (n-k) k is the total number of means being compared

38 Parallel Example 1: Finding the Critical Value from the
Parallel Example 1: Finding the Critical Value from the Studentized Range Distribution Find the critical value from the Studentized range distribution with v = 13 degrees of freedom and k=4 degrees of freedom with a familywise error rate =0.01. Find the critical value from the Studentized range distribution with v = 64 degrees of freedom and k=6 degrees of freedom with a familywise error rate =0.05.

39 Solution

40 Tukey’s Test After rejecting the null hypothesis H0: 1= 2= ··· = k
the following steps can be used to compare pairs of means for significant differences, provided that There are k simple random samples from k populations. The k samples are independent of each other. The populations are normally distributed. The populations have the same variance. Step 1: Arrange the sample means in ascending order.

41 Tukey’s Test Step 2: Compute the pairwise differences where .
Step 3: Compute the test statistic, for each pairwise difference.

42 Tukey’s Test Step 4: Determine the critical value, , where  is the level of significance (the familywise error rate). Step 5: If , reject the null hypothesis that H0: 1= 2 and conclude that the means are significantly different. Step 6: Compare all pairwise differences to identify which means differ.

43 Parallel Example 2: Performing Tukey’s Test
Suppose that there is sufficient evidence to reject H0: 1= 2= 3 using a one-way ANOVA. The mean square error from ANOVA is found to be The sample means are , and , with n1= n2= n3=9. Use Tukey’s test to determine which pairwise means are significantly different using a familywise error rate of  = 0.05.

44 Solution Step 1: The means, in ascending order, are
Step 2: We next compute the pairwise differences for each pair, subtracting the smaller sample mean from the larger sample mean:

45 Solution Step 3: Compute the test statistic q0 for each pairwise difference. 2-1: 2-3:

46 Solution 3-1: Step 4: Find the critical value using an  = 0.05 familywise error rate with  = n-k =27-3=24 and k=3. Then q0.05,24,3=3.532.

47 Solution Step 5: Since 6.22 and 4.42 are greater than 3.532, but 1.79 is less than 3.532, we reject H0: 1=2 and H0: 2=3 but not H0: 1=3. Step 6: The conclusions of Tukey’s test are

48 The Randomized Complete Block Design
Section 13.3 The Randomized Complete Block Design

49 Objectives Conduct analysis of variance on the randomized complete block design Perform the Tukey test

50 In the completely randomized design, the researcher manipulates a single factor and fixes it at two or more levels and then randomly assigns experimental units to a treatment. This design is not always sufficient because the researcher may be aware of additional factors that cannot be fixed at a single level throughout the experiment.

51 The randomized block design is an experimental design that captures more information and therefore reduces experimental error.

52 “In Other Words” A block is a method for controlling experimental error. Blocks should form a homogenous group. For example, if age is thought to explain some of the variability in the response variable, we can remove age from the experimental error by forming blocks of experimental units with the same age.

53 CAUTION! When we block, we are not interested in determining whether the block is significant. We only want to remove experimental error to reduce the mean square error.

54 Objective 1 Conduct Analysis of Variance on the Randomized Complete Block Design

55 Requirements for Analyzing Data from a
Randomized Complete Block Design The response variable for each of the k populations is normally distributed. The response variable for each of the k populations has the same variance; that is, each treatment group has population variance 2.

56 Parallel Example 2: Analyzing the Randomized Complete Block Design
A rice farmer is interested in the effect of four fertilizers on fruiting period. He randomly selects four rows from his field that have been planted with the same seed and divides each row into four segments. The fertilizers are then randomly assigned to the four segments. Assume that the environmental conditions are the same for each of the four rows. The data given in the next slide represent the fruiting period, in days, for each row/fertilizer combination. Is there sufficient evidence to conclude that the fruiting period for the four fertilizers differs at the =0.05 level of significance?

57 Fertilizer 1 Fertilizer 2 Fertilizer 3 Fertilizer 4 Row 1 13.7 14.0 16.2 17.1 Row 2 13.6 14.4 15.3 16.9 Row 3 12.2 11.7 13.0 14.1 Row 4 15.0 16.0 15.9 17.3

58 Solution We wish to test H0: 1= 2= 3= 4 versus H1: at least one of the means is different We first verify the requirements for the test: Normal probability plots for the data from each of the four fertilizers indicates that the normality requirement is satisified.

59 Solution s1= s2= s3= s4=1.509 Since the largest standard deviation is not more than twice the smallest standard deviation, the assumption of equal variances is not violated.

60 Solution ANOVA output from Minitab:
Two-way ANOVA: Fruiting period versus Fertilizer, Row Source DF SS MS F P Fertilizer Row Error Total

61 Solution Since the P-value is < 0.001, we reject the null hypothesis and conclude that there is a difference in fruiting period for the four fertilizers. Note, we are not interested in testing whether the fruiting periods among the blocks are equal.

62 Objective 2 Perform the Tukey Test

63 Once the null hypothesis of equal population means is rejected, we can proceed to determine which means differ significantly using Tukey’s test. The steps are identical to those for comparing means in the one-way ANOVA. The critical value is q,,k using a familywise error rate of  with =(r-1)(c-1) = the error degrees of freedom (r is the number of blocks and c is the number of treatments) and k is the number of means being tested.

64 Use Tukey’s test to determine which pairwise
Parallel Example 3: Multiple Comparisons Using Tukey’s Test Use Tukey’s test to determine which pairwise means differ for the data presented in Example 2 with a familywise error rate of  = 0.05, using MINITAB.

65 Tukey Simultaneous Tests
Response Variable Fruiting period All Pairwise Comparisons among Levels of Fertilizer Fertilizer = 1 subtracted from: Fertilizer Lower Center Upper ( *------) ( *------) (------* ) Fertilizer = 2 subtracted from: Fertilizer Lower Center Upper (------* ) ( *------) Fertilizer = 3 subtracted from: Fertilizer Lower Center Upper (------* )

66 Tukey Simultaneous Tests
Response Variable Fruiting period All Pairwise Comparisons among Levels of Fertilizer Fertilizer = 1 subtracted from: Difference SE of Adjusted Fertilizer of Means Difference T-Value P-Value Fertilizer = 2 subtracted from: Fertilizer = 3 subtracted from:

67 Two-Way Analysis of Variance
Section 13.4 Two-Way Analysis of Variance

68 Objectives Analyze a two-way ANOVA design Draw interaction plots
Perform the Tukey test

69 Objective 1 Analyze a Two-Way Analysis of Variance Design

70 Recall, there are two ways to deal with factors:
control the factors by fixing them at a single level or by fixing them at different levels, and randomize so that their effect on the response variable is minimized. In both the completely randomized design and the randomized complete block design, we manipulated one factor to see how varying it affected the response variable.

71 In a Two-Way Analysis of Variance design, two
factors are used to explain the variability in the response variable. We deal with the two factors by fixing them at different levels. We refer to the two factors as factor A and factor B. If factor A has n levels and factor B has m levels, we refer to the design as an factorial design.

72 Parallel Example 1: A 2 x 4 Factorial Design
Suppose the rice farmer is interested in comparing the fruiting period for not only the four fertilizer types, but for two different seed types as well. The farmer divides his plot into 16 identical subplots. He randomly assigns each seed/fertilizer combination to two of the subplots and obtains the fruiting periods shown on the following slide. Identify the main effects. What does it mean to say there is an interaction effect between the two factors?

73 Fertilizer 1 Fertilizer 2 Fertilizer 3 Fertilizer 4 Seed Type A 13.5 13.9 14.1 15.2 14.7 17.1 16.4 Seed Type B 14.4 15.0 15.4 15.3 15.9 16.9 17.3

74 Solution The two factors are A: fertilizer type and B: seed type. Since all levels of factor A are combined with all levels of factor B, we say that the factors are crossed. The main effect of factor A is the change in fruiting period that results from changing the fertilizer type. The main effect of factor B is the change in fruiting period that results from changing the seed type. We say that there is an interaction effect if the effect of fertilizer on fruiting period varies with seed type.

75 Requirements for the Two-Way
Analysis of Variance The populations from which the samples are drawn must be normal. The samples are independent. The populations all have the same variance.

76 In a two-way ANOVA, we test three separate hypotheses.
Hypotheses Regarding Interaction Effect H0: there is no interaction effect between the factors H1: there is interaction between the factors Hypotheses Regarding Main Effects H0: there is no effect of factor A on the response variable H1: there is an effect of factor A on the response variable H0: there is no effect of factor B on the response variable H1: there is an effect of factor B on the response variable

77 Whenever conducting a two-way ANOVA,
we always first test the hypothesis regarding interaction effect. If the null hypothesis of no interaction is rejected, we do not interpret the result of the hypotheses involving the main effects. This is because the interaction clouds the interpretation of the main effects.

78 Recall the rice farmer who is interested in determining
Parallel Example 3: Examining a Two-Way ANOVA Recall the rice farmer who is interested in determining the effect of fertilizer and seed type on the fruiting period of rice. Assume that probability plots indicate that it is reasonable to assume that the data come from populations that are normally distributed. Verify the requirement of equal population variances. Use MINITAB to test whether there is an interaction effect between fertilizer type and seed type. If there is no significant interaction, determine if there is a significant difference in the means for the 4 fertilizers the 2 seed types

79 Solution The standard deviations for each treatment combination are given in the table below: Fertilizer 1 Fertilizer 2 Fertilizer 3 Fertilizer 4 Seed Type A 0.283 0.424 0.354 0.495 Seed Type B Since the largest standard deviation, 0.495, is not more than twice the smallest standard deviation, 0.283, the assumption of equal variances is met.

80 Solution MINITAB output:
Analysis of Variance for Fruiting period Source DF SS MS F P Fertilizer Seed Fert*Seed Error Total b) The P-value for the interaction term is > 0.05, so we fail to reject the null hypothesis and conclude that there is no interaction effect.

81 Solution c) i) Since the P-value for fertilizer is given as 0.000, we reject the null hypothesis and conclude that the mean fruiting period is different for at least one of the 4 types of fertilizer. ii) Since the P-value for seed type is found to be 0.004, we reject the null hypothesis and conclude that the mean fruiting period is different for the two seed types.

82 Objective 2 Draw Interaction Plots

83 Constructing Interaction Plots
Step 1: Compute the mean value of the response variable within each cell. In addition, compute the row mean value of the response variable and the column mean value of the response variable with each level of each factor. Step 2: In a Cartesian plane, label the horizontal axis for each level of factor A. The vertical axis will represent the mean value of the response variable. For each level of factor A, plot the mean value of the response variable for each level of factor B. Draw straight lines connecting the points for the common level of factor B. You should have as many lines as there are levels of factor B. The more difference there is in the slopes of the two lines, the stronger the evidence of interaction.

84 Draw an interaction plot for the data from the rice farmer example.
Parallel Example 4: Drawing an Interaction Plot Draw an interaction plot for the data from the rice farmer example.

85 Solution The cell means are given in the table below. 13.7 13.8 14.95
Fertilizer 1 Fertilizer 2 Fertilizer 3 Fertilizer 4 Seed Type A 13.7 13.8 14.95 16.75 Seed Type B 14.7 15.05 15.6 17.1

86 Note that the lines have fairly similar slopes between points which indicates no interaction between fertilizer and seed type.

87 Objective 3 Perform the Tukey Test

88 Once we reject the hypothesis of equal population means for either factor, we proceed to determine which means differ significantly using Tukey’s test. The steps are identical to those for one-way ANOVA. However, the critical value is q,,k where  is the familywise error rate  = N-ab where N is the total sample size, a is the number of levels for factor A and b is the number of levels for factor B k is the number of means being tested for the factor

89 The standard error is where m is the product of the number of levels for the factor and the number of observations within each cell.

90 Parallel Example 7: Multiple Comparisons Using
Tukey’s Test For the rice farming example, use MINITAB to perform Tukey’s test to determine which means differ for the four types of fertilizer using a familywise error rate of =0.05.

91 Solution Tukey 95.0% Simultaneous Confidence Intervals
Response Variable Fruiting period All Pairwise Comparisons among Levels of Fertilizer Fertilizer = 1 subtracted from: Fertilizer Lower Center Upper ( * ) ( * ) ( *------) Fertilizer = 2 subtracted from: Fertilizer Lower Center Upper ( * ) ( *------) Fertilizer = 3 subtracted from: Fertilizer Lower Center Upper ( *------)

92 Solution Tukey Simultaneous Tests Response Variable Fruiting period
All Pairwise Comparisons among Levels of Fertilizer Fertilizer = 1 subtracted from: Difference SE of Adjusted Fertilizer of Means Difference T-Value P-Value Fertilizer = 2 subtracted from: Fertilizer = 3 subtracted from:


Download ppt "Comparing Three or More Means"

Similar presentations


Ads by Google