Comparing Three or More Means ANOVA (One-Way Analysis of Variance) Lesson ANOVA - A Comparing Three or More Means ANOVA (One-Way Analysis of Variance)
Objectives Verify the requirements to perform a one-way ANOVA Test a claim regarding three or more means using one way ANOVA
Vocabulary ANOVA – Analysis of Variance: inferential method that is used to test the equality of three or more population means Robust – small departures from the requirement of normality will not significantly affect the results Mean squares – is an average of the squared values (for example variance is a mean square) MST – mean square due to the treatment MSE – mean square due to error F-statistic – ration of two mean squares
ANOVA We have the situation where we want to compare three populations We want to compare the three means (μ1, μ2, and μ3) The method of Analysis of Variance (ANOVA) to test whether the three means are equal The same method also applies to four groups (with μ1 through μ4), for five groups (with μ1 through μ5), and so forth
Why ANOVA Why don’t we use 3 separate hypothesis tests? Two reasons Test whether μ1 = μ2 Test whether μ1 = μ3 Test whether μ2 = μ3 Two reasons If each has a Type I error of 0.05, then doing three tests in a row accumulates a higher Type I error (in fact, to about 0.14) It can be time consuming to compute, particularly if we have more than three populations
ANOVA: Null Hypothesis We begin by claiming that all the populations have the same mean The null hypothesis (for three populations) is thus H0: μ1 = μ2 = μ3 Examples are Whether three different drugs have the same mean effects Whether three different groups of students have the same mean test scores
ANOVA: Null Hypothesis A diagram showing when the null hypothesis is true and when a possible alternative hypotheses is true The null hypothesis The alternative hypothesis (equal means) (different means)
ANOVA: Alternative Hypothesis The alternative hypothesis would be the opposite of the null hypothesis If it is not true that all three means are the same, then at least one of them is different from the other two The three means do not have to all be different The alternative hypothesis would thus be Ha: At least one of the population means (one of μ1, μ2, or μ3) is different from the others
One-way ANOVA Test Requirements There are k simple random samples from k populations The k samples are independent of each other; that is, the subjects in one group cannot be related in any way to subjects in a second group The populations are normally distributed The populations have the same variance; that is, each treatment group has a population variance σ2
ANOVA Requirements Verification ANOVA is robust, the accuracy of ANOVA is not affected if the populations are somewhat non-normal or do not quite have the same variances Particularly if the sample sizes are roughly equal Use normality plots Verifying equal population variances requirement: Largest sample standard deviation is no more than two times larger than the smallest
ANOVA – Analysis of Variance Computing the F-test Statistic 1. Compute the sample mean of the combined data set, x Find the sample mean of each treatment (sample), xi Find the sample variance of each treatment (sample), si2 Compute the mean square due to treatment, MST Compute the mean square due to error, MSE Compute the F-test statistic: mean square due to treatment MST F = ------------------------------------- = ---------- mean square due to error MSE ni(xi – x)2 (ni – 1)si2 MST = -------------- MSE = ------------- k – l n – k Σ k Σ k n = 1 n = 1
MSE MSE - mean square due to error, measures how different the observations, within each sample, are from each other It compares only observations within the same sample Larger values correspond to more spread sample means This mean square is approximately the same as the population variance
MST MST - mean square due to treatment, measures how different the samples are from each other It compares the different sample means Larger values correspond to more spread sample means Under the null hypothesis, this mean square is approximately the same as the population variance
ANOVA – Analysis of Variance Table Source of Variation Sum of Squares Degrees of Freedom Mean Squares F-test Statistic F Critical Value Treatment Σ ni(xi – x)2 k - 1 MST MST/MSE F α, k-1, n-k Error Σ (ni – 1)si2 n - k MSE Total SST + SSE n - 1 Just like the table we saw in tests for homogeneity or independence: Σ (ni – 1)si2 ---------------------- = MSE n - k
ANOVA: Calculation & Interpretation Calculation: We use an F-test. Our calculator and most software will calculate a p-value Interpretation: Same as in all inference tests (Remember the three C’s!!)
Example 1 Researcher Jelodar Gholamali wanted to determine the effectiveness of various treatments on glucose level of diabetic rates. He randomly assigned diabetic albino rats into four treat groups. Group1 served as a control group; group 2 were served a dietary supplement, fenugreek; group 3 were served a dietary supplement, garlic; and group 4 were served a dietary supplement, onion. Ideas came from Persian folklore. After 15 days of treatment, the blood glucose was measured in milligrams per deciliter (mg/dL). The table summarizes the results.
Example 1 Data Animal Control Fenugreek Garlic Onion Rat 1 288.1 229.1 177.4 299.7 Rat 2 296.8 240.7 202.2 258.3 Rat 3 267.8 239.4 163.4 286.8 Rat 4 256.7 207.7 184.7 244.0 Rat 5 292.1 225.7 197.9 267.1 Rat 6 282.9 230.8 164.6 297.1 Rat 7 260.3 206.6 193.9 249.9 Rat 8 283.8 213.3 158.1 265.1
Verify the Requirements SRS: Independent: Normal: Same Variance: Stated in the problem Different rats in each group Normality plots for each sample are reasonable 1Varstats for each sample shows smallest stdev=13.49 and twice it less than the largest stdev=21.18
Excel ANOVA Output Classical Approach: P-value Approach: Test statistic > Critical value … reject the null hypothesis P-value Approach: P-value < α (0.05) … reject the null hypothesis
TI Instructions Enter each population’s or treatments raw data into a list Press STAT, highlight TESTS and select F: ANOVA( Enter list names for each sample or treatment after “ANOVA(“ separate by commas Close parenthesis and hit ENTER Example: ANOVA(L1,L2,L3)
Example 2 An environmentalist wanted to determine if the mean activity of rain differed among Alaska, Florida, and Texas. He randomly selected six rain dates at each of the three locations and obtained the data in the following table: Rain Dates Alaska Florida Texas Date 1 5.41 4.87 5.46 Date 2 5.39 5.18 6.29 Date 3 4.90 4.40 5.57 Date 4 5.14 5.12 5.15 Date 5 4.80 4.89 5.45 Date 6 5.24 5.06 5.30
Example 2 Data Run an appropriate Test Hypotheses: H0: μ1 = μ2 = μ3 (mean acidity in rain is the same) Ha: at least one mean is different (the mean acidity in rain is not the same in all 3 states)
Verify the Requirements SRS: Independent: Normal: Same Variance: Random stated in the problem Cities are independent (??) Normality plots for each sample are ok as stated in problem 1Varstats for each sample shows that twice the smallest stdev, 0.252, is less than the largest stdev, 0.397
TI-83 Calculator Output One-way ANOVA F=5.81095 p=.013532 Factor df=2 SS=1.1675 MS=0.58375 Error df=15 SS=1.50686 MS=.100457 Sxp=0.31695 Calculations: F = MST / MSE = 5.81 df = 2,15 p-value = 0.0135 Fcritical = 3.68 (from F-table) Interpretation: Since p-value (0.014) < α = 0.05, we reject the null hypothesis and conclude that at least one cities’ mean acidity in rainfall is different than the others.
Summary and Homework Summary Homework ANOVA is a method that tests whether three, or more, means are equal One-Way ANOVA is applicable when there is only one factor that differentiates the groups Not rejecting H0 means that there is not sufficient evidence to say that the group means are unequal Rejecting H0 means that there is sufficient evidence to say that group means are unequal Homework pg 685-691; 1-4, 6, 7, 11, 13, 14