MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA) Method for comparing the means of more than two populations ANOVA
Research Question: Are there differences in the mean number of total urchins across locations at Onekahakaha?
ANOVA Research Question: Are there differences in the mean number of total urchins across locations at Onekahakaha? Null hypothesis: Ho: μ (shallow) = μ (middle) = μ (deep) Ha: All means not equal
ANOVA Why not run multiple T-test? μ1μ1 μ2μ2 μ3μ3
ANOVA Why not run multiple T-test? 1. Number of t-tests increases with # of groups becomes cognitively difficult 2. ↑ Number of analyses = ↑ probability of committing Type I error Probability of committing at least one type I error = experiment-wise error rate μ1μ1 μ2μ2 μ3μ3
Assumptions for One-Way ANOVA Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent Samples 3. Normal Populations (or large samples) 4. Variances (std. dev.) are equal One-Way ANOVA
A one-way analysis of variance (ANOVA) tests the hypothesis that the means of several populations are equal The null hypothesis for the test is that all population means (level means) are the same – H 0 : μ 1 = μ 2 = μ 3 The alternative hypothesis is that one or more population means differ from the others – Ha: Not all means are equal ANOVA
H 0 : μ 1 = μ 2 = μ 3 Ha: Not all means are equal One-way ANOVA: _ Urchins versus Location Source DF SS MS F P Location Error Total We reject the null that all means not equal Accept alternative that all means not equal Is that all? ANOVA
Allow you to determine the relations among all the means Several methods: Tukey, Fisher’s LSD, Dunnett’s, Bonferroni, Scheffe, etc Most focus on Tukey Multiple Comparisons
3 ways to test: 1) Confidence Intervals - default on “older” Minitab versions - less intuitive than other methods 2) Grouping Information - Just answers, no details - easy to interpret 3) Simultaneous Tests - t-tests run after ANOVA - provides details; interpret like t-test Multiple Comparisons
Tukey's method Tukey's method compares the means for each pair of factor levels using a family error rate to control the rate of type I error Results are presented as a set of confidence intervals for the difference between pairs of means Use the intervals to determine whether the means are different: If an interval does not contain zero, there is a statistically significant difference between the corresponding means If the interval does contain zero, the difference between the means is not statistically significant
Tukey 95% Simultaneous Confidence Intervals
Deep vs. Middle = Not significantly different Deep vs. Shallow = Significantly different Middle vs. Shallow = Significantly different
Tukey Grouping Information Deep vs. Middle = Not significantly different Deep vs. Shallow = Significantly different Middle vs. Shallow = Significantly different
Tukey Test (using GLM) Deep vs. Middle = Not significantly different Deep vs. Shallow = Significantly different Middle vs. Shallow = Significantly different
Non-Parametric Version of ANOVA If samples are independent, similarly distributed data Use nonparametric test regardless of normality or sample size Is based upon mean of ranks of the data – not the mean or variance (Like Mann-Whitney) If the variation in mean ranks is large – reject null Uses p-value like ANOVA Last Resort/Not Resort –low sample size, “bad” data Kruskal-Wallis
Non-Parametric Version of ANOVA Does not have multiple comparisons test (Tukey’s) Will need to run separate “t-tests” (Mann-Whitney) to test for differences between individual “means” Kruskal-Wallis
Non-Parametric Version of ANOVA Kruskal-Wallis
When Do I Do the What Now? If Data are normal –use ANOVA Otherwise – use Kruskal-Wallis “Well, whenever I'm confused, I just check my underwear. It holds the answer to all the important questions.” – Grandpa Simpson