Download presentation
Presentation is loading. Please wait.
1
ANOVA test
2
ANOVA A specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likehood that the hypothesis is true. ANOVA is a test of hypothesis that is appropriate to compare means of a continuous variable in two or more independent comparison groups. The test statistic must take into account the sample sizes, sample means or sample standard deviations in each of the comparison groups.
3
ANOVA The ANOVA approach:
Consider an example with 3 independent groups and continuous outcome measure. The sample data are organized as follows: Group 1 Group 2 Group 3 Group4 Sample size n1 n2 n3 n4 Sample Mean 𝑥 1 𝑥 2 𝑥 3 𝑥 4 Sample SD s1 s2 s3 s4
4
ANOVA Hypothesis building: Ho: μ1 = μ2 = μ3 = μ4
Ha: The means are not equal. for ANOVA, the alternative hypothesis captures any differences in means and include, for example the situation where all the means are unique, when one is different from the other three, where two are different…and so on.
5
ANOVA 2. The test statistic for testing Ho: (μ1 = μ2 = μ3 = μ4 )
6
ANOVA 3. The critical value is found in a table of probability values for the F distribution with (degrees of freedom) df1 = k-1, df2=N-k.
7
ANOVA 4. Decision We reject H0 if Ft < Fc We accept H0 if Ft < Fc
8
Example A study is designed to test whether there is a difference in the mean for the content of N in the soil for three different samples (of 6 observations each) taken in three different positions. The data are shown below: Point 1 Point 2 Point 3 1200 1000 890 1100 650 980 700 900 800 750 500 400 350
9
ANOVA Is there a statistically difference in mean content of N when comparing three samples for three different positions? We will run the ANOVA using the 4-steps approach.
10
library(MASS) # load the MASS package data<-c(1200,1000,980,900,750,800,1000,1100,700,800,500,700,890,650,1100,900,400,350) f = c("Point1","Point2","Point3") k = 3 # number of treatment levels n = 6 # observations per treatment tm = gl(k, 1, n*k, factor(f)) # matching treatments av = aov(data ~ tm) summary(av)
11
ANOVA Analysis of Variance
Comparison of means/variance between more than two samples Can also compare within a multi-factorial design Assumptions of linear model apply Can be followed up with a Post-Hoc test to determine which groups are significantly different (e.g. Tukey HSD)
12
ANOVA
13
Exercise 5 Does the quantity of N fertilizer applied influence crop growth? 1 kg.ha 10 kg.ha 100 kg.ha 1.00 1.10 0.87 0.95 1.15 0.88 0.98 1.13 0.92 0.94 1.18 0.97 1.16 0.90
14
Implementation in R A two step process: Use the function:
Run a linear model Apply the outcome of the linear model to ANOVA Use the function: aov(response~terms) This runs the linear model followed by ANOVA lm(response~terms) anova(lm) BALANCED DESIGN
15
Exercise 6 In this exercise we will use the internal dataset “ChickWeight” Ho: Diet does not effect 20 day old chicken weight Create a boxplot of the data Consider we are looking at differences in the mean, can you see if there are any means that look different? Conduct an ANOVA analysis Conduct a post-hoc test (TukeyHSD)
16
Exercise 6 – R code Data(ChickWeight) Attach(ChickWeight)
boxplot(weight[Time==20]~Diet[Time==20]) summary(lm(weight[Time==20]~Diet[Time==20])) summary(aov(weight[Time==20]~Diet[Time==20])) TukeyHSD(aov(weight[Time==20]~Diet[Time==20]))
17
Exercise 7 Confirm Exercise 2 excel test within R
The file can be imported into R (“crop.RData”)
18
References References: http://www.r-tutor.com
Chun and Griffith (2013). Spatial Statistics and Geostatistics (Book) (Dr. Sullivan Notes) Master of Photogrammetry and Geoinformatics (Dr. Rawirl Notes)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.