Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Variance Harry R. Erwin, PhD

Similar presentations


Presentation on theme: "Analysis of Variance Harry R. Erwin, PhD"— Presentation transcript:

1 Analysis of Variance Harry R. Erwin, PhD
School of Computing and Technology University of Sunderland

2 Resources Crawley, MJ (2005) Statistics: An Introduction Using R. Wiley. Gonick, L., and Woollcott Smith (1993) A Cartoon Guide to Statistics. HarperResource (for fun).

3 When is Anova Used? All explanatory variables are categorical—unquantified and unordered The explanatory variables are called ‘factors’; each has two or more levels. If there is one factor with two levels, use Student’s t. If there is one factor with three+ levels, use one-way Anova. If there are two factors, use two-way Anova. For three factors, use three-way Anova, and so on… If every combination of factors is present, you have a factorial design, allowing you to study interactions between variables (and order no longer matters!).

4 The Basic Idea of Anova You compare means by comparing variances. (Picture) Compute the overall variance of the data: s2 = sum of squares (SSY)/degrees of freedom (kn-1) If the treatment means are significantly different, the sum of squares computed from the individual treatment means will be smaller than the sum of squares computed from the overall mean. SSE = computed from the individual treatment means. degrees of freedom = k  (n-1) SSA = SSY-SSE, df =k-1. Finally, use an F test to determine if the SSA is significant.

5 Anova Tools model<-aov(y~A) summary(model) plot(model)
Tells you whether the SSA is significant plot(model) checks for constancy of variance and normality of errors. Demonstration ( )

6 Demonstration oneway<-read.table("oneway.txt",header=T) attach(oneway) names(oneway) [1] "ozone" "garden” plot(1:20,ozone,ylim=c(0,8),ylab="y",xlab="order") abline(mean(ozone),0) for(i in 1:20)lines(c(i,i),c(mean(ozone),ozone[i]))

7 Variance from Mean

8 Separating the two gardens

9 Analysis summary(aov(ozone~garden)) Df Sum Sq Mean Sq F value Pr(>F) garden ** Residuals <- residual mean square plot(aov(ozone~garden)) <- similar to lm plots, less informative

10 Investigating Factor Levels
summary.aov() allows you to do hypothesis testing. What is more interesting (usually) than hypothesis testing are the effects of factor levels. That uses summary.lm() from the regression lecture summary.lm(aov(ozone~garden)) Discuss (pages of text)

11 Plotting ANOVA Box and whisker plots Bar plots with error bars
Show the nature of the variation within each treatment. Show skew. Bar plots with error bars Preferred by many journals Demo (pages ) Show when means are significantly different.

12 Factorial Experiments
All combinations of factors present. Highly desirable. Allow us to investigate interactions. summary() summary.lm() Demo of simplification.

13 Pseudoreplication aov and lme can handle complicated error structures.
Avoid two kinds of pseudoreplication: Nested sampling Split-plot analysis You can average away spatial pseudoreplication and conduct individual ANOVAs for each time. Has some weaknesses (page 180)

14 Random Effects and Nested Designs
Mixed effects models: both fixed (affecting the mean) and random (affecting the variance) effects in the explanatory variables. Affected by grouping. Page 179 for categorisation.

15 Longitudinal Data Repeated measurements from an individual
Common in medical studies Be critical! Can separate age effects from cohort effects. Response is a measurement series.

16 Derived Variable Analysis
Summarise the statistics to average away the pseudoreplication and analyse the statistics. Weak if explanatory variable change over time. Watch for variation in: random effects (VCA) serial correlation measurement error


Download ppt "Analysis of Variance Harry R. Erwin, PhD"

Similar presentations


Ads by Google