Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two-way ANOVA with significant interactions

Similar presentations


Presentation on theme: "Two-way ANOVA with significant interactions"— Presentation transcript:

1 Two-way ANOVA with significant interactions
Statistical Data Analysis - Lecture /04/03 Two-way ANOVA with significant interactions In the previous lecture we saw that the full model for a two-way ANOVA includes an interaction term Remember an interaction term models the possible interaction of the two main effects If the interaction term is significant (non-zero) in our ANOVA table then we cannot say anything about the main effects in isolation because they are confounded The concept of confounding is important in statistics If two factors are confounded this means that the main effects do not alter the response independently of each other Statistical Data Analysis - Lecture /04/03

2 Statistical Data Analysis - Lecture15 - 08/04/03
Example The data on the following slide come from an experiment carried out by a statistics class They planted 100 seeds in each of 48 boxes and applied two treatments The boxes were left uncovered or covered The students watered the seed boxes each day with a preset amount of water (coded levels 1 through 6) So the 48 boxes were divided into 24 left uncovered and 24 left covered. The watering levels were assigned in equal proportions, so 4 boxes out of 24 receive level 1, 4 receive level 2 and so on Therefore the design is balanced (equal numbers of replicates for each treatment combination) At the end of two weeks the students counted how many seeds had germinated in each box Statistical Data Analysis - Lecture /04/03

3 Statistical Data Analysis - Lecture15 - 08/04/03
Example In this experiment: The number of seeds that germinated is our response LIGHT and WATER are our two factors The factor LIGHT has two levels (uncovered/covered) The factor WATER has six levels corresponding to the amount of water given Therefore our probability model is In words we say that “the number of seeds that germinate is dependent on whether the box is covered or not and the amount of water the box receives plus some effect due to the interaction of these two factors. We expect the residuals in this model to be Normally distributed with mean zero and variance sigma squared” Statistical Data Analysis - Lecture /04/03

4 Statistical Data Analysis - Lecture15 - 08/04/03
What do we know? What do we know before we carry out this experiment? Plants need light to live (to carry out photosynthesis) Plants need water to live Plants do not survive (well) on water alone Plants do not survive at all on light alone Too much water is not good for plants that don’t live in the water Therefore, we expect that there will be an interaction between the factors LIGHT and WATER Statistical Data Analysis - Lecture /04/03

5 Two way interaction with ANOVA
Statistical Data Analysis - Lecture /04/03

6 Two-way ANOVA with missing values
From the table on the previous slide we can see that the experimenters did not get a result from the 4th uncovered replicate with watering level 5 Some statistical packages have a little trouble with missing values or unbalanced designs. Minitab does. R does not. In general an experiment with two factors with levels I and J respectively is balanced if is an integer, where Statistical Data Analysis - Lecture /04/03

7 Statistical Data Analysis - Lecture15 - 08/04/03
Two-way ANOVA in R Read in the data seeds<-read.csv(“seeds.csv”) Make sure the factors are coded as factors seeds$Light<-as.factor(seeds$Light) seeds$Water<-as.factor(seeds$Water) Fit the model fit<-aov(Count~Light*Water,data=seeds) Check the model plot(fit) Get the ANOVA table anova(fit) Statistical Data Analysis - Lecture /04/03

8 Statistical Data Analysis - Lecture15 - 08/04/03
Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Light Water < 2.2e-16 *** Light:Water e-11 *** Residuals --- From the ANOVA table we can see that the P-value for the hypothesis test that the interactions are zero is <0.001 I.e. there is very strong evidence against the null hypothesis that the interactions are zero This is what we expected to see However, do the model assumptions hold? If the responses can range from zero to one hundred (no seeds germinated – all seeds germinated), then this is a clear violation of the assumption of normality Does it matter? Statistical Data Analysis - Lecture /04/03

9 Statistical Data Analysis - Lecture15 - 08/04/03

10 Statistical Data Analysis - Lecture15 - 08/04/03

11 Normality assumptions
Norplot is approximately linear Therefore the assumption of Normality whilst not strictly true is at least not violated too badly The least squares line (regressing residuals on Z-scores) is of y = -1.4E x, or approximately y = x = 5.6x I.e. the residuals are approximately normally distributed with mean 0 and standard deviation 5.6 Therefore +/- 3sd = +/-16.8 We have one residual of –17.25 (z-score 3.08) so not too bad Statistical Data Analysis - Lecture /04/03

12 Statistical Data Analysis - Lecture15 - 08/04/03

13 Statistical Data Analysis - Lecture15 - 08/04/03
Pred-res plot The pred-res plot is hard to interpret with only 4 observations per group We might be concerned about the water level 6 data (all the counts are zero) What happens when we analyse the experiment without it? Use the subs parameter in aov, e.g. fit2<-aov(Count~Light*Water,data=seeds ,subs=-((1:48)[Water==6])) Makes little difference to the ANOVA table or the residual plots Statistical Data Analysis - Lecture /04/03

14 A graphical display of two-way data (interaction plots)
We can construct LSD plots for two-way data The process is just the same as for one-way, but now we have a mean for each combination of the factor levels, i.e. E.g. we have 2 levels for light and 6 levels for water, so we have 12 fitted means and standard deviations. The degrees of freedom are now min(nij-1, i=1..I, j=1..J) (3 for our example) However, it may be more informative to look at a plot of how the means vary with the combinations of the factors Statistical Data Analysis - Lecture /04/03

15 Statistical Data Analysis - Lecture15 - 08/04/03
Interaction plots Interaction plots plot the means for each combination of the levels of the factors The plot places the levels of one factor on the x-axis and the mean levels of the other factor on the y-axis, i.e Factors 1 to I on the x-axis on the y-axis A point is then plotted for each data mean It is traditional to join the means with the same j level in the y-axis factor with a line I.e. join up If the lines are parallel and not overlapping then there is little or no evidence of an interaction in the data If the lines overlap or are not parallel, then this is taken as evidence of a significant interaction For the examples we’re considering it should make no difference whether we use data means or fitted means Statistical Data Analysis - Lecture /04/03

16 Statistical Data Analysis - Lecture15 - 08/04/03

17 Statistical Data Analysis - Lecture15 - 08/04/03
Interpretation We know from our ANOVA table that the interaction is significant This is represented by the overlapping lines We can see that the uncovered boxes have more successful germinations with less water than the uncovered boxes We can see that water level 6 is fatal regardless of light We can see that light is less important than water (in that we can achieve similar germination levels with more water and no light It looks like the covered boxes may be more robust to water level Statistical Data Analysis - Lecture /04/03


Download ppt "Two-way ANOVA with significant interactions"

Similar presentations


Ads by Google