Experimental design
Experiments vs. observational studies Manipulative experiments: The only way to prove the causal relationships BUT Spatial and temporal limitation of manipulations Side effects of manipulations
Example of side effects – exclosures for grazing
Exclosures have significantly higher density of small rodents ????????????
The poles of fencing are perfect perching sites for birds of pray
Laboratory, field, natural trajectory (NTE), and natural snapshot experiments (Diamond 1986) NTE/NSE - Natural Trajectory/Snapshot Experiment
Observational studies (e. g Observational studies (e.g. for correlation between environment and species, or estimates of plot characteristics) Random vs. regular sampling plan
Take care Even if the plots are located randomly, some of them are (in a finite area) close to each other, and so they might be “auto-correlated” Regular pattern maximizes the distance between neighbouring plots
Regular design - biased results, when there is some regular structure in the plot (e.g. regular furrows), with the same period as is the distance in the grid - otherwise, better design providing better coverage of the area, and also enables use of special permutation tests.
Manipulative experiments frequent trade-off between feasibility and requirements of correct statistical design and power of the tests To maximize power of the test, you need to maximize number of independent experimental units For the feasibility and realism, you need plots of some size, to avoid the edge effect
Important - treatments randomly assigned to plots Completely randomized design Typical analysis: One way ANOVA
Regular patterns of individual treatment type location are often used, they usually maximize possible distance and so minimize the spatial dependence of plots getting the same treatment Similar danger as for regular sampling pattern - i.e., when there is inherent periodicity in the environment – usually very unlikely
When randomizing, your treatment allocation could be also e.g.: Regular pattern helps to avoid possible “clumping” of the same treatment plots
Randomized complete blocks For repeated measurements - adjust the blocks (and even the randomization) after the baseline measurement
ANOVA, TREAT x BLOCK interaction is the error term
If the block has a strong explanatory power, the RCB design is stronger than completely randomized one
If the block has no explanatory power, the RCB design is weak
Reminder – Covariates Use of covariates (covariables) and Analysis of Covariance – another possibility how to filter out „noise“ and decrease the unexplained variability
Latin square design In most cases rather weak test if analyzed as Latin square (i.e. column and row taken as factors in incomplete three way ANOVA) Again, useful to avoid clumping of the same treatment
Most frequent errors - pseudoreplications
Cited 4000+ times
Note, B. is in fact not a pseudoreplication, if the analysis reflects correctly the hierarchical design of the data
Logic of experiments in ecology: is pseudoreplication a pseudoissue? Oksanen, L Logic of experiments in ecology: is pseudoreplication a pseudoissue? OIKOS 94 : 27-38 Hurlbert divides experimental ecologist into 'those who do not see any need for dispersion (of replicated treatments and controls) and those who do recognize its importance and take whatever measures are necessary to achieve a good dose of it'. Experimental ecologists could also be divided into those who do not see any problems with sacrificing spatial and temporal scales in order to obtain replication, and those who understand that appropriate scale must always have priority over replication.
Reminder Type I and Type III SS If the design is balanced, you don’t need to care In non-balanced designs – Type I – sequential – the order of predictors IS important Type III, Type VI
What is the interaction?
Log transformation and the interaction If the interaction = 0 - we expect the pure additivity Effect of A+B = Effect of A + Effect of B Null model for interaction: Xijk = m + ai + bj + eijk (fertilization increases height by 5cm) Model with interaction: Xijk = m + ai + bj + gij + eijk Often biologically more feasible null model: pure multiplicativity (fertilization increases height by 20%) Null model for interaction: Xijk = m . ai . bj . eijk Then: log(Xijk) = log(m . ai . bj . eijk) = log m + log ai + log bj + log eijk
If you log-transform in a factorial ANOVA Think more about the meaning of the interaction, the distributional properties of response variable are (usually) less important
Fixed and random factors
Fertilization experiment in three countries Difference of meaning of the test, depending on whether the country is factor with fixed or random effect COUNTRY FERTIL NOSPEC 1 CZ 0.000 9.000 2 CZ 0.000 8.000 3 CZ 0.000 6.000 4 CZ 1.000 4.000 5 CZ 1.000 5.000 6 CZ 1.000 4.000 7 UK 0.000 11.000 8 UK 0.000 12.000 9 UK 0.000 10.000 10 UK 1.000 3.000 11 UK 1.000 4.000 12 UK 1.000 3.000 13 NL 0.000 5.000 14 NL 0.000 6.000 15 NL 0.000 7.000 16 NL 1.000 6.000 17 NL 1.000 6.000 18 NL 1.000 8.000
Country is a fixed factor (i. e Country is a fixed factor (i.e., we are interested in the three plots only) Summary of all Effects; design: (new.sta) 1-COUNTRY, 2-FERTIL df MS df MS Effect Effect Error Error F p-level 1 2 2.16667 12 1.055556 2.05263 .171112 2 1 53.38889 12 1.055556 50.57895 .000012 12 2 26.05556 12 1.055556 24.68421 .000056 Country is a random factor (i.e., the three plots are considered as a random selection of all plots of this type in Europe - [to make Brussels happy]) Summary of all Effects; design: (new.sta) 1-COUNTRY, 2-FERTIL df MS df MS Effect Effect Error Error F p-level 1 2 2.16667 12 1.05556 2.05263 .171112 2 1 53.38889 2 26.05556 2.04904 .288624 12 2 26.05556 12 1.05556 24.68421 .000056
Nested design („split-plot“)
Two explanatory variables, Treatment and Plot, Plot is random factor nested in Treatment. Accordingly, there are two error terms, effect of Treatment is tested against Plot, effect of Plot against residual variability: F(Treat)=MS(Treat)/MS(Plot) F(Plot)=MS(Plot)/MS(Resid) [often not of interest]
Split plot (main plots and split plots - two error levels)
ROCK is the MAIN PLOT factor, PLOT is random factor nested in ROCK, TREATMENT is the within plot (split-plot) factor. Two error levels: F(ROCK)=MS(ROCK)/MS(PLOT) F(TREA)=MS(TREA)/MS(PLOT*TREA)
Following changes in time Non-replicated BACI (Before-after-control-impact)
Analysed by two-way ANOVA factors: Time (before/after) and Location (control/impact) Of the main interest: Time*Location interaction (i.e., the temporal change is different in control and impact locations)
In fact, in non-replicated BACI, the test is based on pseudoreplications. Should NOT be used in experimental setups In impact assessments, often the best possibility (The best need not be always good enough.)
Replicated BACI - repeated measurements Usually analysed by “univariate repeated measures ANOVA”. This is in fact split-plot, where TREATment is the main-plot effect, time is the within-plot effect, individuals (or experimental units) are nested within a treatment. Of the main interest is interaction TIME*TREAT