Choosing and using statistics to test ecological hypotheses Botany 332 Lab Tutorial Department of Biological Sciences University of Alberta November 2004
Patterns in space or time OBSERVATIONS Patterns in space or time MODELS Explanations or theories Retain Ho (Null Hypothesis) Refute hypothesis and model Reject Ho (Null Hypothesis) Support hypothesis and model HYPOTHESIS Predictions based on model NULL HYPOTHESIS Logical opposite to hypothesis EXPERIMENT Critical test of null hypothesis INTERPRETATION Underwood (1997)
Ecological experiments OBSERVE things. Come up with MODELS (explanations or theories) to explain your observations. Based on your model, come up with a testable HYPOTHESIS (and a NULL hypothesis). Design an EXPERIMENT to test your null hypothesis statistically. Conduct the experiment and collect DATA. Use STATISTICS with your data to TEST the null hypothesis. INTERPRET your results. Did you accept or reject the null hypothesis? Repeat!
Testing (null) hypotheses statistically Recall we can’t prove our hypothesis, so we try to disprove a null hypothesis instead! Null hypothesis = opposite of our actual hypothesis H0 = Null Hypothesis HA = Alternative hypothesis
Testing (null) hypotheses statistically We formally test hypotheses using statistics Which statistical test to use? Depends on your experimental design, data and your hypotheses It’s important to understand the basics of statistical hypothesis testing
Testing (null) hypotheses statistically Based on assumptions about the data, statistics tell us the probability that the null hypothesis is true (P-value). If P is small enough, we can reject the null hypothesis (result is “statistically significant”). What’s “small enough”? P < 0.05 Reject null hypothesis (accept our hypothesis) P > 0.05 Accept null hypothesis (reject our hypothesis)
Testing (null) hypotheses statistically Many statistical methods also tell us the effect size or proportion of variation in the independent variable explained by the dependent variable. e.g. Regression and correlation P-values H0 = No relationship between variables HA = Relationship between variables R2 (variation explained) Can have significant P-values but very small R2
Choosing and using statistics Determine what kinds of data you have Describe your data Choose an appropriate statistical test Perform the test Report and interpret the results
What kinds of data do you have? Categorical Fertilizer addition, species identity Continuous and discrete Biomass, height, number of bites Independent and Dependent variables
Describe your data Measures of central tendency Measures of dispersion Mean, median Measures of dispersion Variance, standard deviation, standard error, range, quartiles
Descriptive Statistics – Visual Aids Boxplots - median, upper and lower quartiles, whiskers (fences), outliers Histograms - separate, stackbar, or paired Error Bar Plots
Describe your data Normal vs. non-normal distributions histograms, Q-Q plots, K-S test (significant means non-normal)
Data transformation If your data are non-normal Use non-parametric statistics Transform your data square-root transform log transform Log-normal Distribution Many data sets like grain-size of sediments, geochemical concentrations, etc. have a very skewed and long-tailed distribution. In general, such distributions arise when the observed quantities have errors that depend on products rather than sums. It therefore follows that the logarithm of the data may be normally distributed. Hence, taking the logarithm of your data may make the transformed distribution look normal. If this is the case, you can apply standard statistical techniques applicable to normal distributions on your log distribution and convert the results (e.g., mean, standard deviation) back to get the proper units.
Choose your statistical test Choose statistical tests based on your hypothesis, experimental design and the data you have collected Parametric tests assume data are normal, non-parametric tests do not Many textbooks have recipes or flowcharts for choosing statistics Check with your TA’s
Common statistical tests Chi-squared test t-test (Mann-Whitney U test) One-way ANOVA (Kruskal-Wallis test) Two-way ANOVA ANCOVA ANOVA with covariate Correlation and regression
Chi-squared test For analysis of tables of counts or frequencies Good with categorical variables Non-parametric # plants Germinated Not Germinated Outcrossed 14 10 Inbred 6
t-test For analysis of categorical independent variable (2 categories) and a continuous dependent variable Samples may be paired (measurements on same individual) or independent (measurements on two sets of individuals) Assumes data are parametric (non-parametric – Mann-Whitney U)
ANOVA Analysis of Variance examines variation within and between groups For analysis of categorical independent variables (2 or more categories) and a continuous dependent variable Assumes data are parametric (non-parametric – Kruskal-Wallis)
ANOVA One-way ANOVA Two-way ANOVA Single independent variable Main effect Two-way ANOVA Two independent variables Main effects and interaction terms Significant result means at least one group differed from another Use post-hoc tests to test for differences among individual treatments
ANCOVA Analysis of Covariance For analysis of categorical independent variables (2 or more categories), a continuous dependent variable, and a covariate Effects of covariate removed before testing for effect of independent variable(s)
Correlation and regression Tests for relationships between two (or more) continuous variables Important to consider both significance (P-value) and effect size (R2)
Report statistical results What’s important? Test used and assumptions tested Test statistic (t, F, R2, χ2, etc.) Significance (P-value) Sample size / degrees of freedom How to report results? Text Figures Tables
Number of flowers per plant ANOVA, F = 1.8, df = 1,83 P = 0.17 Number of flowers per plant
Interpret your results Remember to relate results/tests to your original hypotheses Correlation ≠ causation (P > 0.05) ≠ bad Recognize trends even when not statistically significant Talk to your TAs if you have any questions
SPSS walkthrough Data entry and transformation Descriptive statistics Creating figures Analyses Chi-square (inbreeding data) t-test / ANOVA (inbreeding data) ANCOVA (tomato data) Correlation and regression (inbreeding data)