Introduction to Power Analysis G. Quinn & M. Keough, 2003 Do not copy or distribute without permission of authors.
Power of test Probability of detecting an effect if it exists Probability of rejecting incorrect H O 1 – where is the Type II error
HAHA HoHo Region where H o retainedRegion where H o rejected Type II errorType I error
Statistical power depends on Effect size (ES) –size of difference between treatments –large effects easier to detect Background variation: –variation between experimental units ( 2 estimated by s 2 ) –greater background variability, less likely to detect effects
Sample size (n) for each treatment group: –increasing sample size makes effects easier to detect Significance level ( ): –Type I error rate –as decreases, increases, power decreases
Power analysis If is fixed (usually at 0.05), then
Exact formula depends on statistical test (i.e. different for t, F etc.)
yy P(y) z 22 tF
Good returns Diminshing returns
A posteriori power
If conclusion is non-significant: report power of experiment to detect relevant effect size. Solve power equation for specific ES: Post-hoc (a posteriori) power analysis
Karban (1993) Ecology 74:9-19 Plant growth and reproduction in response to reduced herbivores. Two treatments: –normal herbivore damage –reduced herbivore damage –n = 31 plants in each treatment For plant growth: F 1,60 = 0.51, P = 0.48 ns
Karban (1993) Ecology 74:9-19 Power to detect effects: Small effect (ES = 0.1)0.11 Medium effect (ES = 0.25)0.50 Large effect (ES = 0.40)0.88 Effect size (ES) = MS Groups / MS Residual ie. SD Groups / SD Reps - see Cohen (1992)
A priori power analysis sample size determination To determine appropriate sample size a priori, we need to know: what power we want background variation (from pilot study or previous literature) what ES we wish to be able to detect if it occurs
Solve power equation for n
Required sample size
Example of a priori power analysis Effects of fish predation on mudflat crabs Two treatments: –cage vs cage control Pilot study: –number of crabs in 3 plots –variance was 19 (so s 2 = 19) –mean was 20
Aims: –to detect 50% increase in crab numbers due to caging, ie. an increase from 20 to about 30; so ES = 50% (or 10 crabs per plot) –to be 80% sure of picking up such an effect if it occurred; so power = 0.80 How many replicate plots required for each treatment? –what is required n?
Basic power (t-test) ES = 10 s = 4.36
Detecting a more subtle effect Halved
If data are more variable Variance doubled
Minimum Detectable Effect Size If an ES can’t be determined Specify target power, solve for ES
ES vs Sample Size (a priori)
Effect size How big: –what size of effect is biologically important? –how big an effect do we want to detect if it occurs?
An effect size is… Type of testEffect Size t-test (2-sample)Difference between means Simple linear regression Slope (or change in slope) ANOVA (1-way)Differences between means 2 (2 x 2 table) Difference in proportions
Effect size Where from? –biological knowledge –previous work/literature –compliance requirements (e.g. water quality)
Specification of effect size Easy for 2 groups: –difference between 2 means 0 P(y) Central t Noncentral t
Specification of effect size Harder for more than 2 groups: –Consider 4 groups: 50% difference from smallest to largest 1 = 2 = 3 < 4? 1 < 2 < 3 < 4? 1 = 2 < 3 = 4? –Shape of alternative distribution depends on the particular pattern
P(y) Central F
One-way ANOVA Range = 10, s = No. of replicates Power 3 vs 1 2 vs 2 Linear
Estimate of variance Other work on same system Published work on similar systems Pilot studies Must be estimate of same kind of variance –e.g. paired vs two-sample t test –Variance of difference vs variance of each sample
Power analysis requires Clear understanding of the kind of statistical model to be used (inc. the formal tests) Careful thought about important effects; hardest step, especially for interactions An estimate of variance Significance level to be used Desired level of confidence Understanding of non-centrality parameters for complex designs
Cautions Variance estimates may be uncertain –Allow for extra samples in case of larger than expected variation Realized power Cohen’s Effect Sizes Raw & standardized ES Terminology
Options for study planning: n Level of significance Desired power Target effect size Estimate of variation Calculated sample size –or Power vs n “Safety” factor
Options for planning: ES Significance level Desired level of confidence Estimate of variation Suggested sample size (or range) Effect that should be detected –ES vs n
Power calculations Charts & tables available in many books Software will do these calculations: –Gpower –PiFace & Java applets –Review at: Some statistical packages –But check what they do!
Power vs ES (post hoc)