Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics review Basic concepts: Variability measures Distributions

Similar presentations


Presentation on theme: "Statistics review Basic concepts: Variability measures Distributions"— Presentation transcript:

1 Statistics review Basic concepts: Variability measures Distributions
Hypotheses Types of error Common analyses T-tests One-way ANOVA Two-way ANOVA Regression

2 Variance Ecological rule # 1: Everything varies
…but how much does it vary?

3 Variance S2= Σ (xi – x )2 n-1 x

4 Variance S2= Σ (xi – x )2 n-1 x

5 Variance S2= Σ (xi – x )2 n-1 What is the variance of 4, 3, 3, 2 ? 2/3

6 1. Standard deviation (s, or SD) = Square root (variance)
Variance variants 1. Standard deviation (s, or SD) = Square root (variance) Advantage: units

7 Advantage: indicates reliability
Variance variants 2. Standard error (S.E.) = s n Advantage: indicates reliability

8 How to report We observed 29.7 (+ 5.3) grizzly bears per month (mean + S.E.). A mean (+ SD)of 29.7 (+ 7.4) grizzly bears were seen per month + 1SE or SD - 1SE or SD

9 Distributions Normal Quantitative data Poisson Count (frequency) data

10 Normal distribution 67% of data within 1 SD of mean

11 Poisson distribution mean Mostly, nothing happens (lots of zeros)

12 Poisson distribution Frequency data
Lots of zero (or minimum value) data Variance increases with the mean

13 What do you do with Poisson data?
Correct for correlation between mean and variance by log-transforming y (but log (0) is undefined!!) Use non-parametric statistics (but low power) Use a “general linear model” specifying a Poisson distribution

14 Hypotheses Null (Ho): no effect of our experimental treatment, “status quo” Alternative (Ha): there is an effect

15 Whose null hypothesis? Conditions very strict for rejecting Ho, whereas accepting Ho is easy (just a matter of not finding grounds to reject it). A criminal trial? Environmental protection? Industrial viability? Exotic plant species? WTO?

16 Hypotheses Null (Ho) and alternative (Ha): always mutually exclusive
So if Ha is treatment>control…

17 Types of error Type 1 error Reject Ho Accept Ho Ho true Type 2 error
Ho false

18 Types of error Usually ensure only 5% chance of type 1 error (ie. Alpha =0.05) Ability to minimize type 2 error: called power

19 Statistics review Basic concepts: Variability measures Distributions
Hypotheses Types of error Common analyses T-tests One-way ANOVA Two-way ANOVA Regression

20 The t-test Asks: do two samples come from different populations? YES
NO DATA Ho A B

21 The t-test Depends on whether the difference between samples is much greater than difference within sample. A B Between >> within… A B

22 The t-test Depends on whether the difference between samples is much greater than difference within sample. A B Between < within… A B

23 Difference between means Standard error within each sample
The t-test T-statistic= Difference between means Standard error within each sample sp2 + sp2 n1 n2

24 How many degrees of freedom?
The t-test How many degrees of freedom? (n1-1) + (n2-1) sp2 + sp2 n1 n2

25 T-tables v 0.10 0.05 0.025 1 3.078 6.314 12.706 2 1.886 2.920 4.303 3 1.638 2.353 3.182 4 1.533 2.132 2.776 infinity 1.282 1.645 1.960 Two samples, each n=3, with t-statistic of 2.50: significantly different?

26 T-tables v 0.10 0.05 0.025 1 3.078 6.314 12.706 2 1.886 2.920 4.303 3 1.638 2.353 3.182 4 1.533 2.132 2.776 infinity 1.282 1.645 1.960 Two samples, each n=3, with t-statistic of 2.50: significantly different? No!

27 If you have two samples with similar n and S. E
If you have two samples with similar n and S.E., why do you know instantly that they are not significantly different if their error bars overlap? v 0.10 0.05 0.025 1 3.078 6.314 12.706 2 1.886 2.920 4.303 3 1.638 2.353 3.182 4 1.533 2.132 2.776 infinity 1.282 1.645 1.960

28 If you have two samples with similar n and S. E
If you have two samples with similar n and S.E., why do you know instantly that they are not significantly different if their error bars overlap? v 0.10 0.05 0.025 1 3.078 6.314 12.706 2 1.886 2.920 4.303 3 1.638 2.353 3.182 4 1.533 2.132 2.776 infinity 1.282 1.645 1.960 } Careful! Doesn’t work the other way around!! the difference in means < 2 x S.E., i.e. t-statistic < 2 and, for any df, t must be > 1.96 to be significant!

29 General form of the t-test, can have more than 2 samples
One-way ANOVA General form of the t-test, can have more than 2 samples Ho: All samples the same… Ha: At least one sample different

30 General form of the t-test, can have more than 2 samples
One-way ANOVA General form of the t-test, can have more than 2 samples A B C DATA A B C Ho Ha A B C A C B

31 One-way ANOVA Just like t-test, compares differences between samples to differences within samples A B C Difference between means Standard error within sample T-test statistic (t) MS between groups MS within group ANOVA statistic (F)

32 MS= Sum of squares df Mean squares: Analogous to variance

33 Variance: S2= Σ (xi – x )2 n-1 Sum of squared differences

34 } } ANOVA tables df SS MS F p Treatment (between groups) df (X) SSX
MSX MSE Look up ! Error (within groups) df (E) SSE Total df (T) SST } }

35 Do three species of palms differ in growth rate
Do three species of palms differ in growth rate? We have 5 observations per species. Complete the table! df SS MS F p Treatment (between groups) 69 Error (within groups) k(n-1) Total 104

36 Hint: For the total df, remember that we calculate total SS as if there are no groups…
MS F p Treatment (between groups) 69 Error (within groups) k(n-1) Total 104

37 At alpha = 0.05, F2,12 = 3.89 df SS MS F p Treatment (between groups) 2 69 34.5 11.8 ? Error (within groups) 12 35 2.92 Total 14 104

38 Two-way ANOVA Just like one-way ANOVA, except subdivides the treatment SS into: Treatment 1 Treatment 2 Interaction 1&2

39 Two-way ANOVA Suppose we wanted to know if moss grows thicker on north or south side of trees, and we look at 10 aspen and 10 fir trees: Aspect (2 levels, so 1 df) Tree species (2 levels, so 1 df) Aspect x species interaction (1df x 1df = 1df) Error? k(n-1) = 4 (10-1) = 36

40 v df SS MS F Aspect 1 SS(Aspect) MS(Aspect) MS(As) MSE Species SS(Species) MS(Species) MS(Sp) Aspect x Species SS(Int) MS(Int) Error (within groups) 36 SSE Total 39 SST

41 Interactions Combination of treatments gives non-additive effect
North South Alder Fir

42 Interactions Combination of treatments gives non-additive effect
Anything not parallel! North South North South

43 Careful! If you log-transformed your variables, the absence of interaction is a multiplicative effect: log (a) + log (b) = log (ab) y Log (y) North South North South

44 Regression Problem: to draw a straight line through the points that best explains the variance

45 Regression Problem: to draw a straight line through the points that best explains the variance

46 Regression Problem: to draw a straight line through the points that best explains the variance

47 Regression Test with F, just like ANOVA:
Variance explained by x-variable / df Variance still unexplained / df Variance explained (change in line lengths2) Variance unexplained (residual line lengths2)

48 Regression Test with F, just like ANOVA:
Variance explained by x-variable / df Variance still unexplained / df In regression, each x-variable will normally have 1 df

49 Regression Test with F, just like ANOVA:
Variance explained by x-variable / df Variance still unexplained / df Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom?

50 Regression example Total variance for 32 data points is 300 units.
An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio?

51 Regression example Total variance for 32 data points is 300 units.
An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? R2 = 150/300 = 0.5 F 1,30 = 150/1 = 15 300/30 Why is df error = 30?

52 ANCOVA In regression, x-variables can be continuous or categorical.
Eg. To convert a treatment (size) with two levels (small, large) into a regression variable, we could code small=0, large= 1. Y= size*B + constant Mean value for small Difference between large and small Test significance of “size” in regression

53 ANCOVA In an Analysis of Covariance, we look at the effect of a treatment (categorical) while accounting for a covariate (continuous) Fertilized N+P Fertilized N Growth rate (g/day) Plant height (cm)

54 ANCOVA Fit full model (categorical treatment, covariate, interaction)
Test for interaction (if significant- stop!) Test for differences in intercept Fertilized N+P Fertilized N Growth rate (g/day) No interaction Intercepts differ Plant height (cm)


Download ppt "Statistics review Basic concepts: Variability measures Distributions"

Similar presentations


Ads by Google