Presentation is loading. Please wait.

Presentation is loading. Please wait.

POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance,

Similar presentations


Presentation on theme: "POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance,"— Presentation transcript:

1 POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance, stdev) Normal distribution and SE Student’s t-test and 95% confidence intervals Chi-Square tests MS Excel

2 IF n is very, very large : we use Z distribution to calculate normal deviates Z = (x – μ) σ x STATISTICS: z-DISTRIBUTION t = (x – μ) s x Equation 3 If n is not large, we must use t distribution:

3 But first..WHY do we do all this?? Integral part of science… HYPOTHESIS TESTING ModelExplanation or theory (maybe >1) HypothesisPrediction deduced from model Generate null hypothesis – H 0 : Falsification test TestExperiment IF H 0 rejected – model supported IF H 0 accepted – model wrong Pattern ObservationRigorously Describe

4 HYPOTHESIS TESTING You can say with 95% certainty that the pattern you have observed is not due to chance alone You can say with 99% certainty that the pattern you have observed is not due to chance alone p-value Measure of certainty 1.00 0.05 0.01 α Not significant Significant These are proportions…if expressed as % 1.Collect data 2.Analyse data 3.Set up hypotheses: H 0 = results are due to CHANCE alone H 1 = results are significant and are not due to chance alone 4.Test hypotheses:  Determine significance level for hypothesis testing ( α ) ~ termed ‘Alpha’  Usually either α = 0.05 or α = 0.01  Calculate probability value (p)  If p < α then reject H 0 ; accept H 1 (i.e results are significant and are NOT due to chance alone)  If p > α then reject H1; accept H0 (i.e results are not significant and ARE due to chance alone)

5 POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance, stdev) Normal distribution and SE Student’s t-test and 95% confidence intervals Chi-Square tests MS Excel

6 First, some important concepts about t-tests…

7 Because it is based on the normal distribution, the t distribution has all the attributes of the normal distribution: Completely symmetrical Area under any part of the curve reflects proportion of t values involved etc…. STATISTICS: t-DISTRIBUTION Height (mm) Frequency (%) 0 2 4 6 8 10 12 0246 8 1012141618202224 Shape of the t distribution varies with v (Degrees of Freedom: n-1): the bigger the n, the less spread the distribution -9-8-7-6-5-4-3-20123456789 t V = 100 V = 10 V = 5 V = 1

8 Tails of the t-distribution 0.1 One-Tailed hypothesis testing 0.05 -4-3 - 2 01234 t α (2) Two-Tailed hypothesis testing STATISTICS: t-DISTRIBUTION CONCEPTS Example: if our sample size is 11 (v = 10), what is the value of t beyond which 10% (0.1) of the curve is enclosed? – Two possible t-values H 0 : μ = 25 H 1 : μ < 25 H 0 : μ = 25 H 1 : μ ≠ 25 OR

9 Measure of certainty 1.00 0.05 0.01 Critical t-value Not significant Significant T-statistic t STATISTICS: T-DISTRIBUTION: CONCEPTS Critical values p-value Measure of certainty 1.00 0.05 0.01 α Not significant Significant -4-3 - 2 01234 α (2) -2.064 2.064 t = (x – μ) s x α = 0.05 T-statistic compared with critical value If t-statistic > 2.064 OR < -2.064 then reject H0 ; accept H1 (i.e results are significant and are NOT due to chance alone) Critical values

10 α (2) 0.50.20.10.050.02 α (1) 0.250.10.050.0250.01 11.0003.0786.31412.706 20.8161.8862.9204.303 30.7651.6382.3533.182 40.7411.5332.1322.776 50.7271.4762.0152.571 60.7181.4401.9432.447 70.7111.4151.8952.365 80.7061.3971.8602.306 90.7031.3831.8331.262 100.7001.3721.8122.228 110.6971.3631.7962.201 120.6951.3561.7822.179 130.6941.3501.7712.160 140.6921.3451.7612.145 150.6911.3411.7532.131 160.6901.3371.7462.120 170.6891.3331.7402.110 180.6881.3301.7342.101 190.6881.3281.7292.093 200.6871.3251.7252.086 210.6861.3231.7212.080 220.6861.3211.7172.074 230.6851.3191.7142.069 240.6851.3181.7112.064 250.6841.3161.7082.060 v -4-3- 201234 t α (1) 0.1 -1.372 One-Tailed V=10 0.05 -4-3 - 2 01234 t 1.812-1.812 α (2) Two-Tailed V=10 If our sample size is 11 (v = 10), what is the value of t beyond which 10% (0.1) of the curve is enclosed (i.e what is the critical value of t)? STATISTICS: T-DISTRIBUTION: CONCEPTS Critical values are found on the t-tables

11 1.Establish hypotheses (determine if one-tail or two-tailed test One tail: H 0 has > or < in it Two tail: H 0 has ≠ in it 2.Determine: n, x, μ, s and v (n-1) 3.Calculate the t-statistic using 4.Determine significance level for hypothesis testing (α) ~ termed ‘Alpha Usually either α = 0.05 or α = 0.01 (area in each tail) 5.Calculate the critical value of t use T-statistic table, looking up the value for t 6.Compare t-statistic with critical value to know if you should accept or reject H 0 Steps of Student t-tests: t = (x – μ) s x t significance level (α 1 or 2), v

12 Based on this observation we want to determine if the intensification of agricultural practices has resulted in a significant change to the nitrate concentration of the freshwater resources. HOW? … Need to determine the probability that a the sample (n = 25, x = 24.23 mg.l -1 ) could be randomly generated from a population with μ = 22 mg.l -1 ? The mean nitrate concentration of water in all the upstream tributaries of a large river prior to intensive agriculture is 22 mg.l -1. Afterwards the mean nitrate concentration in 25 of these tributaries is 24.23 mg.l -1 and s = 4.24 mg.l -1 OBSERVATION MADE: STATISTICS: T-DISTRIBUTION: EXAMPLE Nitrate (before agriculture) μ = 22 mg.l -1 n= ALL tributaries Nitrate (after agriculture) x = 24.23 mg.l -1 n= 25 sample tributaries

13 1.Establish hypotheses 2.Determine: n, x, μ, s, n and v (n-1) 3.Calculate the t-statistic 4.Determine significance level (α) 5.Calculate the critical value of t use T-statistic table, looking up the value for t One tail or two tail? Student t-tests: steps for calculation t significance level (α 1 or 2), v H 0 : μ = 22H 1 : μ ≠ 22 What is the probability that a the sample (n=25, x = 24.23 mg.l -1 ) could be randomly generated from a population with μ = 22 mg.l -1 ? n = 25, x = 24.23, μ = 22.00, s = 4.24, v = 24 t = (x – μ) s x (24.23 – 22) 0.848 = 2.23 0.848 == 2.629 s x s n = √ 4.24 25 = √ 4.24 5 = = 0.848 t = 2.629 Either α = 0.05 or α = 0.01 (area in each tail) α = 0.05 t 0.05 (α 2), 24 t α (1) 0.05 One-Tailed 0.025 t α (2) Two-Tailed Go to the hypothesesH 0 : μ = 22H 1 : μ ≠ 22

14 The critical value of t 0.05 (α 2), 24 =2.064 -4-3- 201234 t 2.064-2.064 0.025

15 t = 2.629 > critical value 1.Establish hypotheses 2.Determine: n, x, μ, s, n and v (n-1) 3.Calculate the t-statistic 4.Determine significance level (α) 5.Calculate the critical value of t 6.Compare t-statistic with critical value H 0 : μ = 22H 1 : μ ≠ 22 n = 25, x = 24.23, μ = 22.00, s = 4.24, v = 24 t = 2.629 α = 0.05 STATISTICS: T-DISTRIBUTION: EXAMPLE Critical value = 2.064 -4-3- 201234 t 2.064-2.064 0.025 2.629 SO…means it is very unlikely that a random sample (size 25) would generate a mean of 24.23 mg.l-1 from a population with a mean of 22 mg.l-1 So unlikely, in fact, that we don’t believe it can happen by chance…Reject H0 and accept H1 What is the probability that a the sample (n=25, x = 24.23 mg.l-1) could be randomly generated from a population with μ = 22 mg.l-1?

16 STATISTICS: T-DISTRIBUTION: EXAMPLES Nitrate (before agriculture) μ = 22 mg.l -1 n= ALL tributaries Nitrate (after agriculture) x = 24.23 mg.l -1 n= 25 sample tributaries What we can then say, is that the before and after nitrate levels in the water are (statistically) significantly different from each other (p < 0.05) We are not making any judgment about whether there is more nitrate in the water after than before, only that the concentrations are different …though some things are self evident!

17 Now you try… 25 intertidal crabs were exposed to air at 24.3  C, and their body temperatures were measured. Student-t steps to follow: 1.Establish hypotheses 2.Determine: n, x, μ, s, n and v (n-1) 3.Calculate the t-statistic 4.Determine significance level (α) 5.Calculate the critical value of t 6.Compare t-statistic with critical value H 0 : μ = 24.3  C i.e crab body temp is NOT different from ambient temp H 1 : μ ≠ 24.3  C i.e crab body temp IS different from ambient temp Q: Is the mean body temperature of this species of crab the same as the ambient air temperature of 24.3  C

18 Now you try… 25 intertidal crabs were exposed to air at 24.3  C, and their body temperatures were measured. Student-t steps to follow: 1.Establish hypotheses 2.Determine: n, x, μ, s, n and v (n-1) 3.Calculate the t-statistic 4.Determine significance level (α) 5.Calculate the critical value of t 6.Compare t-statistic with critical value Q: Is the mean body temperature of this species of crab the same as the ambient air temperature of 24.3  C Switch to Excel and do the calculations 25.4025 22.9024 24.8023 27.0022 23.9021 25.5020 25.4019 26.3018 23.5017 24.8016 28.1015 25.5014 23.3013 24.6012 24.3011 26.2010 23.909 24.508 24.007 27.306 25.105 22.904 26.103 24.602 25.801 Body temp (  C) Crab ID

19 α = 0.05 Now you try… 25 intertidal crabs were exposed to air at 24.3  C, and their body temperatures were measured. Student-t steps to follow: 1.Establish hypotheses 2.Determine: n, x, μ, s, n and v (n-1) 3.Calculate the t-statistic 4.Determine significance level (α) 5.Calculate the critical value of t 6.Compare t-statistic with critical value Q: Is the mean body temperature of this species of crab the same as the ambient air temperature of 24.3  C t = 2.7128 t significance level (α 1 or 2), v

20 t 0.05 (α 2), v

21 α = 0.05 Now you try… 25 intertidal crabs were exposed to air at 24.3  C, and their body temperatures were measured. Student-t steps to follow: 1.Establish hypotheses 2.Determine: n, x, μ, s, n and v (n-1) 3.Calculate the t-statistic 4.Determine significance level (α) 5.Calculate the critical value of t 6.Compare t-statistic with critical value Q: Is the mean body temperature of this species of crab the same as the ambient air temperature of 24.3  C t = 2.713 Critical value = 2.064 t = 2.7128 Critical value = 2.064 > H 0 : μ = 24.3  C [i.e crab body temp is NOT different from ambient temp] H 1 : μ ≠ 24.3  C [i.e crab body temp IS different from ambient temp] REJECT -4-3- 201234 t 2.064-2.064 0.025 2.173

22 POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance, stdev) Normal distribution and SE Student’s t-test and 95% confidence intervals Chi-Square tests MS Excel

23 To do this, we need a set of t-tables, and V (N-1) s x The t-Distribution allows us to calculate the 95% (or 99%) confidence intervals around an estimate of the population mean 0.025 t α (2) Two-Tailed In other words, what are limits around our estimate of the population mean, WITHIN which we can be 95% (or 99%) confident that the REAL value of the population mean lies When we express dispersion around some measure of central tendency, we normally use Standard Deviation: x s ± STATISTICS: 95 % CONFIDENCE INTERVALS

24 To do this, we need a set of t-tables, and V (n-1) s x IF n s x x = 42.3 mm = 26 (V = 25) = 2.15 Then the 95% Confidence Interval (CI) around the mean is calculated as: s x * t ά 2 The Confidence Interval expression is then written as: 42.3 mm ± 4.43 mm i.e we are 95% confident that μ lies between 37.87 and 46.73 STATISTICS: 95 % CONFIDENCE INTERVALS = 4.429 = 2.15 * 2.06 - 4.43 mm + 4.43 mm 0.025 α (2) x = 42.3 mm = 4.429

25 POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance, stdev) Normal distribution and SE Student’s t-test and 95% confidence intervals Chi-Square tests MS Excel

26 Nominal data – gender, colour, species, genus, class, town, country, model etc Continuous data – concentration, depth, height, weight, temperature, rate etc Discrete data – numbers per unit space, numbers per entity etc Types of Data The type of data collected influences their statistical analysis MaleFemaleBlueRedBlackWhite 100 g200 g 121.34 g162.18 g180.01 g 5 people Understanding stats…

27 NominalContinuousDiscrete 1 DATA Type z-tests t-tests ANOVA…etc 3 Choice of statistical test Chi - squared 2 Distribution Normal Binomial Poisson…etc + Understanding stats… Data do NOT have to be normally distributed

28 POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance, stdev) Normal distribution and SE Student’s t-test and 95% confidence intervals Chi-Square tests MS Excel

29 Testing Patterns in Discrete (count) Data: the Chi-Square Test Examples of count data:Number of petals per flower Number of segments per insect leg Number of worms per quadrat Number of white cars on campus…etc You can covert continuous data to discrete data, by assigning data to data classes 0 2 4 6 8 10 12 14 16 1.21.31.41.51.61.71.81.9 2 2.12.2 Height (m) Frequency

30 Often want to determine if the population from which you have obtained count data conforms to a certain prediction Q: Does the OBSERVED ratio differ (SIGNIFICANTLY) from the EXPECTED ratio? STATISTICS: CHI-SQUARED TESTS Hypothesised (EXPECTED) ratio: n =134 Observed numbers: 113 yellow21 green Expected numbers: 100.5 yellow33.5 green =134 * 0.75 =134 * 0.25 3 : 1 ¾ : ¼ OR 0.75 : 0.25 OR 113 : 21 OBSERVED ratio: 5.4 : 1 OR = Σ χ 2 (O – E) 2 E [ ] Equation 4 Where O = Observed, E = Expected The bigger the difference between O and E, the greater the χ 2 When there is no difference will be ZERO = Goodness of Fit χ 2 A geneticist raises a progeny of 134 flowers from this cross:

31 STATISTICS: CHI-SQUARED TESTS 1.Establish hypotheses 2.Determine Observed and Expected frequencies 3.Calculate the X 2 -statistic using 4.Determine significance level for hypothesis testing (α = 0.05 or α = 0.01) 5.Calculate the critical value of X 2 use X 2 -statistic table 6.Compare X 2 -statistic with critical value 7.If X 2 -statistic > critical value reject H 0 (significant differences between O and E) 8.If X 2 -statistic < critical value accept H 0 (no significant differences between O and E) NB: must always use counts (frequencies) NOT percentages or proportions = Σ χ 2 (O – E) 2 E [ ] Steps of X 2 tests: Critical value: X 2 significance level, v Number of categories (K) -1

32 STATISTICS: CHI-SQUARED TESTS 1.Establish hypotheses H 0 : Observed and expected ratios are not significantly different H 1 : Observed and expected ratios are significantly different 2.Determine Observed and Expected frequencies Yellow flowers: Observed = 113 ; Expected = 100.5 Green flowers: Observed = 21 ; Expected = 33.5 3.Calculate the X 2 -statistic using 4.Determine significance level for hypothesis testing (α = 0.05 or α = 0.01) 5.Calculate the critical value of X 2 Does the OBSERVED ratio (113:21) differ (SIGNIFICANTLY) from the Expected (100.5:33.5) ratio? Critical value: X 2 significance level, v = χ 2 (113 – 100.5) 2 100.5 [ ] (21 – 33.5) 2 33.5 += 1.55 + 4.66 = 6.22 Yellow flowers Green flowers

33 Degrees of Freedom (v) = K – 1, where K = number of categories in this case two categories: (yellow-flowering and green-flowering) = (2 – 1) …therefore v = 1 Critical value: X 2 0.05, vCritical value: X 2 0.05, 1 Critical value = 3.841

34 STATISTICS: CHI-SQUARED TESTS 1.Establish hypotheses H 0 : Observed and expected ratios are not significantly different H 1 : Observed and expected ratios are significantly different 2.Determine Observed and Expected frequencies Yellow flowers: Observed = 113 ; Expected = 100.5 Green flowers: Observed = 21 ; Expected = 33.5 3.X 2 -statistic = 6.22 4.Determine significance level for hypothesis testing (α = 0.05 or α = 0.01) 5.Critical value = 3.841 6.X 2 -statistic > critical value therefore reject H 0 Q: Does the OBSERVED ratio (113:21) differ (SIGNIFICANTLY) from the Expected (100.5:33.5) ratio? A: the observed ratio is significantly different from the expected ratio

35 1.Establish hypotheses 2.Determine Observed and Expected frequencies 3.Calculate the X 2 -statistic using 4.Determine significance level for hypothesis testing (α = 0.05 or α = 0.01) 5.Calculate the critical value of X 2 use X 2 -statistic table 6.Compare X 2 -statistic with critical value 7.If X 2 -statistic > critical value reject H 0 (significant differences between O and E) 8.If X 2 -statistic < critical value accept H 0 (no significant differences between O and E) = Σ χ 2 (O – E) 2 E [ ] Critical value: X 2 significance level, v STATISTICS: CHI-SQUARED TESTS Q: Has the geneticist sampled from a population having a ratio of 9:3:3:1 ? A plant geneticist has done some crossing between plants and come up with the following numbers of different seeds Now you try… H 0 : Population sampled has YS:YW:GS:GW seeds in the ratio 9:3:3:1 H 1 : Population sampled does not have YS:YW:GS:GW seeds in the ratio 9:3:3:1

36 1.Establish hypotheses 2.Determine Observed and Expected frequencies 3.Calculate the X 2 -statistic using 4.Determine significance level for hypothesis testing (α = 0.05 or α = 0.01) 5.Calculate the critical value of X 2 use X 2 -statistic table 6.Compare X 2 -statistic with critical value 7.If X 2 -statistic > critical value reject H 0 (significant differences between O and E) 8.If X 2 -statistic < critical value accept H 0 (no significant differences between O and E) = Σ χ 2 (O – E) 2 E [ ] Critical value: X 2 significance level, v Now you try… STATISTICS: CHI-SQUARED TESTS Q: Has the geneticist sampled from a population having a ratio of 9:3:3:1 ? A plant geneticist has done some crossing between plants and come up with the following numbers of different seeds Switch to Excel

37 1.Establish hypotheses 2.Determine Observed and Expected frequencies 3.Calculate the X 2 -statistic 4.Determine significance level for hypothesis testing 5.Calculate the critical value of X 2 use X 2 -statistic table Critical value: X 2 significance level, v Now you try… STATISTICS: CHI-SQUARED TESTS Q: Has the geneticist sampled from a population having a ratio of 9:3:3:1 ? A plant geneticist has done some crossing between plants and come up with the following numbers of different seeds χ 2 = 8.97 α = 0.05

38 What is the critical value of χ 2 Critical value: X 2 0.05, 3

39 1.Establish hypotheses 2.Determine Observed and Expected frequencies 3.Calculate the X 2 -statistic 4.Determine significance level for hypothesis testing (α = 0.05 or α = 0.01) 5.Calculate the critical value = 7.815 6.Compare X 2 -statistic with critical value 7.If X 2 -statistic > critical value Now you try… STATISTICS: CHI-SQUARED TESTS Q: Has the geneticist sampled from a population having a ratio of 9:3:3:1 ? A plant geneticist has done some crossing between plants and come up with the following numbers of different seeds χ 2 = 8.97 Reject the Null Hypothesis that sample drawn from a population showing 9:3:3:1 ratio of YS:YW:GS:GW

40 IF Expected Counts are LESS than ONE, then you must combine the categories NB: By combining data you reduce value of K and also v STATISTICS: CHI-SQUARED TESTS…final word…

41 POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance, stdev) Normal distribution and SE Student’s t-test and 95% confidence intervals Chi-Square tests MS Excel

42 Continuous Discrete DATA Looking for probabilities: Z-TESTS Comparing two means: T-TESTS Chi - squared Which stats test to use? Use Getting started with data.xls for further advice


Download ppt "POPULATION DYNAMICS Required background knowledge: Data and variability concepts  Data collection Measures of central tendency (mean, median, mode, variance,"

Similar presentations


Ads by Google