Download presentation
Presentation is loading. Please wait.
Published byGwendoline Robertson Modified over 9 years ago
1
Statistics in IB Biology Error bars, standard deviation, t-test and more
2
Error bars Added to a graph ◦ Show range of data - or – ◦ Standard deviation of data Mean test score (@ 120 min.) x y Time spent studying for test (minutes) Grade on test (%) 120 94 Highest test score (@120 min.) Lowest test score (@120 min.)
3
Error bars Discuss how error bars affect the meaning of these graphs Time studying v. grades Hours Percent grade
4
Mean and Standard Deviation Calculate the means for these two sets of data: ◦ Set 1: 11, 9, 10, 9, 8, 10, 11, 12 ◦ Set 2: 2, 17, 5, 14, 9, 6, 16, 11 The mean is the same, but the range is very different. Standard deviation describes the spread of values around the mean = 10
5
Standard Deviation For your reference only: 1.Find the mean of the values 2.Subtract each value from the mean 3.Square each difference from the mean (Can you see why?) 4.Find the sum of the squared differences 5.Divide by the number of values (This is called the variance) 6.Take the square root Find the standard deviation function on your calculator Standard deviation tutorial Standard deviation tutorial
6
Standard deviation When the data are distributed normally (in a bell shape, not weirdly scattered) 68% of values are within one standard deviation of the mean
7
Standard deviation Small standard deviations indicate tightly clustered data Large standard deviations indicate widely spread data Which graph (top or bottom) would have a larger standard deviation? Explain!
8
Standard deviation questions 1. If US women have a mean height of 64 inches and an SD of 2.5 inches, 68% of US women would fall into what range of heights? 95% of women? 2. How does the SD of “pro beach volleyball women” compare to the SD of US women in general?
9
Analyzing Data Chi-Square
10
T-testChi-square Tests continuous data Compares the means of two groups Degrees of freedom = ◦ total # of samples - 2 Requirements: ◦ Normal distribution ◦ Should have 10+ samples (more = more reliable results) Tests categorical data Compares the frequencies of multiple (2+) groups Degrees of freedom = ◦ # of categories – 1 –or— ◦ (# of rows – 1)(# of columns -1) Requirements: ◦ Random sample ◦ 20+ samples, 5+ per category
11
Chi-square: Goodness of fit test One variable only Example: rolling a die – is it a fair die? ◦ H0 = null hypothesis, results match expectations ◦ H1 = a real difference between results and expectations Degrees of freedom = # of categories - 1 6-1 = 5 Results from 60 rolls of a single die Number123456 Frequency126811815
12
Using the chi-square formula There were 60 rolls and number should have an equal chance of coming up, so 60 / 6 = 10 is the expected value for each number Results from 60 rolls of a single die Number123456 Observed Frequency126811815 Expected frequency10
13
Using the chi-square formula Why square the difference between O and E? Why divide by the expected value? Results from 60 rolls of a single die Number123456 Observed Frequency126811815 Expected frequency10 (O-E)2-4-21 5 (O-E) 2 41641425 (O-E) 2 /E0.41.60.40.10.42.5
14
Using the chi-square formula Sum of differences = 0.4 + 1.6 + 0.4 + 0.1 + 0.4 + 2.5 = 5.4 2 = 5.4 with 5 degrees of freedom (df) Results from 60 rolls of a single die Number123456 Observed Frequency126811815 Expected frequency10 (O-E)2-4-21 5 (O-E) 2 41641425 (O-E) 2 /E0.41.60.41 2.5
15
Using the chi-square formula P =.05 is the standard critical value What does a p-value of 0.05 mean? 2 = 5.4 with 5 df Is the die fair? Discuss.
16
Chi-square: independence test Two or more variables Example: blood pressure medicine Blood pressure levels of patients on Medicine A HighNormalLowTotal: Medicine11035040500 Placebo15032030500 Total:260670701000 Df = (columns – 1)(rows – 1) * rows / columns of values, NOT totals or labels = (3-1)(2-1) = 2 df
17
Using chi-square Expected values = (row total)(column total) overall total Example: on medicine with high blood pressure Expected = (500)(260) / 1000 = 130 Blood pressure levels of patients on Medicine A HighNormalLowTotal: Medicine11035040500 Placebo15032030500 Total:260670701000
18
Using chi-square The 2 value is 8.92 Discuss what this means.
20
t-test t-test Tells whether the difference between two sets of data is significant (meaningful and not caused by chance) Requirements for the t-test ◦ Sample size of at least 10 (according to IB) ◦ Normally distributed data ◦ Continuous data
21
t-test t-test t-test gives a p – value (0 – 1) p-value represents the probability that the difference between the means is due to random chance / sampling error The critical value for scientific “significance” is often p= 0.05 ◦ p< 0.05 is significant ◦ p> 0.05 is NOT significant
22
t - test Note that the mean is the same in all groups of one color In which case is the difference between the green and blue groups most likely to be significant? How would a t – test show this?
23
Need to calculate a t – test? Use graphpad.com ◦ http://graphpad.com/quickcalcs/index.cfm http://graphpad.com/quickcalcs/index.cfm ◦ Has t – test, standard deviation, and many other calculators (You will also need to know Chi-square)
24
Correlation and Causation Correlated variables show a shared trend ◦ Inverse correlation Causal relationships mean a change in one variable causes the change in another
25
Correlation and Causation Time of year (month) Amount Explain what is wrong with this “study”. Read about a current debate on lead and crime here!here
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.