Download presentation
Presentation is loading. Please wait.
Published byCorey Isabel Crawford Modified over 9 years ago
2
Statistical analysis
3
Why?? (besides making your life difficult …) Scientists must collect data AND analyze it Does your data support your hypothesis? Is it valid? Statistics helps us find relationships between sets of data. You are the scientist now, you must be comfortable with analysis of your data
4
Let’s look at two sets of data Sample 1 -10, 0, 10, 20, 30 Sample 2 8, 9, 10, 11, 12 What can you tell me about this data???
5
Mean: the “average” of the data or the central tendency Sample 1 -10, 0, 10, 20, 30 -10 + 0 + 10 + 20 + 30 5 Sample 2 8, 9, 10, 11, 12 8 + 9 + 10 + 11 + 12 5 Mean = 10 Is this analysis complete??? NO!
6
Range: how far is the spread? Largest # - smallest # Sample 1 -10, 0, 10, 20, 30 30 – (-10) Sample 2 8, 9, 10, 11, 12 12 - 8 Range = 40 Range = 4 Does this data help? Yes, Sample 1 is more dispersed Obvious? Perhaps, but now shown mathematically
7
Something more … standard deviation SD is a measure to show how individual data points are dispersed around the mean
8
Assuming normal data distribution (bell curve) 68% of all collected values lie within +/- 1 SD 95% of all collected values lie within +/- 2 SD So what???
9
Standard deviation A small SD indicates the data values are clustered around the mean May also indicate few exteme data points A large SD indicates the data values are spread out May also indicate extreme data points Outliers??
10
Standard deviation
11
Let’s practice …
13
Let’s compare … Sample 1 SD = 15.8 Sample 2 SD = 1.58 How can I use this in my lab?
14
Error bars Error bars represent the variability of your data STANDARD DEVIATION range measurement uncertainties
15
Error bars On a bar graph, the bar represents the mean of your data and the error bars represent +/- 1 sd mean sd
16
Error bars On a line graph, the point represents the mean of your data and the error bars represent +/- 1 sd mean sd
17
t-test t-test determines statistical significance between 2 sample means Is the difference significant? Is the difference due to your variable?? Or is it random chance?? How valid is your data? t-test determines the probability that difference is due to random chance A p value (probability) of 0.05 (5%) shows a 5% chance of randomness, but a 95% chance of confidence … Key word!!!!! You want 95% or higher! your difference IS DUE TO YOUR VARIABLE
18
t-test For tests, you do NOT need to calculate t- values, but you must be able to read a t- chart!! For internal assessments, you may use calculators or excel to calculate t-values
19
This is the range you are hoping for The difference between your samples has a HIGH probability of being due to your variable (and not chance) Need to be able to calculate degrees of freedom
20
Calculating degrees of freedom df = (n 1 + n 2 ) - 2 Size of sample 1 Size of sample 2 # of samples
21
Calculating degrees of freedom df = (n 1 + n 2 ) – 2 Population 1 -10, 0, 10, 20, 30 n 1 = 5 Population 2 8, 9, 10, 11, 12 n 2 = 5 df = (5 + 5) -2 df = 8
22
Using the t-table If df = 8 and t = 3.5, is this a significant difference? Less than 1% probability difference in data is due to chance Therefore, greater than 99% probability difference in data is due to our variable
23
Other options, less commonly used in our class Median The middle #, when arranged in numeric order Sample 1 -10, 0, 10, 20, 30 Median = 10 Sample 2 8, 9, 10, 11, 12 Median = 10 Mode The # that occurs most often Sample 1 -10, 0, 10, 20, 30 No mode Sample 2 8, 9, 10, 11, 12 No mode
24
Some practice: looking at plant height Height in sun (cm)Height in shade (cm) 124131 12060 153131 98160 124212 141117 156131 12895 139145 117118 Calculate the mean for both samples Sun = 130 cm Shade = 130 cm
25
Some practice: looking at plant height Height in sun (cm)Height in shade (cm) 124131 12060 153131 98160 124212 141117 156131 12895 139145 117118 Calculate the range for both samples Sun = 58 cm Shade = 152 cm
26
Some practice: looking at plant height Height in sun (cm)Height in shade (cm) 124131 12060 153131 98160 124212 141117 156131 12895 139145 117118 Calculate the median for both samples Sun = 126 cm Shade = 131 cm If even # of samples, find the average of the two middle numbers
27
Some practice: looking at plant height Height in sun (cm)Height in shade (cm) 124131 12060 153131 98160 124212 141117 156131 12895 139145 117118 Calculate the mode for both samples Sun = 124 cm Shade = 131 cm
28
Some practice: looking at plant height Height in sun (cm)Height in shade (cm) 124131 12060 153131 98160 124212 141117 156131 12895 139145 117118 Calculate the sd for both samples Sun = 17.56 cm Shade = 39.85 cm
29
Some practice: looking at plant height Height in sun (cm)Height in shade (cm) 124131 12060 153131 98160 124212 141117 156131 12895 139145 117118 Sun: sd = 17.56 cm Low sd indicates even (close) distribution of data points More valid Shade: sd = 39.85 cm High sd indicates wide spread of data points MAY indicate a problem with your experimental design
30
Some practice: looking at plant height Height in sun (cm)Height in shade (cm) 124131 12060 153131 98160 124212 141117 156131 12895 139145 117118 If t = 1.5, is this a significant difference? No
31
Be careful: correlation vs. cause Observations (and carefully chosen data) may imply a CORRELATION, but does NOT necessarily demonstrate a cause The average global temperature has increased over the past 100 years. The number of pirates in the world has decreased over the past 100 years. Therefore, decreased number of pirates causes increased global temperatures
32
Be careful: correlation vs. cause no no !
33
Be careful: correlation vs. cause To discern a CAUSE, a valid EXPERIMENT must be done Other scientists must also be able to repeat your experiment
34
Last word … Remember, it is ALWAYS better to PROVE your experiment failed to support your hypothesis, than to lie about it being a success!!!
35
Any questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.