Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Analysis IB Topic 1. IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation.

Similar presentations


Presentation on theme: "Statistical Analysis IB Topic 1. IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation."— Presentation transcript:

1 Statistical Analysis IB Topic 1

2 IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation of the variability of data 2. Calculate the mean and standard deviation of a set of values 3. State that the term standard deviation is used to summarize the spread of values around the mean, and that 68% of values fall within one standard deviation of the mean 4. Explain how the standard deviation is useful for comparing the means and spread of data between two or more samples 5. Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables 6. Explain that the existence of a correlation does not establish that there is a causal relationship between two variables

3 Why study statistics?  Scientists use the scientific method when designing experiments  Observations and experiments result in the collection of measurable data  Statistics is a branch of mathematics which allows us to sample small portions and draw conclusions about the larger population

4 Words to know and love …  Mean Average of data points Sum divided by the total  Range Measures the spread of data Difference between the largest and smallest Very large or small values are called outliers

5 More words …  Standard deviation (SD) A measure of how data are dispersed or spread around the mean Determined by mathematical formula (which you do NOT need to know) Use your calculator or online program  Error bars Graphical representation of the variability of data Error bars can show either the range of data OR the SD Look at Figures 1.1 and 1.2 in your book (pgs. 3, 4)

6 Standard Deviation  In normal distribution, about 68% of all values lie within +/- 1 SD of the mean  This rises to about 95% for +/- 2 SD from the mean  The SD tells us how tightly the data points are clustered around the mean Clustered together = small SD Spread out = large SD

7 Graphical Interpretation

8 Why is this useful?  SD tells you how many extremes are in the data  Questions: What is the shape of the graph of a normal distribution of data points? If there are 100 bean plants represented by the bell curve, how many will be within one standard deviation of the mean?

9 Comparing the means and spread of data between two or more samples  Open your book to page 6 and look at the data table for the bean plants  First, calculate the mean  Look at the data – how would you describe the values for both sets of data?  How can we quantify your observations about the variability of the data? Find the standard deviation Use your calculator  Don’t worry about the equation (unless you want to)

10 Options …  TI 83, 84 http://www.saintmarys.edu/~cpeltier/calcforst at/StatTI-83.html http://www.saintmarys.edu/~cpeltier/calcforst at/StatTI-83.html  TI 86 http://www.saintmarys.edu/~cpeltier/calcforst at/StatTI-86.html http://www.saintmarys.edu/~cpeltier/calcforst at/StatTI-86.html  Online calculator http://www.graphpad.com/quickcalcs/ttest1.cf m http://www.graphpad.com/quickcalcs/ttest1.cf m

11 Back to the bean plants  SD in sunlight = 17.68 cm  SD in shade = 47.02 cm  Looking at the means alone, it appears there is no difference between the two sets of data  However, the high SD of the plants grown in the shade tells us what? How confident can we be in the data? What conclusions can we draw about just looking at the mean?

12 Question… If all the data values are equal, such as 7, 7, 7, 7, what is the standard deviation of this set of four data points?

13 Answer  0, if all values are the same, there’s no deviation from the mean

14 Question  If the daily temperatures of a city A range from 10 *C to 30*C for one month, the mean temperature may be 20*C. Another city B may also have a mean temperature of 20*C for the same month. However, the range of city B is only 15*C to 25*C.  Which city has a temperature with a higher standard deviation?  Which city can give a more accurate prediction of weather and why?

15 Answer  City A has a higher standard deviation  City B since is has a very narrow range of temperature or a very low standard deviation

16 Significant difference and the t-test  The t-test is used to determine whether or not the difference between two sets of data is a significant (real) difference  We use a Table of t values (page 8) You do not need to memorize this! This is a tool scientists use

17 How to navigate the table  Probability (p) Bottom of the table (p) that chance alone could make a difference P = 0.50 = difference is due to chance 50% of the time  This is not a significant difference P = 0.05 = the probability that the difference is due to chance is only 5%  Statisticians are never 100% certain, but like to be at least 95% certain  Degrees of freedom Sum of sample sizes of each of the two groups minus two

18 Practice  Looking at the table …  If the degree of freedom is 9, and the given value of t is 2.60, the table indicates that the t value is just greater than 2.26.  Looking at the bottom of the table, probability that chance alone could produce the result is only 5% This means there is a 95% chance that the difference is significant

19 Worked example 1.4  Compare two groups of barnacles living on a rocky shore. Measure the width of their shells to see if a significant size difference is found depending on how close they live to the water. One group lives between 0 and 10 meters above the water level. The second group lives between 10 and 20 meters above the water level  Measurement was taken of the width of the shells in mm. 15 shells were measured from each group. The mean of the group closer to the water indicates that living closer to the water causes barnacles to have a larger shell. If the value of t is 2.25, is that a significant difference?

20 Answer to worked example 1.4  The degree of freedom is 28 (15 + 15 – 2)  2.25 is just above 2.05  Refer to table …  P = 0.05, so the probability that chance alone could produce that result is only 5%  The confidence level is 95%  We are 95% confident that the difference between the barnacles is significant  Barnacles living nearer the water have a significantly larger shell than those living 10 meters or more away from the water

21 Worked example 1.5  The heights of 16 yr old girls from the UK and US were compared. The means indicate that the British girls are taller than the US girls. The sample size from each group was 50 girls from each country. The calculated t is 2.00.  What are the degrees of freedom used to determine the probability that the differences between the two groups are due to chance?  Using the given t value of 2.00 with your calculated degrees of freedom, what is the probability that chance alone can produce a difference in the heights of these girls?  How confident are we that the British girls are taller than the US girls based on this sample size?

22 Answers  98 50+50-2  P = 0.05 or 5%  95% confident

23 Correlation and Causation  Observing something can suggest correlation  Experiments provide a test which shows cause  Observations without an experiment can only show a correlation.

24 Question  For years we have known that there is a high positive correlation between smoking and lung cancer. Does this high positive correlation prove that smoking causes lung cancer?  How can the cause of lung cancer be determined?

25 Answer  No, it does not prove that smoking causes lung cancer. Only data collected from a well designed experiment can show cause  For many years, scientists have performed experiments to study the cause of the correlation which was originally observed. Currently, scientific evidence shows that smoking increases the chances of contracting lung cancer. Experiments show cause.

26 Africanized Honey Bees (AHB)  Are there any volunteers who can summarize the relationship between correlation and causation using this example?

27 Cormorants  Example of using a mathematical correlation test  The value of r = correlation +1 (complete positive correlation) to 0 (no correlation) to -1 (complete negative correlation)

28 Exit slip  Tear and share paper  Name, date, period upper right hand corner  Title: Exit Slip

29 Exit Slip  1. What is the shape of the graph of a normal distribution of data points?  2. If there are 100 bean plants represented by the bell curve, how many will be within one standard deviation of the mean?  3. What is standard deviation used for?  4. What is an error bar?  5. If p = 0.05, how confident can I be in my data? Is this an acceptable level of confidence in science?


Download ppt "Statistical Analysis IB Topic 1. IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation."

Similar presentations


Ads by Google