Stats Notes We often calculate averages of numbers to have a summary of the data we’ve collected
Sometimes important information can get “blurred” in the average Sometimes important information can get “blurred” in the average. For instance, how much your data varies: If your data is clustered closely around the average, it would have a small standard deviation If your data is scattered widely on both sides of the average, it would have a large standard deviation
Students’ Height in CM Which will have a higher average height, guys or girls? Hypothesis:
Average Height / cm ± Guys Girls *Looking at our data, any guess as to which data set has a higher standard deviation? Average height of guys Average height of girls
When you have a set of numbers that you’ve calculated into an average and you are comparing it to another average, a test needs to be done to see if these averages are significantly different from each other based on the individual data points. -Why would two averages that come out to the same number not necessarily have the same set of data? -Why would two averages that come out to very different numbers have a set of data that is mostly the same?
Example 1: Two data sets with different averages: Calculate the average: Average: Do these averages accurately represent most of the data? Would you say most of the data is actually different from each other between the two sets? 5 4 3 0.005 500
The test that’s performed on the data is called a T-Test The test that’s performed on the data is called a T-Test. This will give us a number (a p-value) that lets us know if our set of data is significantly different. A T-Test calculates a T value that takes into consideration a number of things about your data sets: The differences in the means of your data The sample size The standard deviation
A p-value is number that gives the percent chance that tells us the probability that the difference in our averages are just random, that our data sets actually aren’t that different from each other. In order for our data sets to be considered significantly different from each other the p-value has to be BELOW 5% (0.05) Although different data sets will have different standards for what is an acceptable p-value, the standard in most science trials is 5%
There are two types of tests that can be performed with a T- Test depending on what you want to know: Two tailed test: When you only want to know if your data is significantly different from each other One tailed test: If you want to test if one set of data is significantly bigger (or smaller) than your second set of data. *which test is most appropriate for our hypothesis on guys’ and girls’ heights at ASW?
Online T-Test Calculator (also under the class resources tab on moodle) http://studentsttest.com/ Instructions: Select groups have unequal variance Select one or two tails (depends on your hypothesis) Type data for the first data set into for group 1 into the field in a column and for the second data set type into the field for group 2 Hit calculate to get your p value Note that the average and standard deviation for each data set is also displayed
Write your formal results from our height data here: p value: must be below 0.05 to be significantly different n values: n1 = number of data points in data set one n2 = number of data points in data set two
Does an unladen European or African swallow have a faster air speed Does an unladen European or African swallow have a faster air speed? Hypothesis:
Average Air Speed of an Unladen Swallow / m s-1 ±0.5 European African 9 14 10 15 11 13 8 18 19 12 Formal Results:
Does your mom have a larger mass before or after McDonald’s opened up next to ASW? Hypothesis:
Formal Results: Table 1: The mass of your mom / kg before and after McDonalds was built next to ASW