Presentation is loading. Please wait.

Presentation is loading. Please wait.

Describing Quantitative Data Numerically Symmetric Distributions Mean, Variance, and Standard Deviation.

Similar presentations


Presentation on theme: "Describing Quantitative Data Numerically Symmetric Distributions Mean, Variance, and Standard Deviation."— Presentation transcript:

1 Describing Quantitative Data Numerically Symmetric Distributions Mean, Variance, and Standard Deviation

2 Symmetric Distributions Describing a “typical” value for a set of data when the distribution is at least approximately symmetric allows us to choose our measure of center: Describing a “typical” value for a set of data when the distribution is at least approximately symmetric allows us to choose our measure of center: We can use either We can use either Mean Mean Median Median

3 Finding the Mean of a Distribution The mean of a set of numbers is the arithmetic average. We find this value by adding together each value and then dividing by the number of values we added together The mean of a set of numbers is the arithmetic average. We find this value by adding together each value and then dividing by the number of values we added together The formula for the mean is: The formula for the mean is:

4 Let’s see the Formula in Action Consider Babe Ruth’s HR data Consider Babe Ruth’s HR data A check of a dotplot indicates that the distribution is approximately symmetric A check of a dotplot indicates that the distribution is approximately symmetric 5459354146254760 54464946413422

5 So… the first step is to add all the values So… the first step is to add all the values 54 + 59 + 35 + 41 + 46 + 25 + 47 + 60 + 54 + 46 + 49 + 46 + 41 + 34 + 22 = 659 Now we need to divide that sum by the number of values we added together. Now we need to divide that sum by the number of values we added together.

6 So the mean of the data is 43.9333. Now, if we wish to talk about the “typical” number of home runs for Babe Ruth (and we ALWAYS wish to talk about the context of our data!), we could say something like… So the mean of the data is 43.9333. Now, if we wish to talk about the “typical” number of home runs for Babe Ruth (and we ALWAYS wish to talk about the context of our data!), we could say something like… On average, Babe Ruth hit approximately 44 home runs per season during the 15 seasons he played.

7 Remember that although the center is a very important part of our description, we also need to look at the spread of the distribution. Remember that although the center is a very important part of our description, we also need to look at the spread of the distribution. When we use the mean as our measure of center, we use the standard deviation as our measure of spread. When we use the mean as our measure of center, we use the standard deviation as our measure of spread. We can think of standard deviation as “an average distance of values from the mean” We can think of standard deviation as “an average distance of values from the mean” To calculate the standard deviation by hand, we’ll make a data table… To calculate the standard deviation by hand, we’ll make a data table…

8 Calculating Standard Deviation S =

9 XX X - X (X – X) 2 5443.933310.0667101.3384 5943.933315.0667227.0054 3543.9333-8.933379.8038 4143.9333-2.93338.6042 4643.93332.06674.2712 2543.9333-18.9333358.4698 4743.93333.06679.4046 6043.933316.0667258.1388 5443.933310.0667101.3384 4643.93332.06674.2712 4943.93335.066725.6714 4643.93332.06674.2712 4143.9333-2.93338.6042 3443.9333-9.933398.6704 2243.9333-21.9333481.0696 SUM.0005 (essentially 0) 1770.9333

10 Creating the Data Table The first part of our formula indicates that we need to find the distance from the mean for each of our values (x – x) The first part of our formula indicates that we need to find the distance from the mean for each of our values (x – x) X - X 54 – 43.9333 = 10.0667 15.0667 -8.9333 -2.9333 2.0667 -18.9333 3.0667 16.0667 10.0667 2.0667 5.0667 2.0667 -2.9333 -9.9333 -21.9333

11 Now that we know the individual distances for each value, we want to find an “average” of those distances. Now that we know the individual distances for each value, we want to find an “average” of those distances. To find an average we have to add all the values together To find an average we have to add all the values together We find, though, that the sum of those values is always zero. We find, though, that the sum of those values is always zero. Why? Because some of the values are above the mean (positive values) and some are below (negative). The positives and negatives cancel each other out. Why? Because some of the values are above the mean (positive values) and some are below (negative). The positives and negatives cancel each other out. So what values can we use to find the “average” distance from the mean for a set of values? So what values can we use to find the “average” distance from the mean for a set of values?

12 One way to get rid of the negative values in these distances is to square each of the values. That’s exactly what our formula tells us to do. (x – x) 2 One way to get rid of the negative values in these distances is to square each of the values. That’s exactly what our formula tells us to do. (x – x) 2 Once we have these values, to find the average we must add them together Once we have these values, to find the average we must add them together (X – X) 2 101.3384 227.0054 79.8038 8.6042 4.2712 358.4698 9.4046 258.1388 101.3384 4.2712 25.6714 4.2712 8.6042 98.6704 481.0696 SUM = 1770.9333

13 The final step in finding an average is to divide by the number of values we added together, but our formula is a little different here. The final step in finding an average is to divide by the number of values we added together, but our formula is a little different here. Instead of dividing by the total number of values we added together, we divide by 1 less than the total.Instead of dividing by the total number of values we added together, we divide by 1 less than the total. Why? We have taken a “sample” of the data instead of every piece of data in the population. Since another “sample” would produce a slightly different mean, it would also produce a slightly different standard deviation. Dividing by 1 less than the total number of values added together will give us a slightly larger spread to account for this sampling variation.Why? We have taken a “sample” of the data instead of every piece of data in the population. Since another “sample” would produce a slightly different mean, it would also produce a slightly different standard deviation. Dividing by 1 less than the total number of values added together will give us a slightly larger spread to account for this sampling variation.

14 So, we divide the “sum of the squared deviations” by n-1 So, we divide the “sum of the squared deviations” by n-1 We have now calculated everything inside the square root sign We have now calculated everything inside the square root sign This value is an important one—It is called the This value is an important one—It is called the Variance --S 2

15 Since the units of the variance are not the same as our original units, we have one more calculation we must make. Since the units of the variance are not the same as our original units, we have one more calculation we must make. The square root of the variance will restore the original units and give us the “average distance from the mean”—the standard deviation The square root of the variance will restore the original units and give us the “average distance from the mean”—the standard deviation S = 11.2470 S = 11.2470

16 TI-Tips Mean, Variance, & Standard Deviation Find the MEAN Enter the data into a list Enter the data into a list 2 nd STAT 2 nd STAT MATH MATH 3:mean(list name) 3:mean(list name) If you have used a frequency list, If you have used a frequency list, 3:mean(data list, freq list)

17 TI-Tips Find the Variance Enter the data in a list Enter the data in a list 2 nd STAT 2 nd STAT MATH MATH 8:variance(list name) 8:variance(list name) If you have used a frequency list, If you have used a frequency list, 8:variance(data list, freq list)

18 TI-Tips Find Standard Deviation Enter the data in a list Enter the data in a list 2 nd STAT 2 nd STAT Math Math 7:stdDev(list name) 7:stdDev(list name) If you have used a frequency list, If you have used a frequency list, 7:stdDev(data list, freq list)

19 Additional Resources Practice of Statistics: Pg 30-34, 43-46 Practice of Statistics: Pg 30-34, 43-46 Homework: HW 1.2: 1a-d Homework: HW 1.2: 1a-d

20


Download ppt "Describing Quantitative Data Numerically Symmetric Distributions Mean, Variance, and Standard Deviation."

Similar presentations


Ads by Google