2.4 Measures of Variation Prob & Stats Mrs. O’Toole
Objectives Student will learn how to: Find the range of a data set Find the range of a data set Find the variance and standard deviation of a population and of a sample Find the variance and standard deviation of a population and of a sample Use the Empirical Rule to interpret standard deviation Use the Empirical Rule to interpret standard deviation
Definitions Range – the difference between the maximum and minimum data entries in the set (data must be quantitative) Range – the difference between the maximum and minimum data entries in the set (data must be quantitative) Range = Maximum - Minimum
Definitions The deviation of an entry, x, in a population data set is the difference between the entry and the mean, μ, of the data set. The deviation of an entry, x, in a population data set is the difference between the entry and the mean, μ, of the data set. Deviation of x = x - μ
Try it yourself Find the deviation of each starting salary for Corporation B, listed below. Find the deviation of each starting salary for Corporation B, listed below. Starting salaries for Corporation B (in thousands of dollars)
Mean starting salary = 415/10 = 41.5 Starting salaries for Corporation B (in thousands of dollars) Q: What is the sum of the deviations? A: Zero Q: Will this be the case for all data sets? A: Yes
Note: Because the sum of the deviations is always zero, it doesn’t make sense to find the average deviation. Because the sum of the deviations is always zero, it doesn’t make sense to find the average deviation. Instead, we square each deviation. Instead, we square each deviation. The sum of squares is what you get when you add up all the squared deviations. The sum of squares is what you get when you add up all the squared deviations. The mean of the squares is called the population variance. The mean of the squares is called the population variance.
Definition The population variance of a data set of N entries is : The population variance of a data set of N entries is : (σ is the Greek letter sigma)
Definition The population standard deviation of a data set of N entries is the square root of the variance. The population standard deviation of a data set of N entries is the square root of the variance.
Steps for finding the population standard deviation: 1. Find the mean of the data set. 2. Find the deviation of each entry. 3. Square each deviation. 4. Add to get the sum of squares. 5. Divide by N to get the population variance. 6. Take the square root to get the population standard deviation.
Try it yourself Find the population standard deviation for the salaries in the previous example: Starting salaries for Corporation B (in thousands of dollars) Answer: 10.5
Definition The sample variance, s 2, and sample standard deviation, s, of a data set of n entries are: The sample variance, s 2, and sample standard deviation, s, of a data set of n entries are: Note: the only difference here is that, for technical reasons, we divide by one less than the number of entries
Try it yourself Find the sample standard deviation for the salary data you used in the previous example. Starting salaries for Corporation B (in thousands of dollars) Answer = √(1102.5/9) =
Interpreting Standard Deviation Remember that standard deviation is a measure of the typical amount that a data entry deviates from the mean. Remember that standard deviation is a measure of the typical amount that a data entry deviates from the mean. This means the more the entries are spread out, the greater the standard deviation. This means the more the entries are spread out, the greater the standard deviation. See p.87 in your text for examples See p.87 in your text for examples
The Empirical Rule For data with a symmetric (bell-shaped) distribution, About 68% of the data fall within one standard deviation of the mean. About 68% of the data fall within one standard deviation of the mean. About 95% of the data fall within two standard deviations of the mean. About 95% of the data fall within two standard deviations of the mean. About 99.7% of the data fall within three standard deviations of the mean. About 99.7% of the data fall within three standard deviations of the mean.
Homework 2.4 p.92 (1-4, 13-18)