Section 3.2 Measures of Dispersion 1.Range 2.Variance 3.Standard deviation 4.Empirical Rule for bell shaped distributions 5.Chebyshev’s Inequality for any distribution 3-1
Range The range of a set of data is the difference between the maximum value and the minimum value. Range = (maximum value) – (minimum value)
EXAMPLE The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Find the range. Range = 43 – 5 = 38 minutes
The population variance is the sum of squared deviations about the population mean divided by the number of observations in the population, N. That is it is the mean of the sum of the squared deviations about the population mean. 3-4 Variance
The population variance is symbolically represented by σ 2 (lower case Greek sigma squared). 3-5
EXAMPLE Population Variance The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Compute the population variance of this data. Recall that 3-6
xixi μ x i – μ(x i – μ) minutes 2 3-7
The sample variance is computed by determining the sum of squared deviations about the sample mean and then dividing this result by n –
EXAMPLE Sample Variance For the travel time data assume we obtained the following simple random sample: 5, 36, 26. Compute the sample variance travel time. Travel Time, x i Sample Mean,Deviation about the Mean, Squared Deviations about the Mean, – = ( ) 2 = square minutes 3-9
Standard Deviation The standard deviation of a set of sample values is a measure of variation of values about the mean.
Population standard deviation: = square root of the population variance Sample standard deviation: s = square root of the sample variance, so that 3-11
EXAMPLE Population Standard Deviation The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Compute the population standard deviation of this data. Recall, from the last objective that σ 2 = minutes 2. Therefore, 3-12
EXAMPLESample Standard Deviation Recall the sample data 5, 26, 36 results in a sample variance of square minutes Use this result to determine the sample standard deviation. 3-13
Wait Time at Wendy’s Wait Time at McDonald’s 3-14 EXAMPLE Comparing Standard Deviations
Determine the standard deviation waiting time for Wendy’s and McDonald’s. Which is the better company in terms of waiting times? 3-15
EXAMPLE Comparing Standard Deviations Determine the standard deviation waiting time for Wendy’s and McDonald’s. Sample standard deviation for Wendy’s: minutes Sample standard deviation for McDonald’s: minutes 3-16
For many observations – especially if their histogram is bell-shaped 1.Roughly 68% of the observations in the list lie within 1 standard deviation from the average 2.And 95% of the observations lie within 2 standard deviations from the average Average Ave-s.d. Ave+s.d. 68% 95% Ave-2s.d. Ave+2s.d. The empirical rule for bell shaped distributions
3-18
The Empirical Rule
EXAMPLE Using the Empirical Rule The following data represent the serum HDL cholesterol of the 54 female patients of a family doctor
(a)Compute the population mean and standard deviation. (b) Draw a histogram to verify the data is bell-shaped. (c) Determine the percentage of patients that have serum HDL within 3 standard deviations of the mean according to the Empirical Rule. (d) Determine the percentage of patients that have serum HDL between 34 and 69.1 according to the Empirical Rule. (e) Determine the actual percentage of patients that have serum HDL between 34 and 69.1 (use the raw data directly, not the empirical rule for this question. See how close the empirical rule approximation was!) 3-23
(a) Using a TI-83 plus graphing calculator or Excel, we find (b) 3-24
(e) 45 out of the 54 or 83.3% of the patients have a serum HDL between 34.0 and (c) According to the Empirical Rule, 99.7% of the patients that have serum HDL within 3 standard deviations of the mean. (d) 13.5% + 34% + 34% = 81.5% of patients will have a serum HDL between 34.0 and 69.1 according to the Empirical Rule. 3-25
Empirical rule for any shape distribution Chebyshev’s Inequality 3-26
EXAMPLE Using Chebyshev’s Theorem Using the data from the previous example, use Chebyshev’s Theorem to (a)determine the percentage of patients that have serum HDL within 3 standard deviations of the mean. (b) determine the actual percentage of patients that have serum HDL between 34 and