Download presentation
Presentation is loading. Please wait.
Published byMae Smith Modified over 9 years ago
1
Chapter 2 Descriptive Statistics Section 2.3 Measures of Variation Figure 2.31 Repair Times for Personal Computers at Two Service Centers Figure 2.31 indicates that we need measures of variation to express how the two distributions differ.
2
Chapter 2 Descriptive Statistics Range Largest minus the smallest measurement The Population Variance (pronounced sigma squared) The average of the squared deviations of all the population measurements from the population mean Standard Deviation (pronounced sigma) The square root of the variance Measures of Variation
3
Chapter 2 Descriptive Statistics The Range Range = largest measurement - smallest measurement The range measures the interval spanned by all the data Example 2.3: Internist’s Salaries (in thousands of dollars) 127 132 138 141 144 146 152 154 165 171 177 192 241 Range = 241 - 127 = 114 ($114,000)
4
Chapter 2 Descriptive Statistics The Variance Population X 1, X 2, …, X N Sample x 1, x 2, …, x n Sample Variance ss Population Variance
5
Chapter 2 Descriptive Statistics The Variance For a population of size N, the population variance is defined as For a sample of size n, the sample variance s 2 is defined as and is a point estimate for 2
6
Chapter 2 Descriptive Statistics The Standard Deviation Population Standard Deviation, : Sample Standard Deviation, s:
7
Chapter 2 Descriptive Statistics Example 2.5 Example 2.5 Consider the population of profit margins for five of the best big companies in America as rated by Forbes magazine on its website on March 16, 2005. These profit margins are 8%, 10%, 15%, 12% and 5%. Population Mean Population Variance Population Standard Deviation
8
Chapter 2 Descriptive Statistics Sample variance and standard deviation for first five car mileages from Table 2.1 30.8, 31.7, 30.1, 31.6, 32.1 Example 2.6 Example 2.6 The Car Mileage Case = 2.572 /4 = 0.643
9
Chapter 2 Descriptive Statistics Sample variance and standard deviation for all car mileages from Table 2.1,. The point estimate of the variance of all cars is 0.638793 mpg 2 and the point estimate of the standard deviation of all cars is 0.7992 mpg.
10
Chapter 2 Descriptive Statistics The computational formula for the sample variance Example 2.7 Example 2.7 The Payment Time Case Consider the sample of 65 payment times in Table 2.2. Therefore andDays.
11
Chapter 2 Descriptive Statistics The Empirical Rule for Normal Populations If a population has mean and standard deviation and is described by a normal curve ( symmetrical, bell-shaped distribution ), then 1.68.26% of the population measurements lie within one standard deviation of the mean: [ 2. 95.44% of the population measurements lie within two standard deviations of the mean: [ 2 2 3. 99.73% of the population measurements lie within three standard deviations of the mean: [ 3 3
12
Chapter 2 Descriptive Statistics 68% 95% 99.7% 3- 12 Empirical Rule
13
Chapter 2 Descriptive Statistics Tolerance Intervals An Interval that contains a specified percentage of the individual measurements in a population is called a tolerance interval. The one, two, and three standard deviation intervals around given in (1), (2) and (3) are tolerance intervals containing, respectively, 68.26 percent, 95.44 percent and 99.73 percent of the measurements in a normally distributed population. The three-sigma interval to be a tolerance interval that contains almost all of the measurements in a normally distributed population.
14
Chapter 2 Descriptive Statistics 68.26% of all individual cars will have mileages in the range 68.26% of all individual cars will have mileages in the range 95.44% of all individual cars will have mileages in the range 95.44% of all individual cars will have mileages in the range 99.73% of all individual cars will have mileages in the range 99.73% of all individual cars will have mileages in the range Example 2.8 Example 2.8 mpg The Car Mileage Case 0.7992 and standard deviation S= 0.7992
15
Chapter 2 Descriptive Statistics Example 2.8 Example 2.8 The Car Mileage Case
16
Chapter 2 Descriptive Statistics Skewness and the Empirical Rule The Empirical Rule holds for normally distributed populations. This rule also approximately holds for populations having mound-shaped (single-peaked) distributions that are not very skewed to the right or left. For example, Recall that the distribution of 65 payment times, it indicates that the empirical rule holds.
17
Chapter 2 Descriptive Statistics Chebyshev’s Theorem Let and be a population’s mean and standard deviation, then for any value k > 1, At least 100(1 - 1/k 2 )% of the population measurements lie in the interval: [ k k Use only for non-mound distributions Skewed But not extremely skewed. (If a population is extremely skewed, it is best to measure variation by using percentiles.) Bimodal
18
Chapter 2 Descriptive Statistics Empirical Rule and Chebyshev’s Theorem Empirical RuleChebyshev PopulationNormal Mound-shaped, not very skewed Any population but not extremely skewed [ μ-σ, μ+σ] About 68% [ μ-2σ, μ+2σ] About 95%At least 75% [ μ-3σ, μ+3σ] About 99.7%At least 88.89%
19
Chapter 2 Descriptive Statistics Exercise Shipping Times (Chebychev’s Theorem and the Empirical Rule) A group of 13 students is studying in Istanbul, Tukey, for five weeks. As part of their study of the local economy, they each purchased an Oriental rug and arranged for its shipment back to the United States. The shipping time, in days, For each rug was 31 31 42 39 42 43 34 30 28 36 37 35 40 Estimate the percentage of days that are within two standard deviations of the Mean. Is it likely to take 2 months for a delivery?
20
Chapter 2 Descriptive Statistics z Scores For any x in a population or sample, the associated z score is The z score is the number of standard deviations that x is from the mean A positive z score is for x above (greater than) the mean A negative z score is for x below (less than) the mean
21
Chapter 2 Descriptive Statistics Example 2.9 Example 2.9 Population of profit margins for five big American companies: 8%, 10%, 15%, 12%, 5% = 10%, = 3.406%
22
Chapter 2 Descriptive Statistics Coefficient of Variation Measures the size of the standard deviation relative to the size of the mean Coefficient of Variation = ×100% Used to: compare the relative variabilities of values about the mean Compare the relative variability of populations or samples with different means and different standard deviations Measure risk standard deviation mean
23
Chapter 2 Descriptive Statistics Table: Rates of Return: Asset A and B Which of these two single asset is better? Rates of Return Year Asset A (house) Asset B (stock) 5 years ago 11.3% 9.4% 4 years ago 12.5 17.1 3 years ago 13.0 10.3 2 years ago 12.0 11.2 1 year ago 12.2 13.5 Total 61.0 61.5 Average rate of Return 12.2% 12.3% Standard deviation 0.63 3.08 Expected Return
24
Chapter 2 Descriptive Statistics Section 2.4 Percentiles, Quartiles and Box- and-Whiskers Display For a set of measurements arranged in increasing order, the p th percentile is a value such that p percent of the measurements fall at or below the value and (100-p) percent of the measurements fall at or above the value The first quartile Q 1 is the 25th percentile The second quartile (or median) M d is the 50th percentile The third quartile Q 3 is the 75th percentile The interquartile range IQR is Q 3 - Q 1
25
Chapter 2 Descriptive Statistics One of Procedures for calculating Percentiles 1.Arrange the measurements in increasing order. 2. Calculate the index A. If i is not integer, the next integer greater than i denotes the Position of the pth percentile in the ordered arrangement. B. If i is integer, then the pth percentile is the average of the The measurement in positions i and i+1 in the ordered arrangement
26
Chapter 2 Descriptive Statistics Example 2.10 Example 2.10 DVD Recorder Satisfaction 20 customer satisfaction ratings: 1 3 5 5 7 8 8 8 8 8 8 9 9 9 9 9 10 10 10 10 Q 1 = (7+8)/2 = 7.5 M d = (8+8)/2 = 8 Q 3 = (9+9)/2 = 9 IQR = Q 3 Q 1 = 9 7.5 = 1.5
27
Chapter 2 Descriptive Statistics Five-Number Summary Smallest Value First Quartile Median Third Quartile Largest Value
28
Chapter 2 Descriptive Statistics Using stem-and-leaf displays to find percentiles (a)The 75th percentile of the 65 payment times, and a five- number summary
29
Chapter 2 Descriptive Statistics (b) The 5 th percentile of the 60 bottle design ratings and a five-number summary
30
Chapter 2 Descriptive Statistics Box-and-Whiskers Plots The box plots the: The box plots the: first quartile, Q 1 first quartile, Q 1 median, M d median, M d third quartile, Q 3 third quartile, Q 3 inner fences, located 1.5 IQR away from the quartiles: inner fences, located 1.5 IQR away from the quartiles: = Q 1 – (1.5 IQR) = Q 1 – (1.5 IQR) = Q 3 + (1.5 IQR) = Q 3 + (1.5 IQR) outer fences, located 3 IQR away from the quartiles: outer fences, located 3 IQR away from the quartiles: = Q 1 – (3 IQR) = Q 1 – (3 IQR) = Q 3 + (3 IQR) = Q 3 + (3 IQR)
31
Chapter 2 Descriptive Statistics The “whiskers” are dashed lines that plot the range of the data The “whiskers” are dashed lines that plot the range of the data A dashed line drawn from the box below Q 1 down to the smallest measurement A dashed line drawn from the box below Q 1 down to the smallest measurement Another dashed line drawn from the box above Q 3 up to the largest measurement Another dashed line drawn from the box above Q 3 up to the largest measurement Example: 20 customer satisfaction ratings: 1 3 5 5 7 8 8 8 8 8 8 9 9 9 9 9 10 10 10 10
32
Chapter 2 Descriptive Statistics Outliers Outliers are measurements that are very different from most of the other measurements Outliers are measurements that are very different from most of the other measurements Because they are either very much larger or very much smaller than most of the other measurements Because they are either very much larger or very much smaller than most of the other measurements Outliers lie beyond the fences of the box-and-whiskers plot Outliers lie beyond the fences of the box-and-whiskers plot Measurements between the inner and outer fences are mild outliers Measurements between the inner and outer fences are mild outliers Measurements beyond the outer fences are severe outliers Measurements beyond the outer fences are severe outliers
33
Chapter 2 Descriptive Statistics
34
Section 2.5 Describing Qualitative Data Pie charts of the proportion (as percent) of all cars sold in the United States by different manufacturers, 1970 versus 1997
35
Chapter 2 Descriptive Statistics Population and Sample Proportions Population X 1, X 2, …, X N p Population Proportion Sample x 1, x 2, …, x n Sample Proportion p is the point estimate of p ^
36
Chapter 2 Descriptive Statistics Example 2.11 Example 2.11 The Marketing Ethics Case 117 out of 205 marketing researchers disapproved of action taken in a hypothetical scenario X = 117, number of researches who disapprove n = 205, number of researchers surveyed Sample Proportion:
37
Chapter 2 Descriptive Statistics Bar Chart Percentage of Automobiles Sold by Manufacturer, 1970 versus 1997
38
Chapter 2 Descriptive Statistics Pie Chart Percentage of Automobiles Sold by Manufacturer,1997
39
Chapter 2 Descriptive Statistics An Bar Chart of U.S Automobile Sales in 1997
40
Chapter 2 Descriptive Statistics Misleading Graphs and Charts: Scale Break Break the vertical scale to exaggerate effect Mean Salaries at a Major University, 2002 - 2005
41
Chapter 2 Descriptive Statistics Misleading Graphs and Charts: Scale Effects Compress vs. stretch the vertical axis to exaggerate or minimize the effect Mean Salary Increases at a Major University, 2002 - 2005
42
Chapter 2 Descriptive Statistics Weighted Means Sometimes, some measurements are more important than others Assign numerical “weights” to the data Weights measure relative importance of the value Calculate weighted mean as where w i is the weight assigned to the ith measurement x i
43
Chapter 2 Descriptive Statistics Example 2.12 Example 2.12 June 2001 unemployment rates in the U.S. by region Want the mean unemployment rate for the U.S.
44
Chapter 2 Descriptive Statistics Calculate it as a weighted mean So that the bigger the region, the more heavily it counts in the mean The data values are the regional unemployment rates The weights are the sizes of the regional labor forces Note that the unweigthed mean is 4.55%, which underestimates the true rate by 0.03% That is, 0.0003 144.7 million = 43,410 workers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.