Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 1 Putting Statistics to Work 6
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 2 Unit 6A Characterizing Data
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 3 The distribution of a variable (or data set) refers to the way its values are spread over all possible values. A distribution can be shown visually with a table or graph. Definition
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 4 The mean is what we most commonly call the average value. It is defined as follows: The median is the middle value in the sorted data set (or halfway between the two middle values if the number of values is even). The mode is the most common value (or group of values) in a distribution. Measures of Center in a Distribution
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 5 Eight grocery stores sell the PR energy bar for the following prices: $1.09 $1.29 $1.29 $1.35 $1.39 $1.49 $1.59 $1.79 Find the mean, median, and mode for these prices. Solution The mean is $1.41 Example
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 6 Median: sort the data in ascending order: $1.09 $1.29 $1.29 $1.35 $1.39 $1.49 $1.59 $1.79 Because there are eight prices (an even number), there are two values in the middle of the list: $1.35 and $1.39. The median lies halfway between these two values, which we calculate by adding them and dividing by 2: The mode is $1.29. Example (cont) 3 values below 3 values above 2 middle values
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 7 Effects of Outliers An outlier is a data value that is much higher or much lower than almost all other values. Consider the following data set of contract offers: $0 $0 $0 $0 $10,000,000 The mean contract offer is As displayed, outliers can pull the mean upward (or downward). The median and mode of the data are not affected.
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 8 A newspaper surveys wages for assembly workers in regional high-tech companies and reports an average of $22 per hour. The workers at one large firm immediately request a pay raise, claiming that they work as hard as employees at other companies but their average wage is only $19. The management rejects their request, telling them that they are overpaid because their average wage, in fact, is $23. Can both sides be right? Explain. Example
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 9 Solution Both sides can be right if they are using different definitions of average. In this case, the workers may be using the median while management uses the mean. For example, imagine that there are only five workers at the firm and their wages are $19, $19, $19, $19, and $39. The median of these five wages is $19 (as the workers claimed), but the mean is $23 (as management claimed). Example (cont)
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 10 Two single-peaked (unimodal) distributions A double-peaked (bimodal) distribution Shapes of Distributions
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 11 Symmetry A distribution is symmetric if its left half is a mirror image of its right half. A distribution that is not symmetric must have values that tend to be mote spread out on one side than the other. In this case we say the distribution is skewed.
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 12 A distribution is left- skewed if its values are more spread out on the left side. A distribution is right- skewed if its values are more spread out on the right side. Skewness
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 13 Definition Symmetry and Skewness A single-peaked distribution is symmetric if its left half is a mirror image of its right half. A single-peaked distribution is left-skewed if its values are more spread out on the left side of the mode. A single-peaked distribution is right-skewed if its values are more spread out on the right side of the mode.
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 14 For each of the following situations, state whether you expect the distribution to be symmetric, left- skewed, or right-skewed. Explain. a. Heights of a sample of 100 women b. Number of books read during the school year by fifth graders c. Speeds of cars on a road where a visible patrol car is using radar to detect speeders Example
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 15 Solution a. The distribution of heights of women is symmetric, because roughly equal numbers of women are shorter and taller than the mean and extremes of height are rare on either side of the mean. b. The distribution of the number of books read is right-skewed. Most fifth-grade children read a moderate number of books during the school year, but a few voracious readers will read far more than most other students. These students will therefore be outliers with high values for the number of books read, creating a tail on the right side of the distribution. Example (cont)
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 16 Solution c. Drivers usually slow down when they are aware of a patrol car looking for speeders. Few if any drivers will be exceeding the speed limit, but some drivers slow to well below the speed limit. The distribution of speeds is therefore left-skewed, with a mode near the speed limit but a few cars going well below the speed limit. Example (cont)
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 17 Variation From left to right, these three distributions have increasing variation. Variation describes how widely data values are spread out about the center of a distribution.
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 6, Unit A, Slide 18 How would you expect the symmetry and variation to differ between times in the Olympic marathon and times in the New York marathon? Explain. Solution The Olympic marathon invites only elite runners, whose times are likely to be clustered not far above world record times. The New York marathon allows runners of all abilities, whose times are spread over a very wide range. Therefore, the variation among the times should be greater in the New York marathon than in the Olympic marathon. Example