AP Statistics Chapter 4 Part 2 Displaying and Summarizing Quantitative Data
Learning Goals 1.Know how to display the distribution of a quantitative variable with a histogram, a stem-and-leaf display, or a dotplot. 2.Know how to display the relative position of quantitative variable with a Cumulative Frequency Curve and analysis the Cumulative Frequency Curve. 3.Be able to describe the distribution of a quantitative variable in terms of its shape. 4.Be able to describe any anomalies or extraordinary features revealed by the display of a variable.
Learning Goals 5.Be able to determine the shape of the distribution of a variable by knowing something about the data. 6.Know the basic properties and how to compute the mean and median of a set of data. 7.Understand the properties of a skewed distribution. 8.Know the basic properties and how to compute the standard deviation and IQR of a set of data.
Learning Goals 9.Understand which measures of center and spread are resistant and which are not. 10.Be able to select a suitable measure of center and a suitable measure of spread for a variable based on information about its distribution. 11.Be able to describe the distribution of a quantitative variable in terms of its shape, center, and spread.
DOTPLOTS Quantitative Data
Learning Goal 1: Dotplots What is a dot plot? A dot plot is a plot that displays a dot for each value in a data set along a number line. If there are multiple occurrences of a specific value, then the dots will be stacked vertically.
Learning Goal 1: Dotplots A dotplot is a simple display. It just places a dot along an axis for each case in the data. The dotplot to the right shows Kentucky Derby winning times, plotting each race as its own dot. You may see a dotplot displayed horizontally or vertically.
8 Learning Goal 1: Dotplots To construct a dot plot 1.Draw a horizontal line. 2.Label it with the name of the variable. 3.Mark regular values of the variable (scale) on it. 4.For each observation, place a dot above its value on the number line. Sodium in Cereals
Learning Goal 1: Dotplots - Example: The following data shows the length of 50 movies in minutes. Construct a dot plot for the data. 64, 64, 69, 70, 71, 71, 71, 72, 73, 73, 74, 74, 74, 74, 75, 75, 75, 75, 75, 75, 76, 76, 76, 77, 77, 78, 78, 79, 79, 80, 80, 81, 81, 81, 82, 82, 82, 83, 83, 83, 84, 86, 88, 89, 89, 90, 90, 92, 94, 120. Figure 2-5 Length of 50 Movies
Learning Goal 1: Dotplots – Frequency Table Data The following frequency distribution shows the number of defectives observed by a quality control officer over a 30 day period. Construct a dot plot for the data.
Learning Goal 1: Dotplots – Solution
Learning Goal 1: Dotplots – Your Turn One of Professor Weiss’s sons wanted to add a new DVD player to his home theater system. He used the Internet to shop and went to pricewatch.com. There he found 16 quotes on different brands and styles of DVD players. Construct a dotplot for these data.
Learning Goal 1: Think Before You Draw Remember the “Make a picture” rule? Now that we have options for data displays, you need to Think carefully about which type of display to make. Before making a stem-and-leaf display, a histogram, or a dotplot, check the Quantitative Data Condition: The data are values of a quantitative variable whose units are known.
Learning Goal 2 Know how to display the relative position of quantitative variable with a Cumulative Frequency Curve and analysis the Cumulative Frequency Curve.
OGIVE - CUMULATIVE FREQUENCY CURVE Quantitative Data
Learning Goal 2: Cumulative Frequency and the Ogive Histogram displays the distribution of a quantitative variable. It tells little about the relative standing (percentile, quartile, etc.) of an individual observation. For this information, we use a Cumulative Frequency graph, called an Ogive (pronounced O-JIVE).
Learning Goal 2: Measures of Relative Standing. How many measurements lie below the measurement of interest? This is measured by the p th percentile. p-th percentile (100-p) % x p %
18 Learning Goal 2: Percentile The p th percentile is a value such that p percent of the observations fall below or at that value.
Learning Goal 2: Special Percentiles – Deciles and Quartiles Deciles and quartiles are special percentiles. Deciles divide an ordered data set into 10 equal parts. Quartiles divide the ordered data set into 4 equal parts. We usually denote the deciles by D 1, D 2, D 3, …, D 9. We usually denote the quartiles by Q 1, Q 2, and Q 3.
Learning Goal 2: Special Percentiles – Deciles and Quartiles There are 9 deciles and 3 quartiles. Q 1 = first quartile = P 25 Q 2 = second quartile = P 50 Q 3 = third quartile = P 75 D 1 = first decile = P 10 D 2 = second decile = P D 9 = ninth decile = P 90
Learning Goal 2: Percentile - Examples 90% of all men (16 and older) earn more than $319 per week. BUREAU OF LABOR STATISTICS $319 90%10% 50 th Percentile 25 th Percentile 75 th Percentile = Median = Lower Quartile (Q 1 ) = Upper Quartile (Q 3 ) $319 is the 10 th percentile.
Learning Goal 2: Calculating Percentile The percentile corresponding to a given data value, say x, in a set is obtained by using the following formula.
Learning Goal 2: Calculating Percentile - Example Example: The shoe sizes, in whole numbers, for a sample of 12 male students in a statistics class were as follows: 13, 11, 10, 13, 11, 10, 8, 12, 9, 9, 8, and 9. What is the percentile rank for a shoe size of 12?
Learning Goal 2: Calculating Percentile - Solution Solution: First, we need to arrange the values from smallest to largest. The ordered array is given below: 8, 8, 9, 9, 9, 10, 10, 11, 11, 12, 13, 13. Observe that the number of values at or below the value of 12 is 10.
Learning Goal 2: Calculating Percentile - Solution Solution (continued): The total number of values in the data set is 12. Thus, using the formula, the corresponding percentile is: The value of 12 corresponds to approximately the 83 rd percentile. The value of 12 corresponds to approximately the 83 rd percentile.
Learning Goal 2: Calculating Percentile - Example Example: The data given below represents the 19 countries with the largest numbers of total Olympic medals – excluding the United States, which had 101 medals – for the 1996 Atlanta games. Find the 65 th percentile for the data set. 63, 65, 50, 37, 35, 41, 25, 23, 27, 21, 17, 17, 20, 19, 22, 15, 15, 15, 15.
Learning Goal 2: Calculating Percentile - Solution Solution: First, we need to arrange the data set in order. The ordered set is:. 15, 15, 15, 15, 17, 17, 19, 20, 21, 22, 23, 25, 27, 35, 37, 41, 50, 63, 65. Next, compute the position of the percentile. Here n = 19, k = 65. Thus, c = (19 65)/100 = We need to round up to a value 13.
Learning Goal 2: Calculating Percentile - Solution Solution (continued): Thus, the 13 th value in the ordered data set will correspond to the 65 th percentile. That is P 65 = 27.
Learning Goal 2: Cumulative Frequency What is a cumulative frequency for a class? The cumulative frequency for a specific class in a frequency table is the sum of the frequencies for all values at or below the given class.
Learning Goal 2: Cumulative Frequency Tables Cumulative frequencies for a class are the sums of all the frequencies up to and including that class. Example
Learning Goal 2: Cumulative Frequency Tables Class Total Percentage Cumulative Percentage Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Frequency Cumulative Frequency
Learning Goal 2: Cumulative Frequency Curve - Ogive A line graph that depicts cumulative frequencies. Used to Find Quartiles and Percentiles.
Learning Goal 2: Constructing an Ogive 1.Make a frequency table and add a cumulative frequency column. 2.To fill in the cumulative frequency column, add the counts in the frequency column that fall in or below the current class interval. 3.Label and scale the axes and title the graph. Horizontal axis “classes” and vertical axis “cumulative frequency or relative cumulative frequency”. 4.Begin the ogive at zero on the vertical axis and lower boundary of the first class on the horizontal axis. Then graph each additional Upper class boundary vs. cumulative frequency for that class.
Learning Goal 2: Ogive - Example
Learning Goal 2: Cumulative Frequency Curve – Example The frequencies of the scores of 80 students in a test are given in the following table. Complete the corresponding cumulative frequency table. A suitable table is as follows:
Learning Goal 2: Cumulative Frequency Curve – Example The information provided by a cumulative frequency table can be displayed in graphical form by plotting the cumulative frequencies given in the table against the upper class boundaries, and joining these points with a smooth. Construct the Cumulative Frequency Curve. The cumulative frequency curve corresponding to the data is as follows :
Learning Goal 2: Cumulative Frequency Curve – Class Problem The results obtained by 200 students in a mathematics test are given in the following table. Draw a cumulative frequency curve and use it to estimate. a)The median mark. b)The number of students who scored less than 22 marks. c)The pass mark if 120 students passed the test. d)The min. mark required to obtain an A grade if 10% of the students received an A grade.
Learning Goal 3 Be able to describe the distribution of a quantitative variable in terms of its shape.
Learning Goal 3: What is the Shape of the Distribution? 1.Does the histogram have a single central peak or several separated peaks? 2.Is the histogram symmetric? 3.Do any unusual features stick out? In any graph, look for the overall pattern and any striking deviations from that pattern.
Learning Goal 3: Shape, Center, and Spread When describing a distribution, make sure to always talk about three things: shape, center, and spread… Actually you should comment on four things when describing a distribution. The three above and any deviations from the shape. These deviations from the shape are called ‘outliers’ and will be discussed later.
Learning Goal 3: Shape - Peaks Does the histogram have a single central peak or several separated peaks? Peaks in a histogram are also called modes. A histogram with one main peak is dubbed unimodal; histograms with two peaks are bimodal; histograms with three or more peaks are called multimodal.
Learning Goal 3: Shape: Unimodal - Example Unimodal – single peak. 42
Learning Goal 3: Shape: Bimodal - Example Bimodal - two peaks.
Learning Goal 3: Shape: Multimodal - Example Multimodal – three or more peaks.
Learning Goal 3: Shape: Bimodal or Multimodal A bimodal or multimodal shape distribution might indicate that the data are from two or more different populations.
Learning Goal 3: Shape: Uniform A histogram that doesn’t appear to have any mode and in which all the bars are approximately the same height is called uniform or rectangular. A distribution in which every class has equal frequency, no mode. A uniform distribution is symmetrical with the added property that the bars are the same height.
Learning Goal 3: Shape: Uniform - Example Uniform – no mode, symmetrical.
Learning Goal 3: Shape: Modal Comparison
Learning Goal 3: Shape: Symmetrical In a symmetrical distribution, the data values are evenly distributed on both sides of the mean. If you can fold the histogram along a vertical line through the middle and have the edges match pretty closely, the histogram is symmetric.
Learning Goal 3: Shape: Symmetrical - Example Symmetrical – The distribution’s shape is generally the same if folded down the middle.
Learning Goal 3: Shape: Skewed The (usually) thinner ends of a distribution are called the tails. If one tail stretches out farther than the other, the histogram is said to be skewed to the side of the longer tail. In the figure below, the histogram on the left is said to be skewed left, while the histogram on the right is said to be skewed right.
Learning Goal 3: Shape: Skewed Right - Example In a skewed right distribution, most of the data values fall to the left, and the “tail” of the distribution is to the right.
Learning Goal 3: Shape: Skewed Left - Example In a skewed left distribution, most of the data values fall to the right, and the “tail” of the distribution is to the left.
Learning Goal 3: Shape: Skewed - Comparison 54 A distribution is skewed to the right if the right tail is longer than the left tail A distribution is skewed to the left if the left tail is longer than the right tail
Learning Goal 3: Shape: Other Common Terms Hump – high bar Valley – between 2 peaks Gap – no data
Learning Goal 3: Shapes
Learning Goal 4 Be able to describe any anomalies or extraordinary features revealed by the display of a variable.
Learning Goal 4: Overall Pattern - Anything Unusual? Do any unusual features stick out? Sometimes it’s the unusual features that tell us something interesting or exciting about the data. You should always mention any stragglers, or outliers, that stand off away from the body of the distribution. Are there any gaps in the distribution? If so, we might have data from more than one group.
Learning Goal 4: Deviations from the Overall Pattern Outliers – An individual observation that falls outside the overall pattern of the distribution. Extreme Values – either high or low. Causes: 1.Data Mistake 2.Special nature of some observations Outliers
60 Learning Goal 4: Outliers An Outlier falls far from the rest of the data.
Alaska Florida Learning Goal 4: Outliers Outliers are observations that lie outside the overall pattern of a distribution. Always look for outliers and try to explain them. The overall pattern is fairly symmetrical except for two states clearly not belonging to the main trend. Alaska and Florida have unusual representation of the elderly in their population. A large gap in the distribution is typically a sign of an outlier.
Learning Goal 5 Be able to determine the shape of the distribution of a variable by knowing something about the data.
Learning Goal 5: Determine the Shape of a Distribution - Example 63
Learning Goal 5: Determine the Shape of a Distribution - Example It’s often a good idea to think about what the distribution of a data set might look like before we collect the data. What do you think the distribution of each of the following data sets will look like? 1.Number of Miles run by Saturday morning joggers at a park. Roughly symmetric, slightly skewed right. 2.Hours spent by U.S. adults watching football on Thanksgiving Day. Bimodal. Many people watch no football, others watch most of one or more games. 3.Amount of winnings of all people playing a particular state’s lottery last week. Strongly skewed to the right, with almost everyone at $0, a few small prizes, with the winner an outlier.
Learning Goal 5: Determine the Shape of a Distribution – Your Turn Consider a data set containing IQ scores for the general public. What shape would you expect a histogram of this data set to have? a.Symmetric b.Skewed to the left c.Skewed to the right d.Bimodal 65
66 Learning Goal 5: Determine the Shape of a Distribution – Your Turn Consider a data set of the scores of students on a very easy exam in which most score very well but a few score very poorly. What shape would you expect a histogram of this data set to have? a.Symmetric b.Skewed to the left c.Skewed to the right d.Bimodal