Download presentation
Presentation is loading. Please wait.
Published byErlin Sudirman Modified over 6 years ago
1
Probability & Statistics Describing Quantitative Data
2
Describing Quantitative Data Describing Quantitative Data
T. Serino Describing Quantitative Data Describing Quantitative Data “A picture is worth a thousand words” To describe quantitative data is to explain how the data is distributed. What is the shape of the data? Where is the center of the data? How spread out is the data? In order to describe the distribution of quantitative data, you will need some vocabulary.
3
Quantitative Data Think Before You Draw…
T. Serino Think Before You Draw… Remember the “Make a picture” rule? Now that we have options for data displays, you need to Think carefully about which type of display to make. Before making a stem-and-leaf display, a histogram, or a dotplot, check the Quantitative Data Condition: The data are values of a quantitative variable whose units are known.
4
Quantitative Data Shape, Center, and Spread
T. Serino Shape, Center, and Spread When describing a distribution, make sure to always tell about three things: shape, center, and spread…
5
Quantitative Data What is the Shape of the Distribution?
T. Serino What is the Shape of the Distribution? Does the histogram have a single, central hump or several separated humps? Is the histogram symmetric? Do any unusual features stick out?
6
Quantitative Data Humps
T. Serino Humps Does the histogram have a single, central hump or several separated bumps? Humps in a histogram are called modes. A histogram with one main peak is dubbed unimodal; histograms with two peaks are bimodal; histograms with three or more peaks are called multimodal.
7
Quantitative Data Humps (cont.)
T. Serino Humps (cont.) A bimodal histogram has two apparent peaks:
8
Quantitative Data T. Serino Modes are one way to describe the SHAPE of the graph? The humps of the graph are called MODES.
9
Quantitative Data T. Serino A histogram that doesn’t appear to have any mode and in which all the bars are approximately the same height is called uniform: Proportion of Wins
10
Quantitative Data Symmetry Is the histogram symmetric?
T. Serino Symmetry Is the histogram symmetric? If you can fold the histogram along a vertical line through the middle and have the edges match pretty closely, the histogram is symmetric.
11
Quantitative Data Symmetry (cont.)
T. Serino Symmetry (cont.) The (usually) thinner ends of a distribution are called the tails. If one tail stretches out farther than the other, the histogram is said to be skewed to the side of the longer tail. In the figure below, the histogram on the left is said to be skewed left, while the histogram on the right is said to be skewed right.
12
Quantitative Data Symmetry is another way to describe SHAPE:
T. Serino Symmetry is another way to describe SHAPE: Does the graph have a tail?
13
Quantitative Data Anything Unusual? Do any unusual features stick out?
T. Serino Anything Unusual? Do any unusual features stick out? Sometimes it’s the unusual features that tell us something interesting or exciting about the data. You should always mention any stragglers, or outliers, that stand off away from the body of the distribution. Are there any gaps in the distribution? If so, we might have data from more than one group.
14
Quantitative Data Are there GAPS in your data?
T. Serino Are there GAPS in your data? The following histogram has outliers—there are three cities in the leftmost bar: There is a relatively large gap in this data. Three cities have a significantly lower number of people per housing unit than the other cities.
15
Quantitative Data Are there groups or clusters of data?
T. Serino Are there groups or clusters of data? If so, your data may not all be of the same type, may come from different sources, or contain more than one group. There are two main clusters of data here: one ranging from approximately 7 to 12 and the other ranging from approximately 25 to 37.
16
Quantitative Data Where is the CENTER of the distribution?
T. Serino Where is the CENTER of the distribution? It's easy to identify the center of data that is somewhat uniform or unimodal-symmetric: The center of this data is at approximately 0 The center of this data is at approximately 0.5
17
Quantitative Data T. Serino The center of a skewed graph or a multimodal graph are not as easy to determine or use. Certain measures of center can be meaningless or may not be useful in these types of graphs. There is no data in the center of this graph. The center may have no meaning at all. Mode Median Mean Each measure of center is in a different location. Which measure should be used?
18
Quantitative Data Center of a Distribution – Median
T. Serino Center of a Distribution – Median The median is the value with exactly half the data values below it and half above it. It is the middle data value (once the data values have been ordered) that divides the histogram into two equal areas. It has the same units as the data.
19
Quantitative Data Spread: Home on the Range
T. Serino Spread: Home on the Range How SPREAD out is the distribution? Always report a measure of spread along with a measure of center when describing a distribution numerically. The range of the data is the difference between the maximum and minimum values: Range = max – min A disadvantage of the range is that a single extreme value can make it very large and, thus, not representative of the data overall.
20
Quantitative Data Example
T. Serino Below is a histogram of the Average Wind Speed for every day in 1989. Describe the distribution of the data. Example The range of the data is 9. The data values range from the minimum which is 0 to the maximum which is about 9. The high value may be an outlier. There is a large gap between 6.5 and 8.5. The main cluster of data seems to represent wind speeds between 0 and 2.5 mph. The distribution is unimodal and skewed to the right. The mode is about 0.8 mph The center (median) daily wind speed is about 1.90 mph Can we say more?
21
Quantitative Data T. Serino Example: The following graph displays the time it takes for a warehouse to retrieve parts for customer orders. Describe the distribution of the data. What is the modality? Unimodal, Bimodal, Multimodal? Where is (are) the mode(s)? Give the number or range of numbers where the mode(s) occur(s). Or is the graph uniform? Is the graph somewhat symmetric? Is the graph skewed? Is it skewed left or right?
22
Quantitative Data T. Serino Example: The following graph displays the time it takes for a warehouse to retrieve parts for customer orders. Describe the distribution of the data. Is there anything unusual about the graph? Gaps, outliers, clusters? Where do they occur? Where is the center of the graph? What is the range of the data? What are the maximum and minimum values of the data?
23
Quantitative Data Why is the graph shaped like this?
T. Serino Why is the graph shaped like this? The following graph displays the time it takes for a warehouse to retrieve parts for customer orders. What real life aspects could account for the shape of this graph? It is possible that the two groups are separated by weight. Maybe the warehouse employees need to use a machine to retrieve the heavier parts whereas the lighter parts can be retrieved more quickly by hand. This graph is obviously bimodal. It seems like there are two groups of parts listed. One group of parts takes between 1 and 5 minutes to retrieve and the other group takes 6 to 12 minutes to retrieve. There are other possibilities…. Maybe some parts are kept in a warehouse that is a bit further away from the customer than the main warehouse.
24
Try this. T. Serino The dot plot to the right shows Kentucky Derby winning times, plotting each race as its own dot. a) How many Kentucky Derby winning times were greater than 150 seconds? b) Approximate the best Kentucky Derby winning time. c) Why do you think there is a large gap in the middle of the data? (What real life situation may have caused this gap?)
25
athematical M D ecision aking
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.