3.2 Picturing Distributions of Data LEARNING GOAL Be able to create and interpret basic bar graphs, dotplots, pie charts, histograms, stem-and-leaf plots, line charts, and time-series diagrams. Page 99
Definition The distribution of a variable refers to the way its values are spread over all possible values. We can summarize a distribution in a table or show a distribution visually with a graph. Page 99 Slide 3.2- 2
Bar Graphs, Dotplots, and Pareto Charts A bar graph is one of the simplest ways to picture a distribution. Bar graphs are commonly used for qualitative data. Each bar represents the frequency (or relative frequency) of one category: the higher the frequency, the longer the bar. The bars can be either vertical or horizontal. Page 99 Slide 3.2- 3
Let’s create a vertical bar graph from the essay grade data in Table 3 Page 99 • Because the highest frequency is 9 (the frequency for C grades), we chose to make the vertical scale run from 0 to 10. This ensures that even the tallest bar does not quite touch the top of the graph. Slide 3.2- 4
Let’s create a vertical bar graph from the essay grade data in Table 3 • The graph should not be too short or too tall. In this case, it looks about right to choose a total height of 5 centimeters (as shown in the text), which is convenient because it means that each centimeter of height corresponds to a frequency of 2. • The height of each bar should be proportional to its frequency. For example, because each centimeter of height corresponds to a frequency of 2, the bar representing a frequency of 4 should have a height of 2 centimeters. • Because the data are qualitative, the widths of the bars have no special meaning, and there is no reason for them to touch each other. We therefore draw them with uniform widths. Page 99-100 Slide 3.2- 5
Important Labels for Graphs Title/caption: The graph should have a title or caption (or both) that explains what is being shown and, if applicable, lists the source of the data. Vertical scale and label: Numbers along the vertical axis should clearly indicate the scale. The numbers should line up with the tick marks—the marks along the axis that precisely locate the numerical values. Include a label that describes the variable shown on the vertical axis. Page 100 Slide 3.2- 6
Important Labels for Graphs (cont.) Horizontal scale and label: The categories should be clearly indicated along the horizontal axis. (Tick marks may not be necessary for qualitative data, but should be included for quantitative data.) Include a label that describes the variable shown on the horizontal axis. Legend: If multiple data sets are displayed on a single graph, include a legend or key to identify the individual data sets. Page 100 Slide 3.2- 7
A dotplot is a variation on a bar graph in which we use dots rather than bars to represent the frequencies. Each dot represents one data value. Page 100 Figure 3.2 Dotplot for the essay grade data in Table 3.1. Slide 3.2- 8
A bar graph in which the bars are arranged in frequency order is often called a Pareto chart. Page 101 Slide 3.2- 9
TIME OUT TO THINK Would it be practical to make a dotplot for the population data in Figure 3.3? Would it make sense to make a Pareto chart for data concerning SAT scores? Explain. Page 101 Slide 3.2- 10
Definitions A bar graph consists of bars representing frequencies (or relative frequencies) for particular categories. The bar lengths are proportional to the frequencies. A dotplot is similar to a bar graph, except each individual data value is represented with a dot. A Pareto chart is a bar graph with the bars arranged in frequency order. Pareto charts make sense only for data at the nominal level of measurement. Page 101 Slide 3.2- 11
Pie Charts Pie charts are usually used to show relative frequency distributions. A circular pie represents the total relative frequency of 100%, and the sizes of the individual slices, or wedges, represent the relative frequencies of different categories. Pie charts are used almost exclusively for qualitative data. Page 103 Figure 3.5 Party affiliations of registered voters in Rochester County Slide 3.2- 12
Definition A pie chart is a circle divided so that each wedge represents the relative frequency of a particular category. The wedge size is proportional to the relative frequency. The entire pie represents the total relative frequency of 100%. Page 103 Slide 3.2- 13
A graph in which the bars have a natural order and the bar widths have specific meaning, is called a histogram. The bars in a histogram touch each other because there are no gaps between the categories. Pages 104-105 Slide 3.2- 14
The stem-and-leaf plot (or stemplot) looks somewhat like a histogram turned sideways, except in place of bars we see a listing of data for each category. Page 105 Table 3.3 is on page 91. Stem Leaves Figure 3.9 Stem-and-leaf plot for the energy use data from Table 3.3. Slide 3.2- 15
Another type of stem-and-leaf plot lists the individual data values Another type of stem-and-leaf plot lists the individual data values. For example, the first row shows the data values 0.3 and 0.7. Page 105 Figure 3.10 Stem-and-leaf plot showing numerical data—in this case, the per person carbon dioxide emissions from Table 3.11. Slide 3.2- 16
Definitions A histogram is a bar graph showing a distribution for quantitative data (at the interval or ratio level of measurement); the bars have a natural order and the bar widths have specific meaning. A stem-and-leaf plot (or stemplot) is somewhat like a histogram turned sideways, except in place of bars we see a listing of data. Page 105 Slide 3.2- 17
TIME OUT TO THINK What additional information would you need to create a stem-and-leaf plot for the ages of actresses when they won Academy Awards (from Table 3.12, page 106)? What would the stem-and-leaf plot look like? Page 106 Slide 3.2- 18
Line Charts Definition A line chart shows a distribution of quantitative data as a series of dots connected by lines. For each dot, the horizontal position is the center of the bin it represents and the vertical position is the frequency value for the bin. Page 106 Slide 3.2- 19
Page 107 Figure 3.12 Line chart for the energy use data, with a histogram overlaid for comparison. Slide 3.2- 20
Time-Series Diagrams Definition A histogram or line chart in which the horizontal axis represents time is called a time-series diagram. Page 108 Slide 3.2- 21
Page 109 Figure 3.14 Time-series diagram for the homicide rate data of Table 3.14. Slide 3.2- 22
TIME OUT TO THINK Look for data to extend Table 3.14 and Figure 3.14 (previous slide) beyond 2005. Do you see any trend in the homicide rate? Page 109 Slide 3.2- 23
The End Slide 3.2- 24