Download presentation
Presentation is loading. Please wait.
1
WEEK 4 Picturing Distributions of Data
Graphs, graphs and YES!! More graphs!
2
Graphs are used to try to tell a story.
3
Graphs and Charts: Pictorial Presentation of Data
Graphs and charts provide a direct sense of proportion With graphics, visible spatial features substitute for abstract numbers So why do it? © 2008 McGraw-Hill Higher Education
4
Do you present yourself like this?
Speaker notes Do you present yourself like this? [HAVE AUDIENCE ANSWER QUESTION.] Why would you not present yourself like this? Do you think this man is taken seriously? What do you think would happen if he tried to speak to someone in the Ministry of Health about some information related to a BCC campaign? Would he even be let in? So, if you know that you would not be taken seriously if you presented yourself like this, then . . .
5
So why would you present your data like this?
Speaker notes Why would you present your data like this? Would most people be able to get the message from this data if it was presented in this STATA output? [ALLOW COMMENTS] No, it is too busy and it is difficult to interpret. The way you present your data can greatly affect how usable the data will be.
6
Or this? Speaker notes And why would you present your data like this? Can anyone tell me what some problems may be with this chart? POSSIBLE ANSWERS No title No axis labels The colors are difficult to read. (You should never put a dark color on a dark background.) The green color is too bright.
7
This is Better! Use of ITNs in Zambia Speaker notes
What is improved in this slide compared to the last one? (other than the data points themselves) POSSIBLE ANSWERS Title Axis labels Data labels The colors are easy to read.
8
Effective presentation
Clear Concise Actionable Attractive Speaker notes Regardless what communication formats you use, the information should be presented in a clear, concise way with key findings and recommendation that are actionable. What does this mean?
9
Effective presentation
For all communication formats it is important to ensure that there is: Consistency Font, Colors, Punctuation, Terminology, Line/ Paragraph Spacing An appropriate amount of information Less is more Appropriate content and format for audience Scientific community, Journalist, Politicians Speaker notes An appropriate amount of information will be determined by your audience and format. Policymakers may do better with direct and concise summaries of key points, whereas the scientific community will want more detail. On a PowerPoint slide, try to limit to six lines with no more than six words per line, balance text with graphics, and make sure that there are not too many slides. One way to ensure that you create consistent materials is to decide on a template for the document/presentation/graph, etc., before you produce it. You can then give these guidelines to the different people involved in the process, and then only have to do minor formatting at the end.
10
Summarizing data Tables Charts and graphs
Simplest way to summarize data Data is presented as absolute numbers or percentages Charts and graphs Visual representation of data Usually data is presented using percentages Speaker notes The two main ways of summarizing data are by using tables and charts or graphs. A table is the simplest way of summarizing a set of observations. A table has rows and columns containing data which can be in the form of absolute numbers or percentages, or both. Graphs are pictorial representations of numerical data and should be designed so that they convey at a single look the general patterns of the data. Generally, the data in a graph is in the form of percentages. Although they are easier to read than tables, they provide less detail. The loss of detail may be replaced by a better understanding of the data. Tables and graphs are used to Convey a message; Stimulate thinking; and Portray trends, relationships, and comparisons. The most informative graphs are simple and self-explanatory. Tables can be good for side-by-side comparisons, but can lack visual impact when used on a slide in a presentation.
11
Points to remember Ensure graphic has a title
Label the components of your graphic Indicate source of data with date Provide number of observations (n=xx) as a reference point Add footnote if more information is needed Speaker notes To make the graphic as self explanatory as possible there are several things to include: Every table or graph should have a title or heading The x- and y-axes of a graph should be labeled, include value labels such as a percentage sign, include a legend Cite the source of your data and put the date when the data was collected or published Provide the sample size or the number of people to which the graph is referring Include a footnote if the graphic isn’t self-explanatory These points will pre-empt questions and explain the data. In the next several slides, we’ll see examples of these points.
12
Sec. 3.2 Objective In this section, we will look at the most common methods for displaying distributions of data A distribution of data values refers to the way the values are spread out over the chosen categories You will be able to create and interpret: Basic bar graphs Dotplots Pie charts Histograms Line charts Time-series diagrams
13
Let’s do this!
14
Use the right type of graphic
Charts and graphs »» MAIN ONES: Bar chart: comparisons, categories of data Histogram: represents relative frequency of continuous data Line graph: display trends over time, continuous data (ex. cases per month) Pie chart: show percentages or proportional share BUT MANY OTHER TYPES ALSO EXIST Speaker notes We’re going to review the most commonly used charts and graphs in Excel/PowerPoint. Later we’ll have you use data to create your own graphics which may go beyond those presented here. Bar charts are used to compare data across categories. A histogram looks similar to a bar chart but is a statistical graph that represents the frequency of values of a quantity by vertical rectangles of varying heights and widths. The width of the rectangles is in proportion to the class interval under consideration, and their areas represent the relative frequency of the phenomenon in question A histogram is a histogram, not just because the bars touch. In the bar graph bars in a bar graph can touch if you want them to ... but they don't have to. Touching bars in a bar graph doesn't mean anything. In a histogram, however, the bars must touch. This is because the data elements we are recording are numbers that are grouped, and form a continuous range from left to right. There are no gaps in the numbers along the bottom axis. This is what makes a histogram. Line graphs display trends over time, continuous data (ex. cases per month) Pie charts show percentages or the contribution of each value to a total. When there are more than 4 categories it is best to go to a bar chart so that it is readible
15
Types of Graphs and Levels of Measurement
For nominal/ordinal variables, use pie charts and bar charts For interval/ratio variables, use histograms and polygons (line graphs) © 2008 McGraw-Hill Higher Education
16
Graphing and Table Guidelines
Choose a design based on a variable’s level of measurement, study objectives, and targeted audience A good graphic simplifies, not complicates A good graph is self-explanatory Produce rough drafts and seek advice Adhere to inclusiveness and exclusiveness Provide a descriptive title and indicate the source of material Scrutinize computer generated graphics © 2008 McGraw-Hill Higher Education
17
Sec. 3.2 Pie Charts A pie chart is a circle divided so that each wedge represents the relative frequency of a particular category The wedge size is proportional to the relative frequency The entire pie represents the total relative frequency of 100% When the wedge sizes represent simple fractions, it’s easy to create a pie chart, but when there are numerous categories, a pie chart may not be the best way to represent the data
18
The Pie Chart: EXAMPLE »The Race and Ethnicity of the Elderly
Pie chart: a graph showing the differences in frequencies or percentages among categories of a variable. The categories are displayed as segments of a circle whose pieces add up to 100 percent of the total frequencies.
19
Too many categories can be messy!
2.8% .8% .6% .5% 8.3% 87.7% N = 35,919,174 Figure 3.1 Annual Estimates of U.S. Population 65 Years and Over by Race, 2003
20
We can reduce some of the categories
4% 8.3% 87.7% N = 35,919,174 Figure 3.2 Annual Estimates of U.S. Population 65 Years and Over, 2003
21
Pie chart Speaker notes
A pie chart displays the contribution of each value to a total. In this chart, the values always add up to 100. What should be added to this chart to provide the reader with more information? What should be changed about this chart to make it more readible? POSSIBLE ANSWERS The color scheme, which is currently too bright The title should be more specific and indicate whether these are numbers or percentages. If these are percentages, that should be listed on the data and the n, or number of cases should be indicated to provide context.
22
Pie chart Percentage of all confirmed malaria cases treated by quarter, Country X, 2011 Speaker notes A pie chart displays the contribution of each value to a total. In this case we used the chart to show contribution of each quarter to the entire year. For example, the first quarter contributed the largest the percentage of enrolled patients. To improve the understanding of the pie chart, we’ve added a more descriptive title and added value labels. On the previous chart, we couldn’t tell if the values are numbers or percentages. Adding the sample size let’s us know the total number of observations. For example It is also important to have charts that are attractive, easy to look at and easy to read. The chart on the previous page was so colorful that it was distracting, the colors were so bright that it was hard to look at the chart, let alone read it. While these colors are not the most interesting, they let the reader focus on the chart. The last chart was an exaggeration, but be sure to make sure that you do not make the same mistake on a smaller level. Limit the slices to 4-6. For extra pizzazz, contrast the most important slice either with color or by exploding the slice. N=257
23
Circle Graphs (Pie Charts)
A circle graph, or pie chart, is used to represent categorical data. It consists of a circular region partitioned into disjoint sections, with each section representing a part or percentage of the whole. A circle graph shows how parts are related to the whole.
24
Example 10-2 Construct a circle graph for the information in the table, which is based on information taken from a U.S. Bureau of the Census Report (2006).
25
Example (continued) The entire circle represents the total 299 million people. The measure of the central angle (an angle whose vertex is at the center of the circle) of each sector of the graph is proportional to the fraction or percentage of the population the section represents.
26
Example 10-2 (continued) For example, the measure of the angle for the
sector for the under-5 group is or approximately 7% of the circle. Because the entire circle is 360°, of 360°, or about 24°, should represent the under-5 group.
27
Example (continued) The table shows the number of degrees for each age group.
28
Example (continued)
29
Sec. 3.2 Example 2 Create a pie chart from the essay grade data in section Each sector is 10 degrees. Grade Frequency A 4 B 7 C 9 D 3 F 2 Total 25 29
30
Sec. 3.2 Bar Graphs A bar graph is typically used for qualitative data (categorical) Each bar represents the frequency (or relative frequency of one category) The bars can either be vertical or horizontal
31
Bar Chart (Graph) A series of vertical or horizontal bars with the length of a bar representing the percentage frequency of a category of a nominal/ordinal variable Bar charts are especially useful for conveying a sense of competition among categories © 2008 McGraw-Hill Higher Education
32
The Bar Graph: EXAMPLE»»The Living Arrangements and Labor Force Participation of the Elderly
Bar graph: a graph showing the differences in frequencies or percentages among categories of a variable. The categories are displayed as rectangles of equal width with their height proportional to the frequency or percentage of the category.
33
N=13,886,000 Figure 3.3 Living Arrangements of Males (65 and Older) in the United States, 2000
34
Can display more info by splitting sex
Figure 3.4 Living Arrangement of U.S. Elderly (65 and Older) by Gender, 2003
35
Figure 3.5 Percent of Men and Women 55 Years and Over in the Civilian Labor Force, 2002
36
Characteristics of Bar Graphs
Sec. 3.2 Characteristics of Bar Graphs When the data is qualitative, the widths of the bars have no special meaning, so there is no reason for them to be touching and they should be drawn with uniform widths The graph should have a title or caption that explains what is being shown The vertical axis should be labeled and scaled appropriately The tick marks should be evenly spaced and the range of values between each mark should be the same The horizontal axis should be labeled and each category should be indicated (there is no need for tick marks if the data is qualitative) The graph should include a legend if multiple data sets are displayed on a single graph
37
Bar Graphs A bar graph typically has spaces between the bars and is used to depict categorical data. The bars representing Tom, Dick, Mary, Joy, and Jane could be placed in any order.
38
Double-Bar Graphs A double bar graph can be used to make comparisons in data.
39
Other Bar Graphs The table shows various types of shoes worn by students in one class and the approximate percentages of students who wore them.
40
Other Bar Graphs The figure depicts a percentage bar graph with that same data.
41
Other Bar Graphs The table shows information about the expenditures of a business over a period of years.
42
Other Bar Graphs The figure depicts a stacked bar graph with that same data.
43
Sec. 3.2 Example 1 Create a vertical bar graph from the essay grade data in section 3.1. Grade Frequency A 4 B 7 C 9 D 3 F 2 Total 25
44
Sec. 3.2 Dotplots A dotplot is a variation of a bar graph in which each dot represents one data value and the total number of dots represents the frequency Dotplots are convenient when making graphs of raw data, because you can tally the data by making a dot for each value and then you can choose to convert the graph to a bar chart so it looks more formal
45
Dot Plots A dot plot, or line plot, provides a quick and simple way of organizing numerical data. They are typically used when there is only one group of data with fewer than 50 values.
46
Dot Plots Suppose the 30 students in Abel’s class received the following test scores:
47
Dot Plots A dot plot for the class scores consists of a horizontal number line on which each score is denoted by a dot, or an x, above the corresponding number-line value. The number of x’s above each score indicates how many times each score occurred.
48
Dot plots Four students scored 82.
A gap occurs between scores 88 and 97. Two students scored 72. The score 52 is an outlier Scores 97 and 98 form a cluster
49
Dot Plots Outlier a data point whose value is significantly greater than or less than other values Cluster an isolated group of points Gap a large space between data points Mode data value(s) that occur most often.
50
Dot Plots If a dot plot is constructed on grid paper, then shading in the squares with x’s and adding a vertical axis depicting the scale allows the formation of a bar graph.
51
Sec. 3.2 Pareto Chart A pareto chart is a bar graph with the bars arranged in frequency order (either high to low or low to high) Pareto charts make sense only for data at the nominal level Ex. ~ It wouldn’t make sense to create a pareto chart for the essay grade data because then it would put the grades out of order (C, B, A, D, F) Bar graph Pareto chart
52
Sec. 3.2 Histograms A histogram is a bar graph that shows a distribution of quantitative data (numerical) Not only does the y-axis have numerical meaning, but so does the x-axis (therefore the bar widths have meaning) Just like the tick marks on the y-axis, the tick marks on the x-axis must be evenly spaced and represent the same range of values between each one The bars touch each other because there are no gaps in between the categories The start of the bar includes the number (ex. the red bar is from 0 up to 20, but not including 20 and the pink bar is from 20 up to 40, but not including 40, and so on) Refer to table 3.4 on p.91
53
Histograms and Bar Graphs
A histogram is made up of adjoining rectangles, or bars. The bars are all the same width. The scale on the vertical axis must be uniform.
54
Histograms and Bar Graphs
A distinguishing feature between histograms and bar graphs is that there is no ordering that has to be done among the bars of the bar graph, whereas there is an order for a histogram.
55
The Histogram Histogram: a graph showing the differences in frequencies or percentages among categories of an interval-ratio variable. The categories are displayed as contiguous bars, with width proportional to the width of the category and height proportional to the frequency or percentage of that category.
56
What’s wrong with this one?
Figure 3.7 Age Distribution of U.S. Population 65 Years and Over, 2000
57
Population Pyramid: Country Z, 2008 (a “double” horizontal histogram)
Speaker notes This is a population pyramid. It is basically two histograms presented side by side. On the right you can see males and on the left you see females. The bins shown are five-year age categories. Population pyramids are useful for presenting descriptive data about your population of interest or study population.
58
The following two slides are applications of the histogram
The following two slides are applications of the histogram. They examine, by gender, age distribution patterns in the U.S. population for 1955 and 2010 (projected). Notice that in both figures, age groups are arranged along the vertical axis, whereas the frequencies (in millions of people) are along the horizontal axis. Each age group is classified by males on the left and females on the right. Because this type of histogram reflects age distribution by gender, it is also called an age-sex pyramid.
60
Histograms – Example A What are the class widths?
Histograms are a useful way to illustrate the frequency distribution of continuous data. For example, the data in the table below show the lung volume of a group of students. Lung volume (litres) Frequency 2.5–2.9 2 3.0–3.4 5 3.5–3.9 8 4.0–4.4 11 4.5–4.9 9 5.0–5.4 4 5.5–5.9 1 What are the class widths?
61
Histograms – Example A Since the classes are of equal width, a standard histogram can be drawn, using the frequencies for height. Remember to give your histogram a title. The vertical axis represents the frequency. There are no gaps between the bars because the data are continuous. The horizontal axis represents the lung volume (litres) and contains the classes of the frequency distribution.
62
The Frequency Polygon Frequency polygon: a graph showing the differences in frequencies or percentages among categories of an interval-ratio variable. Points representing the frequencies of each category are placed above the midpoint of the category and are jointed by a straight line.
63
Histograms – Example A The shape of the line is the frequency distribution. A frequency polygon is obtained by joining the mid-points of the top of the histogram bars.
64
Sec. 3.2 Example 3 Create a histogram from the data below that shows the ages of Academy Award-winning actresses from 1970 to 2007. Age Number of Actresses 20-29 8 30-39 18 40-49 50-59 60-69 2 70-79 1 80-89 64
65
Sec. 3.2 Line Charts A line chart shows a distribution of data using a series of dots connected by lines When qualitative, the dot is positioned horizontally by placing it on the tick mark of the category and vertically by placing it at the appropriate value that corresponds to the frequency When quantitative, the dot is positioned horizontally in the middle of the bin and vertically at the appropriate value that corresponds to the frequency
66
Line graph Number of Clinicians* Working in Each Clinic During Years 1-4, Country Y Speaker notes A line graph should be used to display trends over time and is particularly useful when there are many datapoints. In this case we have 4 datapoints for each clinic. By adding a label to the y-axis, a title and a footnote. In some settings, clinicians may only mean doctors but to be clear the footnote let’s the reader know that in this case we are referring to both doctors and nurses. *Includes doctors and nurses.
67
Caution: Line Graph Number of Clinicians* Working in Each Clinic During Years 1-4, Country Y Speaker notes What is wrong with this line graph? If you look closely you can see that the X axis should be years, but instead it is clinics. Make sure that the right data is always charted on the axes, or else you may end up with a graph that cannot be interpreted like this one. *Includes doctors and nurses.
68
Source: Adapted from U. S
Source: Adapted from U.S. Bureau of the Census, Center for International Research, International Data Base, 2003. Figure Population of Japan, Age 55 and Over, 2000, 2010, and 2020
69
Time Series Charts (another type of line graph)
Time series chart: a graph displaying changes in a variables at different points in time. It shows time (measured in units such as years or months) on the horizontal axis and the frequencies (percentages or rates) of another variable on the vertical axis.
70
Source: Federal Interagency Forum on Aging Related Statistics, Older Americans 2004: Key Indicators of Well Being, 2004. Figure Percentage of Total U. S. Population 65 Years and Over, 1900 to 2050
71
Source: U.S. Bureau of the Census, “65+ in America,” Current Population Reports,
1996, Special Studies, P23-190, Table 6-1. Figure Percentage Currently Divorced Among U.S. Population 65 Years and Over, by Gender, 1960 to 2040
72
Time-Series Diagrams Sec. 3.2
A time-series diagram is a histogram or line chart in which the horizontal axis represents time
73
Other types of graphs and data « pictures »
74
The Statistical Map: The Geographic Distribution of the Elderly
We can display dramatic geographical changes in American society by using a statistical map. Maps are especially useful for describing geographical variations in variables, such as population distribution, voting patterns, crimes rates, or labor force participation.
77
Pictographs A pictograph is used to represent tallies of categories. For example, categorical data might be seen in determining the month that the most automobiles were sold. The month is the category. A symbol, or icon, is used to represent a quantity of items.
78
Pictographs A symbol or an icon is used to represent a quantity of items. A legend tells what the symbol represents.
79
Pictographs All graphs need titles. If needed, legends should be shown.
80
Pictographs The number of students in each teacher’s fifth-grade class at Hillview is depicted in tabular representation in the table and in a pictograph.
81
BUT »» Let’s be careful!
82
© 2008 McGraw-Hill Higher Education
Statistical Follies Graphs may be intentionally or mistakenly distorted Make sure any claimed differences in scores is real and not simply a distortion of the graphic Use computer graphics carefully and edit output. Rely on the computer as simply a drawing tool
83
Distortions in Graphs Graphs not only quickly inform us; they can quickly deceive us. Because we are often more interested in general impressions than in detailed analyses of the numbers, we are more vulnerable to being swayed by distorted graphs. What are graphical distortions? How can we recognize them?
84
Shrinking and Stretching the Axes: Visual Confusion
Probably the most common distortions in graphical representations occur when the distance along the vertical or horizontal axis is altered in relation to the other axis. Axes can be stretched or shrunk to create any desired result.
85
Shrinking and Stretching the Axes: Visual Confusion
86
Distortions with Picture Graphs
Another way to distort data with graphs is to use pictures to represent quantitative information. The problem with picture graphs is that the visual impression received is created by the picture’s total area rather than by is height (the graphs we have discussed so far rely heavily on height).
87
Statistics in Practice
The following graphs are particularly suitable for making comparisons among groups: - Bar chart - Frequency polygon - Time series chart
88
Source: Smith, 2003. This bar chart compares elderly males and females who live alone by age, gender, and race or Hispanic origin. It shows that that the percentage of elderly who live alone varies not only by age but also by both race and gender. Figure 3.17 Percentage of College Graduates among People 55 years and over by age and sex, 2002
89
Source: Stoops, Nicole. 2004. “Educational Attainment in the United States: 2003.”
Current Population Reports, P Washington D.C.: U.S. Government Printing Office. This frequency polygon compares years of school completed by black Americans age 25 to 64 and 65 years and older with that of all Americans in the same age groups. Figure 3.18 Years of School Completed in the United States by Race and Age, 2003
90
Why use charts and graphs?
What do you lose? ability to examine numeric detail offered by a table potentially the ability to see additional relationships within the data potentially time: often we get caught up in selecting colors and formatting charts when a simply formatted table is sufficient What do you gain? ability to direct readers’ attention to one aspect of the evidence ability to reach readers who might otherwise be intimidated by the same data in a tabular format ability to focus on bigger picture rather than perhaps minor technical details We do this as an in-class exercise – where they pair up and construct a chart based on a table from the text or handed out in class and then answer the two questions above.
91
Summary Make sure that you present your data in a consistent format
Use the right graph for the right data and the right audience Label the components of your graphic (title, axis) Indicate source of data and number of observations (n=xx) Add footnote for more explanation Speaker notes In summary, [READ BULLETS]
92
Sec. 3.2 Summary Bar graph: each bar represents the frequency of a category Dotplot: similar to a bar graph, but there is a dot for each piece of data that falls into a certain category All the dots added up give the frequency for that category Pareto chart: is a bar graph arranged in frequency order Remember that this would only make sense for a nominal level of measurement Pie chart: a circle that is divided into wedges that represent the relative frequency of a category Histogram: is a bar graph in which the data is quantitative Stem-and-leaf plot: is a table that represents either qualitative data or quantitative data by dividing that data into two parts Line chart: a series of points connected by line segments in which the point represents the frequency of the category Time-series diagram: is a histogram or line chart in which the x-axis represents time
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.