Download presentation
Presentation is loading. Please wait.
1
Honors Statistics Chapter 3 Part 1
Displaying and Describing Categorical Data
2
Learning Objectives Rubric: Level 1 – Know the objectives. Level 2 – Fully understand the objectives. Level 3 – Use the objectives to solve simple problems. Level 4 – Use the objectives to solve more advanced problems. Level 5 – Adapts and applies the objectives to different and more complex problems. Summarize the distribution of a categorical variable with a frequency table. Display the distribution of a categorical variable with a bar chart or pie chart. Recognize misleading statistics. Know how to make and examine a contingency table. Be able to make and examine a segmented bar chart of the conditional distribution of variable for two or more categories.
3
Learning Objectives Describe the distribution of a categorical variable in terms of its possible values and relative frequency. Understand how to examine the association (independence or dependence) between categorical variables by comparing conditional and marginal percentages. Know what Simpson’s paradox is and be able to recognize when it occurs.
4
Learning Objective 1: Distributions
Definition: Distribution The pattern of variation of a variable. What values a variable takes and how often it takes these values. A distribution tells us the possible values a variable takes as well as the occurrence of those values (frequency or relative frequency)
5
Learning Objective 1: Proportion & Percentage (Relative Frequencies)
The proportion of the observations that fall in a certain category is the frequency (count) of observations in that category divided by the total number of observations Frequency of that class Sum of all frequencies The Percentage is the proportion multiplied by Proportions and percentages are also called relative frequencies.
6
Learning Objective 1: Frequency, Proportion, & Percentage - Example
If 4 students received an “A” out of 40 students, then, 4 is the frequency. 4/40 = 0.10 is the proportion and relative frequency. 10% is the percentage (0.10 · 100=10%).
7
Learning Objective 1: Frequency Table
A frequency table is a listing of possible values for a variable , together with the number of observations and/ or relative frequencies for each value. Frequency tables are often used to organize categorical data. Frequency tables display the category names and the counts of the number of data values in each category. Relative frequency tables also display the category names, but they give the percentages (and/or relative frequency) rather than counts for each category.
8
Learning Objective 1: Class Problem
A stock broker has been following different stocks over the last month and has recorded whether a stock is up, the same, or down in value. The results were What is the variable of interest What type of variable is it? Add proportions to this frequency table Performance of stock Count Up 21 Same 7 Down 12
9
Learning Objective 2: Graphs for Categorical Variables
Displaying categorical data Frequency tables can be difficult to read. Sometimes it is easier to analyze a distribution by displaying it with a bar graph or pie chart. Frequency Table Format Count of Stations Adult Contemporary 1556 Adult Standards 1196 Contemporary Hit 569 Country 2066 News/Talk 2179 Oldies 1060 Religious 2014 Rock 869 Spanish Language 750 Other Formats 1579 Total 13838 Relative Frequency Table Format Percent of Stations Adult Contemporary 11.2 Adult Standards 8.6 Contemporary Hit 4.1 Country 14.9 News/Talk 15.7 Oldies 7.7 Religious 14.6 Rock 6.3 Spanish Language 5.4 Other Formats 11.4 Total 99.9
10
Learning Objective 2: Graphs for Categorical Variables
Use pie charts and bar graphs to summarize categorical variables. Bar Graph: A graph that displays a vertical bar for each category. Pie Chart: A circle having a “slice of pie” for each category. Categorical Data Graphing Data Pie Chart Bar Chart
11
Learning Objective 2: Graphs for Categorical Variables
Because the variable is categorical, the data in the graph can be ordered any way we want (alphabetical, by increasing value, by year, by personal preference, etc.). Bar charts Pie charts
12
Learning Objective 2: Bar Graphs
Bar graphs are used for summarizing a categorical variable Bar Graphs display a vertical bar for each category. The height of each bar represents either counts (“frequencies”) or percentages (“relative frequencies”) for that category. Usually easier to compare categories with a bar graph than with a pie chart. A bar chart stays true to the area principle. The bars are separated to emphasize the fact that each class is a separate category.
13
Learning Objective 2: Bar Graphs
Either counts (frequency bar chart) or proportions (relative frequency bar chart) may be shown on the y-axis. This will not change the shape or relationships of the graph. Make sure all graphs have a descriptive title and that the axes are labeled (this is true for all graphs).
14
Learning Objective 2: Bar Graphs - Procedure
A bar chart is a graphical device for depicting categorical data. On one axis (usually the vertical axis) pick an appropriate scale for frequency, relative frequency, or percentage and label. On the other axis (usually the horizontal axis), specify the labels that are used for each of the categories. Using a bar of fixed width (to maintain the area principle) drawn above each class label, extend the height appropriately. Title the graph.
15
Learning Objective 2: Bar Graphs Example
Construct a bar graph on the following table of absences today by grade level. Grade Level Absences Today 6th 7 7th 12 8th 4
16
Learning Objective 2: Bar Graphs Example - Solution
Step One: Draw your axis:
17
Learning Objective 2: Bar Graphs Example - Solution
Step Two: Scale and label your axis: 15 10 # of Absences 5 6th 7th 8th Grade Level
18
Learning Objective 2: Bar Graphs Example - Solution
Step Three: plot your data: 15 10 # of Absences 5 6th 7th 8th Grade Level
19
Learning Objective 2: Bar Graphs Example - Solution
Step Four: Fill in your bars: 15 10 # of Absences 5 6th 7th 8th Grade Level
20
Learning Objective 2: Bar Graphs Example - Solution
Step Five: Title the graph. Absences in Each Grade Level 15 10 # of Absences 5 6th 7th 8th Grade Level
21
Learning Objective 2: Graphs for Categorical Variables
Many students spend lots of time constructing graphs only to forget the labels. It is imperative to communicate the data with the proper labels and scaling. Unless specifically directed to do so, do not create a pie chart. Statisticians prefer bar charts to pie charts because they are easier to create and compare.
22
Learning Objective 2: Pie Charts
When you are interested in parts of the whole (relative frequency or percentages), a pie chart might be your display of choice. Pie charts show the whole group of cases as a circle. They slice the circle into pieces whose size is proportional to the fraction of the whole in each category.
23
Learning Objective 2: Pie Charts - Procedure
Commonly used graphical device for presenting relative frequency distributions for qualitative data. First draw a circle; then subdivide the circle into sectors that correspond in area to the relative frequency for each category. Since there are 360 degrees in a circle, a category with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle. “Good practice” requires including a title and either wedge labels or legend.
24
Learning Objective 2: Pie Chart Example
Construct a pie chart for the table on U.S. sources of electricity below.
25
Learning Objective 2: Pie Chart Example - Solution
Step 1: Convert the percentage or relative frequencies of each category to an angle measurement. Step 2: Draw a circle and divide into sectors using the angles calculated. Angle .51 · 360 = ̊ .06 · 360 = 21.6 ̊ .16 · 360 = 57.6 ̊ .21 · 360 = 75.6 ̊ .03 · 360 = 10.8 ̊ 360 ̊
26
Learning Objective 2: Pie Chart Example - Solution
Step 3: Using “Good practices” include a title and either wedge labels or legend.
27
Learning Objective 2: Pie Chart
28
Learning Objective 3: Misleading Statistics
There are three kinds of lies: lies, damned lies, and statistics. Benjamin Disraeli ( )
29
Learning Objective 3: Misleading Statistics
Survey problems Choice of sample Question phrasing Misleading graphs Scale Missing numbers Pictographs Correlation vs. Causation Self-Interest Study Partial pictures Deliberate distortions Mistakes
30
Learning Objective 3: Misleading Statistics
Questions to Ask When Looking at Data and/or Graphs. Is the information presented correctly? Is the graph trying to influence you? Does the scale use a regular interval? What impression is the graph giving you?
31
Learning Objective 3: Misleading Statistics
The best data displays observe a fundamental principle of graphing data called the area principle. The area principle says that the area occupied by a part of the graph should correspond to the magnitude of the value it represents. Violations of the area principle are a common way of misleading with statistics.
32
Learning Objective 3: Misleading Statistics
Adjusting the scale of a graph is a common way to mislead (or lie) with statistics. Not following the area principle. Example:
33
Learning Objective 3: Misleading Statistics - Why is this graph misleading?
This title tells the reader what to think (that there are huge increases in price). The scale moves from 0 to 80,000 in the same amount of space as 80,000 to 81,000. The actual increase in price is 2,000 pounds, which is less than a 3% increase. The graph shows the second bar as being 3 times the size of the first bar, which implies a 300% increase in price. Violates the area principle.
34
Learning Objective 3: Misleading Statistics - A more accurate graph:
An unbiased title A scale with a regular interval. This shows a more accurate picture of the increase. Follows the area principle.
35
Learning Objective 3: Misleading Statistics
Why is this graph misleading? The scale does not have a regular interval.
36
Learning Objective 3: Misleading Statistics
Graphs in the news can be misleading. The margin of error is the amount (usually in percentage points) that the results can be “off by.” Be wary of data with large margins of error.
37
Learning Objective 3: Misleading Statistics
From CNN.com
38
Learning Objective 3: Misleading Statistics
Problems: The difference in percentage points between Democrats and Republicans (and between Democrats and Independents) is 8% (62 – 54). Since the margin of error is 7%, it is likely that there is even less of a difference. The graph implies that the Democrats were 8 times more likely to agree with the decision. In truth, they were only slightly more likely to agree with the decision. The graph does not accurately demonstrate that a majority of all groups interviewed agreed with the decision.
39
Learning Objective 3: Misleading Statistics
CNN.com updates the graph:
40
Learning Objective 3: Area Principle - Pictographs
Double the length, width, and height of a cube, and the volume increases by a factor of eight
41
Learning Objective 3: Misleading Statistics
What’s Wrong With This Picture? You might think that a good way to show the Titanic data is with this display:
42
Learning Objective 3: Misleading Statistics
The ship display makes it look like most of the people on the Titanic were crew members, with a few passengers along for the ride. When we look at each ship, we see the area taken up by the ship, instead of the length of the ship. The ship display violates the area principle: The area occupied by a part of the graph should correspond to the magnitude of the value it represents.
43
Learning Objective 3: Misleading Statistics
Missing Numbers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.