Download presentation
Presentation is loading. Please wait.
1
Analyzing Categorical Data
1.1 Analyzing Categorical Data
2
(1.1) Bar & Pie Graphs, Marginal & Conditional Distributions
I can… Make a bar graph of the distribution of categorical data Recognize when a pie chart can be used Identify what makes some graphs deceptive. Use a two-way table of counts to answer questions involving marginal and conditional distributions. Describe the relationship between two categorical variables in context by comparing the appropriate conditional distributions. Construct bar graphs to display the relationship between two categorical variables.
3
Categorical Data The distribution of a categorical variable lists the categories and gives either the count or the percent of individuals who fall in each category. Round-off error could exist if percents do not add to 100% but are close.
4
Example: Smart-phone users by age group (2011)
Would it be better to make a pie chart or a gar graph in this example?
5
Bar graphs & Pie charts Use a pie chart only when you want to emphasize each category’s relationship to the whole. Bar graphs are easier to read and easier to make comparisons.
6
Smart-phone Users by Age Group (2011)
60 50 40 30 20 10 What is good? What is bad?
7
Graphs: Good & Bad Bar graphs help us compare by looking at the heights of bars. Our eyes, however, react to the area of the bars as well as to their height. Beware of the pictograph. Beware of the scale on the y-axis.
8
Misleading Graphs
10
Two Way Tables, Marginal & Conditional Distributions
Which superpower would you prefer? Invisibility Super-strength Telepathy Flying Freeze time
11
Two-way Table: Do boys & girls differ in their superpower preference?
(WRITE THIS TABLE ON THE BOARD) Superpower Female Male TOTAL Invisibility Super-strength Telepathy Flying Freeze time TOTAL From a sample of 200 children from United Kingdom aged 9-17 selected from Discuss categorical variables…row variable-opinion about superpower, column variable-gender…in a two way table it is okay to switch the locations of the variables. We will later learn about explanatory and response variables…
12
First look at the distribution of each categorical variable separately: Marginal Distribution
Calculate the marginal distribution (in percents) of superpower preferences. Superpower Percent Invisibility /200= Super-strength Telepathy Flying Freeze time The marginal distribution of one of the categorical variables in a two-way table of counts is the distribution of values of that variable among all individuals described by the table. We could have also calculated the marginal distribution of gender … Female 115/200 and Male 85/200
13
Bar Graph of Marginal Distribution
Observations
14
Next look at relationships between categorical variables: Conditional Distributions
A conditional distribution of a variable describes the values of that variable among individuals who have a specific value of another variable. We use the term conditional because in our example, this distribution describes only young adults who satisfy the condition that they are female or that they are male.
15
Superpower %Female %Male Invisibility 17/115= 13/85= Super-strength
Calculate the conditional distribution of responses for females and males. (Write this on the board) Superpower %Female %Male Invisibility 17/115= 13/85= Super-strength Telepathy Flying Freeze time
16
Make a side-by-side bar graph comparing the preferences of males and females.
(Write this on the board)
17
Do boys and girls differ in their preference of Superpower?
Four-step Process State What is the relationship between gender and the answer to the question “What superpower would you prefer?” Plan We suspect gender might influence a child’s preference about superpowers. So we should compare the conditional distribution of responses for females alone and for males alone.
18
Do We can make a table and side-by-side bar graph comparing the preferences of males and females using percents (since the # of females and males is different) Conclude Based on the sample data, Females were more likely to prefer _____ than males. Males were more likely to prefer _____ than females. Females were slightly more likely to choose _______ than males. Males and females were equally likely to choose _________.
19
Association versus Causation
We can say there is an association between the categorical variables gender and superpower preference. Often we confuse association with causation. When we study quantitative variables, we will talk about correlation. Even a strong association between two categorical variables can be influenced by other variables lurking in the background.
20
Assignment: Read pg. 8-22 (omit Simpson’s Paradox reading) Do pg. 22
(11, 13, 15, 17, 19, 21, 23, 25, 27-32)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.