Download presentation
Presentation is loading. Please wait.
Published byJemimah Parks Modified over 7 years ago
1
Smart Start In June 2003, Consumer Reports published an article on some sport-utility vehicles they had tested recently. They had reported some basic information about each of the vehicles and the results of some tests conducted by their staff. Among other things, the article told the brand of each vehicle, its price, and whether it had standard or automatic transmission. They reported the vehicle’s fuel economy, its acceleration (number of seconds to go from 0 to 60 mph), and its breaking distance to stop from 60 mph. The article also rated each vehicle’s reliability as much better than average, better than average, average, worse, or much worse than average. Describe the W’s of the information List the variables and indicate whether each variable is categorical or quantitative. If the variable is quantitative tell its units.
2
Chapter 1.1 Analyzing Categorical Data
3
The Titanic – The data Here are some data about the passengers and crew on board. Survival Age Sex Class Dead Alive Adult Male Female Third Crew First
4
Review from Chapter 1 Identify the 5 W’s and how about the data: Who –
What – When – Where – Why – How –
5
Frequency Tables Example:
A _____________ like ticket class with only a few categories is easy to read. For a variable with dozens or hundreds of categories is much harder to read. Class Count First Second Third Crew 325 285 706 885
6
Frequency Tables Counts are useful, but sometimes we want to know the fraction or _______________of the data in each category. Multiply by 100 to express the proportion as a _____________. _________________ _________________ table: Displays the _____________________ rather than the _____________ of the values in each ____________________. They describe the distribution of ______________ variable because they name possible categories and tell how frequently each occurs. Class % First Second Third Crew
7
Frequency Table Example
Fill in the table with your M&M data. Color Frequency Relative Frequency Percent Blue Red Green Orange Yellow Brown Total
8
Bar Chart - Characteristics
Definition: Bar chart – Should always obey the _____________________. That is their widths have to be the _______ so their ____________ determine the area and is _________________ to the count. Should have _______________between the bars to indicate that they are freestanding bards that could be ________________ into any order They can be vertical or horizontal
9
Bar Chart - Example Create a bar chart for your M&M data:
10
Bar Chart – Relative Frequency
You can replace the counts with percentages and create a _______________ ________________bar chart. The sum of the relative frequencies is 100% Here is the relative frequency bar chart of the titanic data:
11
Relative Frequency Bar Chart - Example
Create a relative frequency bar chart for your M&M data:
12
Pie Charts Often difficult to construct by hand.
13
Misleading Graphs: Example
In the regular season, basketball player Dwayne Wade averaged 7.5 assists per game and 2.2 steals per game. Explain what is wrong with the following graph and how to make it better. Although the heights of the basketballs are correct, our eyes respond to the areas of the basketballs. The area of the basketball representing assists is over 9 times as big. It would be better to use equally wide bars for assists and steals so that only the height of the bars determines the area.
14
Misleading Graphs: Example
Here are two possible bar graphs of the data table. Which one could be considered deceptive? Why? Previous Ownership Count Percent (%) None 85 17.0 PC 60 12.0 Macintosh 355 71.0 Total 500 100.0
15
Chapter 2 - Continued Two-Way Tables and Marginal Distributions, Relationships between Categorical Variables: Conditional Distributions
16
Two-Way Tables To answer this question, our first step is to _________________ ______________ Two-Way Table: First Second Third Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 325 285 706 885 2201
17
Each cell of the table gives the ________________________
First Second Third Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 325 285 706 885 2201 Each cell of the table gives the ________________________ _______________________________________ The ___________ of the table, both on the right and at the bottom, give _____________. The bottom line of the table is the ____________________ of the __________________. The right column is the _________________________of the variable ____________. When presented like this, in the _____________ of a contingency table, the __________________________of one of the variables is called ________________________
18
Find the cell containing 118. Find the cell containing 178.
First Second Third Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 325 285 706 885 2201 Find the cell containing 118. Find the cell containing 178. More third class survived, but does that mean that third class passengers were more likely to survive? Keep in mind, there were more third class passengers on the ship. Therefore, to compare them fairly we need to express them as __________.
19
Percent of what …? For every cell there are three choices of percent:
20
Percent of What? Each different percent affects the ________ of the problem What percent of survivors were in 2nd class? Who – What percent were 2nd class passengers who survived? What percent of 2nd class passengers survived? Key words -
21
Super Bowl Example A recent Gallup poll asked 1,008 Americans age 18 and over whether they planned to watch the upcoming Super Bowl. The pollster also asked those who planned to watch whether they were looking forward more to seeing the football game or the commercials. The results are summarized in the table.
22
Example – Finding Marginal Distributions
Male Female Total Game 279 200 479 Commercials 81 156 237 Won’t Watch 132 160 292 492 516 1,008 Question: What’s the marginal distribution of the responses?
23
Conditional Distributions
A ___________ ______________ shows the _______________ of one variable, for only the individuals who ___________________________________________ _______________________________________ The conditional distribution of ______ ______ conditional on _____________ The conditional distribution of ticket class, conditional on _______________: First Second Third Crew Total Alive 203 118 178 212 711 First Second Third Crew Total Dead 122 167 528 673 1490
24
Conditional Distributions
What is the conditional relative distribution of survival status among the third class? First Second Third Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 325 285 706 885 2201
25
Independence If the ____________ ______________are the ______, we conclude that the variables are _____ _______________. Therefore, they are ______________ of each other If the conditional distributions _________, we conclude that the variables are somehow associated. Therefore, they are ______ ____________of one another.
26
Conditional Distributions
We can create a picture of this table by using two pie charts. We want to restrict the ______ in the problem to ____________ and then restrict the ________to _______________.
27
Segmented Bar Charts We can display the Titanic information by dividing up bars rather than using circles. The resulting segmented bar chart
28
Segmented Bar Charts
29
Super Bowl Example – Finding Conditional Distributions
Male Female Total Game 279 200 479 Commercials 81 156 237 Won’t Watch 132 160 292 492 516 1,008 Question: How do the conditional distributions of interest in the commercials differ for men and women? Answer: Conclusion:
30
Titanic Example – Conditional Distribution
We can turn our Titanic question around and look at the distribution of survival for each ticket class (column percent) What is the conditional distribution of survival status among ticket classes? First Second Third Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 325 285 706 885 2201
31
Titanic Example – Column %
Conclusion: Looking at how the percentages across each row it looks like your class mattered.
32
Super Bowl Example – Association between Variables
Question: Does it seem that there is an association between interest in Super Bowl TV coverage and a person’s gender? Steps: Male Female Total Game 279 200 479 Commercials 81 156 237 Won’t Watch 132 160 292 492 516 1,008
33
Example – Association between Variables
34
Just Checking A Statistics class reports the following data on Sex and Eye Color for students in the class. Blue Brown Green/ Other Total Males 6 20 32 Females 4 16 12 10 36 18 64
35
Just Checking What percent of females are brown-eyed?
What percent of brown-eyed students are female? What percent of students are brown-eyed females? What’s the distribution of Eye Color? What’s the conditional distribution of eye color for the males? Compare the percent who are female amount the blue-eyed students to the percent of all students who are female. Does is seem that Eye color and sex are independent? Explain.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.