Displaying and describing categorical data

Slides:



Advertisements
Similar presentations
Introduction to Stats Honors Analysis. Data Analysis Individuals: Objects described by a set of data. (Ex: People, animals, things) Variable: Any characteristic.
Advertisements

The Three Rules of Data Analysis
. Chapter 3 Displaying and Describing Categorical Data.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3 Displaying and Describing Categorical Data.
Copyright © 2010 Pearson Education, Inc. Chapter 3 Displaying and Describing Categorical Data.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 2, Slide 1 Chapter 2 Displaying and Describing Categorical Data.
Do Now Have you: Read Harry Potter and the Deathly Hallows Seen Harry Potter and the Deathly Hallows (part 2)
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 1.
Chapter 2 DISPLAYING AND DESCRIBING CATEGORICAL DATA.
Displaying Categorical Data THINK SHOW TELL What is categorical data? Bar, Segmented Bar, and Pie Charts Frequency vs. Relative Frequency Tables/Charts.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. - use pie charts, bar graphs, and tables to display data Chapter 3: Displaying and Describing Categorical.
Chapter 3 Displaying and Describing Categorical Data.
Displaying & Describing Categorical Data Chapter 3.
Objectives Given a contingency table of counts, construct a marginal distribution. Given a contingency table of counts, create a conditional distribution.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Unit 6, Module 15 – Two Way Tables (Part I) Categorical Data Comparing 2.
Chapter 3 Displaying and Describing Categorical Data Math2200.
Displaying and Describing Categorical Data Chapter 3.
August 25,  Passengers on the Titanic by class of ticket. ClassCount 1 st nd rd th 885.
CATEGORICAL DATA CHAPTER 3 GET A CALCULATOR!. Slide 3- 2 THE THREE RULES OF DATA ANALYSIS won’t be difficult to remember: 1. Make a picture — things may.
Smart Start In June 2003, Consumer Reports published an article on some sport-utility vehicles they had tested recently. They had reported some basic.
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Displaying and Describing Categorical Data
Displaying and Describing Categorical Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
AP Statistics Chapter 3 Part 3
Chapter 3: Displaying and Describing Categorical Data
Bell Ringer The State Education Department requires local school districts to keep these records on all students: age, race or ethnicity, days absent,
CHAPTER 1 Exploring Data
Take out a piece of paper and open book to pg. 6
CATEGORICAL DATA CHAPTER 3
Displaying and Describing Categorical Data
Displaying and Describing Categorical Data
Displaying and Describing Categorical Data
End of lecture 2 + Displaying quantitative variables
Chapter 1 Data Analysis Section 1.1 Analyzing Categorical Data.
AP Statistics Chapter 3 Part 2
Warmup Which part- time jobs employed 10 or more of the students?
Displaying and Describing Categorical Data
Chapter 1: Exploring Data
Introduction & 1.1: Analyzing categorical data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
1.1: Analyzing Categorical Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Displaying and Describing Categorical Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Displaying and Describing Categorical data
CHAPTER 1 Exploring Data
Displaying and Describing Categorical Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Displaying and Describing Categorical Data
CHAPTER 1 Exploring Data
Displaying and Describing Categorical Data
CHAPTER 1 Exploring Data
Presentation transcript:

Displaying and describing categorical data Lecture 2 Displaying and describing categorical data

Make a picture Large tables are inconvenient: we see many many rows, but can not observe anything (see next slide)

It has about 100 rows

Make a picture In the previous table, what if we wanted to see proportion of freshmen/sophmores/juniors/seniors in the Commodores football team? We would have to draw a chart. Chart should make our eye immediately capture differences between proportions.

A frequency table We first summarize the table we have into a shorter one Freshmen 34 Sophmores 25 Juniors 30 Seniors 14

This table is still a bit too hard This table is still a bit too hard. We can, of course, compare 4 numbers. But what if we had more rows? Say, ages 0—2, 2—4, 4—6, and so on. Or the numbers are large: compare 10123248 to 10123419. Freshmen 34 Sophmores 25 Juniors 30 Seniors 14

A bar chart

A pie chart

Just open MS Word and hit “Insert chart” And many more! Just open MS Word and hit “Insert chart” Why this one is bad?

Exploring the relationship A single football player has two categorical “properties”: say, year of study and position? We want to know: are they related or “independent”? I.e., if one is a senior, can we confidently say that, most probably, he is not a wide receiver?

Let’s switch to the book: Titanic survivors First Class Second Class Third Class Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 325 285 706 885 2201 Let’s identify the “who”s and the “what”s. Can we now say that someone from the first class had more chances to survive?

First Class Second Class Third Class Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 325 285 706 885 2201 The bad thing is that we see too much. We see that 203 1st class passengers survived versus 178 from the 3rd class. But then we look down and see 325 vs 706

Instead of “Alive + Total” we now have only one number to compare First Class Second Class Third Class Crew Total Alive 203 118 178 212 711 Dead 122 167 528 673 1490 325 285 706 885 2201 First Class Second Class Third Class Crew Alive 62% 41% 25% 24% Dead 38% 59% 75% 76% Instead of “Alive + Total” we now have only one number to compare

Conditional distributions We can do, for example, this: how many alive passengers were in the first class? In the second class? And so on. Mathematically we ask: what is the proportion of survivors CONDITIONED to the fact that they are in the first class?

We get the following table First Second Third Crew Total 203 28.6% 118 16.6% 178 25% 212 29.8% 711 First column reads: 203 out of 711 survivors were from the first class. Or: 28.6% of all survivors were from the first class

Rule of thumb The rule of thumb is: we have a table with certain property as row (alive/dead) and certain property as column (class). We then restrict ourselves to one particular column or row. Say, “how does the survival % differ for different classes?” This means that we care only about survivors; thus, so we condition to the fact that one survived.

Bar chart again We express survivor percentages depending on class

One more bar chart And here is a side-by-side chart of survivors vs nonsurvivors

We (almost) see that the survival chance DEPENDS on the class We (almost) see that the survival chance DEPENDS on the class. If all conditional distributions (conditioned to what?) were the same, we would say that survival chances and class are INDEPENDENT

Homework Read chapter 2. Work through examples and carefully read the “what can go wrong” section Do p.33+: 1, 4, 5, 6, 17, 31, 34, 37bce, 41abd