Data Analysis for Two-Way Tables

Slides:



Advertisements
Similar presentations
Displaying & Describing Categorical Data Chapter 3.
Advertisements

CHAPTER 23: Two Categorical Variables: The Chi-Square Test
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
AP Statistics Section 14.2 A. The two-sample z procedures of chapter 13 allowed us to compare the proportions of successes in two groups (either two populations.
Comparitive Graphs.
AP Statistics Section 4.2 Relationships Between Categorical Variables.
Chapter 13: Inference for Distributions of Categorical Data
Section 2.6 Relations in Categorical Variables So far in chapter two we have dealt with data that is quantitative. In this section we consider categorical.
Relations in Categorical Data 1. When a researcher is studying the relationship between two variables, if both variables are numerical then scatterplots,
AP Statistics Section 14.2 A. The two-sample z procedures of chapter 13 allowed us to compare the proportions of successes in two groups (either two populations.
AP STATISTICS Section 4.2 Relationships between Categorical Variables.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Copyright © 2010 Pearson Education, Inc. Chapter 3 Displaying and Describing Categorical Data.
Do Now Have you: Read Harry Potter and the Deathly Hallows Seen Harry Potter and the Deathly Hallows (part 2)
Displaying & Describing Categorical Data Chapter 3.
BPS - 5TH ED.CHAPTER 6 1 An important measure of the performance of a locomotive is its "adhesion," which is the locomotive's pulling force as a multiple.
4.3 Categorical Data Relationships.
Examining Relationships Scatterplots
HW#8: Chapter 2.5 page Complete three questions on the last two slides.
Chapter 2 DISPLAYING AND DESCRIBING CATEGORICAL DATA.
Analysis of Two-Way tables Ch 9
Unit 3 Relations in Categorical Data. Looking at Categorical Data Grouping values of quantitative data into specific classes We use counts or percents.
CHAPTER 6: Two-Way Tables. Chapter 6 Concepts 2  Two-Way Tables  Row and Column Variables  Marginal Distributions  Conditional Distributions  Simpson’s.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Two-way tables BPS chapter 6 © 2006 W. H. Freeman and Company.
Analysis of two-way tables - Data analysis for two-way tables IPS chapter 2.6 © 2006 W.H. Freeman and Company.
 Some variables are inherently categorical, for example:  Sex  Race  Occupation  Other categorical variables are created by grouping values of a.
Chapter 3: Displaying and Describing Categorical Data Sarah Lovelace and Alison Vicary Period 2.
BPS - 3rd Ed. Chapter 61 Two-Way Tables. BPS - 3rd Ed. Chapter 62 u In this chapter we will study the relationship between two categorical variables (variables.
Stat1510: Statistical Thinking and Concepts Two Way Tables.
Two-Way Tables Categorical Data. Chapter 4 1.  In this chapter we will study the relationship between two categorical variables (variables whose values.
Aim: How do we analyze data with a two-way table?
Inference about a population proportion. 1. Paper due March 29 Last day for consultation with me March 22 2.
Warm-up An investigator wants to study the effectiveness of two surgical procedures to correct near-sightedness: Procedure A uses cuts from a scalpel and.
Chapter 6 Two-Way Tables BPS - 5th Ed.Chapter 61.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
BPS - 3rd Ed. Chapter 61 Two-Way Tables. BPS - 3rd Ed. Chapter 62 u In prior chapters we studied the relationship between two quantitative variables with.
AP Statistics Section 4.2 Relationships Between Categorical Variables
4.3 Relations in Categorical Data.  Use categorical data to calculate marginal and conditional proportions  Understand Simpson’s Paradox in context.
Summarizing the Relationship Between Two Variables with Tables Chapter 6.
Chapter 1.1 – Analyzing Categorical Data A categorical variable places individuals into one of several groups of categories. A quantitative variable takes.
CHAPTER 6: Two-Way Tables*
Copyright ©2011 Brooks/Cole, Cengage Learning Turning Data Into Information Use table and/or graph to represent Categorical Data Chapter 2 – Class 11 1.
4.3 Reading Quiz (second half) 1. In a two way table when looking at education given a person is 55+ we refer to it as ____________ distribution. 2. True.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Unit 6, Module 15 – Two Way Tables (Part I) Categorical Data Comparing 2.
Chi Square Procedures Chapter 14. Chi-Square Goodness-of-Fit Tests Section 14.1.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
Second factor: education
CHAPTER 1 Exploring Data
The Practice of Statistics in the Life Sciences Third Edition
Analysis of two-way tables - Data analysis for two-way tables
Second factor: education
Looking at Data - Relationships Data analysis for two-way tables
The Practice of Statistics in the Life Sciences Fourth Edition
Data Analysis for Two-Way Tables
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
AP STATISTICS LESSON 4 – 3 ( DAY 1 )
Chapter 1 Data Analysis Section 1.1 Analyzing Categorical Data.
AP Statistics Chapter 3 Part 2
Second factor: education
Section 4-3 Relations in Categorical Data
Chapter 13: Inference for Distributions of Categorical Data
4.2 Relationships between Categorical Variables and Simpson’s Paradox
Chapter 11 Analyzing the Association Between Categorical Variables
Section Way Tables and Marginal Distributions
Relations in Categorical Data
Chapter 4: More on Two-Variable Data
Analysis of two-way tables
Displaying and Describing Categorical Data
Presentation transcript:

Data Analysis for Two-Way Tables Chapter 2.5

Second factor: education Two-way tables An experiment has a two-way, or block, design if two categorical factors are studied with several levels of each factor. Two-way tables organize data about two categorical variables obtained from a two-way, or block, design. (There are now two ways to group the data). First factor: age Group by age Second factor: education Record education

Two-way tables We call education the row variable and age group the column variable. Each combination of values for these two variables is called a cell. For each cell, we can compute a proportion by dividing the cell entry by the total sample size. The collection of these proportions would be the joint distribution of the two variables.

Marginal distributions We can look at each categorical variable separately in a two-way table by studying the row totals and the column totals. They represent the marginal distributions, expressed in counts or percentages (They are written as if in a margin.) 2000 U.S. census

The marginal distributions can then be displayed on separate bar graphs, typically expressed as percents instead of raw counts. Each graph represents only one of the two variables, completely ignoring the second one. The marginal distributions summarize each categorical variable independently. But the two-way table actually describes the relationship between both categorical variables. The cells of a two-way table represent the intersection of a given level of one categorical factor and a given level of the other categorical factor.

Conditional Distribution In the table below, the 25 to 34 age group occupies the first column. To find the complete distribution of education in this age group, look only at that column. Compute each count as a percent of the column total. These percents should add up to 100% because all persons in this age group fall into one of the education categories. These four percents together are the conditional distribution of education, given the 25 to 34 age group. 2000 U.S. census

Conditional distributions The percents within the table represent the conditional distributions. Comparing the conditional distributions allows you to describe the “relationship” between both categorical variables. Here the percents are calculated by age range (columns). 29.30% = 11071 37785 = cell total . column total

Here, the percents are calculated by age range (columns). The conditional distributions can be graphically compared using side by side bar graphs of one variable for each value of the other variable. Here, the percents are calculated by age range (columns).

Music and wine purchase decision What is the relationship between type of music played in supermarkets and type of wine purchased? Calculations: When no music was played, there were 84 bottles of wine sold. Of these, 30 were French wine. 30/84 = 0.357  35.7% of the wine sold was French when no music was played. 30 = 35.7% 84 = cell total . column total We want to compare the conditional distributions of the response variable (wine purchased) for each value of the explanatory variable (music played). Therefore, we calculate column percents. We calculate the column conditional percents similarly for each of the nine cells in the table:

Does background music in supermarkets influence customer purchasing decisions? For every two-way table, there are two sets of possible conditional distributions. Wine purchased for each kind of music played (column percents) Music played for each kind of wine purchased (row percents)

Simpson’s paradox An association or comparison that holds for all of several groups can reverse direction when the data are combined (aggregated) to form a single group. This reversal is called Simpson’s paradox. Example: Hospital death rates On the surface, Hospital B would seem to have a better record. But once patient condition is taken into account, we see that hospital A has in fact a better record for both patient conditions (good and poor). Here, patient condition was the lurking variable.