Relations in Categorical Data 1. When a researcher is studying the relationship between two variables, if both variables are numerical then scatterplots,

Slides:



Advertisements
Similar presentations
Data Analysis for Two-Way Tables
Advertisements

Introduction to Stats Honors Analysis. Data Analysis Individuals: Objects described by a set of data. (Ex: People, animals, things) Variable: Any characteristic.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Active Learning Lecture Slides For use with Classroom Response Systems Chapter 3 Describing Categorical.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Chapter 11 Inference for Distributions of Categorical Data
AP Statistics Section 14.2 A. The two-sample z procedures of chapter 13 allowed us to compare the proportions of successes in two groups (either two populations.
Does Background Music Influence What Customers Buy?
AP Statistics Section 4.2 Relationships Between Categorical Variables.
Chapter 13: Inference for Distributions of Categorical Data
2.4 Cautions about Correlation and Regression. Residuals (again!) Recall our discussion about residuals- what is a residual? The idea for line of best.
1 Here are some additional methods for describing data.
CHAPTER 11 Inference for Distributions of Categorical Data
1 Here are some additional methods for describing data.
Types of Graph And when to use them!.
DISCLAIMER This guide is meant to walk you through the physical process of graphing and regression in Excel…. not to describe when and why you might want.
AP Statistics Section 14.2 A. The two-sample z procedures of chapter 13 allowed us to compare the proportions of successes in two groups (either two populations.
Stat 31, Section 1, Last Time T distribution –For unknown, replace with –Compute with TDIST & TINV (different!) Paired Samples –Similar to above, work.
Goodness-of-Fit Tests and Categorical Data Analysis
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
CHAPTER 7: Exploring Data: Part I Review
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Analysis of Two-Way tables Ch 9
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Two-way tables BPS chapter 6 © 2006 W. H. Freeman and Company.
Analysis of two-way tables - Data analysis for two-way tables IPS chapter 2.6 © 2006 W.H. Freeman and Company.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
 Some variables are inherently categorical, for example:  Sex  Race  Occupation  Other categorical variables are created by grouping values of a.
Correlation/Regression - part 2 Consider Example 2.12 in section 2.3. Look at the scatterplot… Example 2.13 shows that the prediction line is given by.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
ANOVA, Regression and Multiple Regression March
AP Statistics Section 4.2 Relationships Between Categorical Variables
1 M04- Graphical Displays 2  Department of ISM, University of Alabama, 2003 Graphical Displays of Data.
Chi Square Test for Goodness of Fit. p ,5,8.
Chapter 1.1 – Analyzing Categorical Data A categorical variable places individuals into one of several groups of categories. A quantitative variable takes.
Correlation  We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chi Square Procedures Chapter 14. Chi-Square Goodness-of-Fit Tests Section 14.1.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
Textbook Section * We already know how to compare two proportions for two populations/groups. * What if we want to compare the distributions of.
11/12 9. Inference for Two-Way Tables. Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain the effect,
Displaying and Describing Categorical Data Chapter 3.
Second factor: education
Introduction The two-sample z procedures of Chapter 10 allow us to compare the proportions of successes in two populations or for two treatments. What.
CHAPTER 1 Exploring Data
CHAPTER 11 Inference for Distributions of Categorical Data
Analysis of two-way tables - Data analysis for two-way tables
Second factor: education
Looking at Data - Relationships Data analysis for two-way tables
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
AP STATISTICS LESSON 4 – 3 ( DAY 1 )
Chapter 1 Data Analysis Section 1.1 Analyzing Categorical Data.
Second factor: education
CHAPTER 11 Inference for Distributions of Categorical Data
Types of Graphs… and when to use them!.
CHAPTER 11 Inference for Distributions of Categorical Data
Section 4-3 Relations in Categorical Data
Chapter 13: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Looking At Data.
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Relations in Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Analysis of two-way tables
Presentation transcript:

Relations in Categorical Data 1

When a researcher is studying the relationship between two variables, if both variables are numerical then scatterplots, the correlation coefficient and regression analysis are useful tools. But, when the variables are categorical the study of a possible relationship between the variables begins with a two way table. Let’s consider an example where a retailer over a period of time will study the types of wine that are sold. The three types are French, Italian, and the rest of the wines labeled Other. It is also of interest to see if the type of music playing in the store has had an influence on the type of wine purchased. So, when each bottle of wine is sold the type of music playing in the store (None, French music, or Italian music) must be noted. Let’s add some numbers and the table on the next slide. 2

Music WineNoneFrenchItalian French Italian11119 Other You will notice all these numbers add up to 243. This number represents the total number of bottles of wine sold during the study. But, the number also represents the total types of music that were played when the wines were sold. On the next slide I will add a column and a row to the table. The column will contain the actual sum of each type of wine and the total number of bottles and the row will contain the actual number of song types and the total number of songs. 3

Music WineNoneFrenchItalianTotal French Italian Other Total You will notice all these numbers add up to 243. This number represents the total number of bottles of wine sold during the study. But, the number also represents the total types of music that were played when the wines were sold. So, the total column is really the distribution on the types of wines sold and we could make pie charts and bar graphs from this information. The column is often called the marginal distribution on the row variable because it is written in the margin. The total row has a similar interpretation. 4

In this example the real interest of the retailer is understanding types of wine sold. Thus the type of wine sold is the response variable and we put it here in the rows. The variable type of music is thought to explain the wine type sold and is thus the explanatory variable, sometimes called the treatment variable in this example and here put in the columns. To explore the possible relationship between the variables we look at each value of the explanatory variable (here in columns). Then for each number in a column we take each number and divide by the column total. We see this on the next slide. Check my work – please! 5

Music WineNoneFrenchItalianTotal French Italian Other Total1111 When no music was playing, wines other than French or Italian made up the majority of wines purchased. French was the second most common, followed by Italian. When French music was playing, more people bought French wine, very few people bought Italian wine, and the percent who bought other wines decreased. When Italian music was playing, French wine made up the same percent of bottles purchased as when no music was playing, but more people bought Italian wine and fewer people bought other wines. So, the music played does seem to matter. French wines sell in greater proportion when French music is played than when not, and the same is true for Italian wines and music. 6