Chapter 11 The Chi-Square Test of Association/Independence Target Goal: I can perform a chi-square test for association/independence to determine whether.

Slides:



Advertisements
Similar presentations
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Advertisements

CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Chapter 11 Inference for Distributions of Categorical Data
Chapter 13: Inference for Tables
Chapter 13: Inference for Distributions of Categorical Data
Chapter 26: Comparing Counts
CHAPTER 11 Inference for Distributions of Categorical Data
Analysis of Two-Way Tables Inference for Two-Way Tables IPS Chapter 9.1 © 2009 W.H. Freeman and Company.
Statistics 303 Chapter 9 Two-Way Tables. Relationships Between Two Categorical Variables Relationships between two categorical variables –Depending on.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Presentation 12 Chi-Square test.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 11: Inference for Distributions of Categorical Data Section 11.2 Inference.
Lesson Inference for Two-Way Tables. Vocabulary Statistical Inference – provides methods for drawing conclusions about a population parameter from.
Analysis of Two-Way Tables
A random sample of 300 doctoral degree
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
1 Desipramine is an antidepressant affecting the brain chemicals that may become unbalanced and cause depression. It was tested for recovery from cocaine.
11.2 Inference for Relationships. Section 11.2 Inference for Relationships COMPUTE expected counts, conditional distributions, and contributions to the.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to.
CHAPTER 11 SECTION 2 Inference for Relationships.
13.2 Chi-Square Test for Homogeneity & Independence AP Statistics.
+ Chi Square Test Homogeneity or Independence( Association)
Chapter 11 Chi- Square Test for Homogeneity Target Goal: I can use a chi-square test to compare 3 or more proportions. I can use a chi-square test for.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 11: Inference for Distributions of Categorical Data Section 11.2 Inference.
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Statistical Significance for a two-way table Inference for a two-way table We often gather data and arrange them in a two-way table to see if two categorical.
Inference for Distributions of Categorical Variables (C26 BVD)
Lesson Inference for Two-Way Tables. Knowledge Objectives Explain what is mean by a two-way table. Define the chi-square (χ 2 ) statistic. Identify.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
Section 13.2 Chi-Squared Test of Independence/Association.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
Chi Square Procedures Chapter 14. Chi-Square Goodness-of-Fit Tests Section 14.1.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
Textbook Section * We already know how to compare two proportions for two populations/groups. * What if we want to compare the distributions of.
11/12 9. Inference for Two-Way Tables. Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain the effect,
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 11: Inference for Distributions of Categorical Data Section 11.2 Inference.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 11: Inference for Distributions of Categorical Data Section 11.2 Inference.
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 13: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Inference for Two Way Tables
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
11.2 Inference for Relationships
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Analysis of two-way tables
Presentation transcript:

Chapter 11 The Chi-Square Test of Association/Independence Target Goal: I can perform a chi-square test for association/independence to determine whether there is convincing evidence of an association between two categorical variables. 11.2b h.w: pg. 728: 49, 51,

The chi-square test can also be used to show evidence that there is a relationship between two categorical variables. Use this if you have independent SRS’s from several populations where one variable is categorical and the other is the sample number. Use this if you have independent SRS’s from several populations where one variable is categorical and the other is the sample number. Or, if you have a single SRS with each individual classified according to two categorical variables. Or, if you have a single SRS with each individual classified according to two categorical variables. Or, if you have an entire population with each individual classified according to two categorical variables. Or, if you have an entire population with each individual classified according to two categorical variables.

Ex: Smoking and SES An example that classifies observations from a single population in two ways: by smoking habits and SES. In a study of heart disease in male federal employees, researchers classified 356 volunteer subjects according to their socioeconomic status (SES) and their smoking status. In a study of heart disease in male federal employees, researchers classified 356 volunteer subjects according to their socioeconomic status (SES) and their smoking status.

Observed Counts for smoking and SES SES Smoking High Middle Low Total Current Former Never Never Total This is a 3x3 table with added margin totals. This is a 3x3 table with added margin totals. Even though this example is different than comparing several proportions, we can still apply the chi-square test because the row and column variables are not related to each other.

The Chi-Square Test of Association/Independence Use the chi-square test of association/independence to test the null hypothesis, H o : there is no relationship between two categorical variables when you have a two way table from a single SRS, with each individual is classified according to both of two categorical variables.

SES cont. SES is the explanatory variable therefore we need to compare the column percents that give the conditional distribution of smoking within each SES category. SES is the explanatory variable therefore we need to compare the column percents that give the conditional distribution of smoking within each SES category.

Calculate Column Percents: 51/211 = about 24.2% of the high- SES group are current smokers. 51/211 = about 24.2% of the high- SES group are current smokers. Fill in the rest of the table. Fill in the rest of the table.

Column percents for Smoking and SES SES Smoking High Middle Low Current Current Former Former Never Never Total Total What do the column percents suggest?

There is a negative association between smoking and SES. There is a negative association between smoking and SES. The lower the SES, the more likely to smoke. The lower the SES, the more likely to smoke.

Computing Expected Cell Counts 116 x 211 = x 211 =

Expected Count for Smoking and SES SES SES Smoking High Middle Low Total Current Former Never Total

Chi-square Test for Association/Independence Step 1: State - We want to perform a test of H o : There is no association between smoking and SES. H a : There is an association between smoking and SES.

Step 2: Plan If conditions are met, we should carry out a chi-square test of association/independence. Random: The subjects were volunteers, we may not be able to generalize our results. Large Sample Size: To use chi-square we must check all expected counts. To use chi-square we must check all expected counts. We did this and all counts ≥ 1 and no more than 20% < 5. We did this and all counts ≥ 1 and no more than 20% < 5.

Independence: Because we are sampling without replacement, we need to check the 10% condition. It is safe to assume that the total number of male federal employees is at least 10(356) = Because we are sampling without replacement, we need to check the 10% condition. It is safe to assume that the total number of male federal employees is at least 10(356) = Thus, knowing the values of both variables for one person gives us no meaningful information about the variables for another person. So, individual observations are independent. Thus, knowing the values of both variables for one person gives us no meaningful information about the variables for another person. So, individual observations are independent.

Step 3: Carry out the inference procedure. The test statistic The test statistic Calculate by hand with df = (r-1)(c-1) = Calculate by hand with df = (r-1)(c-1) = Or with calculator, need to enter observed counts into matrix table A. Or with calculator, need to enter observed counts into matrix table A. Note: the calculator will calculate the expected counts for you when you execute the X 2 test. Note: the calculator will calculate the expected counts for you when you execute the X 2 test.

Note: if doing by hand, could write calculator program to do “expected counts” or must do by hand. Enter observed values in matrix A, Enter observed values in matrix A, Then STAT:TESTS: -Test Then STAT:TESTS: -Test The calculator enters expected values in matrix B. The calculator enters expected values in matrix B. P-value = P-value = Note: the association does not mean that SES causes smoking behavior.

Step 4: Conclude – Interpret the results in context. With a p-value this low, we reject the null hypothesis at the alpha =.01 level and conclude that there is strong evidence of an association between smoking and SES in the population of male federal employees. With a p-value this low, we reject the null hypothesis at the alpha =.01 level and conclude that there is strong evidence of an association between smoking and SES in the population of male federal employees.

Computer Output

Follow-up Analysis Follow-up Analysis Inference for Relationships Start by examining which cells in the two-way table show large deviations between the observed and expected counts. Then look at the individual components to see which terms contribute most to the chi-square statistic. Minitab output for the wine and music study displays the individual components that contribute to the chi-square statistic.

Follow-up Analysis Follow-up Analysis Inference for Relationships Looking at the output, we see that just two of the nine components that make up the chi-square statistic contribute about 14 (almost 77%) of the total χ 2 = We are led to a specific conclusion: sales of Italian wine are strongly affected by Italian and French music.