Testing for an Association between two Categorical Variables

Slides:



Advertisements
Similar presentations
Chi-Square Tests 3/14/12 Testing the distribution of a single categorical variable :  2 goodness of fit Testing for an association between two categorical.
Advertisements

Statistics: Unlocking the Power of Data Lock 5 Testing Goodness-of- Fit for a Single Categorical Variable Kari Lock Morgan Section 7.1.
Chapter 12 Goodness-of-Fit Tests and Contingency Analysis
Chi-Squared Hypothesis Testing Using One-Way and Two-Way Frequency Tables of Categorical Variables.
Chapter 13: Inference for Distributions of Categorical Data
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Homogeneity.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
Chapter 26: Comparing Counts
CHI-SQUARE TEST OF INDEPENDENCE
Chapter Goals After completing this chapter, you should be able to:
11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way.
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
Presentation 12 Chi-Square test.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
Chapter 10 Analyzing the Association Between Categorical Variables
How Can We Test whether Categorical Variables are Independent?
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 15 Inference for Counts:
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
©2011 Brooks/Cole, Cengage Learning Elementary Statistics: Looking at the Big Picture 1 Lecture 33: Chapter 12, Section 2 Two Categorical Variables More.
For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category.
Chi-square test or c2 test
Two Way Tables and the Chi-Square Test ● Here we study relationships between two categorical variables. – The data can be displayed in a two way table.
Chapter 26 Chi-Square Testing
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Other Chi-Square Tests
FPP 28 Chi-square test. More types of inference for nominal variables Nominal data is categorical with more than two categories Compare observed frequencies.
13.2 Chi-Square Test for Homogeneity & Independence AP Statistics.
+ Chi Square Test Homogeneity or Independence( Association)
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Essential Statistics Chapter 161 Review Part III_A_Chi Z-procedure Vs t-procedure.
Chapter 11 Chi- Square Test for Homogeneity Target Goal: I can use a chi-square test to compare 3 or more proportions. I can use a chi-square test for.
The Practice of Statistics Third Edition Chapter (13.1) 14.1: Chi-square Test for Goodness of Fit Copyright © 2008 by W. H. Freeman & Company Daniel S.
Statistical Significance for a two-way table Inference for a two-way table We often gather data and arrange them in a two-way table to see if two categorical.
AP Statistics Section 14.. The main objective of Chapter 14 is to test claims about qualitative data consisting of frequency counts for different categories.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 10/30/12 Chi-Square Tests SECTIONS 7.1, 7.2 Testing the distribution of a.
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan SECTION 7.1 Testing the distribution of a single categorical variable : χ.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan SECTION 7.1 Testing the distribution of a single categorical variable : 
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan SECTION 7.2 χ 2 test for association (7.2) Testing for an Association between.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
The Chi-Square Distribution  Chi-square tests for ….. goodness of fit, and independence 1.
12.2 Tests for Homogeneity and Independence in a two-way table Wednesday, June 22, 2016.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
Chapter 12 Lesson 12.2b Comparing Two Populations or Treatments 12.2: Test for Homogeneity and Independence in a Two-way Table.
 Check the Random, Large Sample Size and Independent conditions before performing a chi-square test  Use a chi-square test for homogeneity to determine.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.
Chi-Square Goodness-of-Fit Test
Chi-square test or c2 test
Chapter 10 Analyzing the Association Between Categorical Variables
Contingency Tables: Independence and Homogeneity
Inference for Relationships
Analyzing the Association Between Categorical Variables
Lecture 46 Section 14.5 Wed, Apr 13, 2005
Presentation transcript:

Testing for an Association between two Categorical Variables STAT 250 Dr. Kari Lock Morgan Testing for an Association between two Categorical Variables SECTION 7.2 χ2 test for association (7.2)

Painkillers and Miscarriage Is use of painkillers during pregnancy associated with miscarriage? Scientists interviewed 1009 women soon after they got a positive pregnancy test about their use of painkillers around the time of conception or the early weeks of pregnancy The researchers then kept track of which of the pregnancies ended in miscarriage Li, D-K., et. al. (2003). “Exposure to non-steroidal anti-inflammatory drugs during pregnancy and risk of miscarriage: population based cohort study,” British Medical Journal, 327(7411): 1.

Painkillers and Miscarriage No Miscarriage TOTAL No painkiller 103 659 762 Aspirin 5 17 22 Ibuprofen 13 40 53 Acetaminophen 24 148 172 145 864 1009 Does this data provide evidence that these two variables are associated?

Two Categorical Variables The statistics behind a χ2 test easily extends to two categorical variables A χ2 test for association (often called a χ2 test for independence) tests for an association between two categorical variables Everything is the same as a chi-square goodness-of-fit test, except: The hypotheses The expected counts Degrees of freedom for the χ2-distribution

Hypotheses General hypotheses: H0: The two variables are not associated Ha: The two variables are associated Painkillers and miscarriage: H0: Type of painkiller taken is not associated with whether or not pregnancy ends in miscarriage Ha: Type of painkiller taken is associated with whether or not pregnancy ends in miscarriage

Expected Counts Miscarriage No Miscarriage TOTAL No painkiller 762 Aspirin 22 Ibuprofen 53 Acetaminophen 172 145 864 1009

Expected Count Give the expected count for Aspirin, Miscarriage. 2.1 No Miscarriage TOTAL No painkiller 762 Aspirin 22 Ibuprofen 53 Acetaminophen 172 145 864 1009 Give the expected count for Aspirin, Miscarriage. 2.1 3.16 4.72 5.65

Chi-Square Statistic Observed (expected) Miscarriage No Miscarriage TOTAL No painkiller 103 (109.5) 659 (652.5) 762 Aspirin 5 ( ) 17 (18.8) 22 Ibuprofen 13 (7.6) 40 (45.4) 53 Acetaminophen 24 (24.7) 148 (147.3) 172 145 864 1009

Chi-Square Statistic Miscarriage No Miscarriage No painkiller 103 (109.5) 659 (652.5) Aspirin 5 (3.16) 17 (18.8) Ibuprofen 13 (7.6) 40 (45.4) Acetaminophen 24 (24.7) 148 (147.3) Give the contribution to the χ2 statistic for the Aspirin, Miscarriage category. 0.7 1.07 1.7 2.07

StatKey χ2 = 6.168

What Next? χ2 = 6.168 What next?

Randomization Distribution

Conclusion Can we conclude that type of painkiller taken is associated with having a miscarriage? Yes No

Conclusion Can we conclude that type of painkiller taken is not associated with having a miscarriage? Yes No

Chi-Square (χ2) Distribution If each of the expected counts are at least 5, AND if the null hypothesis is true, then the χ2 statistic follows a χ2 –distribution, with degrees of freedom equal to df = (number of rows – 1)(number of columns – 1) Painkillers and Miscarriage: df = (4 – 1)(2 – 1) = 3

Theoretical Distribution Miscarriage No Miscarriage No painkiller 103 (109.5) 659 (652.5) Aspirin 5 (3.16) 17 (18.8) Ibuprofen 13 (7.6) 40 (45.4) Acetaminophen 24 (24.7) 148 (147.3) Can we also use the theoretical χ2 distribution to get the p-value? Yes No

NSAIDs? Headline coming out of this paper:Use of NSAIDs in pregnancy increases risk of miscarriage NSAIDs (Nonsteroidal anti-inflammatory drugs) are a special class of painkillers that include aspirin and ibuprofen (but not acetaminophen) Is taking NSAIDs or not associated with miscarriage?

NSAIDs and Miscarriage No Miscarriage TOTAL No painkiller 103 659 762 Aspirin 5 17 22 Ibuprofen 13 40 53 Acetaminophen 24 148 172 145 864 1009 Miscarriage No Miscarriage TOTAL No NSAIDs 127 807 934 NSAIDs 18 57 75 145 864 1009

NSAIDs and Miscarriage No Miscarriage TOTAL No NSAIDs 127 807 934 NSAIDs 18 57 75 145 864 1009 How should we analyze this data? Test for difference in proportions using a randomization test Test for a difference in proportions using the z-statistic and normal distribution Chi-Square Test for Association Any of the above None of the above

Two Categorical Variables with Two Categories If you are testing for an association between two categorical variables each with two categories, test for a difference in proportions and chi-square test for association will give you identical p-values

Hypotheses H0: taking NSAIDs around the time of conception or early in pregnancy is not associated with having a miscarriage Ha: taking NSAIDs around the time of conception or early in pregnancy is associated with having a miscarriage

Expected Counts Miscarriage No Miscarriage TOTAL No NSAIDs 127 807 934 NSAIDs 18 57 75 145 864 1009 What is the expected count for the NSAIDs, Miscarriage cell? 10.8 12.5 15.6 16.4

Expected Counts Miscarriage No Miscarriage TOTAL No NSAIDs 127 807 934 NSAIDs 18 (10.8) 57 75 145 864 1009 What is the contribution to the chi-square statistic for the NSAIDs, Miscarriage cell? 3.21 4.13 4.84 5.4

StatKey

Conclusion Can we conclude that taking NSAIDs around the time of conception or in early pregnancy is associated with having a miscarriage? Yes No

Conclusion Can we conclude that taking NSAIDs around the time of conception or in early pregnancy causes increased risk of miscarriage? Yes No

That’s Not All! A much more recent study (March 2014) reexamined this issue. Daniel, S. et. al. (2014). Fetal Exposure to nonsteroidal anti-inflammatory drugs and spontaneous abortions, Canadian Medical Association Journal, 186(5).

NSAIDs and Miscarriage

NSAIDs and Miscarriage

Results

Results

??? The first study found a significant association between NSAIDs and miscarriage, with those taking NSAIDS having significantly higher risk of miscarriage The second study found a significant association between NSAIDs and miscarriage, with those taking NSAIDS having a significantly lower risk of miscarriage WHAT’S GOING ON????

To Do Read Section 7.2 Do HW 7.2 (due Friday, 4/17)