Testing for an Association between two Categorical Variables STAT 250 Dr. Kari Lock Morgan Testing for an Association between two Categorical Variables SECTION 7.2 χ2 test for association (7.2)
Painkillers and Miscarriage Is use of painkillers during pregnancy associated with miscarriage? Scientists interviewed 1009 women soon after they got a positive pregnancy test about their use of painkillers around the time of conception or the early weeks of pregnancy The researchers then kept track of which of the pregnancies ended in miscarriage Li, D-K., et. al. (2003). “Exposure to non-steroidal anti-inflammatory drugs during pregnancy and risk of miscarriage: population based cohort study,” British Medical Journal, 327(7411): 1.
Painkillers and Miscarriage No Miscarriage TOTAL No painkiller 103 659 762 Aspirin 5 17 22 Ibuprofen 13 40 53 Acetaminophen 24 148 172 145 864 1009 Does this data provide evidence that these two variables are associated?
Two Categorical Variables The statistics behind a χ2 test easily extends to two categorical variables A χ2 test for association (often called a χ2 test for independence) tests for an association between two categorical variables Everything is the same as a chi-square goodness-of-fit test, except: The hypotheses The expected counts Degrees of freedom for the χ2-distribution
Hypotheses General hypotheses: H0: The two variables are not associated Ha: The two variables are associated Painkillers and miscarriage: H0: Type of painkiller taken is not associated with whether or not pregnancy ends in miscarriage Ha: Type of painkiller taken is associated with whether or not pregnancy ends in miscarriage
Expected Counts Miscarriage No Miscarriage TOTAL No painkiller 762 Aspirin 22 Ibuprofen 53 Acetaminophen 172 145 864 1009
Expected Count Give the expected count for Aspirin, Miscarriage. 2.1 No Miscarriage TOTAL No painkiller 762 Aspirin 22 Ibuprofen 53 Acetaminophen 172 145 864 1009 Give the expected count for Aspirin, Miscarriage. 2.1 3.16 4.72 5.65
Chi-Square Statistic Observed (expected) Miscarriage No Miscarriage TOTAL No painkiller 103 (109.5) 659 (652.5) 762 Aspirin 5 ( ) 17 (18.8) 22 Ibuprofen 13 (7.6) 40 (45.4) 53 Acetaminophen 24 (24.7) 148 (147.3) 172 145 864 1009
Chi-Square Statistic Miscarriage No Miscarriage No painkiller 103 (109.5) 659 (652.5) Aspirin 5 (3.16) 17 (18.8) Ibuprofen 13 (7.6) 40 (45.4) Acetaminophen 24 (24.7) 148 (147.3) Give the contribution to the χ2 statistic for the Aspirin, Miscarriage category. 0.7 1.07 1.7 2.07
StatKey χ2 = 6.168
What Next? χ2 = 6.168 What next?
Randomization Distribution
Conclusion Can we conclude that type of painkiller taken is associated with having a miscarriage? Yes No
Conclusion Can we conclude that type of painkiller taken is not associated with having a miscarriage? Yes No
Chi-Square (χ2) Distribution If each of the expected counts are at least 5, AND if the null hypothesis is true, then the χ2 statistic follows a χ2 –distribution, with degrees of freedom equal to df = (number of rows – 1)(number of columns – 1) Painkillers and Miscarriage: df = (4 – 1)(2 – 1) = 3
Theoretical Distribution Miscarriage No Miscarriage No painkiller 103 (109.5) 659 (652.5) Aspirin 5 (3.16) 17 (18.8) Ibuprofen 13 (7.6) 40 (45.4) Acetaminophen 24 (24.7) 148 (147.3) Can we also use the theoretical χ2 distribution to get the p-value? Yes No
NSAIDs? Headline coming out of this paper:Use of NSAIDs in pregnancy increases risk of miscarriage NSAIDs (Nonsteroidal anti-inflammatory drugs) are a special class of painkillers that include aspirin and ibuprofen (but not acetaminophen) Is taking NSAIDs or not associated with miscarriage?
NSAIDs and Miscarriage No Miscarriage TOTAL No painkiller 103 659 762 Aspirin 5 17 22 Ibuprofen 13 40 53 Acetaminophen 24 148 172 145 864 1009 Miscarriage No Miscarriage TOTAL No NSAIDs 127 807 934 NSAIDs 18 57 75 145 864 1009
NSAIDs and Miscarriage No Miscarriage TOTAL No NSAIDs 127 807 934 NSAIDs 18 57 75 145 864 1009 How should we analyze this data? Test for difference in proportions using a randomization test Test for a difference in proportions using the z-statistic and normal distribution Chi-Square Test for Association Any of the above None of the above
Two Categorical Variables with Two Categories If you are testing for an association between two categorical variables each with two categories, test for a difference in proportions and chi-square test for association will give you identical p-values
Hypotheses H0: taking NSAIDs around the time of conception or early in pregnancy is not associated with having a miscarriage Ha: taking NSAIDs around the time of conception or early in pregnancy is associated with having a miscarriage
Expected Counts Miscarriage No Miscarriage TOTAL No NSAIDs 127 807 934 NSAIDs 18 57 75 145 864 1009 What is the expected count for the NSAIDs, Miscarriage cell? 10.8 12.5 15.6 16.4
Expected Counts Miscarriage No Miscarriage TOTAL No NSAIDs 127 807 934 NSAIDs 18 (10.8) 57 75 145 864 1009 What is the contribution to the chi-square statistic for the NSAIDs, Miscarriage cell? 3.21 4.13 4.84 5.4
StatKey
Conclusion Can we conclude that taking NSAIDs around the time of conception or in early pregnancy is associated with having a miscarriage? Yes No
Conclusion Can we conclude that taking NSAIDs around the time of conception or in early pregnancy causes increased risk of miscarriage? Yes No
That’s Not All! A much more recent study (March 2014) reexamined this issue. Daniel, S. et. al. (2014). Fetal Exposure to nonsteroidal anti-inflammatory drugs and spontaneous abortions, Canadian Medical Association Journal, 186(5).
NSAIDs and Miscarriage
NSAIDs and Miscarriage
Results
Results
??? The first study found a significant association between NSAIDs and miscarriage, with those taking NSAIDS having significantly higher risk of miscarriage The second study found a significant association between NSAIDs and miscarriage, with those taking NSAIDS having a significantly lower risk of miscarriage WHAT’S GOING ON????
To Do Read Section 7.2 Do HW 7.2 (due Friday, 4/17)