Categorical Data Analysis Chapter 3: Inference for Contingency Tables
Estimation of Association Parameters Proportion difference point and interval estimators Relative risk Odds ratio Example
IxJ Contingency Tables Inference on difference, RR, odds ratio Are X and Y independent? Ho : independence
Measure the Lack of Independence Pearson chi-square statistic: Likelihood ratio (LR) test statistic: If the statistic is too large, then we have a strong evidence against independence.
Tests for Independence For Pearson or LR test, the df of the chi-square test is the dimension of the whole parameter space (Q) – the dimension of the hypothesized parameter space (Q0), i.e. df = dim(Q) – dim(Q0)
Poisson sampling: df = IJ-(I+J-1) =(I-1)(J-1) Single multinomial sampling: df = (IJ-1)-(I+J-2) = (I-1)(J-1) Independent multinomial sampling: df = I(J-1)-(J-1)
Example: Oral Contraceptive vs. Heart Attack Case-Control study: Retrospective sampling; Column totals were fixed Oral Contra-ceptives Heart attack Yes No Used 23 34 Never used 35 132 Total 58 166
Follow-up Chi-squared Tests Pearson and standardized residuals Partitioning Chi-squared
Residuals Pearson residual: Standardized Pearson residual:
Partitioning Chi-squared Describing association in IxJ table Partition a IxJ table to (I-1)(J-1) sub 2x2 tables Chi-squared= the sum of independent (I-1)(J-1) chi-squareds
Rules for Independent Partitioning S df for the subtables = (I-1)(J-1) Each cell count nij must appear in one and only one subtable Each marginal total (ni+ or n+j) must be a marginal total for one and only one subtable
Example: Aspirin vs. Heart Attack Prospective sampling; Row totals were fixed Fatal H.A. Non-fatal H.A. No H.A. Placebo 18 171 10845 Aspirin 5 99 10933
Ordinary X: Trend Tests Test for Linear Trend alternative: M^2 Choice of scores Example: Table 2.8. 1996 General Social Survey Job Satisfaction Income Very dissatisfied Little dissatisfied Moderately satisfied Very satisfied <15K 1 3 10 6 15K-25K 2 7 25K-40K 14 12 >40K 9 11
Exact Test for Independence The Chi-squared tests are for large samples The Chi-squared tests are valid only when The sample size is large enough so that expected frequencies are greater than or equal to 5 for 80% or more of the categories
Fisher’s Exact Test Consider a 2x2 table Under the three sampling methods, what is the distribution of n11 conditional on n1+, n2+, n+1, n+2? Example: Table 3.8