Download presentation
Presentation is loading. Please wait.
1
Analysis of Categorical Data
2
Types of Tests Data in 2 X 2 Tables (covered previously)
Comparing two population proportions using independent samples (Fisher’s Exact Test) Comparing two population proportions using dependent samples (McNemar’s Test) Relative Risk (RR), Odds Ratios (OR), Risk Difference, Attributable Risk (AR), & NNT/NNH Data in r X c Tables Tests of Independence/Association and Homogeneity.
3
Cervical-Cancer and Age at First Pregnancy – 2 X 2 Data Table
These data come from a case-control study to examine the potential relationship between age at first pregnancy and cervical cancer. In this study we will be comparing the proportion of women who had their first pregnancy at or before the ages of 25, because researchers suspected that an early age at first pregnancy leads to increased risk of developing cervical cancer.
4
2 X 2 Example: Case-Control Study
Cervical Cancer and Age at 1st Pregnancy Disease Status Age at 1st Pregnancy Age < 25 Age > 25 Row Totals Cervical Cancer (Case) 42 7 49 Healthy (Control) 203 114 317 Column 245 121 366
5
Previously We have compared the proportions of women with the risk factor in both groups (p1 vs. p2) using the z-test, a CI for (p1 – p2) & Fisher’s Exact Test. Computed the Odds Ratio (OR) and found a CI for the population OR.
6
Development of a Test Statistic to Measure Lack of Independence
One way to generalize the question of interest to the researchers is to think of it as follows: Q: Is there an association between cervical cancer status and whether or not a woman had her 1st pregnancy at or before the age of 25?
7
Development of a Test Statistic to Measure Lack of Independence
If there is not an association, we say that the variables are independent. In the probability notes we saw that two events A and B are said to be independent if P(A|B) = P(A).
8
Development of a Test Statistic to Measure Lack of Independence
In the context of our study this would mean P(Age < 25|Cancer Status) = P(Age < 25) i.e. knowing something about disease status tells you nothing about the presence of the risk factor of having their first pregnancy at or before age 25.
9
Development of a Test Statistic to Measure Lack of Independence
P(Age < 25) = 245/ = In this study 66.94% of the women sampled had their first pregnancy at or before the age of 25. When we consider this percentage conditioning on disease status we see that relationship for independence does not hold for these data. P(Age < 25|Cervical Cancer) = 42/47 = .8936 P(Age < 25|Healthy Control) = 203/317 = .6404 Should both be equal to .6694
10
Development of a Test Statistic to Measure Lack of Independence
Of course the observed differences could be due to random variation and in truth it may be the case that disease and risk factor status are independent. Therefore we need a means of assessing how different the observed results are from what we would expect to see if the these two factors were independent.
11
2 X 2 Example: Case-Control Study
Cervical Cancer and Age at 1st Pregnancy Disease Status Age at 1st Pregnancy Age < 25 Age > 25 Row Totals Cervical Cancer (Case) 42 7 49 Healthy (Control) 203 114 317 Column 245 121 366 a b R1 R2 c d n C1 C2
12
Development of a Test Statistic to Measure Lack of Independence
The unconditional probability of risk presence of these data is given by: From this table we can calculate the conditional probability of having the risk factor of early pregnancy given the disease status of the subject as follows: and setting these to equal we have
13
Development of a Test Statistic to Measure Lack of Independence
Thus we expect the frequency in the a cell to be equal to: Similarly we find the following expected frequencies for the cells making up the 2 X 2 table
14
Development of a Test Statistic to Measure Lack of Independence
In general we denote the observed frequency in the ith row and jth column as or just O for short. We denote the expected frequency for the ith row and jth column as or just E for short.
15
Development of a Test Statistic to Measure Lack of Independence
To measure how different our observed results are from what we expected to see if the two variables in question were independent we intuitively should look at the difference between the observed (O) and expected (E) frequencies, i.e. O – E or more specifically However this will give too much weight to differences where these frequencies are both large in size.
16
Development of a Test Statistic to Measure Lack of Independence
One test statistic that addresses the “size” of the frequencies issue is Pearson’s Chi-Square (c2) Notice this test statistic still uses (O – E) as the basic building block. This statistic will be large when the observed frequencies do NOT match the expected values for independence.
17
Chi-square Distribution (c2)
p-value c2 This is a graph of the chi-square distribution with 4 degrees of freedom. The area to the right of Pearson’s chi-square statistic give the p-value. The p-value is always the area to the right!
18
2 X 2 Example: Case-Control Study
Cervical Cancer and Age at 1st Pregnancy Disease Status Age at 1st Pregnancy Age < 25 Age > 25 Row Totals Cervical Cancer (Case) 42 7 49 Healthy (Control) 203 114 317 Column 245 121 366 O11 O12 R1 R2 O21 O22 n C1 C2
19
Calculating Expected Frequencies
Cervical Cancer and Age at 1st Pregnancy Disease Status Age at 1st Pregnancy Age < 25 Age > 25 Row Totals Cervical Cancer (Case) 42 7 49 Healthy (Control) 203 114 317 Column 245 121 366 (32.80) (16.20) R1 (212.20) (104.80) R2 C1 n C2
20
Calculating the Pearson Chi-square
21
Chi-square Probability Calculator in JMP
Enter the test statistic value and df and the p-value is automatically calculated. p-value = P(c2 > 9.011) = .0027
22
2 X 2 Example: Case-Control Study
Cervical Cancer and Age at 1st Pregnancy Conclusion: We have strong evidence to suggest that at age at first pregnancy and cervical cancer status are NOT independent, and that they are associated or related (p =.0027). In particular we found that the proportion of women having their first pregnancy at or before the age of 25 was higher amongst women with cervical cancer than for those without.
23
Other things we could do…
Odds Ratio (OR) and CI for OR - case-control study means no RR. Fisher’s Exact Test - Pearson’s chi-square is an approximation that requires “large” sample sizes * typically we would like all Eij > 5 * or at least 80% of cells should have Eij > 5 * thus the approximation should be good here as both of these conditions are met for this study.
24
Example 2: Response to Treatment and Histological Type of Hodgkin’s Disease
In this study a random sample of 538 patients diagnosed with some form of Hodgkin’s Disease was taken and the histological type: nodular sclerosis (NS), mixed cellularity (MC), lymphocyte predominance (LP), or lymphocyte depletion (LD) was recorded along with the outcome from standard treatment which was recorded as being none, partial, or complete remission. Q: Is there an association between type of Hodgkin’s and response to treatment? If so, what is the nature of the relationship?
25
Example 2: Response to Treatment and Histological Type of Hodgkin’s Disease
None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 Some Probabilities of Potential Interest Probability of Positive Response to Treatment P(positive) = 314/538 = .5836 Probability of Positive Response to Treatment Given Disease Type P(positive|LD) = 18/72 = P(positive|LP) = 74/104 = .7115 P(positive|MC) = 154/266 = .5789 P(positive|NS) = 68/96 = .7083 Notice the conditional probabilities are not equal to the unconditional!!!
26
Mosaic plot of the results
Response to Treatment vs. Histological Type Clearly we see that LP and NS respond most favorably to treatment with over 70% of those sampled having experiencing complete remission, whereas lymphocyte depletion has a majority (61.1%) of patients having no response to treatment. A statistical test at this point seems unnecessary as it seems clear that there is an association between the type of Hodgkin’s disease and the response to treatment, nonetheless we will proceed…
27
Example 2: Response to Treatment and Histological Type of Hodgkin’s Disease
None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 (16.86) (13.11) (42.02) (24.36) (18.94) (60.69) (62.30) (48.45) (155.25) (22.48) (17.49) (56.03)
28
Example 2: Response to Treatment and Histological Type of Hodgkin’s Disease
None Partial Positive Row Totals LD 44 10 18 72 LP 12 74 104 MC 58 54 154 266 NS 16 68 96 Column 126 98 314 n = 538 (16.86) (13.11) (42.02) (24.36) (18.94) (60.69) (62.30) (48.45) (155.25) (22.48) (17.49) (56.03) We have strong evidence of an association between the type of Hodgkin’s and response to treatment (p < .0001).
29
Measures of Association Between Two Categorical Variables
This can be applied to the cervical cancer case-control study.
30
Measures of Association Between Two Categorical Variables
This can be used for general r x c tables. This can be used for the Hodgkin’s example:
31
Measures of Association Between Two Categorical Variables
For the Hodgkin’s study
32
Measures of Association Between Two Categorical Variables
There are lots of other measures of association. When both variables are nominal the previous measures are fine and there are certainly many more. For cases where both variables are ordinal common measures include Kendall’s tau and Somer’s D. In some cases we wish to measure the degree of exact agreement between two nominal or ordinal variables measured using the same levels or scales in which case we generally use Cohen’s Kappa (k).
33
Measures of Association Between Two Categorical Variables
Cohen’s Kappa (k) – measures the degree of agreement between two variables on the same scales. Example 3: Medicare Study – General health at baseline and 2-yr. follow-up, how well do they agree? > excellent agreement .4 < k < good agreement 0 < k < marginal agreement There is a fairly good agreement between the general assessment of overall health baseline and at follow-up. However, there appears to be some general trend for improvement as well.
34
Testing for Lack of Symmetry
Bowker’s Test of Symmetry is a generalization of McNemar’s Test to r x r tables where there where the row and column variables are on the same scale. The general health of the subjects in the Medicare study is an example of where this test could be used as both the health at baseline and follow-up is recorded using the same 5-point ordinal scale.
35
Bowker’s Test of Symmetry
1 2 … r Row Totals O11 O12 O1r O21 O22 O2r Or1 Or2 Orr Column Totals The test looks for the frequencies to be generally larger on one side of the diagonal than the other. X
36
Bowker’s Test of Symmetry
When will this test statistics be “large”? If there was a general trend or tendency for X > Y or for X < Y then we would expect the off diagonal cells of the table to larger on one side than the other. For example if Y tended to be larger than X, perhaps indicating an improvement in health, then we expect the frequencies above the diagonal to be larger than those below.
37
Bowker’s Test of Symmetry
Symmetry of Disagreement Bowker’s test suggests the differences are asymmetric (p < .0001). Examining the percentages suggests a majority of patients either stayed the same or improved in each group based on baseline score. Therefore it is reasonable to state that we have evidence that in general subjects health stayed the same or if it did change it was generally for the better (p < .0001).
38
Other Approaches Wilcoxon Sign-Rank Test for the paired differences in the ordinal health score (p < .0001). Direct examination of the distribution of the changes in general health score. The plot on the right shows the change in general health vs. baseline health. With the exception of those with the lowest health at baseline a majority (50%+) of patients stayed the same. The shading for improvement is larger than the shading for health decline. Follow-up – Baseline There is a slight advantage for improvement vs. decline in health.
39
Other Tests for Categorical Data
Chi-square Test for Trend in Binomial Proportions tests whether or not p1 < p2 < p3 < … < pk where 1, 2, …, k are levels of an ordinal variable, i.e. 2 X k table. Chi-square Goodness-of-Fit Tests – used test whether observations come from some hypothesized distribution. Cochran-Mantel-Haenszel Test – Looks at whether or not there is a relationship in a 2 X 2 table situation adjusting for the level of a third factor. For example, is there a relationship between heavy drinking (Y or N) and lung cancer (Y or N) adjusting for smoking status.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.