Download presentation
Presentation is loading. Please wait.
Published byVivian King Modified over 9 years ago
1
Contingency Tables 1.Explain 2 Test of Independence 2.Measure of Association
2
Contingency Tables Tables representing all combinations of levels of explanatory and response variables Tables representing all combinations of levels of explanatory and response variables Numbers in table represent Counts of the number of cases in each cell Numbers in table represent Counts of the number of cases in each cell Row and column totals are called Marginal counts Row and column totals are called Marginal counts
3
2x2 Tables Each variable has 2 levels Each variable has 2 levels –Explanatory Variable – Groups (Typically based on demographics, exposure) –Response Variable – Outcome (Typically presence or absence of a characteristic)
4
2x2 Tables - Notation Outcome Present Outcome Absent Group Total Group 1n 11 n 12 n 1. Group 2n 21 n 22 n 2. Outcome Total n.1 n.2 n..
5
2 Test of Independence
6
1.Shows If a Relationship Exists Between 2 Qualitative Variables 1.Shows If a Relationship Exists Between 2 Qualitative Variables –One Sample Is Drawn –Does Not Show Causality 2.Assumptions 2.Assumptions –Multinomial Experiment –All Expected Counts 5 3.Uses Two-Way Contingency Table 3.Uses Two-Way Contingency Table
7
2 Test of Independence Contingency Table 1.Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables 1.Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables
8
2 Test of Independence Contingency Table 1.Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables 1.Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables Levels of variable 2 Levels of variable 1
9
2 Test of Independence Hypotheses & Statistic 1.Hypotheses 1.Hypotheses –H 0 : Variables Are Independent –H a : Variables Are Related (Dependent)
10
2 Test of Independence Hypotheses & Statistic 1.Hypotheses 1.Hypotheses –H 0 : Variables Are Independent –H a : Variables Are Related (Dependent) 2.Test Statistic 2.Test Statistic Observed count Expected count
11
2 Test of Independence Hypotheses & Statistic 1.Hypotheses 1.Hypotheses –H 0 : Variables Are Independent –H a : Variables Are Related (Dependent) 2.Test Statistic 2.Test Statistic Degrees of Freedom: (r - 1)(c - 1) Degrees of Freedom: (r - 1)(c - 1) Rows Columns Observed count Expected count
12
2 Test of Independence Expected Counts 1.Statistical Independence Means Joint Probability Equals Product of Marginal Probabilities 1.Statistical Independence Means Joint Probability Equals Product of Marginal Probabilities 2.Compute Marginal Probabilities & Multiply for Joint Probability 2.Compute Marginal Probabilities & Multiply for Joint Probability 3.Expected Count Is Sample Size Times Joint Probability 3.Expected Count Is Sample Size Times Joint Probability
13
Expected Count Example
15
112 160 Marginal probability =
16
Expected Count Example 112 160 78 160 Marginal probability =
17
Expected Count Example 112 160 78 160 Marginal probability = Joint probability = 112 160 78 160
18
Expected Count Example 112 160 78 160 Marginal probability = Joint probability = 112 160 78 160 Expected count = 160· 112 160 78 160 = 54.6
19
Expected Count Calculation
21
112·82 160 48·78 160 48·82 160 112·78 160
22
You’re a marketing research analyst. You ask a random sample of 286 consumers if they purchase Diet Pepsi or Diet Coke. At the.05 level, is there evidence of a relationship? You’re a marketing research analyst. You ask a random sample of 286 consumers if they purchase Diet Pepsi or Diet Coke. At the.05 level, is there evidence of a relationship? 2 Test of Independence Example
23
2 Test of Independence Solution
24
H 0 : H 0 : H a : H a : = df = df = Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion:
25
2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship = df = df = Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion:
26
2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion:
27
2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion: =.05
28
E(n ij ) 5 in all cells 170·132 286 170·154 286 116·132 286 154·116 286 2 Test of Independence Solution
30
H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion: =.05 2 = 54.29
31
2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion: Reject at =.05 =.05 2 = 54.29
32
2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion: Reject at =.05 There is evidence of a relationship =.05 2 = 54.29
33
Siskel and Ebert | Ebert Siskel | Con Mix Pro | Total -----------+---------------------------------+---------- Con | 24 8 13 | 45 Mix | 8 13 11 | 32 Pro | 10 9 64 | 83 -----------+---------------------------------+---------- Total | 42 30 88 | 160
34
Siskel and Ebert | Ebert Siskel | Con Mix Pro | Total -----------+---------------------------------+---------- Con | 24 8 13 | 45 | 11.8 8.4 24.8 | 45.0 -----------+---------------------------------+---------- Mix | 8 13 11 | 32 | 8.4 6.0 17.6 | 32.0 -----------+---------------------------------+---------- Pro | 10 9 64 | 83 | 21.8 15.6 45.6 | 83.0 -----------+---------------------------------+---------- Total | 42 30 88 | 160 | 42.0 30.0 88.0 | 160.0 Pearson chi2(4) = 45.3569 p < 0.001
35
Yate’s Statistics Method of testing for association for 2x2 tables when sample size is moderate ( total observation between 6 – 25) Method of testing for association for 2x2 tables when sample size is moderate ( total observation between 6 – 25)
36
End of Chapter Any blank slides that follow are blank intentionally. Measures of association –Relative Risk –Odds Ratio –Absolute Risk
37
Relative Risk Ratio of the probability that the outcome characteristic is present for one group, relative to the other Ratio of the probability that the outcome characteristic is present for one group, relative to the other Sample proportions with characteristic from groups 1 and 2: Sample proportions with characteristic from groups 1 and 2:
38
Relative Risk Estimated Relative Risk: Estimated Relative Risk: 95% Confidence Interval for Population Relative Risk:
39
Relative Risk Interpretation Interpretation –Conclude that the probability that the outcome is present is higher (in the population) for group 1 if the entire interval is above 1 –Conclude that the probability that the outcome is present is lower (in the population) for group 1 if the entire interval is below 1 –Do not conclude that the probability of the outcome differs for the two groups if the interval contains 1
40
Example - Coccidioidomycosis and TNF -antagonists Research Question: Risk of developing Coccidioidmycosis associated with arthritis therapy? Groups: Patients receiving tumor necrosis factor (TNF ) versus Patients not receiving TNF (all patients arthritic) Source: Bergstrom, et al (2004)
41
Example - Coccidioidomycosis and TNF -antagonists Group 1: Patients on TNF Group 2: Patients not on TNF Entire CI above 1 Conclude higher risk if on TNF
42
Odds Ratio Odds of an event is the probability it occurs divided by the probability it does not occur Odds of an event is the probability it occurs divided by the probability it does not occur Odds ratio is the odds of the event for group 1 divided by the odds of the event for group 2 Odds ratio is the odds of the event for group 1 divided by the odds of the event for group 2 Sample odds of the outcome for each group: Sample odds of the outcome for each group:
43
Odds Ratio Estimated Odds Ratio: 95% Confidence Interval for Population Odds Ratio
44
Odds Ratio Interpretation Interpretation –Conclude that the probability that the outcome is present is higher (in the population) for group 1 if the entire interval is above 1 –Conclude that the probability that the outcome is present is lower (in the population) for group 1 if the entire interval is below 1 –Do not conclude that the probability of the outcome differs for the two groups if the interval contains 1
45
Example - NSAIDs and GBM Case-Control Study (Retrospective) Case-Control Study (Retrospective) – Cases: 137 Self-Reporting Patients with Glioblastoma Multiforme (GBM) – Controls: 401 Population-Based Individuals matched to cases wrt demographic factors Source: Sivak-Sears, et al (2004)
46
Example - NSAIDs and GBM Interval is entirely below 1, NSAID use appears to be lower among cases than controls
47
Absolute Risk Difference Between Proportions of outcomes with an outcome characteristic for 2 groups Difference Between Proportions of outcomes with an outcome characteristic for 2 groups Sample proportions with characteristic from groups 1 and 2: Sample proportions with characteristic from groups 1 and 2:
48
Absolute Risk Estimated Absolute Risk: 95% Confidence Interval for Population Absolute Risk
49
Absolute Risk Interpretation Interpretation –Conclude that the probability that the outcome is present is higher (in the population) for group 1 if the entire interval is positive –Conclude that the probability that the outcome is present is lower (in the population) for group 1 if the entire interval is negative –Do not conclude that the probability of the outcome differs for the two groups if the interval contains 0
50
Example - Coccidioidomycosis and TNF -antagonists Group 1: Patients on TNF Group 2: Patients not on TNF Interval is entirely positive, TNF is associated with higher risk
51
Ordinal Explanatory and Response Variables Pearson’s Chi-square test can be used to test associations among ordinal variables, but more powerful methods exist Pearson’s Chi-square test can be used to test associations among ordinal variables, but more powerful methods exist When theories exist that the association is directional (positive or negative), measures exist to describe and test for these specific alternatives from independence: When theories exist that the association is directional (positive or negative), measures exist to describe and test for these specific alternatives from independence: –Gamma –Kendall’s b
52
Concordant and Discordant Pairs Concordant Pairs - Pairs of individuals where one individual scores “higher” on both ordered variables than the other individual Concordant Pairs - Pairs of individuals where one individual scores “higher” on both ordered variables than the other individual Discordant Pairs - Pairs of individuals where one individual scores “higher” on one ordered variable and the other individual scores “lower” on the other Discordant Pairs - Pairs of individuals where one individual scores “higher” on one ordered variable and the other individual scores “lower” on the other C = # Concordant Pairs D = # Discordant Pairs C = # Concordant Pairs D = # Discordant Pairs –Under Positive association, expect C > D –Under Negative association, expect C < D –Under No association, expect C D
53
Example - Alcohol Use and Sick Days Alcohol Risk (Without Risk, Hardly any Risk, Some to Considerable Risk) Alcohol Risk (Without Risk, Hardly any Risk, Some to Considerable Risk) Sick Days (0, 1-6, 7) Sick Days (0, 1-6, 7) Concordant Pairs - Pairs of respondents where one scores higher on both alcohol risk and sick days than the other Concordant Pairs - Pairs of respondents where one scores higher on both alcohol risk and sick days than the other Discordant Pairs - Pairs of respondents where one scores higher on alcohol risk and the other scores higher on sick days Discordant Pairs - Pairs of respondents where one scores higher on alcohol risk and the other scores higher on sick days Source: Hermansson, et al (2003)
54
Example - Alcohol Use and Sick Days Concordant Pairs: Each individual in a given cell is concordant with each individual in cells “Southeast” of theirs Discordant Pairs: Each individual in a given cell is discordant with each individual in cells “Southwest” of theirs
55
Example - Alcohol Use and Sick Days
56
Measures of Association Goodman and Kruskal’s Gamma: Kendall’s b : When there’s no association between the ordinal variables, the population based values of these measures are 0. Statistical software packages provide these tests.
57
Example - Alcohol Use and Sick Days
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.