Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association.

Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association

Contingency Tables Tables representing all combinations of levels of explanatory and response variables Tables representing all combinations of levels of explanatory and response variables Numbers in table represent Counts of the number of cases in each cell Numbers in table represent Counts of the number of cases in each cell Row and column totals are called Marginal counts Row and column totals are called Marginal counts

2x2 Tables Each variable has 2 levels Each variable has 2 levels –Explanatory Variable – Groups (Typically based on demographics, exposure) –Response Variable – Outcome (Typically presence or absence of a characteristic)

2x2 Tables - Notation Outcome Present Outcome Absent Group Total Group 1n 11 n 12 n 1. Group 2n 21 n 22 n 2. Outcome Total n.1 n.2 n..

 2 Test of Independence

1.Shows If a Relationship Exists Between 2 Qualitative Variables 1.Shows If a Relationship Exists Between 2 Qualitative Variables –One Sample Is Drawn –Does Not Show Causality 2.Assumptions 2.Assumptions –Multinomial Experiment –All Expected Counts  5 3.Uses Two-Way Contingency Table 3.Uses Two-Way Contingency Table

 2 Test of Independence Contingency Table 1.Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables 1.Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables

 2 Test of Independence Contingency Table 1.Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables 1.Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables Levels of variable 2 Levels of variable 1

 2 Test of Independence Hypotheses & Statistic 1.Hypotheses 1.Hypotheses –H 0 : Variables Are Independent –H a : Variables Are Related (Dependent)

 2 Test of Independence Hypotheses & Statistic 1.Hypotheses 1.Hypotheses –H 0 : Variables Are Independent –H a : Variables Are Related (Dependent) 2.Test Statistic 2.Test Statistic Observed count Expected count

 2 Test of Independence Hypotheses & Statistic 1.Hypotheses 1.Hypotheses –H 0 : Variables Are Independent –H a : Variables Are Related (Dependent) 2.Test Statistic 2.Test Statistic Degrees of Freedom: (r - 1)(c - 1) Degrees of Freedom: (r - 1)(c - 1) Rows Columns Observed count Expected count

 2 Test of Independence Expected Counts 1.Statistical Independence Means Joint Probability Equals Product of Marginal Probabilities 1.Statistical Independence Means Joint Probability Equals Product of Marginal Probabilities 2.Compute Marginal Probabilities & Multiply for Joint Probability 2.Compute Marginal Probabilities & Multiply for Joint Probability 3.Expected Count Is Sample Size Times Joint Probability 3.Expected Count Is Sample Size Times Joint Probability

Expected Count Example

112 160 Marginal probability =

Expected Count Example 112 160 78 160 Marginal probability =

Expected Count Example 112 160 78 160 Marginal probability = Joint probability = 112 160 78 160

Expected Count Example 112 160 78 160 Marginal probability = Joint probability = 112 160 78 160 Expected count = 160· 112 160 78 160 = 54.6

Expected Count Calculation

112·82 160 48·78 160 48·82 160 112·78 160

You’re a marketing research analyst. You ask a random sample of 286 consumers if they purchase Diet Pepsi or Diet Coke. At the.05 level, is there evidence of a relationship? You’re a marketing research analyst. You ask a random sample of 286 consumers if they purchase Diet Pepsi or Diet Coke. At the.05 level, is there evidence of a relationship?  2 Test of Independence Example

 2 Test of Independence Solution

H 0 : H 0 : H a : H a :   = df = df = Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion:

 2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship   = df = df = Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion:

 2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship   =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion:

 2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship   =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion:  =.05

E(n ij )  5 in all cells 170·132 286 170·154 286 116·132 286 154·116 286  2 Test of Independence Solution

H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship   =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion:  =.05  2 = 54.29

 2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship   =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion: Reject at  =.05  =.05  2 = 54.29

 2 Test of Independence Solution H 0 : No Relationship H 0 : No Relationship H a : Relationship H a : Relationship   =.05 df = (2 - 1)(2 - 1) = 1 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Critical Value(s): Test Statistic: Decision:Conclusion: Reject at  =.05 There is evidence of a relationship  =.05  2 = 54.29

Siskel and Ebert | Ebert Siskel | Con Mix Pro | Total -----------+---------------------------------+---------- Con | 24 8 13 | 45 Mix | 8 13 11 | 32 Pro | 10 9 64 | 83 -----------+---------------------------------+---------- Total | 42 30 88 | 160

Siskel and Ebert | Ebert Siskel | Con Mix Pro | Total -----------+---------------------------------+---------- Con | 24 8 13 | 45 | 11.8 8.4 24.8 | 45.0 -----------+---------------------------------+---------- Mix | 8 13 11 | 32 | 8.4 6.0 17.6 | 32.0 -----------+---------------------------------+---------- Pro | 10 9 64 | 83 | 21.8 15.6 45.6 | 83.0 -----------+---------------------------------+---------- Total | 42 30 88 | 160 | 42.0 30.0 88.0 | 160.0 Pearson chi2(4) = 45.3569 p < 0.001

Yate’s Statistics Method of testing for association for 2x2 tables when sample size is moderate ( total observation between 6 – 25) Method of testing for association for 2x2 tables when sample size is moderate ( total observation between 6 – 25)

End of Chapter Any blank slides that follow are blank intentionally. Measures of association –Relative Risk –Odds Ratio –Absolute Risk

Relative Risk Ratio of the probability that the outcome characteristic is present for one group, relative to the other Ratio of the probability that the outcome characteristic is present for one group, relative to the other Sample proportions with characteristic from groups 1 and 2: Sample proportions with characteristic from groups 1 and 2:

Relative Risk Estimated Relative Risk: Estimated Relative Risk: 95% Confidence Interval for Population Relative Risk:

Relative Risk Interpretation Interpretation –Conclude that the probability that the outcome is present is higher (in the population) for group 1 if the entire interval is above 1 –Conclude that the probability that the outcome is present is lower (in the population) for group 1 if the entire interval is below 1 –Do not conclude that the probability of the outcome differs for the two groups if the interval contains 1

Example - Coccidioidomycosis and TNF  -antagonists Research Question: Risk of developing Coccidioidmycosis associated with arthritis therapy? Groups: Patients receiving tumor necrosis factor  (TNF  ) versus Patients not receiving TNF  (all patients arthritic) Source: Bergstrom, et al (2004)

Example - Coccidioidomycosis and TNF  -antagonists Group 1: Patients on TNF  Group 2: Patients not on TNF  Entire CI above 1  Conclude higher risk if on TNF 

Odds Ratio Odds of an event is the probability it occurs divided by the probability it does not occur Odds of an event is the probability it occurs divided by the probability it does not occur Odds ratio is the odds of the event for group 1 divided by the odds of the event for group 2 Odds ratio is the odds of the event for group 1 divided by the odds of the event for group 2 Sample odds of the outcome for each group: Sample odds of the outcome for each group:

Odds Ratio Estimated Odds Ratio: 95% Confidence Interval for Population Odds Ratio

Odds Ratio Interpretation Interpretation –Conclude that the probability that the outcome is present is higher (in the population) for group 1 if the entire interval is above 1 –Conclude that the probability that the outcome is present is lower (in the population) for group 1 if the entire interval is below 1 –Do not conclude that the probability of the outcome differs for the two groups if the interval contains 1

Example - NSAIDs and GBM Case-Control Study (Retrospective) Case-Control Study (Retrospective) – Cases: 137 Self-Reporting Patients with Glioblastoma Multiforme (GBM) – Controls: 401 Population-Based Individuals matched to cases wrt demographic factors Source: Sivak-Sears, et al (2004)

Example - NSAIDs and GBM Interval is entirely below 1, NSAID use appears to be lower among cases than controls

Absolute Risk Difference Between Proportions of outcomes with an outcome characteristic for 2 groups Difference Between Proportions of outcomes with an outcome characteristic for 2 groups Sample proportions with characteristic from groups 1 and 2: Sample proportions with characteristic from groups 1 and 2:

Absolute Risk Estimated Absolute Risk: 95% Confidence Interval for Population Absolute Risk

Absolute Risk Interpretation Interpretation –Conclude that the probability that the outcome is present is higher (in the population) for group 1 if the entire interval is positive –Conclude that the probability that the outcome is present is lower (in the population) for group 1 if the entire interval is negative –Do not conclude that the probability of the outcome differs for the two groups if the interval contains 0

Example - Coccidioidomycosis and TNF  -antagonists Group 1: Patients on TNF  Group 2: Patients not on TNF  Interval is entirely positive, TNF  is associated with higher risk

Ordinal Explanatory and Response Variables Pearson’s Chi-square test can be used to test associations among ordinal variables, but more powerful methods exist Pearson’s Chi-square test can be used to test associations among ordinal variables, but more powerful methods exist When theories exist that the association is directional (positive or negative), measures exist to describe and test for these specific alternatives from independence: When theories exist that the association is directional (positive or negative), measures exist to describe and test for these specific alternatives from independence: –Gamma –Kendall’s  b

Concordant and Discordant Pairs Concordant Pairs - Pairs of individuals where one individual scores “higher” on both ordered variables than the other individual Concordant Pairs - Pairs of individuals where one individual scores “higher” on both ordered variables than the other individual Discordant Pairs - Pairs of individuals where one individual scores “higher” on one ordered variable and the other individual scores “lower” on the other Discordant Pairs - Pairs of individuals where one individual scores “higher” on one ordered variable and the other individual scores “lower” on the other C = # Concordant Pairs D = # Discordant Pairs C = # Concordant Pairs D = # Discordant Pairs –Under Positive association, expect C > D –Under Negative association, expect C < D –Under No association, expect C  D

Example - Alcohol Use and Sick Days Alcohol Risk (Without Risk, Hardly any Risk, Some to Considerable Risk) Alcohol Risk (Without Risk, Hardly any Risk, Some to Considerable Risk) Sick Days (0, 1-6,  7) Sick Days (0, 1-6,  7) Concordant Pairs - Pairs of respondents where one scores higher on both alcohol risk and sick days than the other Concordant Pairs - Pairs of respondents where one scores higher on both alcohol risk and sick days than the other Discordant Pairs - Pairs of respondents where one scores higher on alcohol risk and the other scores higher on sick days Discordant Pairs - Pairs of respondents where one scores higher on alcohol risk and the other scores higher on sick days Source: Hermansson, et al (2003)

Example - Alcohol Use and Sick Days Concordant Pairs: Each individual in a given cell is concordant with each individual in cells “Southeast” of theirs Discordant Pairs: Each individual in a given cell is discordant with each individual in cells “Southwest” of theirs

Example - Alcohol Use and Sick Days

Measures of Association Goodman and Kruskal’s Gamma: Kendall’s  b : When there’s no association between the ordinal variables, the population based values of these measures are 0. Statistical software packages provide these tests.

Example - Alcohol Use and Sick Days

Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association.

Similar presentations

Presentation on theme: "Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association.

Similar presentations

Presentation on theme: "Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association."— Presentation transcript:

Similar presentations

About project

Feedback