March 28 Analyses of binary outcomes 2 x 2 tables

Slides:



Advertisements
Similar presentations
Contingency Tables Prepared by Yu-Fen Li.
Advertisements

M2 Medical Epidemiology
Simple Logistic Regression
1 Contingency Tables: Tests for independence and homogeneity (§10.5) How to test hypotheses of independence (association) and homogeneity (similarity)
1 If we live with a deep sense of gratitude, our life will be greatly embellished.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
EPI 809 / Spring 2008 Final Review EPI 809 / Spring 2008 Ch11 Regression and correlation  Linear regression Model, interpretation. Model, interpretation.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
= == Critical Value = 1.64 X = 177  = 170 S = 16 N = 25 Z =
Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics Biostatistics March 2007 Carla Talarico.
Analysis of Categorical Data
Amsterdam Rehabilitation Research Center | Reade Testing significance - categorical data Martin van der Esch, PhD.
Estimation of Various Population Parameters Point Estimation and Confidence Intervals Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology.
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
POTH 612A Quantitative Analysis Dr. Nancy Mayo. © Nancy E. Mayo A Framework for Asking Questions Population Exposure (Level 1) Comparison Level 2 OutcomeTimePECOT.
Analyses of Covariance Comparing k means adjusting for 1 or more other variables (covariates) Ho: u 1 = u 2 = u 3 (Adjusting for X) Combines ANOVA and.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Contingency Tables 1.Explain  2 Test of Independence 2.Measure of Association.
March 30 More examples of case-control studies General I x J table Chi-square tests.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
More Contingency Tables & Paired Categorical Data Lecture 8.
Chi Square Tests Chapter 17. Assumptions for Parametrics >Normal distributions >DV is at least scale >Random selection Sometimes other stuff: homogeneity,
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Doing Analyses on Binary Outcome. From November 14 th Dr Sainani talked about how the math works for binomial data.
Fall 2002Biostat Inference for two-way tables General R x C tables Tests of homogeneity of a factor across groups or independence of two factors.
Lesson 10 - Topics SAS Procedures for Standard Statistical Tests and Analyses Programs 19 and 20 LSB 8:16-17.
How to Carry Out Nonparametric Tests and Construct Contingency Tables 21 January 2015.
CHI-SQUARE(X2) DISTRIBUTION
Hypothesis Testing Start with a question:
Understanding Sampling Distributions: Statistics as Random Variables
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Inference for Two-Samples
April 18 Intro to survival analysis Le 11.1 – 11.2
Notes on Logistic Regression
Introduction The two-sample z procedures of Chapter 10 allow us to compare the proportions of successes in two populations or for two treatments. What.
The binomial applied: absolute and relative risks, chi-square
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 16: Research with Categorical Data.
This Week Review of estimation and hypothesis testing
Association between two categorical variables
Hypothesis Testing Review
Lecture 8 – Comparing Proportions
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 8: Inference for Proportions
Lesson 10 - Topics SAS Procedures for Standard Statistical Tests and Analyses Programs 19 and 20 LSB 9:4-7;12-13 Welcome to lesson 10. In this lesson.
Chapter 18 Cross-Tabulated Counts
SA3202 Statistical Methods for Social Sciences
We’ll now consider 2x2 contingency tables, a table which has only 2 rows and 2 columns along with a special way to analyze it called Fisher’s Exact Test.
Examples and SAS introduction: -Violations of the rare disease assumption -Use of Fisher’s exact test January 14, 2004.
Saturday, August 06, 2016 Farrokh Alemi, PhD.
Lecture 5, Goodness of Fit Test
If we can reduce our desire,
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
CHAPTER 11 Inference for Distributions of Categorical Data
Analyzing the Association Between Categorical Variables
Analysis of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Categorical Data Analysis
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Applied Statistics Using SPSS
CHAPTER 11 Inference for Distributions of Categorical Data
Applied Statistics Using SPSS
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 9 Estimation: Additional Topics
Presentation transcript:

March 28 Analyses of binary outcomes 2 x 2 tables Relative Risks and Relative Odds (Odds Ratio) Lee 6.1 through 6.5 C & S Chapter 3 (G, I, K, L, M, O)

Estimating a Single Proportion p = proportion in population with characteristic Take random sample of size n x = number in sample with characteristic p = x/n estimate of p SE(p) = Assumptions: n is large enough for CLT Then p is normally distributed 95% CI for p: p ± 1.96 SE(p) p – p0 Z = SQRT(p(1-p) (1/n))

Example p = proportion favoring a certain candidate n = 625 x = 300 favor the candidate p = 300/625 = 0.48 is estimate of p SE(p) = = 0.020 95% CI for p: 0.48 ± 1.96 (0.02) 0.48 ± 0.04 Note: Samplers use n=625 frequently

1-Sample Z-Test: Matched Pair Data Control Positive Control Negative a b c d Pos Case Neg Analyses is done on discordant pairs b and c Called McNemar’s chi-square Ho: p = 0.5 where n=b+c and x=b Z = (b/(b+c) – 0.5)/sqrt(.5*.5/(b+c)) Z = (b-c)/sqrt(b+c) c2 = (b-c)2/(b+c)

Example – Vitamin Use/Disease (440 Pairs) Control Vitamin + Control Vitamin - 100 50 90 200 Vit + Case Vit - Ho: p = 0.5 where n=140 and b = 50 c2 = (50-90)2/(50+90) = 11.43 (p=.0007)

Comparing Two Proportions Ho: p1 = p2 Ha: p1 ≠ p2 p1 = x1/ n1 p2 = x2 / n2 p = (x1+x2)/(n1+n2) This is the pooled proportion p2 – p1 Z = p(1-p) (1/n1 + 1/n2) Compare to standard normal distribution Assume n1 and n2 are large enough to use normal approximation

CI: Difference in Proportions 95% CI for difference in proportions:

Example – Asthma and SES Ho: p1 = p2 Ha: p1 ≠ p2 p1 = 30/ 160 = 0.188 p2 = 40 / 140 = 0.286 p = 70/300 = 0.233 This is the pooled proportion 0.286 – 0.188 Z = 0.233(.767) (1/160 + 1/140) c2 = Z2 = 4.03 X1 is number with asthma in high SES group X2 is number with asthma in low SES group = 0.098/0.049 = 2.01

2 by 2 Table a b c d c2 = ( a + b + c + d ) (ad – bc)2 Factor Present Factor Absent a b c d Sample 1 n1 = a + b Sample 2 n2 = c + d a + c b+ d c2 = ( a + b + c + d ) (ad – bc)2 ( a + c) (b + d) (a + b) (c + d)

2 by 2 Table Have Asthma No Asthma 30 130 40 100 High SES n1 =160 Low SES n2 =140 70 230 c2 = ( 30 + 130 + 40 + 100 ) (3000 – 5200)2 ( 70) (230) (160) (140) = 4.02

Relative Risks and Relative Odds Factor Present Factor Absent a b c d Sample 1 n1 = a + b Sample 2 n2 = c + d a + c b + d RR = a/(a+b) c/(c+d) RO = a/b c/d = ad/bc If a+b is approximately equal to b and If c+d is approximately equal to d then RR will be close to RO

Calculation RR and RO 30 130 40 100 160 140 70 140 RR = 30/160 40/140 Have Asthma No Asthma 30 130 40 100 High SES 160 Low SES 140 70 140 RR Asthma (High v Low SES) RO Asthma (High v Low SES) RR = 30/160 40/140 = 0.66 RO = 30/130 40/100 = 0.58 34% Lower risk of asthma in high SES compared to low SES

Confidence Interval for Relative Risk This CI looks a little different from usual form It is calculated on log scale Distribution of RR possible values is skewed Can’t be less than zero Can be extremely large positive values Usually transformed back when presented Calculated automatically by SAS

Confidence Interval for Odds Ratio Similar to CI for relative risk Can calculate by hand easily; SAS calculates automatically

Notes About CI for RR and RO Confidence intervals are not symmetric around the point estimate Cannot use RR ± SE notation Point estimate: 0.66 95% CI (0.43 – .99) 0.23 below 0.33 above

Odds Ratio Property 30 130 40 100 160 140 70 140 RO = 30/130 40/100 Have Asthma No Asthma 30 130 40 100 High SES 160 Low SES 140 70 140 RO Asthma (High v Low SES) RO High SES (Asthma v No Asthma) RO = 30/130 40/100 = 0.58 RO = 30/40 130/100 = 0.58 Same Answer – Not true for RR

Cohort Versus Case Control Study Cohort (Prospective) Find a population of low SES persons and a population of high SES persons For each person determine if he/she has asthma Case-Control (Retrospective) Find a group of persons with asthma and a group of persons without asthma. Determine if person is of low or high SES

What Can Be Estimated Cohort (Prospective) Can estimate probability of asthma for both low and high SES groups Can compute relative risk of asthma (high versus low SES) Case-Control (Retrospective) Can not estimate probability of asthma Can not estimate relative risk of asthma (H versus L SES) Can estimate relative odds (H versus L SES) If disease is fairly rare the RO can estimate RR

Cohort Versus Case-Control Cohort (Prospective) May not be possible to do Case-Control (Retrospective) May be only way to assess risk factors for disease

INPUT ses asthma count; DATALINES; 1 1 30 1 2 130 2 1 40 2 2 100 ; USING SAS DATA asthma; INFILE DATALINES; INPUT ses asthma count; DATALINES; 1 1 30 1 2 130 2 1 40 2 2 100 ; PROC FREQ DATA=asthma; TABLES ses*asthma/CHISQ RELRISK; WEIGHT count; RUN; Insert 2 x 2 table. The variable count contains the number in each cell of the table Get c2 value Get RR and RO Very important statement !

Table of ses by asthma ses asthma Frequency| Percent | Row Pct | Col Pct | 1| 2| Total ---------+--------+--------+ 1 | 30 | 130 | 160 | 10.00 | 43.33 | 53.33 | 18.75 | 81.25 | | 42.86 | 56.52 | 2 | 40 | 100 | 140 | 13.33 | 33.33 | 46.67 | 28.57 | 71.43 | | 57.14 | 43.48 | Total 70 230 300 23.33 76.67 100.00 Ho: p1 = p2 p1 = 30/160 = 0.1875 p2 = 40/140 = 0.2857 Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 4.0262 0.0448 Likelihood Ratio Chi-Square 1 4.0234 0.0449 Continuity Adj. Chi-Square 1 3.4959 0.0615 Mantel-Haenszel Chi-Square 1 4.0128 0.0452

Table of ses by asthma ses asthma Frequency| Percent | Row Pct | Col Pct | 1| 2| Total ---------+--------+--------+ 1 | 30 | 130 | 160 | 10.00 | 43.33 | 53.33 | 18.75 | 81.25 | | 42.86 | 56.52 | 2 | 40 | 100 | 140 | 13.33 | 33.33 | 46.67 | 28.57 | 71.43 | | 57.14 | 43.48 | Total 70 230 300 23.33 76.67 100.00 Ho: RR =1 or RO = 1 RR = 0.1875/0.2857 = 0.6563 RO = (30/130) / (40/100) = 0.5769 Row1 = High SES Row2 = Low SES Col1 = Have asthma Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ----------------------------------------------------------------- Case-Control (Odds Ratio) 0.5769 0.3361 0.9904 Cohort (Col1 Risk) 0.6563 0.4331 0.9943 Cohort (Col2 Risk) 1.1375 1.0003 1.2935

Adjusting for Other Factors Other factors must be categorical Estimated RR and RO are a pooled estimate across all combinations of adjustment variables Analyses called Mentel-Haenszel c2 PROC FREQ DATA=asthma; WEIGHT count; TABLES gender*ses*asthma/CHISQ CMH; RUN; Dependent variable Risk factor of interest Adjustment Variable (s)

Adjusting for Other Factors Perhaps some or all of the SES/ASTHMA relationship is due to sex/gender PROC FREQ DATA=asthma; WEIGHT count; TABLES gender*ses*asthma/CHISQ CMH; RUN; Dependent variable Risk factor of interest Adjustment Variable (s)

INPUT gender ses asthma count; DATALINES; 1 1 1 10 1 1 2 70 1 2 1 10 USING SAS DATA asthma; INFILE DATALINES; INPUT gender ses asthma count; DATALINES; 1 1 1 10 1 1 2 70 1 2 1 10 1 2 2 30 2 1 1 20 2 1 2 60 2 2 1 30 2 2 2 70 ; PROC FREQ DATA=asthma; TABLES gender*ses*asthma/CHISQ CMH; WEIGHT count; 2 x 2 table for men 2 x 2 table for women

Analysis for men ses asthma Frequency| Percent | Row Pct | Table of ses by asthma ses asthma Frequency| Percent | Row Pct | Col Pct | 1| 2| Total ---------+--------+--------+ 1 | 10 | 70 | 80 | 8.33 | 58.33 | 66.67 | 12.50 | 87.50 | | 50.00 | 70.00 | 2 | 10 | 30 | 40 | 8.33 | 25.00 | 33.33 | 25.00 | 75.00 | | 50.00 | 30.00 | Total 20 100 120 16.67 83.33 100.00 Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ----------------------------------------------------------------- Case-Control (Odds Ratio) 0.4286 0.1616 1.1366 Cohort (Col1 Risk) 0.5000 0.2269 1.1018 Analysis for men

Analysis for women Table of ses by asthma ses asthma Frequency| Percent | Row Pct | Col Pct | 1| 2| Total ---------+--------+--------+ 1 | 20 | 60 | 80 | 11.11 | 33.33 | 44.44 | 25.00 | 75.00 | | 40.00 | 46.15 | 2 | 30 | 70 | 100 | 16.67 | 38.89 | 55.56 | 30.00 | 70.00 | | 60.00 | 53.85 | Total 50 130 180 27.78 72.22 100.00 Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ----------------------------------------------------------------- Case-Control (Odds Ratio) 0.7778 0.4010 1.5087 Cohort (Col1 Risk) 0.8333 0.5139 1.3513 Analysis for women

Tests if Odds Ratio is same for men and women POOLED ANALYSES Estimates of the Common Relative Risk (Row1/Row2) Type of Study Method Value 95% Confidence Limits ------------------------------------------------------------------------- Case-Control Mantel-Haenszel 0.6491 0.3749 1.1241 (Odds Ratio) Logit 0.6443 0.3725 1.1147 Cohort Mantel-Haenszel 0.7222 0.4794 1.0879 (Col1 Risk) Logit 0.7251 0.4801 1.0951 Breslow-Day Test for Homogeneity of the Odds Ratios ------------------------------ Chi-Square 0.9880 DF 1 Pr > ChiSq 0.3202 Pooled Analyses Tests if Odds Ratio is same for men and women

Class Exercise Among the 668 patients in TOMHS randomized to active treatment 74 experienced a CVD event during the study. Among the 234 patients randomized to placebo 38 had a CVD event. Compute the RR of CVD, for active versus placebo Compute the RO of CVD, for active versus placebo Use SAS to create the 2 x 2 table Using SAS compute the c2 statistic Using SAS compute the 95% for the RR and RO above