Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics Biostatistics 510 13-15 March 2007 Carla Talarico.

Slides:



Advertisements
Similar presentations
Chapter 18: The Chi-Square Statistic
Advertisements

Comparing Two Proportions (p1 vs. p2)
M2 Medical Epidemiology
Tutorial: Chi-Square Distribution Presented by: Nikki Natividad Course: BIOL Biostatistics.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Bivariate Analyses.
Simple Logistic Regression
Chapter 19 Stratified 2-by-2 Tables
Funded through the ESRC’s Researcher Development Initiative Department of Education, University of Oxford Session 3.3: Inter-rater reliability.
Chance, bias and confounding
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
1 If we live with a deep sense of gratitude, our life will be greatly embellished.
ChiSq Tests: 1 Chi-Square Tests of Association and Homogeneity.
EPI 809 / Spring 2008 Final Review EPI 809 / Spring 2008 Ch11 Regression and correlation  Linear regression Model, interpretation. Model, interpretation.
Today Concepts underlying inferential statistics
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 14 Inferential Data Analysis
Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 What is a Perfect Positive Linear Correlation? –It occurs when everyone has the.
Confounding, Effect Modification, and Stratification.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Inferential Statistics
Conditional Logistic Regression for Matched Data HRP /25/04 reading: Agresti chapter 9.2.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Stratification and Adjustment
Unit 6: Standardization and Methods to Control Confounding.
Analysis of Categorical Data
PTP 560 Research Methods Week 11 Question on article If p
Simple Linear Regression
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
1 Applied Statistics Using SAS and SPSS Topic: Chi-square tests By Prof Kelly Fan, Cal. State Univ., East Bay.
Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing two population proportions using independent samples.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
Chapter 16 The Chi-Square Statistic
October 15. In Chapter 19: 19.1 Preventing Confounding 19.2 Simpson’s Paradox 19.3 Mantel-Haenszel Methods 19.4 Interaction.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Analytical epidemiology Disease frequency Study design: cohorts & case control Choice of a reference group Biases Alain Moren, 2006 Impact Causality Effect.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
CHI SQUARE TESTS.
Matching (in case control studies) James Stuart, Fernando Simón EPIET Dublin, 2006.
Case Control Study : Analysis. Odds and Probability.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
11/20091 EPI 5240: Introduction to Epidemiology Confounding: concepts and general approaches November 9, 2009 Dr. N. Birkett, Department of Epidemiology.
March 30 More examples of case-control studies General I x J table Chi-square tests.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
More Contingency Tables & Paired Categorical Data Lecture 8.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Instructor Resource Chapter 15 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Introdcution to Epidemiology for Medical Students Université Paris-Descartes Babak Khoshnood INSERM U1153, Equipe EPOPé (Dir. Pierre-Yves Ancel) Obstetric,
Chapter 4 Selected Nonparemetric Techniques: PARAMETRIC VS. NONPARAMETRIC.
March 28 Analyses of binary outcomes 2 x 2 tables
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Chapter 18 Cross-Tabulated Counts
Gerald - P&R Chapter 7 (to 217) and TEXT Chapters 15 & 16
Natalie Robinson Centre for Evidence-based Veterinary Medicine
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Saturday, August 06, 2016 Farrokh Alemi, PhD.
If we can reduce our desire,
Analysis of Categorical Data
Evaluating Effect Measure Modification
15.1 The Role of Statistics in the Research Process
Categorical Data Analysis
Applied Statistics Using SPSS
Applied Statistics Using SPSS
Chapter 18: The Chi-Square Statistic
Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.
Presentation transcript:

Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics Biostatistics March 2007 Carla Talarico

Overview Variable stratification Cochran-Mantel-Haenszel (CMH) statistics Matching and matched data Agreement statistics –McNemar’s Test –Cohen’s Kappa

Stratification by a Third Variable Exposure of interest Disease outcome Third variable, e.g., confounder ED ? C

Confounding Effect of exposure on disease may be different in the presence of a third variable (“Confounder”) Reflects the fact that epidemiologic research is conducted among humans with unevenly distributed characteristics Results because of a lack of comparability between the exposed and unexposed groups in the base population

Controlling for Confounding Design phase of studies –Randomization in experimental studies –Restriction –Matching Analysis phase –Stratified analysis –Model fitting

Stratified Analyses: The CMH Option in SAS Gives a stratified statistical analysis of the relationship between Exposure (E) and Disease (D), after controlling for a Confounder (C): Proc freq; tables C * E * D / cmh; Run; Proc freq; tables C 1 * C 2 * E * D / cmh; Run; Can simultaneously stratify by multiple confounders:

Estimates of Common Relative Risk for 2x2 Tables Adjusted odds ratio (OR) and relative risk (RR) for stratified 2x2 tables with 95% CL Obtain OR and RR estimates for association between Exposure and Disease, adjusted for the Confounder For this course, report the Mantel-Haenszel estimate of the common odds ratio, OR MH

Breslow-Day Test for Homogeneity of the Odds Ratios For stratified 2x2 tables Null hypothesis is that the ORs are equal across all strata –χ 2 distribution with q – 1 df, where q is the number of strata Alternative hypothesis is that at least one stratum-specific OR differs from other stratum- specific ORs

χ 2 BD (con’t) If reject H 0 for χ 2 BD test: –There is evidence for heterogeneity of ORs across strata; not appropriate to report the adjusted common OR –Report the stratum-specific ORs when effect modification is present

CMH Statistic 1: Nonzero Correlation Tests the null hypothesis of no association vs. the alternative hypothesis that there is a linear association between the row and column variables in at least one stratum Both row and column variables have to be ordinal Under H 0, ~ χ 2 with 1 df

CMH Statistic 2: Row Mean Scores Differ Tests the null hypothesis of no association vs. the alternative hypothesis that the mean scores of the table rows are unequal for at least one stratum Useful only when the column variable is ordinal Under H 0, ~ χ 2 with (r – 1) df

CMH Statistic 3: General Association Tests the null hypothesis of no association vs. the alternative hypothesis that there is some kind of association between the row and column variables for at least one stratum Does not require the row or column variable to be ordinal Under H 0, ~ χ 2 with (r – 1)(c – 1) df

Matching Control for confounding more efficiently than if the matching had not been performed Design phase of a study Gain statistical efficiency in effect estimation

Matching (con’t) Select comparison participants into a study such that they are the same (or nearly the same) on certain variable(s) Matched design requires a matched analysis Once match on a variable, the effect of that variable cannot be estimated in your data set

Matched Data and the AGREE Option in SAS AGREE option computes tests and measures of agreement for square tables (where the number of rows equal the number of columns) title "McNemar's Test for highchol and hibmi for pill and non-pill"; proc freq data=pairs; tables hichol1*hichol2 hibmi1*hibmi2 / agree norow nocol; run;

AGREE Option in SAS AGREE option generates: -McNemar’s Test -Kappa -Weighted Kappa

McNemar’s Test of Symmetry for Matched Samples For 2x2 tables Appropriate when have data from matched pairs of subjects with a dichotomous (yes/no) outcome Null hypothesis of marginal homogeneity –Werner data set of matched pairs, comparing proportion of women with high cholesterol who take birth control pill to the proportion of women with high cholesterol who do not take the pill χ 2 distribution with 1 df

McNemar’s Test for Matched Proportions Werner data set with age- matched pairs Frequency Percent Pill: High Chol=1 Pill: High Chol=2Total No Pill: High Chol= No Pill: High Chol= Total Χ 2 M = (21 – 23) 2 (21 +23) = There are 92 pairs % of the NoPill group have high chol % of the Pill group have high chol.

Simple Kappa Coefficient (Cohen’s Kappa) Measure of inter-rater agreement, corrected for chance Scale from -1 to +1 –Κ = +1 when there is perfect agreement –Κ = 0 when the agreement equals that expected by chance Magnitude of Kappa reflects the strength of the agreement, beyond chance Κ = P 0 - P e 1 - Pe

Cohen’s Kappa (con’t) SAS gives 95% CI for Kappa Kappa Guidelines (Landis and Koch) Kappa Statistic Strength of Agreement <0.00Poor 0.00 – 0.20Slight 0.21 – 0.40Fair 0.41 – 0.60Moderate 0.61 – 0.80Substantial 0.81 – 1.00Almost perfect

Good Resources for Categorical Data Analysis and SAS SAS: Categorical Data Analysis Using The SAS System by Maura E. Stokes, Charles S. Davis, and Gary G. Koch. 2 nd Ed, SAS Institute Inc., Cary, NC, See pages of Biostat 510 course pack Kappa: “The Measurement of Observer Agreement for Categorical Data,” by J. Richard Landis and Gary G. Koch. Biometrics 33(1): , 1977