Lecture 13: Case-control studies: introduction to matching

Slides:



Advertisements
Similar presentations
Case-control study 3: Bias and confounding and analysis Preben Aavitsland.
Advertisements

1 Matching EPIET introductory course Mahón, 2011.
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
1 Epidemiologic Measures of Association Saeed Akhtar, PhD Associate Professor, Epidemiology Division of Epidemiology and Biostatistics Aga Khan University,
Matching in Case-Control Designs EPID 712 Lecture 13 02/23/00 Megan O’Brien.
M2 Medical Epidemiology
Observational Studies and RCT Libby Brewin. What are the 3 types of observational studies? Cross-sectional studies Case-control Cohort.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Chance, bias and confounding
Introduction to Risk Factors & Measures of Effect Meg McCarron, CDC.
BIOST 536 Lecture 12 1 Lecture 12 – Introduction to Matching.
Statistics for Health Care
Unit 6: Standardization and Methods to Control Confounding.
Multiple Choice Questions for discussion
Dr. Abdulaziz BinSaeed & Dr. Hayfaa A. Wahabi Department of Family & Community medicine  Case-Control Studies.
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
CHP400: Community Health Program- lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Case Control Studies Present: Disease Past:
Types of study designs Arash Najimi
Lecture 6 Objective 16. Describe the elements of design of observational studies: (current) cohort studies (longitudinal studies). Discuss the advantages.
Case-control studies Overview of different types of studies Review of general procedures Sampling of controls –implications for measures of association.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
Case Control Study Dr. Ashry Gad Mohamed MB, ChB, MPH, Dr.P.H. Prof. Of Epidemiology.
Matching (in case control studies) James Stuart, Fernando Simón EPIET Dublin, 2006.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
More Contingency Tables & Paired Categorical Data Lecture 8.
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
POPLHLTH 304 Regression (modelling) in Epidemiology Simon Thornley (Slides adapted from Assoc. Prof. Roger Marshall)
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Case-Control Studies Afshin Ostovar Bushehr University of Medical Sciences Bushehr, /14/20161.
Epidemiological Study Designs And Measures Of Risks (1)
Chapter 9: Case Control Studies Objectives: -List advantages and disadvantages of case-control studies -Identify how selection and information bias can.
F-tests continued.
March 28 Analyses of binary outcomes 2 x 2 tables
Study Designs Group Work
April 18 Intro to survival analysis Le 11.1 – 11.2
Present: Disease Past: Exposure
Notes on Logistic Regression
Matched Case-Control Study
How many study subjects are required ? (Estimation of Sample size) By Dr.Shaik Shaffi Ahamed Associate Professor Dept. of Family & Community Medicine.
Epidemiologic Measures of Association
Lecture 18 Matched Case Control Studies
Chow test.
Biostatistics Case Studies 2016
CASE-CONTROL STUDIES Ass.Prof. Dr Faris Al-Lami MB,ChB MSc PhD FFPH
The binomial applied: absolute and relative risks, chi-square
Journal Club Notes.
Random error, Confidence intervals and P-values
دانشگاه علوم پزشکی بوشهر دانشکده بهداشت
Lecture 1: Fundamentals of epidemiologic study design and analysis
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Lecture 3: Introduction to confounding (part 1)
Chapter 18 Cross-Tabulated Counts
BMTRY 747: Introduction Jeffrey E. Korte, PhD
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Lecture 4: Introduction to confounding (part 2)
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Evaluating Effect Measure Modification
Research Strategies.
Interpreting Epidemiologic Results.
Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.
Dr Luis E Cuevas – LSTM Julia Critchley
Enhancing Causal Inference in Observational Studies
Confounders.
Case-control studies: statistics
Enhancing Causal Inference in Observational Studies
Presentation transcript:

Lecture 13: Case-control studies: introduction to matching Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II Department of Public Health Sciences Medical University of South Carolina Spring 2015

Matching: overview Control confounding in the design stage Residual confounding may occur with matched variables, or with other covariates Matched analysis: pairs of subjects with the same exposure status (in a case-control study) are non-informative Only pay attention to discordant pairs

Matching (definition) Matching refers to the selection of a reference series (unexposed subjects in a cohort study or controls in a case-control study)– that is identical, or nearly so, to the index series with respect to the distribution of one or more potentially confounding factors - Rothman and Greenland, Modern Epidemiology, 1998

Matching Matching is most commonly done in case-control studies Cases (usually those with the disease of interest) are matched with controls (those without the disease) based on the value of a certain variable. Can be employed in other studies Exposed members of a cohort are matched with unexposed members and followed longitudinally

Types of Matching Individual Matching Frequency Matching One or more reference subjects with matching factor values equal to those of the index subject Frequency Matching Selection of an entire stratum of reference subjects with matching-factor values equal to that of a stratum of index subjects

Advantages of Matching Controls factors that are unknown or unmeasureable Matching siblings can control for many genetic and enviromental factors Improves precision (study validity) Convenience Narrows down a large number of controls Can reduce sampling variability Less heterogeneity across paired subjects Can increase statistical power Can improve efficiency by reducing sample size needed for a study

Disadvantages of Matching Lose the ability to assess the matched variable as a risk factor Additional time and expense Decision is irreversible Requires special analytic techniques If the matching variable is not an independent risk factor for the outcome, matching is wasteful. If the matching variable is not an independent risk factor for the outcome, but is associated with the risk factor, matching is wasteful and inefficient. Potentially alters the distribution of the risk factor

Elements of Matching Unit of analysis is the pair of components with something in common. Twins Siblings Neighborhood Controls Two components are selected to have different exposures (RCT or cohort study) or different outcomes (Case-Control)

Elements of Matching Matched pairs can be used in different study designs: Cross-sectional Cohort Case-Control Randomized Clinical Trial

Elements of Matching Same Subject Matching (within subject) Data pairs could be measured on the same subject Example: Compare skin-grafting techniques by applying each treatment to the same person Compare intra-ocular pressure in both eyes of one subject, after giving each eye a different treatment. Notice the data structure may bear a superficial resemblance to multi-level analysis…both analytic strategies take “groups” into account

Overmatching The factor to be matched is (partially or wholly) on the causal pathway between the risk factor and the outcome. Example Study Goal: Assess the association between alcohol consumption and heart disease (case-control study) Matching Variable: sleep apnea (want to control for conditions that may influence oxygenation) If sleep apnea is a condition that occurs as a result of alcohol consumption and is a risk factor for heart disease, then matching Forces cases and controls to have a more similar distribution of alcohol consumption Accordingly, attenuates the odds ratio The true association between alcohol consumption and heart disease cannot be assessed.

Example 1: Number of Sick Days Twin study (children) – 10 pairs of twins One twin is immunized The other twin is not Subjects are followed through school year This is a matched study – each pair of twins is a matched set for the clinical trial

Example 1: Number of Sick Days Pair Control Treated Control-Treated 1 4 2 8 6 3 5 7 -2 9 10

Example 1: Number of Sick Days Mean of column 3 = 1.5 days Standard deviation = = 1.78 days

Example 1: Number of Sick Days Standard error = = = 0.56 days 95% confidence interval: 1.5 ± 1.96*(0.56) 1.5 (0.40, 2.60) more days in control group t-test: p=0.026

Example 1: Unmatched analysis? Control mean = 4.6 days Treated mean = 3.1 days Difference between means: Estimated benefit = 1.5 days 95% CI (-0.47, 3.47) Two-sample t-test: p=0.145 Control Treated 4 8 6 5 3 2 7

Example 1: Number of Sick Days Conclusion: a study based on matched pairs can be more powerful than an unmatched study Tighter confidence limits; easier to show statistical significance In this case, the unmatched analysis did not result in bias (we still estimated 1.5 fewer sick days in the treated group)

Statistical Methods for Analyzing Matched Data Unmatched study: one entry for each subject Exposure E _ Outcome D a b c d

Statistical Methods for Analyzing Matched Data Matched cohort study: one entry for each pair Exposed D _ Unexposed a b c d

Statistical Methods for Analyzing Matched Data Matched case-control study: one entry for each pair Control E _ Case a b c d

Statistical Methods for Analyzing Matched Data Matched case-control study: Matched pair both exposed: add 1 to a Case exposed; control unexposed: add 1 to b Control exposed; case unexposed: add 1 to c Matched pair both unexposed: add 1 to d Only discordant pairs (in cells “b” and “c”) give useful information

Statistical Methods for Analyzing Matched Data Discordant pairs  estimate odds ratio! p1 = probability of exposure in cases p0 = probability of exposure in controls Therefore: probability of cell “b” = p1(1-p0) probability of cell “c” = p0(1-p1)

Statistical Methods for Analyzing Matched Data Formula for standard error of odds ratio: 95% confidence limits: lnOR ± 1.96(SE(lnOR)) (must then exponentiate confidence limits of lnOR to obtain 95% CI for odds ratio)

Statistical Methods for Analyzing Matched Data McNemar’s test Chi-squared statistic for matched data with 1 degree of freedom

Example 2: Matched Case-Control Study Research question: Is there an association between the amount of time a mother spends on her feet during a pregnancy and the likelihood of preterm birth? Study Sample: 223 matched case-control pairs of women who had given birth at a local hospital, 1992-93 Disease 1= preterm birth (<37 weeks gestation) 0= no preterm birth Exposure 1= mother’s work required standing 0= mother’s work did not require standing

Example 2 (continued) Matching Each case (disease=1) was matched with a control (disease=0) on the basis of Maternal Age (<3 years) and Parity (1, 0) 4 Possible Exposure Combinations of Matched Pairs Both Case and Control are EXPOSED Case EXPOSED and Control NON-EXPOSED Case NON-EXPOSED and Control EXPOSED Both Case and Control NON-EXPOSED

A Look at the Raw Data ID/ Matched Pair Preterm Birth (case/control) Age Parity Work Standing (exposure) 1 22 23 2 28 27 3 19 4 32

Example 2 (continued) Control (Standing) Control (Not Standing) Case (Standing) 147 31 Case 14 Note: Relevant information is confined to discordant pairs. OR cannot be estimated in studies in which all matched pairs have the same level of exposure

Example 2 (continued) Odds ratio for matched case-control study: ratio of the number of positive to negative discordant pairs. 31 pairs (in which exposed member experienced the outcome and the non-exposed member did not) 14 pairs (in which these outcomes were reversed) Odds Ratio= 31/14= 2.21

Example 2 (continued) Approximate 95% confidence interval: Two standard deviations on each side of the estimated log odds ratio Exponentiate the result (take the anti-logarithm) Confidence interval ranges from 1.16, 4.22 Conclusion: standing is associated with pre-term birth. Note: McNemar’s Chi-square statistic will provide the p-value testing the null hypothesis of no association between exposure and disease

Example 2: What happens if you ignore the matching? Four possible combinations of Matched Pairs Unexposed Control, Unexposed Case Unexposed Control, Exposed Case Exposed Control, Unexposed Case Exposed Control, Exposed Case Unexp Exp Control 1 + 2 3 + 4 Case 1 + 3 2 + 4

Unmatched Data from the same study Standing Not Standing Preterm 178 45 Term 161 62 OR= (178 x 62)/ (45 x 161) = 1.52 95% CI= 0.98, 2.36

When Matching is ignored? A noticeable difference between the matched and unmatched analyses Matched: OR = 2.21 (1.16, 4.22) Unmatched: OR = 1.52 (0.98, 2.36) Unmatched analysis ignores any correlation in exposure status between the case and control in the matched pair. If this correlation is substantial, then the unmatched analysis gives a biased result.

Should a matched analysis always be used for matched data? If there is no evidence of a correlation within pairs, should you still proceed with a matched analysis? NOT NECESSARILY Matched analyses can give an unstable result if the sample size is too small

Example 3: Matched Cohort Study Research question: Is there an association between vasectomy and myocardial infarction? Study Sample: 4830 exposed-unexposed pairs of men Matching Variables: Age (5-year band), current smoking status (yes/no) Disease Outcome 1= MI 0= No MI Exposure 1= Vasectomy 0= No vasectomy

Example 3: Matched Cohort Study Analysis The previous odds ratio computational methodology applies to pair-matched cross-sectional or cohort studies with binary outcomes. Matching Each matched pair contains one exposed and one un-exposed individual 4 Possible Exposure Combinations of Matched Pairs Unexposed has No Disease / Exposed has No Disease Unexposed has No Disease / Exposed has Disease Unexposed has Disease / Exposed has No Disease Unexposed has Disease / Exposed has Disease

Example 3 (continued) No Vasectomy MI No MI Vasectomy 20 16 4,794 20 16 4,794 Note: Relevant information is confined to discordant pairs. OR cannot be estimated in studies in which all matched pairs have the same disease outcome

Example 3: OR computation The odds ratio of a matched cohort study may be estimated by taking the ratio of the number of positive to negative discordant pairs. 20 pairs (in which exposed member experienced the outcome and the non-exposed member did not) 16 pairs (in which these outcomes were reversed) Odds Ratio= 20/16= 1.25 (0.65, 2.41) No association between having a vasectomy and suffering an MI.

Matched analysis in modeling Use “conditional logistic regression” Produce matched OR and confidence interval Control for confounders Ignores pairs that are concordant on all variables Logistic regression in unmatched data is “unconditional logistic regression” Either one of these can be done using PROC LOGISTIC

Conditional logistic regression in SAS proc logistic data=one; strata ID; model outcome=expose cov1 cov2; run;

Summary: reasons to match Control for confounding in design phase Nuisance variables Not important predictors you hope to assess Improve study efficiency Useful if sample size is limited May clarify or simplify decision-making about control recruitment

Summary: problems with matching Risk of over-matching, or unnecessary matching May add a layer of complexity and difficulty to the study implementation May be difficult or impossible to find a match for some individuals May therefore add expense Once matching is done, cannot be undone Must (usually) use matched analysis