C C ? ? E E D D Causal Diagrams -- DAGs DAGs = directed acyclic graphs; aka chain graphs Consist of nodes (variables) and arrows “Directed”: all arrows.

Slides:



Advertisements
Similar presentations
Confounding and effect modification
Advertisements

Case-control study 3: Bias and confounding and analysis Preben Aavitsland.
Analytical epidemiology
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Matching in Case-Control Designs EPID 712 Lecture 13 02/23/00 Megan O’Brien.
Sample size estimation
M2 Medical Epidemiology
Third training Module, EpiSouth: Multivariate analysis, 15 th to 19 th June 20091/29 Multivariate analysis: Introduction Third training Module EpiSouth.
Traps and pitfalls in medical statistics Arvid Sjölander.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
1 Confounding and Interaction: Part II  Methods to Reduce Confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Confounding and Interaction: Part II
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Chapter 19 Stratified 2-by-2 Tables
Chance, bias and confounding
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
Confounding and Interaction: Part III
Intermediate methods in observational epidemiology 2008 Confounding - II.
Confounding and Interaction: Part II
Winter Electives Molecular and Genetic Epidemiology
Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics Biostatistics March 2007 Carla Talarico.
Confounding and Interaction: Part II
Today Concepts underlying inferential statistics
Chapter 14 Inferential Data Analysis
THREE CONCEPTS ABOUT THE RELATIONSHIPS OF VARIABLES IN RESEARCH
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Are the results valid? Was the validity of the included studies appraised?
Stratification and Adjustment
Unit 6: Standardization and Methods to Control Confounding.
Concepts of Interaction Matthew Fox Advanced Epi.
Measuring Associations Between Exposure and Outcomes.
Chapter 1: Introduction to Statistics
Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Confounding in epidemiology
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Amsterdam Rehabilitation Research Center | Reade Multiple regression analysis Analysis of confounding and effectmodification Martin van de Esch, PhD.
October 15. In Chapter 19: 19.1 Preventing Confounding 19.2 Simpson’s Paradox 19.3 Mantel-Haenszel Methods 19.4 Interaction.
Analytical epidemiology Disease frequency Study design: cohorts & case control Choice of a reference group Biases Alain Moren, 2006 Impact Causality Effect.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
C E D ?. DAGs also useful for Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction.
Confounding and Interaction: Part III Methods to reduce confounding –during study design: Randomization Restriction Matching Instrumental variables –during.
Matching (in case control studies) James Stuart, Fernando Simón EPIET Dublin, 2006.
Issues concerning the interpretation of statistical significance tests.
Tim Wiemken PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky Confounding.
Instructor Resource Chapter 14 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Case Control Study : Analysis. Odds and Probability.
11/20091 EPI 5240: Introduction to Epidemiology Confounding: concepts and general approaches November 9, 2009 Dr. N. Birkett, Department of Epidemiology.
A short introduction to epidemiology Chapter 9: Data analysis Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
More Contingency Tables & Paired Categorical Data Lecture 8.
Sample Size Determination
Instructor Resource Chapter 15 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Confounding and effect modification Epidemiology 511 W. A. Kukull November
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
(www).
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Validity in epidemiological research Deepti Gurdasani.
Sample Size Determination
Epidemiology 503 Confounding.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Saturday, August 06, 2016 Farrokh Alemi, PhD.
Evaluating Effect Measure Modification
Interpreting Epidemiologic Results.
Confounders.
Effect Modifiers.
Presentation transcript:

C C ? ? E E D D Causal Diagrams -- DAGs DAGs = directed acyclic graphs; aka chain graphs Consist of nodes (variables) and arrows “Directed”: all arrows have one-way direction and depict causal relationships “Acyclic”: there is never a complete circle (i.e. no factor can cause itself) Better than the rough criteria for confounding when planning studies and analyses Identifies pitfalls of adjusting and not adjusting for certain variables Frontier of epidemiologi c theory Research Question: Does E cause D? Forces investigator to conceptualize system

Birth Defects Folate Intake Stillbirths ? ? RQ: Does lack of folate intake cause birth defects? Use of DAGs to Identify What is Not Confounding Stillbirths are a “common effect” of both the exposure and disease – not a common cause. Common effects are called “colliders” Adjusting for colliders OPENS paths. Will actually result in bias. It is harmful. Stillbirths are a “common effect” of both the exposure and disease – not a common cause. Common effects are called “colliders” Adjusting for colliders OPENS paths. Will actually result in bias. It is harmful. Hernan AJE 2002

DAGs point out special issue when estimating direct effects RQ: Does aspirin prevent CHD in a pathway other than through platelet aggregation –Assumes no common cause of platelet agg. and D Would be correct to adjust But if –Assume common cause (e.g., genetic component) –Need other statistical methods to resolve Aspirin Coronary Heart Disease Platelet Aggregation ? ? Aspirin Coronary Heart Disease Platelet Aggregation ? ? Genetic factors (not measured) Would be incorrect to adjust OR not to adjust for platelet aggregation Cole and Hernan IJE 2002

Confounding and Interaction: Part III Methods to reduce confounding –during study design: Randomization Restriction Matching –during study analysis: Stratified analysis –Forming “Adjusted” Summary Estimates –Concept of weighted average »Woolf’s Method »Mantel-Haenszel Method –Handling more than one potential confounder –Role of an analysis plan Another design technique: Instrumental variables Quantitative assessment of unmeasured confounding Limitations of stratification –motivation for multivariable regression Limitations of conventional adjustment –motivation for other “causal” techniques

Effect-Measure Modification Stratified Crude No Caffeine Use Heavy Caffeine Use RR crude = 1.7 RR no caffeine use = 2.4 RR caffeine use = 0.7. cs delayed smoking, by(caffeine) caffeine | RR [95% Conf. Interval] M-H Weight no caffeine | heavy caffeine | Crude | M-H combined | Test of homogeneity (M-H) chi2(1) = Pr>chi2 = Report interaction; confounding is not relevant

Report vs Ignore Effect-Measure Modification? Some Guidelines Is an art form: requires consideration of clinical, statistical and practical considerations

Does AZT after needlesticks prevent HIV? Minor Severity Major Severity Crude Stratified OR crude =0.61 OR = 0.0 OR = 0.35 Report or ignore interaction?

General Framework for Stratification Design phase: Create a DAG –Decide which variables to control for Implementation phase: measure the confounders (or other variables needed to block path) Analysis phase: Report Effect-Measure Modification? (assess clinical, statistical, and practical considerations) yesno Derive summary “adjusted” estimate Report stratum- specific estimates Report crude estimate, 95% CI, p value Decide which variables to adjust for in final estimate nonesome Report adjusted estimate, 95% CI, p value

Assuming Interaction is not Present, Form a Summary of the Unconfounded Stratum-Specific Estimates Construct a weighted average –Assign weights to the individual strata –Summary Adjusted Estimate = Weighted Average of the stratum-specific estimates –a simple mean is a weighted average where the weights are equal to 1 –which weights to use depends on type of effect estimate desired (OR, RR, RD), characteristics of the data, and goal of research –e.g., Woolf’s method Mantel-Haenszel method Standardization (see text)

Forming a Summary Adjusted Estimate for Stratified Data Minor Severity Major Severity Crude Stratified OR crude = 0.61 OR = 0.0OR = 0.35 How would you weight these strata?

Summary Estimators: Woolf’s Method aka Directly pooled or precision estimator Woolf’s estimate for adjusted odds ratio –where w i – w i is the inverse of the variance of the stratum-specific log(odds ratio)

Calculating a Summary Effect Using the Woolf Estimator e.g., AZT use, severity of needlestick, and HIV Minor Severity Major Severity Crude Stratified OR crude =0.61 OR = 0.0OR = 0.35 Problem: cannot take log of 0; cannot divide by zero

Summary Adjusted Estimator: Woolf’s Method Conceptually straightforward Best when: –number of strata is small –sample size within each strata is large Cannot be calculated when any cell in any stratum is zero because log(0) is undefined –1/2 cell corrections have been suggested but are subject to bias Formulae for Woolf’s summary estimates for other measures (e.g., risk ratio, RD) available in texts and software documentation

Summary Adjusted Estimators: Mantel-Haenszel Mantel-Haenszel estimate for odds ratios –OR MH = –w i = –w i is inverse of the variance of the stratum- specific odds ratio under the null hypothesis (OR =1)

Summary Adjusted Estimator: Mantel-Haenszel Relatively resistant to the effects of large numbers of strata with few observations Resistant to cells with a value of “0” Computationally easy Most commonly used in commercial software

Calculating a Summary Adjusted Effect Using the Mantel-Haenszel Estimator OR MH = Minor Severity Major Severity Crude Stratified OR crude =0.61 OR = 0.0OR = 0.35

Calculating a Summary Effect in Stata To stratify by a third variable: –cs var case var exposed, by(var third variable ) –cc var case var exposed, by(var third variable ) Default summary estimator is Mantel-Haenszel –“, pool” will also produce Woolf’s method epitab command - Tables for epidemiologists

Calculating a Summary Effect Using the Mantel-Haenszel Estimator e.g. AZT use, severity of needlestick, and HIV. cc HIV AZTuse,by(severity) pool severity | OR [95% Conf. Interval] M-H Weight minor | major | Crude | Pooled (direct) |... M-H combined | Test of homogeneity (B-D) chi2(1) = 0.60 Pr>chi2 = Test that combined OR = 1: Mantel-Haenszel chi2(1) = 6.06 Pr>chi2 = Minor Severity Major Severity Crude Stratified OR crude =0.61 OR = 0.0 OR = 0.35

Calculating a Summary Effect Using the Mantel-Haenszel Estimator In addition to the odds ratio, Mantel-Haenszel estimators are also available in Stata for: –risk ratio “cs var case var exposed, by(var third variable )” –rate ratio “ir var case var exposed var time, by(var third variable )”

After Confounding is Managed: Confidence Interval Estimation and Hypothesis Testing for the Mantel- Haenszel Estimator e.g. AZT use, severity of needlestick, and HIV. cc HIV AZTuse,by(severity) pool severity | OR [95% Conf. Interval] M-H Weight minor | major | Crude | Pooled (direct) |... M-H combined | Test of homogeneity (B-D) chi2(1) = 0.60 Pr>chi2 = Test that combined OR = 1: Mantel-Haenszel chi2(1) = 6.06 Pr>chi2 = What does the p value = mean?

Mantel-Haenszel Confidence Interval and Hypothesis Testing

Mantel-Haenszel Techniques Mantel-Haenszel estimators Mantel-Haenszel chi-square statistic Mantel’s test for trend (dose-response)

Spermicides, maternal age & Down Syndrome Age < 35 Age > 35 Crude Stratified OR = 3.4 OR = 5.7 OR = 3.5 Which answer should you report as “final”? What undesired feature has stratification caused?

Effect of Adjustment on Precision (Variance) Adjustment can increase or decrease standard errors (and CI’s) depending upon: –Nature of outcome (interval scale vs. binary) –Measure of association desired –Method of adjustment (Woolf vs M-H vs MLE) –Strength of association between potential confounding factor and exposure/disease Complex and difficult to memorize Good news: adjustment for strong confounders removes bias and often improves precision Bad news: adjustment for less-than-strong confounders can often (but not always) worsen precision

Effect of Adjustment on Precision Stratified Crude Matches Absent Matches Present OR crude = 21.0 (95 % CI: ) OR matches = 21.0 OR no matches = 21.0 OR adj = 21.0 (95 % CI: )

Whether or not to accept the “adjusted” summary estimate instead of the crude? Methodologic literature is inconsistent on this –Bias-variance tradeoff Scientifically most rigorous approach is to: –Create the DAG and identify potential confounders –Prior to adjustment, create two lists of potential confounders “A” List: Those factors for which you will accept the adjusted result no matter how small the difference from the crude. –Factors strongly believed to be confounders “B” List: Those factors for which you will accept the adjusted result only if it meaningfully differs from the crude (with some pre-specified difference, e.g., 5 to 10%). – “Change-in-estimate” approach –Factors you are less sure about For some analyses, may have no factors on A list. For other analyses, no factors on B list. Always putting all factors on A list may seem conservative, but not necessarily the right thing to do in light of penalty of statistical imprecision Bias control paramount Need for tradeoffs

Choosing the crude or adjusted estimate? Assume all factors are on B list and a 10% change-in-estimate rule is in place

No Role for Statistical Testing for Confounding Testing for statistically significant differences between crude and adjusted measures is inappropriate –e.g., examining an association for which a factor is a known confounder (say age in the association between hypertension and CAD) –if the study has a small sample size, even large differences between crude and adjusted measures may not be statistically different yet, we know confounding is present therefore, the difference between crude and adjusted measures cannot be ignored as merely chance. bias must be prevented and hence adjusted estimate is preferred we must live with whatever effects we see after adjustment for a factor for which there is a strong a priori belief about confounding –the issue of confounding is one of bias, not of sampling error. Other than in RCTs, we’re not concerned that sampling error is causing confounding and therefore we don’t have to worry about testing for role of chance

Spermicides, maternal age & Down Syndrome Age < 35 Age > 35 Crude Stratified OR = 3.4 OR = 5.7 OR = 3.5 Which answer should you report as “final”?

Stratifying by Multiple Potential Confounders Crude Stratified <40 smokers >60 non-smokers40-60 non-smokers<40 non-smokers smokers>60 smokers

The Need for Evaluation of Joint Confounding Variables that evaluated alone show no confounding may show confounding when evaluated jointly Crude Stratified by Factor 1 alone by Factor 2 alone by Factor 1 & 2

Murray et al. Population Health Metrics 2003 WHO Causal Model of Coronary Heart Disease

Approaches for When More than One Potential Confounder is Present Backward vs forward variable selection strategies –relevant both for stratification and multivariable regression modeling (“model selection”) Backwards Strategy –initially evaluate all potential confounders together (i.e., look for joint confounding) –preferred because in nature variables act together –Procedure: with all potential confounders considered, form adjusted estimate. This is the “gold standard” Of variables on the B list, one variable can then be dropped and the adjusted estimate is re-calculated (adjusted for remaining variables) if the dropping of the first variable results in a non- meaningful (eg < 5 or 10%) change compared to the gold standard, it can be eliminated continue until no more variables can be dropped (i.e. all remaining variables are relevant) –Problem: With many potential confounders and multiple stratified analyses, p values (too small) & confidence intervals (too narrow) lose their nominal interpretation –Active area of methodologic interpretation With many potential confounders, cells become very sparse and many strata provide no information

Approaches for When More than One Potential Confounder is Present Forward Strategy –start with the variable that has the biggest “change-in-estimate” impact when evaluated individually –then add the variable with the second biggest impact –keep this variable if its presence meaningfully changes the adjusted estimate –procedure continues until no other added variable has an important impact –Advantage: avoids the initial sparse cell problem of backwards approach –Problem: does not evaluate joint confounding effects of many variables Multiple analyses again lead to problems in interpreting p values and CI’s

An Analysis Plan Available methods often arbitrary and invite fishing for desired answers Solution: Analysis plan Written before the data are analyzed Content –Detailed description of the techniques to be used to analyze data, step by step –Forms the basis of “Statistical Analysis” section in manuscripts –Parameters/rules/logic to guide key decisions: which variables will be assessed for interaction and for adjustment? what p value will be used to guide reporting of interaction? what is a meaningful change-in-estimate threshold between two estimates (e.g., 10%) to determine model selection? Utility: A plan helps to keep the analysis: –Focused –Transparent –Reproducible –Honest (avoids p value shopping)

Instrumental Variables to Manage Confounding C1C1 C1C1 ? ? E E D D Unmeasured C Instrumental variable (IV) C2C2 C2C2 IV must be related to E but nothing else Assess association between IV and D to estimate E-D relationship ? ? Length of stay Neonatal outcomes Unmeasured C Hour of birth Prenatal complications Malkin et al. Heath Serv. Res., 2000 RQ: Does length of stay determine neonatal outcomes?

Residual Confounding Four Mechanisms Categorization of confounder too broad –e.g., Association between natural menopause and prevalent CHD Szklo and Nieto, 2007 Misclassification of confounders –Can be differential or non-differential with respect to exposure and disease –If non-differential, will lead to adjusted estimates somewhere in between crude and true adjusted –If differential, can lead to a variety of unpredictable directions of bias

Residual Confounding Mechanisms – cont’d Variable used for adjustment is imperfect surrogate for true confounder CRP level ? ? Periodontal disease CAD Inflammatory Predisposition Inflammatory Predisposition Unmeasured confounders Age ? ? E E D D Unmeasured C

Quantitative Analysis of Unmeasured Confounding Can back calculate to determine how a confounder would need to act in order to spuriously cause any apparent odds ratio. Example: OR= 2.0 Prevalence of “high” level of unmeasured confounder Association between unmeasured confounder and disease (risk ratio) Association between unmeasured confounder and exposure (prevalence ratio) A (low prevalence scenario) = 7 B (high prevalence scenario) = 3.4 Winkelstein et al., AJE 1984

Stratification to Manage Confounding Advantages –straightforward to implement and comprehend –easy way to evaluate interaction Limitations –Requires continuous variables to be discretized loses information; possibly results in “residual confounding” Discretizing often brings less precision –Deteriorates with multiple confounders e.g., suppose 4 confounders with 3 levels –3x3x3x3=81 strata needed –unless huge sample, many cells have “0”’s and strata have undefined effect measures –Solution: Mathematical modeling (multivariable regression) –e.g. »linear regression »logistic regression »proportional hazards regression

Limitation of Conventional Stratification (and Regression) RQ: Does coffee use cause CAD? ? ? Coffee CAD Cholesterol level Behavioral factors (unmeasured) RQ: Does HAART prevent AIDS/Death? Simultaneous desire to control for cholesterol/CD4 to manage confounding and NOT to control because they are intermediary variables AIDS HAART CD4 count Severity of HIV (unmeasured) ? ?

When factors are simultaneously confounders and intermediaries, conventional techniques fail and “causal methods” are needed Causal methods: g-estimation, structural nested models, marginal structural models Cole et al, AJE 2003

Regression is ahead but don’t forget about the simple techniques ….. “Because of the increased ease and availability of computer software, the last few years have seen a flourishing of the use of multivariate analysis in the biomedical literature. These highly sophisticated mathematic models, however, rarely eliminate the need to examine carefully the raw data by means of scatter diagrams, simple n x k table, and stratified analyses.” Szklo and Nieto 2007 “The widespread availability and user-friendly nature of computer software make the method accessible to some data analysts who may not have had adequate instruction in its appropriate applications. When they are misapplied, multivariate techniques have the potential to contribute to incorrect model development, misleading results, and inappropriate interpretation of the effect of hypothesized confounders.” Friis and Sellers, 2009 “Statistical software is like raising the gas pedal in a car for a 4 year old.” Peter Bachetti (UCSF), date unknown

Next Tuesday (12/9/08) –8:45 to 10:15: Journal Club –1:30 to 3:00 pm: Mitch Katz “Conceptual approach to multivariable regression” Note chapters in his textbook –3:15 to 4:45: Last Small Group Section Web-based course evaluation Bring laptop –Distribute Final Exam (on line) Exam due 12/16 in hands of Olivia by 4 pm by ( or China Basin 5700