Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction »Matching –during study analysis:

Slides:



Advertisements
Similar presentations
Case-control study 3: Bias and confounding and analysis Preben Aavitsland.
Advertisements

Analytical epidemiology
Agency for Healthcare Research and Quality (AHRQ)
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
M2 Medical Epidemiology
Designing Clinical Research Studies An overview S.F. O’Brien.
Observational Studies and RCT Libby Brewin. What are the 3 types of observational studies? Cross-sectional studies Case-control Cohort.
CONCEPTS UNDERLYING STUDY DESIGN
1 Confounding and Interaction: Part II  Methods to Reduce Confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction »Matching –during study analysis:
Confounding and Interaction: Part II
Case-Control Studies (Retrospective Studies). What is a cohort?
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Chapter 19 Stratified 2-by-2 Tables
Sensitivity Analysis for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Chance, bias and confounding
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
Third Training Module, EpiSouth: Stratification, 15 th to 19 th June /50 Stratification: Confounding, Effect modification Third training Module EpiSouth.
Confounding and Interaction: Part III
Confounding and Interaction: Part II
Winter Electives Molecular and Genetic Epidemiology
1June In Chapter 19: 19.1 Preventing Confounding 19.2 Simpson’s Paradox (Severe Confounding) 19.3 Mantel-Haenszel Methods 19.4 Interaction.
Cohort Studies.
Categorical Data Analysis: Stratified Analyses, Matching, and Agreement Statistics Biostatistics March 2007 Carla Talarico.
Confounding and Interaction: Part II
Case-Control Studies. Feature of Case-control Studies 1. Directionality Outcome to exposure 2. Timing Retrospective for exposure, but case- ascertainment.
Intermediate methods in observational epidemiology 2008 Confounding - I.
Case Control Study Manish Chaudhary BPH, MPH
Are the results valid? Was the validity of the included studies appraised?
Stratification and Adjustment
Cohort Study.
Unit 6: Standardization and Methods to Control Confounding.
The third factor Effect modification Confounding factor FETP India.
Multiple Choice Questions for discussion
Concepts of Interaction Matthew Fox Advanced Epi.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 7: Gathering Evidence for Practice.
Chapter 1: Introduction to Statistics
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Confounding, Matching, and Related Analysis Issues Kevin Schwartzman MD Lecture 8a June 22, 2005.
A short introduction to epidemiology Chapter 2b: Conducting a case- control study Neil Pearce Centre for Public Health Research Massey University Wellington,
October 15. In Chapter 19: 19.1 Preventing Confounding 19.2 Simpson’s Paradox 19.3 Mantel-Haenszel Methods 19.4 Interaction.
VSM CHAPTER 6: HARM Evidence-Based Medicine How to Practice and Teach EMB.
Analytical epidemiology Disease frequency Study design: cohorts & case control Choice of a reference group Biases Alain Moren, 2006 Impact Causality Effect.
Case Control Study Dr. Ashry Gad Mohamed MB, ChB, MPH, Dr.P.H. Prof. Of Epidemiology.
Measuring associations between exposures and outcomes
C E D ?. DAGs also useful for Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction.
Case-Crossover Studies.
Issues concerning the interpretation of statistical significance tests.
Instructor Resource Chapter 14 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Case Control Study : Analysis. Odds and Probability.
11/20091 EPI 5240: Introduction to Epidemiology Confounding: concepts and general approaches November 9, 2009 Dr. N. Birkett, Department of Epidemiology.
A short introduction to epidemiology Chapter 9: Data analysis Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
Instructor Resource Chapter 15 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
Design of Clinical Research Studies ASAP Session by: Robert McCarter, ScD Dir. Biostatistics and Informatics, CNMC
Types of Studies. Aim of epidemiological studies To determine distribution of disease To examine determinants of a disease To judge whether a given exposure.
THE CHI-SQUARE TEST BACKGROUND AND NEED OF THE TEST Data collected in the field of medicine is often qualitative. --- For example, the presence or absence.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
1 Study Design Imre Janszky Faculty of Medicine, ISM NTNU.
Purpose of Epi Studies Discover factors associated with diseases, physical conditions and behaviors Identify the causal factors Show the efficacy of intervening.
(www).
Case Control study. An investigation that compares a group of people with a disease to a group of people without the disease. Used to identify and assess.
Understanding Results
Lecture 3: Introduction to confounding (part 1)
Kanguk Samsung Hospital, Sungkyunkwan University
Evaluating Effect Measure Modification
The Aga Khan University
Confounders.
Effect Modifiers.
Presentation transcript:

Confounding and Interaction: Part II  Methods to reduce confounding –during study design: »Randomization »Restriction »Matching –during study analysis: »Stratified analysis  Interaction –What is it? How to detect it? –Additive vs. multiplicative interaction –Comparison with confounding –Statistical testing for interaction –Implementation in Stata

Confounding Confounder D D ANOTHER PATHWAY TO GET TO THE DISEASE (a mixing of effects) ANOTHER PATHWAY TO GET TO THE DISEASE (a mixing of effects) RQ: Is E associated with D independent of C?

Methods to Prevent or Manage Confounding D D D D or By prohibiting at least one “arm” of the exposure- confounder - disease structure, confounding is precluded

Randomization to Reduce Confounding  Definition: random assignment of subjects to exposure (e.g., treatment) categories  All subjects  Randomize  Distribution of any variable is theoretically the same in the exposed group as the unexposed –Theoretically, can be no association between exposure and any other variable  One of the most important inventions of the 20th Century! Exposed (treatment) Unexposed

Randomization to Reduce Confounding D D Explains the exulted role of randomization in clinical research Randomization prevents confounding

Randomization to Reduce Confounding  All subjects  Randomize  Applicable only for ethically assignable exposures (ie, interventions, experiments) –Not for naturally occuring exosures (e.g., air pollution)  Special strength of randomization is its ability to control the effect of confounding variables about which the investigator is unaware –Because distribution of any variable theoretically same across randomization groups  Does not, however, always eliminate confounding! –By chance alone, there can be imbalance –Less of a problem in large studies –Techniques exist to ensure balance of certain variables Exposed Unexposed

Restriction to Reduce Confounding  AKA Specification  Definition: Restrict enrollment to only those subjects who have a specific value/range of the confounding variable –e.g., when age is confounder: include only subjects of same narrow age range But what if we cannot randomize?

Restriction to Reduce Confounding D D Birth order Down syndrome Age 30 only No variability in C precludes associaton with E and D

Restriction to Prevent Confounding –Problem: degree of injection drug use is difficult to measure –Solution: restrict to subjects with no injection drug use, thereby precluding the need to measure degree of injection use –Cannon et. al NEJM 2001 »Restricted to persons denying injection drug use Commercial sex HHV-8 Injection drug use ?  e.g. –RQ: Is there an association between sexual behavior and acquisition of HHV-8 infection? –Issue: Is association confounded by injection drug use?  Particularly useful when confounder is quantitative in scale but difficult to measure

Restriction to Reduce Confounding  Advantages: –conceptually straightforward –handles difficult to quantitate variables –can also be used in analysis phase  Disadvantages: –may limit number of eligible subjects –inefficient to screen subjects, then not enroll –“residual confounding” may persist if restriction categories not sufficiently narrow (e.g. “20 to 30 years old” might be too broad) –limits generalizability (but don’t worry about this) –not possible to evaluate the relationship of interest at different levels of the restricted variable(i.e. cannot assess interaction)  Bottom Line –not used as much as it should be

Matching to Reduce Confounding  A complex topic  Definition: only unexposed/non-case subjects are chosen who match those of the comparison group (either exposed or cases) in terms of the confounder in question  Mechanics depends upon study design: –e.g. cohort study: unexposed subjects are “matched” to exposed subjects according to their values for the potential confounder. »e.g. matching on race One unexposed black enrolled for each exposed black One unexposed asian enrolled for each exposed asian –e.g. case-control study: non-diseased controls are “matched” to diseased cases »e.g. matching on age One control age 50 enrolled for each case age 50 One control age 70 enrolled for each case age 70

Matching to Reduce Confounding D D D D or Cohort design Case-control design Also illustrates a limitation

Advantages of Matching 1. Useful in preventing confounding by factors which would be nearly impossible to manage in analysis phase –e.g., “neighborhood” is a nominal variable with multiple values (complex nominal variable) –e.g., Case-control study of the effect of a BCG vaccine in preventing TB (Int J Tub Lung Dis. 2006) »Cases: newly diagnosed TB in Brazil »Controls: persons without TB »Exposure: receipt of a BCG vaccine »Potential confounder: neighborhood (village) of residence; related to ambient TB incidence and practices regarding BCG vaccine »Control sampling: Relying upon random sampling without attention to neighborhood may result in (especially in a small study) choosing no controls from some of the neighborhoods seen in the case group (i.e., cases and controls lack overlap) Matching on neighborhood ensures overlap »Even if all neighborhoods seen in the case group were represented in the control group, adjusting for neighborhood with “analysis phase” strategies is problematic

If you chose to stratify to manage confounding, the number of strata may be unwieldy Crude Stratified Mission CastroPacific Heights Marina SunsetRichmond Matching avoids this

Advantages of Matching 2. By ensuring a balanced number of cases and controls (in a case-control study) or exposed/unexposed (in a cohort study) within the various strata of the confounding variable, statistical precision may be increased

Smoking, Matches, and Lung Cancer B. Controls matched on smoking A. Random sample of controls Crude Non-SmokersSmokers OR crude = 8.8 OR CF+ = OR smokers = 1.0 OR CF- = OR non - smokers = 1.0 OR adj = 1.0 Stratified SmokersNon-Smokers OR CF+ = OR smokers = 1.0 OR CF- = OR non - smokers = 1.0 OR adj = 1.0 (0.31 to 3.2) Little known benefit of matching: Improved precision (0.40 to 2.5)

Disadvantages of Matching 1. Finding appropriate matches may be difficult and expensive. Therefore, the gains in statistical efficiency can be offset by losses in overall efficiency. 2. In a case-control study, factor used to match subjects cannot be itself evaluated as a risk factor for the disease. In general, matching decreases robustness of study to address secondary questions. 3. Decisions are irrevocable - if you happened to match on an intermediary factor, you have lost ability to evaluate role of exposure in question via that pathway. e.g. study of effect of sexual activity on cervical cancer. Matching on HPV status precludes ability to look at sexual activity 4. If potential confounding factor really isn’t a confounder, statistical precision can be worse than no matching. Think carefully before you match and seek advice

Stratification to Reduce Confounding  Goal: evaluate the relationship between the exposure and outcome in strata homogeneous with respect to potentially confounding variables  Each stratum is a mini-example of restriction!  CF = confounding factor Crude Stratified CF Level I CF Level 3 CF Level 2 Strategies in the analysis phase:

Smoking, Matches, and Lung Cancer Stratified Crude Non-SmokersSmokers OR crude OR CF+ = OR smokers OR CF- = OR non - smokers  OR crude = 8.8  Each stratum in unconfounded with respect to smoking  OR smokers = 1.0  OR non-smoker = 1.0

Stratifying by Multiple Confounders with More than 2 Levels Potential Confounders: Age and Smoking  To control for multiple confounders simultaneously, must construct mutually exclusive and exhaustive strata: Crude

Stratifying by Multiple Potential Confounders Crude Stratified <40 smokers >60 non-smokers40-60 non-smokers<40 non-smokers smokers>60 smokers Each of these strata is unconfounded by age and smoking

Adjusted Estimate from the Stratified Analyses  After the stratum have been formed, what next?  Goal: Create a single unconfounded (“adjusted”) estimate for the relationship in question –e.g., relationship between matches and lung cancer after adjustment (controlling) for smoking  Process: Summarize the unconfounded estimates from the two (or more) strata to form a single overall unconfounded “adjusted” estimate –e.g., summarize the odds ratios from the smoking stratum and non-smoking stratum into one odds ratio

Smoking, Matches, and Lung Cancer Stratified Crude Non-SmokersSmokers OR crude OR CF+ = OR smokers OR CF- = OR non - smokers  OR crude = 8.8  OR smokers = 1.0  OR non-smoker = 1.0  OR adjusted = 1.0

Smoking, Caffeine Use and Delayed Conception Stratified Crude No Caffeine Use Heavy Caffeine Use RR crude = 1.7 RR no caffeine use = 2.4RR caffeine use = 0.7 Is it appropriate to summarize these two stratum-specific estimates into a single number? Stanton and Gray. AJE 1995

Underlying Assumption Needed to Form a Summary of the Unconfounded Stratum-Specific Estimates  If the relationship between the exposure and the outcome varies meaningfully in a clinical/biologic sense across strata of a third variable: –it is not appropriate to create a single summary estimate of all of the strata  i.e. When you summarize across strata, the assumption is that no “interaction” is present

Interaction  Definition –when the magnitude of a measure of association (between exposure and disease) meaningfully differs according to the value of some third variable  Synonyms –Effect modification –Effect-measure modification –Heterogeneity of effect  Proper terminology –e.g. Smoking, caffeine use, and delayed conception »Caffeine use modifies the effect of smoking on the risk for delayed conception. »There is interaction between caffeine use and smoking in the risk for delayed conception. »Caffeine is an effect modifier in the relationship between smoking and delayed conception.

RR = 3.0 RR = 11.2

RR = 0.7 RR = 2.4

Interaction is everywhere  Susceptibility to infectious diseases –e.g., »exposure: sexual activity »disease: HIV infection »effect modifier: chemokine receptor phenotype  Susceptibility to non-infectious diseases –e.g., »exposure: smoking »disease: lung cancer »effect modifier: genetic susceptibility to smoke  Susceptibility to drugs (efficacy and side effects) »effect modifier: genetic susceptibility to drug  But in practice to date, difficult to document –Genomics may change this

Smoking, Caffeine Use and Delayed Conception: Additive vs Multiplicative Interaction Stratified Crude No Caffeine Use Heavy Caffeine Use RR crude = 1.7 RD crude = 0.07 RR no caffeine use = 2.4 RD no caffeine use = 0.12 RR caffeine use = 0.7 RD caffeine use = RD = Risk Difference = Risk exposed - Risk Unexposed (Text unfortunately calls this attributable risk) Additive interaction Multiplicative interaction

Additive vs Multiplicative Interaction  Assessment of whether interaction is present depends upon the measure of association –ratio measure (multiplicative interaction) or difference measure (additive interaction) –Hence, the term effect-measure modification  Absence of multiplicative interaction implies presence of additive interaction (exception: no association) Additive interaction present Multiplicative interaction absent RR = 3.0 RD = 0.3 RR = 3.0 RD = 0.1

Additive vs Multiplicative Interaction  Absence of additive interaction implies presence of multiplicative interaction Multiplicative interaction present Additive interaction absent RR = 3.0 RD = 0.1 RR = 1.7 RD = 0.1

Additive vs Multiplicative Interaction  Presence of multiplicative interaction may or may not be accompanied by additive interaction Additive interaction present No additive interaction RR = 2.0 RD = 0.1 RR = 3.0 RD = 0.4 RR = 3.0 RD = 0.1

Additive vs Multiplicative Interaction  Presence of additive interaction may or may not be accompanied by multiplicative interaction Multiplicative interaction absent Multiplicative interaction present RR = 3.0 RD = 0.1 RR = 3.0 RD = 0.4 RR = 2.0 RD = 0.1 RR = 3.0 RD = 0.2

Additive vs Multiplicative Interaction  Presence of qualitative multiplicative interaction is always accompanied by qualitative additive interaction Multiplicative and additive interaction both present e.g., smoking, caffeine, delayed ocnception

Additive vs Multiplicative Scales  Which do you want to use?  Multiplicative measures (e.g., risk ratio) –favored measure when looking for causal association (etiologic research) –not dependent upon background incidence of disease  Additive measures (e.g., risk difference): –readily translated into impact of an exposure (or intervention) in terms of absolute number of outcomes prevented »e.g. 1/risk difference = no. needed to treat to prevent (or avert) one case of disease or no. of exposed persons one needs to take the exposure away from to avert one case of disease –very dependent upon background incidence of disease –gives “public health impact” of the exposure

Additive vs Multiplicative Scales  Causally related but minor public health importance –Risk ratio = 2 –Risk difference = = –Need to eliminate exposure in 20,000 persons to avert one case of disease  Causally related and major public health importance –RR = 2 –RD = = 0.1 –Need to eliminate exposure in 10 persons to avert one case of disease

Smoking, Family History and Cancer: Additive vs Multiplicative Interaction Stratified Crude Family History Absent Family History Present Risk ratio no family history = 2.0 RD no family history = 0.05 Risk ratio family history = 2.0 RD family history = 0.20 No multiplicative interaction but presence of additive interaction If etiology is goal, risk ratio is sufficient If goal is to define sub-groups of persons to target: - Rather than ignoring, it is worth reporting that only 5 persons with a family history have to be prevented from smoking to avert one case of cancer

Confounding vs Interaction  We discovered interaction by performing stratification as a means to evaluate for confounding –This is where the similarities between confounding and interaction end!  Confounding –An extraneous or nuisance pathway that an investigator hopes to prevent or rule out  Interaction –A more detailed description of the relationship between the exposure and disease –A richer description of the biologic or behavioral system under study –A finding to be reported, not a bias to be eliminated

Smoking, Caffeine Use and Delayed Conception Stratified Crude No Caffeine Use Heavy Caffeine Use RR crude = 1.7 RR no caffeine use = 2.4RR caffeine use = 0.7 RR adjusted = 1.4 (95% CI= 0.9 to 2.1) Is this the best “final” answer? Here, adjustment is contraindicated When interaction is present, confounding becomes irrelevant!

Chance as a cause of interaction? Are all non-identical stratum-specific estimates indicative of interaction? Stratified Crude Age > 35Age < 35 OR crude = 3.5 OR age >35 = 5.7OR age <35 = 3.4 Should we report interaction here?

Statistical Tests of Interaction: Test of Homogeneity (heterogeneity)  Null hypothesis: The individual stratum-specific estimates of the measure of association differ only by random variation (chance or sampling error) –i.e., the strength of association is homogeneous across all strata –i.e., there is no interaction  Alternative: there is heterogeneity (i.e. no homogeneity)  If the test of homogeneity is “significant” (small p value), we reject the null in favor of the alternative hypothesis  A variety of formal tests are available with the same general format, following a chi-square distribution:  where: –effect i = stratum-specific measure of assoc. –var(effect i ) = variance of stratum-specifc m.o.a. –summary effect = summary adjusted effect –N = no. of strata of third variable  For ratio measures of effect, e.g., OR, log transformations are used:  The test statistic will have a chi-square distribution with degrees of freedom of one less than the number of strata

Tests of Homogeneity with Stata 1. Determine crude measure of association e.g. for a cohort study command: cs outcome-variable exposure-variable for smoking, caffeine, delayed conception: -exposure variable = “smoking” -outcome variable = “delayed” -third variable = “caffeine” command is: cs delayed smoking 2. Determine stratum-specific estimates by levels of third variable command: cs outcome-var exposure-var, by(third-variable) e.g. cs delayed smoking, by(caffeine)

. cs delayed smoking | smoking |  | Exposed Unexposed | Total   Cases | | 90  Noncases | | 734   Total | | 824  | |  Risk | |  | Point estimate | [95% Conf. Interval]  |  Risk difference | |  Risk ratio | | –  chi2(1) = 5.97 Pr>chi2 = . cs delayed smoking, by(caffeine)  caffeine | RR [95% Conf. Interval] M-H Weight   no caffeine |  heavy caffeine |   Crude |  M-H combined |   Test of homogeneity (M-H) chi2(1) = Pr>chi2 = What does the p value mean?

Reporting or Ignoring Interaction  When to report or ignore interaction is not clear cut. –A clinical, statistical, and practical decision –Clinical: Is the magnitude of stratum-specific differences substantively (clinically) important? –Statistical »There are inherent limitations in the power of the test of homogeneity »Only relatively large effect sizes or large sample size can achieve p < 0.05 »One approach is to report interaction for p < 0.10 to 0.20 if the magnitude of differences is high enough »However, meaning of p value is not different than other contexts –Practical: How complicated is the story? »i.e., if it is not too complicated to report stratum- specific estimates, it is often more revealing to report potential interaction than to ignore it.

Report vs Ignore Interaction? Some Guidelines Is an art form: requires consideration of both clinical and statistical significance

When Assessing the Association Between an Exposure and a Disease, What are the Possible Effects of a Third Variable? EM + _ Confounding: ANOTHER PATHWAY TO GET TO THE DISEASE Confounding: ANOTHER PATHWAY TO GET TO THE DISEASE Effect Modifier (Interaction): MODIFIES THE EFFECT OF THE EXPOSURE D I C Intermediary Variable: No Effect ON CAUSAL PATHWAY