Novel Approaches to Adjusting for Confounding: Propensity Scores, Instrumental Variables and MSMs Matthew Fox Advanced Epidemiology.

Novel Approaches to Adjusting for Confounding: Propensity Scores, Instrumental Variables and MSMs Matthew Fox Advanced Epidemiology

What are the exposures you are interested in studying?

Assuming I could guarantee you that you would not create bias, which approach is better: randomization or adjustment for every known variable?

What is intention to treat analysis?

Yesterday Causal diagrams (DAGS) – Discussed rules of DAGS – Goes beyond statistical methods and forces us to use prior causal knowledge – Teaches us adjustment can CREATE bias Helps identify a sufficient set of confounders – Not how to adjust for them

This week Beyond stratification and regression – New approaches to adjusting for (not “controlling” ) confounding – Instrumental variables – Propensity scores (Confounder scores) – Marginal structural models Time dependent confounding

Given the problems with the odds ratio, why does everyone use it?

Non-collapsibility of the OR (Excel; SAS)ExcelSAS Odds ratio collapsibility but confounding C+ C- Total E+ E- E+ E- E+ E- Disease+400 300 240 180 640 480 Disease-100 200 360 720 460 920 Total500 600 900 1100 1400 Risk0.80 0.60 0.40 0.20 0.58 0.34 Odds4.00 1.50 0.67 0.25 1.39 0.52 Crude Adj RR1.33 2.00 1.6968 1.5496 OR2.67 RD0.2 0.238960.2

Solution: SAS CodeSAS Code title "Crude relative risk model"; proc genmod data=odds descending; model d = e/link=log dist=bin; run; title "Adjusted relative risk model"; proc genmod data=odds descending; model d = e c/link=log dist=bin; run;

Model crude: Exp(0.5288) = 1.6968 Crude RR was 1.6968 Results

Model adjusted: Exp(0.3794) = 1.461 MH RR was 1.55

STATA glm d e, family(binomial) link(log) glm d e c, family(binomial) link(log)

What about risk differences?

Solution: SAS CodeSAS Code title "Crude risk differences model"; proc genmod data=odds descending; model d = e/link=bin dist=identity; run; title "Adjusted risk differences model"; proc genmod data=odds descending; model d = e c/link=bin dist=identity; run;

Model crude: Exp(0.5288) = 1.6968 Crude RR was 1.7 Results Model crude: 0.239 Crude RD = 0.23896

Results Adjusted model : 0.20 MH RD = 0.20

STATA glm d e, family(binomial) link(identity) glm d e c, family(binomial) link(identity) glm d e c c*e, family(binomial) link(identity)

Novel approaches to controlling confounding

Limitations of Stratification and Regression Stratification/regression work well with point exposures with complete follow up and sufficient data to adjust – Limited data on confounders or small cells – No counterfactual for some people in our dataset Regression often estimates parameters – Time dependent exposures and confounding A common situation With time dependence, DAGs gets complex

Randomization and Counterfactuals Ideally, evidence comes from RCTs – Randomization gives expectation unexposed can stand in for the counterfactual ideal Full exchangeability: E(p 1 =q 1, p 2 =q 2, p 3 =q 3, p 4 =q 4 ) – In expectation, assuming no other bias [Pr(Y a=1 =1) - Pr(Y a=0 =1)] = [Pr(Y=1|A=1) - Pr(Y=1|A=0)] Since we assign A, RR AC = 1 – If we can’t randomize, what can we do to approximate randomization?

How randomization works Randomized Controlled Trial Randomization strongly predicts exposure (ITT) C2C2 C1C1 C3C3 A D Randomization

A typical observational study Observational Study C2C2 C1C1 C3C3 A D ?

A typical observational study Observational Study C2C2 C1C1 C3C3 A D Regression/stratification seeks to block backdoor path from A to D by averaging A-D associations within levels of C x

Approach 1: Instrumental Variables

Intention to treat analysis In an RCT we assign the exposure – e.g. assign people to take an aspirin a day vs. not – But not all will take aspirin when told to and others will take it even if told not to What to do with those who don’t “obey”? – The paradigm of intention to treat analysis says analyze subject in the group they are assigned Maintains the benefits of randomization Biases towards the null at worst

Instrumental variables An approach to dealing with confounding using a single variable – Works along the same lines as randomization Commonly used approach in economics, yet rarely used in medical research – Suggests we are either behind the times or they are hard to find – Party privileged in economics because little adjustment data exists

Instrumental variables An instrument (I): – A variable that satisfies 3 conditions: Strongly associated with exposure Has no effect on outcome except through A (E) Shares no common causes with outcome Ignore E-D relationship – Measure association between I and D This is not confounded – Approximates an ITT approach

Adjust the IV estimate Can optionally adjust IV estimate to estimate the effect of A (exposure) – But differs from randomization If an instrument can be found, has the advantage we can adjust for unknown confounders – This is the benefit we get from randomization?

Intention to Treat (IV Ex 1) A(Exposure): Aspirin vs. Placebo Outcome: First MI Instrument: Randomized assignment TherapyMI Confounders Randomization Condition 1: Predictor of A ? Condition 2: no direct effect on the outcome? Condition 3: No common causes with outcome?

Confounding by indication (IV Ex 2) A(Exposure): COX2 inhibitor vs NSAID Outcome: GI complications Instrument: Physician’s previous prescription COX2/NSAIDGI comp Indications Previous Px Regression (17 confounders), no effect RD: -0.06/100; 95% CI -0.26 to 0.14 IV: Protective effect of COX-2 RD: -1.31/100; -2.42 to -0.20 Compatible with trial results RD: -0.65/100; -1.08 to -0.22

Unknown confounders (IV Ex 3) A(Exposure): Childhood dehydration Outcome: Adult high blood pressure Instrument: 1 st year summer climate dehydrationHigh BP SES 1st year climate Hypothesized hottest/driest summers in infancy would be associated with severe infant diarrhea/dehydration, and consequently higher blood pressure in adulthood. For 3,964 women born 1919- 1940, a 1 SD (1.3 ºC) > mean summer temp in 1 st year life associated with 1.12-mmHg (95% CI: 0.33, 1.91) > adult systolic blood pressure, and 1 SD > mean summer rainfall (33.9 mm) associated with < systolic blood pressure (-1.65 mmHg, 95% CI: -2.44, -0.85).

Optionally we can adjust for “non- compliance” Optionally if we want to estimate A-D relationship, not I-D, we can adjust: – RD ID / RD IE – Inflate the IV estimator to adjust for the lack of perfect correlation between I and E – If I perfectly predicts E then RD IE = 1, so adjustment does nothing Like per protocol analysis – But adjusted for confounders

To good to be true? Maybe The assumptions needed for an instrument are un-testable from the data – Can only determine if I is associated with A Failure to meet the assumptions can cause strong bias – Particularly if we have a “weak” instrument

Approach 2: Propensity Scores

Comes out of a world of large datasets (Health insurance data) Cases where we have a small (relative to the size of the dataset) exposed population and lots and lots of potential comparisons in the unexposed group – And lots of covariate data to adjust for Then we have luxury of deciding who to include in study as a comparison group based on a counterfactual definition

Propensity Score Model each subject’s propensity to receive the index condition as a function of confounders – Model is independent of outcomes, so good for rare disease, common exposure Use the propensity score to balance assignment to index or reference by: – Matching – Stratification – Modeling

Propensity Scores The propensity score for subject i is: – Probability of being assigned to treatment A = 1 vs. reference A = 0 given a vector x i of observed covariates: In other words, the propensity score is: – Probability that the person got the exposure given anything else we know about them

Why estimate the probability a subject receives a certain treatment when it is known what treatment they received?

How Propensity Scores Work Quasi-experiment – Using probability a subject would have been treated (propensity score) to adjust estimate of the treatment effect, we simulate a RCT 2 subjects with = propensity, one E+, one E- – We can think of these two as “randomly assigned” to groups, since they have the same probability of being treated, given their covariates – Assumes we have enough observed data that within levels of propensity E is truly random

Propensity Scores: Smoking and Colon Cancer Have info on people’s covariates: – Alcohol use, sex, weight, age, exercise, etc: Person A is a smoker, B is not – Both had 85% predicted probability of smoking If “propensity” to smoke is same, only difference is 1 smoked and 1 didn’t – This is essentially what randomization does – B is the counterfactual for A assuming a correct model for predicting smoking

Obtaining Propensity Scores in SAS Calculate propensity score proc logistic; model exposure = cov_1 cov_2 … cov_n; output out = pscoredat pred = pscore; run; Either match subjects on propensity score or adjust for propensity score proc logistic; model outcome = exposure pscore; run;

Pros and Cons of PS Pros – Adjustment for 1 confounder – Allows estimation of the exposure and fitting a final model without ever seeing the outcome – Allows us to see parts of data we really should not be drawing conclusions on b/c no counterfactual Cons – Only works if have good overlap in pscores – Does not fix conditioning on a collider problem – Doesn’t deal with unmeasured confounders

Study of effect of neighborhood segregation on IMR

Approach 3: Marginal Structural Models

Time Dependent Confounding Time dependent confounding: 1) Time dependent covariate that is a risk factor for or predictive of the outcome and also predicts subsequent exposure Problematic if also: 2) Past exposure history predicts subsequent level of covariate

Example Observational study of subjects infected with HIV – E = HAART therapy – D = All cause mortality – C = CD4 count

Time Dependent Confounding A0A0 A1A1 C0C0 C1C1 D A0A0 A1A1 C0C0 C1C1 D 1) 2)

Failure of Traditional Methods Want to estimate causal effect of A on D – Can’t stratify on C (it’s an intermediate) – Can’t ignore C (it’s a confounder) Solution – rather than stratify, weight – Equivalent to standardization Create pseudo-population where RR CE = 1 – Weight each person by “inverse probability of treatment” they actually received – Weighting doesn’t cause problems pooling did – In DAG, remove arrow C to A, don’t box

Remember back to the SMR CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0

The SMR asks, what if the exposed had also been unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 CrudeC1C1 C0C0 E+E- E+E- E+E- D+350D+300D+50 D-1650D-1200D-450 Total2000Total1500Total500 Risk0.18 0.2 0.1 RR SMR

The SMR asks, what if the exposed had also been unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 CrudeC1C1 C0C0 E+E- E+E- E+E- D+350D+300D+50 D-1650D-1200D-450 Total2000Total1500 Total500 Risk0.18 0.2 0.1 RR SMR

The SMR asks, what if the exposed had also been unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 CrudeC1C1 C0C0 E+E- E+E- E+E- D+350D+300D+50 D-1650D-1200D-450 Total2000Total1500 Total500 Risk0.18 0.20.1 0.05 RR SMR

The SMR asks, what if the exposed had also been unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 CrudeC1C1 C0C0 E+E- E+E- E+E- D+350D+300150D+5025 D-1650D-12001350D-450475 Total2000Total1500 Total500 Risk0.18 0.20.1 0.05 RR 2.0 RR2.0 SMR

The SMR asks, what if the exposed had also been unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 CrudeC1C1 C0C0 E+E- E+E- E+E- D+350175D+300150D+5025 D-16501825D-12001350D-450475 Total2000 Total1500 Total500 Risk0.1750.875 0.20.1 0.05 RR2.0RR2.0 RR2.0 Crude now equals the adjusted. No need to adjust. SMR

Could also ask, what if everyone was both exposed, unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0

Could also ask, what if everyone was both exposed, unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+ D- Total 1700 Total1500 Risk 0.20.1 0.05 RR 2.0 RR2.0

Could also ask, what if everyone was both exposed, unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+ 340170D+15075 D- 13601530D-13501425 Total 1700 Total1500 Risk 0.20.1 0.05 RR 2.0 RR2.0

Could also ask, what if everyone was both exposed, unexposed? CrudeC1C1 C0C0 E+E- E+E- E+E- D+490245D+340170D+15075 D-27102955D-13601530D-13501425 Total3200 Total1700 Total1500 Risk0.1530.077 0.20.1 0.05 RR2.0RR2.0 RR2.0

What is Inverse Probability Weighting (IPW)? Weight each subject by inverse probability of treatment received Probability of treatment is: – p(receiving treatment received| covariates) – Adjust # of E+ and E- subjects in C strata Weighting breaks E-C link only – Now Marginal (Crude) = Causal Effect But that’s what we just did

Calculate the weights CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 PT IPTW Calculate p(receiving treatment received|C) For C=1, E=1 – PT = 1500/1700 = 0.88 – IPTW = 1/0.88 = 1.13

Calculate the weights CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 PT0.88 IPTW1.13 Calculate p(receiving treatment received|C) For C=1, E=1 – PT = 1500/1700 = 0.88 – IPTW = 1/0.88 = 1.13

Calculate the weights CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 PT0.880.12 IPTW1.138.50 Calculate p(receiving treatment received|C) For C=1, E=0 – PT = 200/1700 = 0.12 – IPTW = 1/0.12 = 8.50

Calculate the weights CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 PT0.880.120.330.67 IPTW1.138.503.001.50 Calculate p(receiving treatment received|C) For C=1, E=0 – PT = 200/1700 = 0.12 – IPTW = 1/0.12 = 8.50 Multiply cell number by the weights

Apply the weights CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 PT0.880.120.330.67 IPTW1.138.503.001.50 Pseudo population CrudeC1C1 C0C0 E+E- E+E- E+E- D+ 340170D+15075 D- 13601530D-13501425 Total 1700 Total1500 Risk 0.20.1 0.05 RR 2.0 RR2.0

Collapse CrudeC1C1 C0C0 E+E- E+E- E+E- D+35070D+30020D+50 D-16501130D-1200180D-450950 Total20001200Total1500200Total5001000 Risk0.180.06 0.20.1 0.05 RR3.0 RR2.0 RR2.0 PT0.880.120.330.67 IPTW1.138.503.001.50 Pseudo population CrudeC1C1 C0C0 E+E- E+E- E+E- D+490245D+340170D+15075 D-27102955D-13601530D-13501425 Total3200 Total1700 Total1500 Risk0.1530.077 0.20.1 0.05 RR2.0RR2.0 RR2.0 Broke link between C and E without stratification, so no problem of conditioning on collider

Pseudo-population The “pseudo-population” breaks the link between the exposure and the outcome without stratification – Note this is different from stratifying – Create a standard population without confounding By creating multiple copies of people, standard errors will be biased – Use robust standard errors to adjust

Robins and Hernán “The IPTW method effectively simulates the data that would be observed had, contrary to fact, exposure been conditionally randomized”

Time Dependent Confounding Extend method to time dependent confounders – Predict p(receiving treatment actually received at time t 1 |covariates, treatment at t 0 ) Probability of treatment at t 1 is: p(receiving treatment received at t 0 ) * p(receiving treatment received at t 1 ) See Hernán for SAS code, not hard scwgt command, robust SE (repeated statement)

Time Dependent Confounding E0E0 E1E1 C0C0 C1C1 D E0E0 E1E1 C0C0 C1C1 D 1) Before IPTW 2) After IPTW

Limitations of MSMs Very sensitive to weights Still need to be able to be able to predict the exposure – The methods solves the structural problem, but we still need the data to be able to accurately predict exposure Still have to get the model right

Novel Approaches to Adjusting for Confounding: Propensity Scores, Instrumental Variables and MSMs Matthew Fox Advanced Epidemiology.

Similar presentations

Presentation on theme: "Novel Approaches to Adjusting for Confounding: Propensity Scores, Instrumental Variables and MSMs Matthew Fox Advanced Epidemiology."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Novel Approaches to Adjusting for Confounding: Propensity Scores, Instrumental Variables and MSMs Matthew Fox Advanced Epidemiology.

Similar presentations

Presentation on theme: "Novel Approaches to Adjusting for Confounding: Propensity Scores, Instrumental Variables and MSMs Matthew Fox Advanced Epidemiology."— Presentation transcript:

Similar presentations

About project

Feedback