Propensity Scores October 2012 Alexander M. Walker MD, DrPH Extensive parts of this presentation incorporate the work of John D. Seeger, PharmD, DrPH.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

A workshop introducing doubly robust estimation of treatment effects
M2 Medical Epidemiology
1 Arlene Ash QMC - Third Tuesday September 21, 2010 (as amended, Sept 23) Analyzing Observational Data: Focus on Propensity Scores.
V.: 9/7/2007 AC Submit1 Statistical Review of the Observational Studies of Aprotinin Safety Part I: Methods, Mangano and Karkouti Studies CRDAC and DSaRM.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
Sensitivity Analysis for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence January–February 2009.
1 Arlene Ash QMC - Third Tuesday September 21, 2010 Analyzing Observational Data: Focus on Propensity Scores.
Sampling and Experimental Control Goals of clinical research is to make generalizations beyond the individual studied to others with similar conditions.
Incomplete Block Designs
N The Experimental procedure involves manipulating something called the Explanatory Variable and seeing the effect on something called the Outcome Variable.
Authors: Sujai M. Parker, Gunjan N. Jodi, Jalak Jani
EVIDENCE BASED MEDICINE
Experimental Research
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Nonparametric or Distribution-free Tests
Unit 6: Standardization and Methods to Control Confounding.
Propensity Scores October 2014 Alexander M. Walker MD, DrPH Extensive parts of this presentation incorporate the work of John D. Seeger, PharmD, DrPH.
Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data.
Advanced Statistics for Interventional Cardiologists.
1 Journal Club Alcohol, Other Drugs, and Health: Current Evidence January–February 2014.
Simple Linear Regression
Presentations in this series 1.Overview and Randomization 2.Self-matching 3.Proxies 4.Intermediates 5.Instruments 6.Equipoise Avoiding Bias Due to Unmeasured.
Research Methodology For IB Psychology Students. Empirical Investigation The collecting of objective information firsthand, by making careful measurements.
Understanding Statistics
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
Experimental Design making causal inferences Richard Lambert, Ph.D.
Amsterdam Rehabilitation Research Center | Reade Multiple regression analysis Analysis of confounding and effectmodification Martin van de Esch, PhD.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Introduction to confounding and DAGs
Article Review Cara Carty 09-Mar-06. “Confounding by indication in non-experimental evaluation of vaccine effectiveness: the example of prevention of.
Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR 5/06.
Sampling, sample size estimation, and randomisation
Presentations in this series 1.Introduction 2.Self-matching 3.Proxies 4.Intermediates 5.Instruments 6.Equipoise Avoiding Bias Due to Unmeasured Covariates.
Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.
Case Control Study Dr. Ashry Gad Mohamed MB, ChB, MPH, Dr.P.H. Prof. Of Epidemiology.
Describing the risk of an event and identifying risk factors Caroline Sabin Professor of Medical Statistics and Epidemiology, Research Department of Infection.
Impact Evaluation Sebastian Galiani November 2006 Matching Techniques.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Master’s Essay in Epidemiology I P9419 Methods Luisa N. Borrell, DDS, PhD October 25, 2004.
Chapter 8: Simple Linear Regression Yang Zhenlin.
1 Chapter 16 logistic Regression Analysis. 2 Content Logistic regression Conditional logistic regression Application.
Using Propensity Score Matching in Observational Services Research Neal Wallace, Ph.D. Portland State University February
1 Hester van Eeren Erasmus Medical Centre, Rotterdam Halsteren, August 23, 2010.
Logistic Regression Analysis Gerrit Rooks
1 Statistical Review of the Observational Studies of Aprotinin Safety Part II: The i3 Drug Safety Study CRDAC and DSaRM Meeting September 12, 2007 P. Chris.
Probability and odds Suppose we a frequency distribution for the variable “TB status” The probability of an individual having TB is frequencyRelative.
Transparency in the Use of Propensity Score Methods
Case-Control Studies September 2014 Alexander M. Walker MD, DrPH With Sonia Hernández-Díaz MD, DrPH.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Analysis of matched data Analysis of matched data.
Propensity Score Matching in SPSS: How to turn an Audit into a RCT
(ARM 2004) 1 INNOVATIVE STATISTICAL APPROACHES IN HSR: BAYESIAN, MULTIPLE INFORMANTS, & PROPENSITY SCORES Thomas R. Belin, UCLA.
Uses of Diagnostic Tests Screen (mammography for breast cancer) Diagnose (electrocardiogram for acute myocardial infarction) Grade (stage of cancer) Monitor.
Purpose of Epi Studies Discover factors associated with diseases, physical conditions and behaviors Identify the causal factors Show the efficacy of intervening.
Matching methods for estimating causal effects Danilo Fusco Rome, October 15, 2012.
Addressing Confounding in Real-World Evidence Using Propensity Scores
Instructional Objectives:
Sec 9C – Logistic Regression and Propensity scores
Matching Methods & Propensity Scores
Matching Methods & Propensity Scores
Matching Methods & Propensity Scores
Evaluating Impacts: An Overview of Quantitative Methods
Evaluating Effect Measure Modification
An example of the Lancet
Emerging Analytical Techniques for Comparative Effectiveness Research
Presentation transcript:

Propensity Scores October 2012 Alexander M. Walker MD, DrPH Extensive parts of this presentation incorporate the work of John D. Seeger, PharmD, DrPH

“In mathematics, you don't understand things. You just get used to them.” - John von Neumann

Research Goal  Compare two treatments with respect to a health or economic outcome  “Counterfactual” ideal If the same people had received B instead of A, how would their outcomes have differed?  What is achievable: “similar” not “same” Comparable treatment groups … insofar as you can tell!

4 Pictures for Confounding

Comparison of Heterogeneous Groups 5 E1E2

6 E1E2 Internal Composition May Differ

7 Affected individuals 50% 15% 50% 15% E1E2 Risks that Depend on Subgroup Status

8 These differences in risk are due to the covariate structure of compared populations, not to the differential effects of E1 and E2 E1E2 Internal Risk Factor Heterogeneity Creates an Differences in Group Risk

9 Propensity Scores to Create Populations with Similar Covariate Structure

10 E1 E2 Covariate Heterogeneity E1 has more Yellow E2 has more Gray

11 E2 E1 E2 Gray predicts E2 Yellow predicts E1 Covariate Status as a Predictor of Treatment

12 Propensity Scores  PS is the predicted probability of treatment, given all the covariates  Matching on the PS creates study populations that have balance on the covariates  Perfect for a single, dichotomous covariate  Not perfect, but very good for multiple covariates

13 E1 E2 Propensity for Covariate Patterns Think of orange and green as two distinct covariate patterns that have the same predicted Pr(E1). Pr(E1)=x

14 E1 E2 Pr(E1)=x Gathering subjects with identical propensity puts all individuals with covariate patterns orange and green into the same stratum. Conditioning on Propensity Permits Unconfounded Comparisons At a given propensity level, there is no association between treatment and covariate patterns.

Formal Expression Propensity(x)  P(T=1|x) = E(T|x) The propensity associated with level x of the covariate X is the probability that treatment is present (equivalently, is “B” as opposed to “A”), given level x, and this is in turn equal to the expected value of treatment, given x. Note that the definition does not specify the parametric form of the Propensity(x). The examples in this talk use a logistic function; others -- including nonparametric functions -- are also used. Notation. A single capital letter denotes a variable, a single lower case letter denotes a particular value for that variable.

Probability Calculus Under propensity matching, how do x (covariate status) and t (treatment status) relate to one another? 1.Pr( x, t | p ) = Pr( x | p ) Pr( t | x, p ) Probability Theory 2.Pr( t | x, p ) = Pr( t | p ) p incorporates all information about t that is in x  Pr( x, t | p ) = Pr( x | p ) Pr( t | p )

Pr( x, t | p ) = Pr( x | p ) Pr( t | p ) Given a particular value of the propensity score variable, that is at P=p, the covariates X and T are uncorrelated. At particular levels of P individually and therefore collectively as well (“conditionally on P), X cannot confound the association between T and any outcome.

18 Matching on Propensity Scores

Propensity Matching: Method  Identify candidate predictors of treatment B v A  Perform a logistic regression of B v A  Obtain from the regression a “predicted” probability of B v A  Sort all members of A and B according to this propensity  Match A patients to B patients on the propensity

Duragesic and Long-Acting Opioids DuragesicLA Opioids N5042, years29%10% Male35%49% Periph Vasc Disease4%1% Sx of Abd or Pelvis18%10% > 2 hospitalztns 6 mo9%3% 30 days NonRx Costs$1,136$746

Straightforward Regression proc logistic data = mother.propensity2 descending; model DuragesicUser = DischCostIndex EncCostIndex RxCostIndex OtherCostIndex RxCostPrior1 OtherCostPrior1 AnyRx OneDisch TwoDisch ThreePlusDisch AnyICD443 AnyICD719 AnyICD724 AnyICD787 AnyICD789 q3_95_new q4_95 q1_96 q2_96 q3_96 q4_96 q1_97 q2_97 q3_97 q4_97 q1_98 q2_98 q3_98 q4_98 hmo men young old /rl; where enrbaseflag = 1 and validindex = 1 and sameday = 0 and medicare = 0 and malignant = 0; output out = mother.propensity3 p = score ; run;

Propensity Output Obs PATIENT score

23 E1 Pr(E1)=x E2 (sample) E2 (residual) Choose from E2 a sample that matches E1 in size. Matching on Propensity

24 E1 Pr(E1) = 0.5 E2 At every level of propensity in the constructed cohorts, Pr(E1) = 0.5. Therefore, treatment is uncorrelated with propensity, and you can collapse all the propensity- matched groups together to form a cohort in which all covariate patterns are uncorrelated with treatment, and there will be no confounding bias. Matching on Propensity

Stratum I II III IV V

Duragesic and Long-Acting Opioids DuragesicLA Opioids N5042, years29%10% Male35%49% Periph Vasc Disease4%1% Sx of Abd or Pelvis18%10% > 2 hospitalztns 6 mo9%3% 30 days NonRx Costs$1,136$746

Propensity-Matched Cohorts DuragesicLA Opioids N years26%25% Male36%33% Periph Vasc Disease4%3% Sx of Abd or Pelvis17%18% > 2 hospitalztns 6 mo8% 30 days NonRx Costs$1,084$1,043

Pharmacoepidemiol Drug Saf Jul;14(7):

Do Statins Affect Risk of AMI?  The purpose of the study was to assess whether Statins affect the risk of risk of acute myocardial infarction (AMI)  Strong predictors for statin use that affect risk of AMI  How to design an observational study?  Note: we would not ordinarily use observational data for efficacy questions, but this serves as a suitable test case because there is a known gold standard

+Risk Factors: age (45M, 55F), diabetes, smoking, HTN, low HDL, family history of premature CHD -Risk Factor: high HDL Risk Category LDL to initiate drug Tx LDL Goal of drug Tx No CHD and <2 Risk Factors  190 <160 No CHD and  2 Risk Factors  160 <130 With CHD >130  100 NCEP ATP II guidelines (1993) Good Clinical Practice Creates Confounding

Gold Standard for the Effect of Statins CARE Trial Results Sacks FM, et al N Engl J Med. 1996;335:1001-9

Data Source Fallon Community Health Plan Central Massachusetts HMO ~200,000 members Claims Data available on: –Enrollment (age, sex, date) –Ambulatory care visits –Hospitalization –Pharmacy dispensings (drug & quantity) –Laboratory tests (tests & results)

Patient Entry, Analytic Sequence of 9 Blocks 1)Apply eligibility criteria FCHP member for at least 1 year At least one physician visit in last year LDL, HDL, TG levels in last 6 months At least one physician visit in cohort accrual block No PAD diagnosis before index date Not current statin user 2) Estimate propensity score (statin initiation) 3) Match statin initiators with non-initiators 4) Repeat for all blocks of time 5) Follow matched groups for diagnosis of MI 2nd/9 4 ~35,000 Members All Fallon members with any LDL > 130 mg/dl Require 1 year Enrollment

Current Statin Users (1501) Statin Initiators, Eligible (77) Statin Initiators, Not Eligible (34) Non Statin Users, Not Eligible (24,799) Non Statin Users, Eligible (9,639) Month of 1/1/94 Propensity Score Matching Total subjects in cohort (36,050)

“Typical” Statin Initiator and Non-Initiator

111% (46%- 204%) Risk Increase Statin Non-Initiators Statin Initiators Months of Follow-Up Cumulative Incidence MI Outcome (Unmatched) HR=2.11 ( )

Calculate Propensity Score  Predict Treatment Statin Initiation vs Not In Each 6-month Period of Cohort Accrual  Using Baseline Covariates  Obtain Fitted Value From Regression  Fitted Value is the Propensity Score

Construct Rich Model  More than 8 events per covariate leads to unbiased estimates  Many more persons exposed to drug of interest than study outcomes  In Drug Safety studies, usually the outcome is rare  Therefore can control for more covariates when exposure is dependent variable than when outcome is Cepeda S, et al. Am J Epidemiol 2003;158:

*build model for 9501; proc logistic descending data=new1; model statin = male smok obes age9501 ang9501 usa9501 chf9501 isch9501 ath9501 cva9501 usa9501 mi9501 olmi9501 htn9501 tia9501 afib9501 ascv9501 hth9501 ost9501 cvs9501 htdx9501 circ9501 cond9501 rvsc9501 hhd9501 dysr9501 hrt9501 ns9501 ins9501 diab9501 skca9501 depr9501 adj9501 schz9501 deb9501 rheu9501 days9501 lres9501 tres9501 hres9501 hbac9501 cvhp9501 ekg9501 cvrx9501 cvvs9501 llab9501 lab9501 cvdg9501 hosp9501 rx9501 vist9501 diag9501 ; output out=psmodel pred=PROPSCORE; run;

Propensity Regression Parameter Estimates

Obs ID STATINPROPSCORE Output File – Propensity Scores

43

Obs ID STATINPROPSCOR Output File – Propensity Scores

Balance Achieved by Matching Only 1 of 52 variables sig. different at P<0.05

31% (7%- 48%) Risk Reduction Statin Non-Initiators Statin Initiators Months of Follow-Up Cumulative Incidence MI Outcome (After Matching) HR=0.69 ( )

Interpreting Propensity Coefficients 49

When Is the Model Sufficient? 53

Early Matching Results

New Variables Suggested post hoc for the Propensity Score Cardiac Disease  Cardiovascular  Diagnoses  Hospitalizations  Outpatient visits  Medications  EKGs  Number of labs  Number of lipid labs Other Causes of “Medicalization”  Schizophrenia  Adjustment Disorder  Depression  Non-Skin CA  Skin CA  Debility  Rheumatic Disease

Imbalance on Non-Included Variables

NIVs are Predictors of Statin Initiation

New Ranking of Predictors

Balance on New Variables

Thank You!