May 24, 2000NIDA/NIMH Substance Abuse Conference 1 Research Designs, Statistical Strategies for Dealing with Selection Bias in Treatment Delivery, and.

Slides:



Advertisements
Similar presentations
1 QOL in oncology clinical trials: Now that we have the data what do we do?
Advertisements

Estimating Direct Effects of New HIV Prevention Methods. Focus: the MIRA Trial UCSF: Helen Cheng, Nancy Padian, Steve Shiboski, Ariane van der Straten.
Propensity Score Matching Lava Timsina Kristina Rabarison CPH Doctoral Seminar Fall 2012.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
PHSSR IG CyberSeminar Introductory Remarks Bryan Dowd Division of Health Policy and Management School of Public Health University of Minnesota.
Observational Studies Based on Rosenbaum (2002) David Madigan Rosenbaum, P.R. (2002). Observational Studies (2 nd edition). Springer.
Method Reading Group September 22, 2008 Matching.
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
Clustered or Multilevel Data
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
Sampling and Experimental Control Goals of clinical research is to make generalizations beyond the individual studied to others with similar conditions.
Introduction to PSM: Practical Issues, Concerns, & Strategies Shenyang Guo, Ph.D. School of Social Work University of North Carolina at Chapel Hill January.
Using Regression Models to Analyze Randomized Trials: Asymptotically Valid Tests Despite Incorrect Regression Models Michael Rosenblum, UCSF TAPS Fellow.
Studying treatment of suicidal ideation & attempts: Designs, Statistical Analysis, and Methodological Considerations Jill M. Harkavy-Friedman, Ph.D.
Stratification and Adjustment
Advanced Statistics for Interventional Cardiologists.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 5: Analysis Issues in Large Observational Studies.
Systematic Reviews.
Study design P.Olliaro Nov04. Study designs: observational vs. experimental studies What happened?  Case-control study What’s happening?  Cross-sectional.
Propensity Score Matching and Variations on the Balancing Test Wang-Sheng Lee Melbourne Institute of Applied Economic and Social Research The University.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 3: Incomplete Data in Longitudinal Studies.
Consumer behavior studies1 CONSUMER BEHAVIOR STUDIES STATISTICAL ISSUES Ralph B. D’Agostino, Sr. Boston University Harvard Clinical Research Institute.
Correlational Research Chapter Fifteen Bring Schraw et al.
Article Review Cara Carty 09-Mar-06. “Confounding by indication in non-experimental evaluation of vaccine effectiveness: the example of prevention of.
Biostatistics Case Studies 2008 Peter D. Christenson Biostatistician Session 5: Choices for Longitudinal Data Analysis.
Data Analysis – Statistical Issues Bernd Genser, PhD Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador
© Nuffield Trust 22 June 2015 Matched Control Studies: Methods and case studies Cono Ariti
Assumptions of value-added models for estimating school effects sean f reardon stephen w raudenbush april, 2008.
Matching Estimators Methods of Economic Investigation Lecture 11.
Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR 5/06.
Landmark Trials: Recommendations for Interpretation and Presentation Julianna Burzynski, PharmD, BCOP, BCPS Heme/Onc Clinical Pharmacy Specialist 11/29/07.
A Randomized Experiment Comparing Random to Nonrandom Assignment William R Shadish University of California, Merced and M.H. Clark Southern Illinois University,
Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.
1 THE ROLE OF COVARIATES IN CLINICAL TRIALS ANALYSES Ralph B. D’Agostino, Sr., PhD Boston University FDA ODAC March 13, 2006.
Empirical Efficiency Maximization: Locally Efficient Covariate Adjustment in Randomized Experiments Daniel B. Rubin Joint work with Mark J. van der Laan.
RCTs and instrumental variables Anna Vignoles University of Cambridge.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Applying impact evaluation tools A hypothetical fertilizer project.
Marginal Treatment Effects and the External Validity of the Oregon Health Insurance Experiment Amanda Kowalski Associate Professor, Department of Economics,
META-ANALYSIS, RESEARCH SYNTHESES AND SYSTEMATIC REVIEWS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.
Impact Evaluation Sebastian Galiani November 2006 Matching Techniques.
Overview and Common Pitfalls in Statistics and How to Avoid Them
Studying the Mean and Variation in the Effect of Program Participation in Multi-site Trials The research reported here was supported by a grant from the.
Using Propensity Score Matching in Observational Services Research Neal Wallace, Ph.D. Portland State University February
1 Hester van Eeren Erasmus Medical Centre, Rotterdam Halsteren, August 23, 2010.
Instructor Resource Chapter 15 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
ECON 3039 Labor Economics By Elliott Fan Economics, NTU Elliott Fan: Labor 2015 Fall Lecture 91.
Randomized Assignment Difference-in-Differences
Tutorial I: Missing Value Analysis
1 Bandit Thinkhamrop, PhD.(Statistics) Dept. of Biostatistics & Demography Faculty of Public Health Khon Kaen University Overview and Common Pitfalls in.
Todd Wagner, PhD February 2011 Propensity Scores.
Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11.
MATCHING Eva Hromádková, Applied Econometrics JEM007, IES Lecture 4.
Propensity Score Matching in SPSS: How to turn an Audit into a RCT
(ARM 2004) 1 INNOVATIVE STATISTICAL APPROACHES IN HSR: BAYESIAN, MULTIPLE INFORMANTS, & PROPENSITY SCORES Thomas R. Belin, UCLA.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
David M. Murray, Ph.D. Associate Director for Prevention Director, Office of Disease Prevention Multilevel Intervention Research Methodology September.
On estimating treatment effects under noncompliance in randomized clinical trials Heejung Bang.
Sec 9C – Logistic Regression and Propensity scores
Randomized Trials: A Brief Overview
Heterogeneity and sources of bias
Challenges of statistical analysis in surgical trials
NIDA/NIMH Substance Abuse Conference
Impact evaluation: The quantitative methods with applications
Explanation of slide: Logos, to show while the audience arrive.
Joint Statistical Meetings, Vancouver, August 1, 2018
Evaluating Impacts: An Overview of Quantitative Methods
Analysing RWE for HTA: Challenges, methods and critique
Alternative Scenarios and Related Techniques
Presentation transcript:

May 24, 2000NIDA/NIMH Substance Abuse Conference 1 Research Designs, Statistical Strategies for Dealing with Selection Bias in Treatment Delivery, and Limitations Naihua Duan UCLA and RAND May 2000 Selection bias in treatment assignment/delivery Research designs Mitigating for overt selection bias Dealing with hidden selection bias Discussions

May 24, 2000NIDA/NIMH Substance Abuse Conference 2 Selection Bias in Treatment Delivery In naturalistic settings: Pre-treatment health  treatment delivered Pre-treatment health  outcome Treated group dissimilar from untreated group Direct comparison of treated vs. untreated results in biased estimate for treatment effect Need to mitigate selection bias in order to assess treatment effect more appropriately

May 24, 2000NIDA/NIMH Substance Abuse Conference 3 Selection Bias in Treatment Delivery: Typology Overt selection bias Treatment related to covariates T  X Given covariates, treatment independent of outcome T  Y | X(ignorability) Like a stratified randomized experiment Hidden selection bias Given covariates, treatment still related to outcome T  Y | X Rosenbaum (1995) Observational Studies, Springer-Verlag

May 24, 2000NIDA/NIMH Substance Abuse Conference 4 Rubin Causal Model Potential outcome Y 1 T Y 1 C Y 1 T  Y 1 C ……………………………………….. Y m T Y m C Y m T  Y m C Y m+1 T Y m+1 C Y m+1 T  Y m+1 C ……………………………………….. Y m+n T Y m+n C Y m+n T  Y m+n C Challenging missing data problem Missing at random (ignorable missingness)

May 24, 2000NIDA/NIMH Substance Abuse Conference 5 Research Designs Ideal randomized clinical trial (RCT) Imperfect RCT with noncompliance Randomized encouragement design (RED) Observational studies Settings: controlled vs. naturalistic Treatment assignment/delivery: mandated vs. choice Treated vs. untreated groups: balance vs. imbalance Research questions: efficacy vs. adoption, program effect, and efficacy Analytic strength: interval validity vs. external validity

May 24, 2000NIDA/NIMH Substance Abuse Conference 6 Randomized Clinical Trial Intensive efforts made to mandate assignment

May 24, 2000NIDA/NIMH Substance Abuse Conference 7 Randomized Encouragement Design Encouragement: training, providing information, case management, reducing barriers (child care, transportation, flexible hours, reducing co-payment…), decorate waiting room,...

May 24, 2000NIDA/NIMH Substance Abuse Conference 8 Randomized Encouragement Design: Features Analogous to marketing experiment Encouragement  higher adoption rate?  better overall outcomes?  better outcomes for new users? Naturalistic, incorporate user preferences, facilitate choice Broader participation, external validity, dissemination Zelen (1979 NEJM, 1990 Stat. in Medicine: randomized consent design), Holland (1988) in Clogg CC, ed. Sociological Methodology, Hirano et al. (2000, Biostatistics), Wells et al. (2000, JAMA), Duan et al. (2000, manuscript)

May 24, 2000NIDA/NIMH Substance Abuse Conference 9 Mitigating Overt Selection Bias Assume overt selection bias: T  X Assume no hidden selection bias: T  Y | X Covariate adjustment through ANCOVA Stratification (through propensity score method) Matching (through propensity score method)

May 24, 2000NIDA/NIMH Substance Abuse Conference 10 Covariate Adjustment Y =  + T  + X  (+ T X  ) +  Extrapolation can be risky when imbalance is substantial Y X: Pre- Tx health T = 1 T = 0

May 24, 2000NIDA/NIMH Substance Abuse Conference 11 Limitations for Covariate Adjustment Extrapolation can be risky when imbalance is substantial Compare apples and oranges, rely on model to adjust Careful model diagnosis is essential Multivariate imbalance might be more problematic Why so popular? Ease of push-botton analysis Almost always gives an answer Could be a bad answer!

May 24, 2000NIDA/NIMH Substance Abuse Conference 12 Stratification When Covariate Is Univariate Stratify, then compare by stratum Compare apples and apples, oranges and oranges Y X: Pre- Tx health T = 1 T = 0

May 24, 2000NIDA/NIMH Substance Abuse Conference 13 Stratification: Procedure Stratify, then compare treated vs. untreated by stratum Two-sample comparison within each stratum ANCOVA within each stratum Assess interactions across strata Synthesize treatment effects across strata Weighted average Overall intervention effect on treated Overall intervention effect on untreated Overall intervention effect on entire pool Can be specified as ANCOVA with interactions Nonparametric regression of Y on X, stratified by T

May 24, 2000NIDA/NIMH Substance Abuse Conference 14 Covariate Adjustment, Nonparametric Version OK for low dimension X Curse of dimensionality for high dimension X Y X: Pre- Tx health T = 1 T = 0

May 24, 2000NIDA/NIMH Substance Abuse Conference 15 Stratification: Features Why not used as widely as ANCOVA? Does not always give an answer Provides warning where imbalance is too severe Not a push-button operation, but not difficult How to stratify? Clinical judgement Usually not critical; sensitivity analysis recommended Cochran-Rubin-Rosenbaum recommend 5 strata How to stratify with multi-dimensional covariates? Curse of dimensionality Use propensity score method to reduce dimensionality

May 24, 2000NIDA/NIMH Substance Abuse Conference 16 Propensity Score Method Assume  overt selection bias, no hidden selection bias T  Y | X  =  X) = P(T = 1 | X) is the propensity score Example: logit(  X)) =  + X   X) is a balancing score (most parsimonious) T  X |  X) Given  X), treatment independent of outcome T  Y |  X) Need only stratify by propensity score Other dimensions of X can be neglected in assessing treatment effect

May 24, 2000NIDA/NIMH Substance Abuse Conference 17 Propensity Score Method: Procedure Estimate  X) = P(T = 1 | X) Logistic regression of T on X Stratify sample (X, T, and Y) by estimated  X) or X  Sort out apples and oranges Analyze each stratum, compare treated vs. untreated Two sample comparison within stratum ANCOVA within stratum Assess interactions across strata Synthesize treatment effects across strata Weighted average...

May 24, 2000NIDA/NIMH Substance Abuse Conference 18 Propensity Score Method: Stratification for X Stratify, then compare by stratum Compare apples and apples, oranges and oranges X k X1X1 XX

May 24, 2000NIDA/NIMH Substance Abuse Conference 19 Propensity Score Method: Stratification for Y Stratify, then compare by stratum Compare apples and apples, oranges and oranges Y XX T = 0 T = 1

May 24, 2000NIDA/NIMH Substance Abuse Conference 20 Propensity Score Method: Model Specification Specification of propensity score model Lean towards over-fitting vs. under-fitting? Model diagnosis: are the covariates balanced across treatment groups within each stratum? Stratify by propensity score and key covariates (one or two)? Model misspecification less serious than ANCOVA? Only rank of estimated propensity score is used Stratification not sensitive to minor perturbations in model Limited empirical evidence (Drake 1993 Biometrics, Dehejia and Wahba 1999 JASA)

May 24, 2000NIDA/NIMH Substance Abuse Conference 21 Propensity Score Method: Options Stratification Matching (case-control) Curse of dimensionality relevant, less critical Mahalonobis distance matching Match on propensity score (+ a few key covariates?) Design stage vs. analysis stage Primary vs. secondary data collection ANCOVA: regress Y on T and propensity score (+ a few key covariates? + interactions?) Nonparametric regression? Stratified by T?

May 24, 2000NIDA/NIMH Substance Abuse Conference 22 Dimension Reduction Fundamental challenge in ANCOVA Valid assessment of treatment effect can be obtained using nonparametric regression of Y on X, stratified by T Curse of dimensionality No obvious way to reduce dimensionality? Propensity score method is an elegant way to reduce dimensionality Alternative dimension reduction methods? Slicing regression (Duan and Li 1991 Annals of Statistics, Li 1991 JASA): use inverse regression of X on Y...

May 24, 2000NIDA/NIMH Substance Abuse Conference 23 Propensity Score Method: References Rosenbaum and Rubin (1983 Biometrika, 1984 JASA) Lavori, Dawson, and Mueller (1994 Stat. in Medicine) Rosenbaum (1995) Observational Studies, Springer-Verlag Rubin (1997) Annals of Internal Medicine D’Agastino (1998 Stat. in Medicine) Normand et al. (2000 manuscript) Hirano et al. (2000 manuscript)

May 24, 2000NIDA/NIMH Substance Abuse Conference 24 Dealing with Hidden Selection Bias T  Y | X Very challenging problem, no easy solutions Given X, how does treatment depend on outcome? Overt selection bias can be made to look like stratified randomized experiment Hidden selection bias cannot be made to… Rosenbaum-Rubin’s sensitivity analysis Instrumental variable analysis a la Rubin Causal Model Selection modeling

May 24, 2000NIDA/NIMH Substance Abuse Conference 25 Rosenbaum’s Sensitivity Analysis: General Principle How robust is the observed treatment effect against hidden selection bias? Analogous to pattern mixture model for missing data Formulate a family of plausible models for hidden selection bias (from mild to severe) Assess treatment effect under each model Determine how much hidden selection bias wipes out treatment effect Is this much hidden selection bias realistic? Specificity analysis

May 24, 2000NIDA/NIMH Substance Abuse Conference 26 Unobserved Confounder Model logit(  X i )) =  + X i  U i  0  U i  1  > 0: maximum impact of unobserved hidden bias  = exp(  ) is the upper bound between  X i )’s | X Example: 2 x 2 table (analyzed with Fisher’s exact test) Worst case scenario for hidden bias: Unobserved health is a perfect predictor of survival Healthy patients are more likely to receive treatment U i = 1 for all survivors;= 0 for all deceaseds Null distribution is a tilted hypergeometric distribution Given , derive P-value under tilted hypergeometric distribution

May 24, 2000NIDA/NIMH Substance Abuse Conference 27 Rosenbaum’s Sensitivity Analysis: Limitations Does not give THE answer (should we expect one?) Rosenbaum’s sensitivity analysis is based on permutation test (tilted by hidden selection bias) Permutation test is the foundation for randomized trials, but rarely used: heavy computation burden Used more in recent years, e.g., COMMIT Special software required for tilted permutation test Programming logic not difficult Very heavy computation burden Inertia for users to stay with familiar packages

May 24, 2000NIDA/NIMH Substance Abuse Conference 28 Instrumental Variable (IV) Analysis for RED, a la Rubin Causal Model Encouragement intervention serves as instrumental variable Assume binary intervention (I = 0, 1) binary treatment (T = 0, 1) T(0)T(1)Category 0 0Never takers 0 1Compliers (new users) 1 0Defiers (assumed to be absent) 1 1Always takers Very likely different beyond observed characteristics

May 24, 2000NIDA/NIMH Substance Abuse Conference 29 IV Analysis: Observed Compliance Status I = 0: Untreated:C or N Treated:A or D I = 1: Untreated:N or D Treated:C or A Randomized encouragement design Compliance status distributed similarly across intervention groups %(C) = %(treated | I = 1)  %(treated | I = 0) = %(untreated | I = 0)  %(untreated | I = 1)

May 24, 2000NIDA/NIMH Substance Abuse Conference 30 IV Analysis: Intervention Effect by Subgroups Key assumption: Effect of encouragement intervention mediated entirely through treatment (exclusion restriction) Always takers and never takers: no treatment variation  no intervention effect [exclusion restriction]  cannot assess treatment effect Intervention effect manifested entirely through compliers

May 24, 2000NIDA/NIMH Substance Abuse Conference 31 Complier Average Causal Effect Treatment “Efficacy” on compliers: CACE = Program effect / Incremental adoption rate Program effect = intent-to-treat effect for encouragement intervention on outcome Incremental adoption rate = intent-to-treat effect for encouragement intervention on adoption Distribute intervention effect on outcome over compliers

May 24, 2000NIDA/NIMH Substance Abuse Conference 32 IV Analysis: External Validity Treatment effect estimable only for compliers (new users) Intrinsic limitation of design (RED or imperfect RCT) Should we be concerned about treatment effect for always takers and never takers? Yes for efficacy trials, less so for RED Never taker might never adopt treatment voluntarily Mandate vs. choice Universal dissemination vs. practical dissemination Always takers more critical; absent for new treatments Presence of defier likely to cancel some intervention effect IV estimate is conservative for true CACE

May 24, 2000NIDA/NIMH Substance Abuse Conference 33 IV Analysis: Discussions Exclusion restriction needs to be entertained carefully Likelihood and Bayesian methods available under weaker assumptions Non-randomized encouragement design (observational studies with instrumental variables) Example: McClellan et al. JAMA 1994, distance to alternative types of hospitals IV analysis usually deflates precision substantially Bias-variance trade-off? Combine propensity score analysis with IV analysis?

May 24, 2000NIDA/NIMH Substance Abuse Conference 34 IV Analysis: References Sommer and Zeger (1991 Stat. in Medicine) Angrist, Imbens, and Rubin (1996 JASA) Imbens and Rubin (1997 Annals of Statistics) Little and Yau (1998 Psych Methods) Hirano, Imbens, Rubin, and Zhou (2000 Biostatistics) Wells, et al. (2000, manuscript)

May 24, 2000NIDA/NIMH Substance Abuse Conference 35 Discussions Formulate research questions Treatment effect for whom? Adoption? Careful design usually more effective than analytic solutions Matching to avoid severe imbalance Promising methods for mitigating overt selection bias Careful modeling warranted Propensity score method worth exploring Nonparametric regression worth exploring Hidden selection bias very challenging Rosenbaum’s sensitivity analysis warranted IV analysis and selection model require careful assessment