Sensitivity Analysis for Interval-Censored Discrete Failure Time Data: Application to ACTG 181 Daniel Scharfstein Department of Biostatistics Johns Hopkins.

Slides:



Advertisements
Similar presentations
Sensitivity Analysis of Randomized Trials with Missing Data Daniel Scharfstein Department of Biostatistics Johns Hopkins University
Advertisements

Industry Issues: Dataset Preparation for Time to Event Analysis Davis Gates Schering Plough Research Institute.
Controlling for Time Dependent Confounding Using Marginal Structural Models in the Case of a Continuous Treatment O Wang 1, T McMullan 2 1 Amgen, Thousand.
Point Estimation Notes of STAT 6205 by Dr. Fan.
Pattern Recognition and Machine Learning
1 Goodness-of-Fit Tests with Censored Data Edsel A. Pena Statistics Department University of South Carolina Columbia, SC [ Research.
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Informative Censoring Addressing Bias in Effect Estimates Due to Study Drop-out Mark van der Laan and Maya Petersen Division of Biostatistics, University.
Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Goodness of Fit of a Joint Model for Event Time and Nonignorable Missing Longitudinal Quality of Life Data – A Study by Sneh Gulati* *with Jean-Francois.
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Maximum likelihood (ML) and likelihood ratio (LR) test
1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Introduction to Survival Analysis Seminar in Statistics 1 Presented by: Stefan Bauer, Stephan Hemri
Lecture 9 Today: –Log transformation: interpretation for population inference (3.5) –Rank sum test (4.2) –Wilcoxon signed-rank test (4.4.2) Thursday: –Welch’s.
Modeling clustered survival data The different approaches.
Maximum likelihood (ML)
Sample Size Determination
Sample Size Determination Ziad Taib March 7, 2014.
Analysis of Complex Survey Data
Mixture Modeling Chongming Yang Research Support Center FHSS College.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Acute ischemic stroke (AIS) AISbrain artery Acute ischemic stroke (AIS) is a major cause of disability and death in adults. AIS is caused by a clot in.
Estimation Basic Concepts & Estimation of Proportions
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
8.1 Inference for a Single Proportion
Statistical approaches to analyse interval-censored data in a confirmatory trial Margareta Puu, AstraZeneca Mölndal 26 April 2006.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Bayesian Analysis and Applications of A Cure Rate Model.
Learning Theory Reza Shadmehr logistic regression, iterative re-weighted least squares.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Chapter 7 Point Estimation
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
Modeling Cure Rates Using the Survival Distribution of the General Population Wei Hou 1, Keith Muller 1, Michael Milano 2, Paul Okunieff 1, Myron Chang.
HSRP 734: Advanced Statistical Methods July 31, 2008.
Empirical Efficiency Maximization: Locally Efficient Covariate Adjustment in Randomized Experiments Daniel B. Rubin Joint work with Mark J. van der Laan.
POSTER TEMPLATE BY: Weighted Kaplan-Meier Estimator for Adaptive Treatment Strategies in Two-Stage Randomization Designs Sachiko.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 4: An Alternative to Last-Observation-Carried-Forward:
Empirical Likelihood for Right Censored and Left Truncated data Jingyu (Julia) Luan University of Kentucky, Johns Hopkins University March 30, 2004.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Lecture 4: Likelihoods and Inference Likelihood function for censored data.
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
SS r SS r This model characterizes how S(t) is changing.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Topic 19: Survival Analysis T = Time until an event occurs. Events are, e.g., death, disease recurrence or relapse, infection, pregnancy.
Joint Modelling of Accelerated Failure Time and Longitudinal Data By By Yi-Kuan Tseng Yi-Kuan Tseng Joint Work With Joint Work With Professor Jane-Ling.
Paz Bailey G 1, Sternberg M 1, Puren AJ 2, Markowitz LE 1, Ballard R 1, Delany S 3, Hawkes S 4, Nwanyanwu O 1, Ryan C 1, and Lewis DA 5 1. NCHHSTP, CDC.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Confidential and Proprietary Business Information. For Internal Use Only. Statistical modeling of tumor regrowth experiment in xenograft studies May 18.
Survival time treatment effects
Comparing Cox Model with a Surviving Fraction with regular Cox model
Chapter 8: Inference for Proportions
Special Topics In Scientific Computing
Cox Regression Model Under Dependent Truncation
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Pattern Recognition and Machine Learning
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Applied Statistics and Probability for Engineers
Kaplan-Meier survival curves and the log rank test
Subgroup analysis on time-to-event: a Bayesian approach
A Bayesian Design with Conditional Borrowing of Historical Data in a Rare Disease Setting Peng Sun* July 30, 2019 *Joint work with Ming-Hui Chen, Yiwei.
Presentation transcript:

Sensitivity Analysis for Interval-Censored Discrete Failure Time Data: Application to ACTG 181 Daniel Scharfstein Department of Biostatistics Johns Hopkins University Collaborators: Michelle Shardell (PhD Student) Sam Bozzette (PI of ACTG 181)

ACTG 181  Natural history study of advanced HIV disease.  204 participants were scheduled to be monitored for CMV shedding in the urine every 4 weeks and in the blood every 12 weeks.  Shedding time was discretized into three-month quarters.  No death during the one year follow-up period.  At baseline, 69 (135) participants were classified as having high (low) CD4 counts.

ACTG 181: Research Questions 1. What are the shedding-time distributions for the high and low baseline CD4 groups? 2. What is the effect of baseline CD4 count on the risk of shedding?

ACTG 181: Challenge  Participants miss visits  Missingness may be related to the shedding time High CD4Low CD4 Left Censored23%36% Interval Censored10%12% Right Censored57%41% Exactly Observed10%11%

Data Structure and Notation  M=4  T = Shedding time (0,1,…,M)  T = “M+1” if no shedding by one year  Let p t = P(T = t), t =0,…M+1  Z = Baseline CD status (1:high; 0:low)

Missed visits 0 4 “5” LR Shedding- Shedding+Shedding  Observed Data ? T  [L, R]

Coarsening at Random (CAR)  Pattern mixture model  HR (1991), GLR (1997)

Coarsening at Random (CAR)  Pattern mixture model  HR (1991), GLR (1997) Failure and censoring processes Failure process only

Coarsening at Random (CAR)  Selection model  HR (1991), GLR (1997)

Estimation under CAR  Turnbull (1976, JRSS-B) proposed EM algorithm to estimate the distribution of T.  Tu, Meng, and Pagano (1993) extend Turnbull’s method to estimate parameters in a discrete-time Cox model.  Sun (1997) estimates parameters in a continuation-ratio model using maximum likelihood.

Coarsening at Random (CAR)  CAR is untestable without auxiliary information.  CAR is often considered implausible by scientific experts.  For example, Sam Bozzette, PI of ACTG 181, believes that the nature of the missed clinic visits relates to shedding time.

NCAR Models (PM)  Exponential tilting (B-N & C, 1989)  Rotnitzky, Robins, and colleagues

NCAR Models (S)  WLOG, we assume that q(l,l,r)=0.  q(t,l,r) is interpreted as the log probability ratio of having interval [l,r] comparing a subject with T=t vs. T=l.  q=0 iff CAR.

NCAR Models - Theorems Theorem 1: Suppose Model (PM,S) holds for a specified function q. If P(L=t,R=t)>0 for all t, then the distribution of T is uniquely identified. Theorem 2: For specified function q, the uniquely identified distribution of T (Theorem 1) in conjunction with Model (S) yields a joint distribution for (L,R,T) which marginalizes to the population distribution of (L,R).  So, the NCAR models (i.e., q) are untestable.  Sensitivity analysis

Inference for Survival Curves  Estimation proceeds via EM algorithm.  Re-weighted version of Turnbull’s self-consistency equation.  Standard errors via Louis’ method. Observed data indicators

Inference for Subgroup Effects  Continuation Ratio Model  Estimation via EM algorithm.  Standard errors using Louis’ method

Simulation Study  M=4  n = 100/arm  Z = 0,1   = 0 or 0.75  For each treatment group, exp{  z } is the probability ratio of having interval [l, r] comparing those with T=r to those with T=l.   z = -log(2), 0, or log(2)

Simulation Results:  = 0 True  0 True  1 BiasCoverage -log(2) log(2) log(2)log(2) CAR Analysis

Simulation Results:  = 0.75 True  0 True  1 Bias Coverage -log(2) log(2) log(2)log(2) Truth

ACTG 181 – Censoring Bias Function  exp{ } is the CD4-specific probability ratio of having interval [3 mos., 12 mos.] comparing those who begin shedding at 12 mos. to those who begin shedding at 3 mos.  exp{ } is the CD4-specific probability ratio of dropping out just after baseline comparing those who do not begin within 12 mos. to those who begin shedding within 3 mos. from baseline.

ACTG 181: Elicitation Schematic

ACTG 181: Bozzette’s Responses  Among participants with low baseline CD4, those who begin shedding within 3 months of testing negative at baseline are less likely to drop out (be interval-censored) than those who begin shedding after (at) 12 months.  Sam believes that those with low baseline CD4 have probably not managed their HIV well, and the least healthy of this group have the greatest motivation to make visits. The most healthy of this group are likely to shed later and may feel less motivated to comply with the visit schedule.  exp{ } and exp{ } are between 1 and 3.

ACTG 181: Bozzette’s Responses  Among participants with high baseline CD4, those who begin shedding within 3 months of testing negative at baseline are more likely to drop out (be interval-censored) than those who begin shedding after (at) 12 months.  Sam believes that those with high baseline CD4 have likely managed their HIV well, and the most healthy of this group will continue to do so. However, he believes that the least healthy of this group are more likely to miss visits and shed earlier.  exp{ } and exp{ } are between 1/3 and 1

Redistribution under CAR and NCAR

ACTG 181: Results

ACTG 181 Results

Summary  We have presented a formal sensitivity analysis approach for analyzing informative interval-censored, discrete time-to-event data. Future Directions  Formal Bayesian approach  Asymptotic theory for continuous time.  High dimensional covariates  Elicitation from experts