Survival Analysis: From Square One to Square Two Yin Bun Cheung, Ph.D. Paul Yip, Ph.D. Readings
Lecture structure Basic concepts Kaplan-Meier analysis Cox regression Computer practice
What’s in a name? time-to-event data failure-time data censored data (unobserved outcome)
Types of censoring – loss to follow-up during the study period – study closure
Examples of survival analysis 1. Marital status & mortalitymortality 2. Medical treatments & tumor recurrence & mortality in cancer patientstumor recurrence & mortality 3. Size at birth & developmental milestones in infantsdevelopmental milestones
Why survival analysis ? Censoring (time of event not observed) Unequal follow-up time
What is time? What is the origin of time? In epidemiology: Age (birth as time 0) ? Calendar time since a baseline survey ?
What is the origin of time? In clinical trials: Since randomisation ? Since treatment begins ? Since onset of exposure ?
The choice of origin of time Onset of continuous exposure Randomisation to treatment Strongest effect on the hazard
Types of survival analysis 1. Non-parametric method Kaplan-Meier analysis 2. Semi-parametric method Cox regression 3. Parametric method
Square 1 to square 2 This lecture focuses on two commonly used methods Kaplan-Meier method Cox regression model
KM survival curve * d=death, c=censored, surv=survival
KM survival curve
No. of expected deaths Expected death in group A at time i, assuming equality in survival: E Ai =no. at risk in group A i death i total no. at risk i Total expected death in group A: E A = E Ai
Log rank test A comparison of the number of expected and observed deaths. The larger the discrepancy, the less plausible the null hypothesis of equality.
An approximation The log rank test statistic is often approximated by X 2 = (O A -E A ) 2 /E A + (O B -E B ) 2 /E B, where O A & E A are the observed & expected number of deaths in group A, etc.
Proportional hazard assumption Log rank test preferred (PH true ) Breslow test preferred (non-PH)
Risk, conditional risk, hazard
Another look of PH Log rank test preferred (PH true ) Breslow test preferred (non-PH) Hazard Time Hazard Time
Cox regression model Handles 1 exposure variables. Covariate effects given as Hazard Ratios. Semi-parametric: only assumes proportional hazard.
Cox model in the case of a single variable 1. h i (t) = h B (t) exp(BX i ) 2. h j (t) = h B (t) exp(BX j ) 3. h i (t)/h j (t) =exp[B(X i -X j )] exp(B) is a Hazard Ratio
Test of proportional hazard assumption Scaled Schoenfeld residuals Grambsch-Therneau test Test for treatment period interaction Example: mortality of widowsmortality of widows
Computer practice A clinical trial of stage I bladder tumor Thiotepa vs Control Data from StatLibStatLib
Data structure Two most important variables: Time to recurrence (>0) Indicator of failure/censoring (0=censored; 1=recurrence) (coding depends on software)
KM estimates Thiotepa Control
Log rank test chi2(1) = 1.52 Pr>chi2 = 0.22
Cox regression models
Test of PH assumption Grambsch-Therneau test for PH in model II Thiotepa P=0.55 Number of tumor P=0.60
Major References (Examples) Ex 1. Cheung. Int J Epidemiol 2000;29:93-99.Int J Epidemiol Ex 2. Sauerbrei et al. J Clin Oncol 2000;18: J Clin Oncol Ex 3. Cheung et al. Int J Epidemiol 2001;30:66-74.Int J Epidemiol
Major References (General) Allison. Survival Analysis using the SAS ® System. Collett. Modelling Survival Data in Medical Research. Fisher, van Belle. Biostatistics: A Methodology for the Health Sciences.