Univariate Survival Analysis Prof. L. Duchateau Ghent University
Model specification Most survival models are defined in terms of the hazard with the hazard at time t for subject i the baseline hazard at time t the incidence vector for subject i the parameter vector
Hazard function Density function Cumulative distribution function Survival function Hazard function
Alternative models Hazard model: Baseline hazard function parametric Baseline hazard function unspecified Summary measure: hazard ratio Accelerated failure time (AFT) model: Typically parametric Summary measure: accelerator factor
Parametric hazard model: Analytical solution Assume constant baseline hazard (exponential lifetimes) with only control and treated group with = 0 for control and =1 for treated Likelihood for exponential:
Likelihood specification constant hazard Define as number of events in treated (control) group and Define as at risk time in treated (control) group and
Solution from likelihood specification Maximise the log likelihood function leading to
Analytical solution reconstitution data #The analytical solution DT<-sum(stat[trt==1]); DC<-sum(stat[trt==0]); yT<-sum(timerec[trt==1]);yC<-sum(timerec[trt==0]); lambda<-DC/yC;HR<-(DT/yT)/(DC/yC) lambda;HR
Exercise Obtain the analytical solution for the diagnosis data set First rework the data: #Read the data diag<-read.table("timetodiag.csv",header=T,sep=";") #Create 5 column vectors, five different variables timetodiag<-c(diag$t1,diag$t2) stat<-c(diag$c1,diag$c2) technique<-c(rep(0,106),rep(1,106)) dogid<-c(diag$dogid,diag$dogid) diagnosis<- data.frame(dogid=dogid,technique=technique,timetodiag=t imetodiag,stat=stat)
Variance of the estimates? Obtain the Hessian, i.e., the matrix of the second derivatives of the log likelihood which is The information matrix is then
Inverse of observed information matrix The observed information matrix is thus and the asymptotic variance-covariance matrix is
Variance estimators #The observed information matrix I<-matrix(data=c((DT+DC)/(lambda^2), yT*HR,yT*HR,lambda*yT*HR), nrow = 2, ncol = 2) V<-solve(I) V;sqrt(V) [,1] [,2] [1,] [2,] [,1] [,2] [1,] NaN [2,] NaN
Exercise Obtain the asymptotic variance estimates for the parameters of the diagnosis data set
Maximizer solution reconstitution data #(negative) loglikelihood exponential with l=exp(p[1]), beta=p[2] loglikelihood.exponential<-function(p){ cumhaz<- exp(p[1])*timerec*(exp(p[2]*trt)) hazard<-stat*log(exp(p[1])*exp(p[2]*trt)) loglik<-sum(hazard)-sum(cumhaz) -loglik} #Apply minimizer to minus loglikelihood function res<-nlm(loglikelihood.exponential,c(-1,0)) res;lambda<-exp(res$estimate[1]);HR<- exp(res$estimate[2]) lambda;HR
Variances from maximizer solution #Apply minimizer to obtain Hessian matrix res<-nlm(loglikelihood.exponential,c(-1,0),hessian=T) solve(res$hessian) [,1] [,2] [1,] [2,]
Use parameters of interest as input #(negative) loglikelihood exponential with l=p[1], HR=p[2] loglikelihood.exponentialHR<-function(p){ cumhaz<- p[1]*timerec*(exp(log( p[2])*trt)) hazard<-stat*log(p[1]*exp(log(p[2])*trt)) loglik<-sum(hazard)-sum(cumhaz) -loglik} #Apply minimizer to obtain Hessian matrix res<- nlm(loglikelihood.exponentialHR,c(lambda,HR),hessian=T,iterl im=1) solve(res$hessian) [,1] [,2] [1,] [2,]
Exercise Obtain the parameter estimates and their variance for the diagnosis data set using the maximizer
Standard software? #Univariate model-exponential library(survival) res.unadjust<- survreg(Surv(timerec,stat)~trt,dist="exponential",data=reconstituti on) res.unadjust summary(res.unadjust) lambda<-res.unadjust$coef[1]; beta<-res.unadjust$coef[2];HR<-exp(beta) lambda;beta;HR
Loglinear model representation Hazard model with parametric baseline hazard can be rewritten in a loglinear model representation Most often used:
Examples Weibull distributions – varying
Survival function for Weibull hazard model Assume
Survival function for Weibull loglinear model Assume with From this follows that and thus Based on the Gumbel assumption, the survival function becomes
Two presentations for Weibull event times and thus:
Two presentations for exponential event times and thus:
survreg function #unadjusted model-exponential res.unadjust<- survreg(Surv(timerec,stat)~trt,dist="exponential",data=reconstituti on) res.unadjust;summary(res.unadjust) mu<-res.unadjust$coef[1];alpha<-res.unadjust$coef[2]; lambda<- exp(-mu);beta<- -alpha;HR<-exp(beta) lambda;beta;HR
Exercise Obtain the parameter estimates of the diagnosis data set using the survreg function in R
survreg function variances #unadjusted model-exponential res.unadjust<- survreg(Surv(timerec,stat)~trt,dist="exponential",data=reconstituti on) res.unadjust;summary(res.unadjust) mu<-res.unadjust$coef[1];alpha<-res.unadjust$coef[2]; lambda<- exp(-mu);beta<- -alpha;HR<-exp(beta) lambda;beta;HR res.unadjust$var (Intercept) trt (Intercept) trt
The delta method on variance Obtaining the variance of using the variance of
The delta method - general Original parameters Interest in univariate cont. function Use one term Taylor expansion of with
The delta method - specific Interest in univariate cont. function The one term Taylor expansion of With
survreg function variances #unadjusted model-exponential-variances of transformed variables lambda<- exp(-mu);beta<- -alpha;HR<-exp(beta) Vlambda<- res.unadjust$var[1,1]*(lambda^2) Vbeta<- res.unadjust$var[2,2]
Exercise Obtain the variances parameter estimates of the diagnosis data set using the survreg function in R, and applying the delta method