Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multivariate Survival Analysis Alternative approaches Prof. L. Duchateau Ghent University.

Similar presentations


Presentation on theme: "Multivariate Survival Analysis Alternative approaches Prof. L. Duchateau Ghent University."— Presentation transcript:

1 Multivariate Survival Analysis Alternative approaches Prof. L. Duchateau Ghent University

2 Overview  The different approaches The marginal model The fixed effects model The stratified model The copula model The frailty model  Efficiency comparisons

3 The marginal model  The marginal model approach consists of two stages Stage 1: Fit the model without taking into account the clustering Stage 2: Adjust for the clustering in the data

4  The ML estimate from the Independence Working Model (IWM) is a consistent estimator for (Huster, 1989)  More generally, the ML estimate ( and baseline parameters) from the IWM is also a consistent estimator for whole  Parameter refers to the whole population Consistency of marginal model parameter estimates

5  The variance estimate based on the inverse of the information matrix of is an inconsistent estimator of Var( )  One possible solution: jackknife estimation  General expression of jackknife estimator (Wu, 1986) with N the number of observations and a the number of parameters Adjusting the variance of IWM estimates

6 The grouped jackknife estimator  For clustered observations: grouped jackknife estimator with s the number of clusters

7 Reconstitution: jackknife #Jackknife estimator bdel<-rep(NA,100) b1<- - survreg(Surv(timerec,stat)~trt,data=reconstitution,dist="expon ential")$coeff[2] for (i in 1:100){ temp<-reconstitution[reconstitution$cowid!=i,] bdel[i]<-- survreg(Surv(timerec,stat)~trt,data=temp,dist="exponential")$ coeff[2]} var.robust<-0.98*sum((bdel-b1)^2);stderr.robust<- sqrt(var.robust) var.robust;stderr.robust

8 Reconstitution: jackknife in R? #Jackknife estimator using cluster() and robust=T command analexp.jc<- survreg(Surv(timerec,stat)~trt+cluster(cowid),robust=T,dist=" exponential") Error in score %*% vv : non-conformable arguments

9 Example marginal model with jackknife estimator  Example: Time to reconstitution with drug versus placebo  Estimates from IWM model with time- constant hazard rate assumption are given by  Grouped jackknife = approximation

10 Jackknife estimator  Adjusts for clustering  Reconstitution example: jackknife estimator is smaller  Time to diagnosis example?

11 Jackknife estimator  Adjusts for clustering  Reconstitution example: jackknife estimator is smaller  What is then the picture? Simulation

12 Jackknife estimator- simulations(1)  Is jackknife estimate always smaller than estimate from unadjusted model?  Generate data from the frailty model with  We generate 2000 datasets, each of 100 pairs of two subjects for the settings 1. Matched clusters, no censoring 2. 20% of clusters 2 treated or untreated subjects, no censoring 3. Matched clusters, 20% censoring

13 Jackknife estimator- simulations(2)

14 The fixed effects model  The fixed effects model is given by with the fixed effect for cluster i,  Assume for simplicity

15 The fixed effects model: ML solution  General survival likelihood expression  For fixed effects model using assumptions

16 Reconstitution: fixed effects model #Fixed effects model res.fixed<- survreg(Surv(timerec,stat)~trt+as.factor(cowid),dist="exponen tial",data=reconstitution) res.fixed summary(res.fixed)

17 Treatment effect for reconstitution data using R-function survreg (loglin. model) Output treatment effect

18 Parameter interpretation  corresponds to constant hazard of untreated udder quarter of cow 1  corresponds to constant hazard of untreated udder quarter of cow i Cowid65 ≈ 0 Cowid100 exp(-21+18.8)=0.11  Treatment effect: HR=exp(0.185)=1.203 with 95% CI [0.83;1.75] Parameter interpretation

19 Investigate cow characteristic: heifer #Fixed effects model res.fixed<- survreg(Surv(timerec,stat)~heifer+as.factor(cowid), dist="exponential",data=reconstitution) res.fixed summary(res.fixed)

20 Heifer effect for reconstitution data introducing heifer first in the model Hazard ratio impossibly high Output heifer effect

21 Add cow characteristic: heifer after cowid? #Fixed effects model res.fixed<- survreg(Surv(timerec,stat)~trt+as.factor(cowid)+heifer, dist="exponential",data=reconstitution) res.fixed summary(res.fixed)

22 Heifer effect for reconstitution data introducing cowid first in the model Hazard ratio equal to 1 Example: between cluster covariate (2)

23 Exercise  Investigate method and type of fracture in diagnosis data

24 Note on overparametrisation and confounding

25 Cell means model: no overparametrisation  Milk reduction as a function of low and high inoculation dose

26 Factor effects model: overparametrisation  Milk reduction as a function of low and high inoculation dose

27

28 Confounding between temperature in F and C  Effect of temperature on bacterial growth (log(CFU))

29 Temperature in °C vs °F

30 Conversion from°F to °C

31 Infinite number of model representations

32 Confounding between blocks and block factors  Cow factor is not confounded with treatment factor

33 Fitting model with cow and treatment

34 Model with cow and treatment vs cow alone

35 Adding the heifer factor

36 Infinite number of model representations

37 Example: heifer - cowid confounded  There is complete confounding between fixed heifer effect and cowid

38 The stratified model  Based on the Cox model where now baseline hazard function unspecified  Cox (1972) showed that if only order of events matters, the survival likelihood reduces to the partial likelihood

39 Partial likelihood for the stratified model  The stratified model is given by  Maximisation of partial likelihood

40 Reconstitution: stratified model #stratified Cox model library(survival) res.strat<- coxph(Surv(timerec,stat)~trt+strata(cowid),data=reconstituti on) res.strat summary(res.strat)

41 Example for bivariate data  The partial likelihood for reconstitution data  Estimates

42 Exercise  Fit the stratified model for the diagnosis data

43 The copula model  The copula model is often considered to be a two-stage model  First obtain the population (marginal) survival functions for each subject in a cluster.  The copula function then links these population survival functions to generate the joint survival function (Frees et al., 1996).

44 Example of copula model  Time to diagnosis of being healed

45 Bivariate copula model likelihood  Four different possible contributions of a cluster  Estimated population survival functions are inserted, only copula parameters unknown

46 The Clayton copula  The Clayton copula (Clayton, 1978) is  The Clayton copula corresponds to the family of Archimedean copulas, i.e., with in the Clayton copula case

47 Clayton copula likelihood  Two censored observations  Observation j censored  No observations censored

48 Example Clayton copula (1)  For diagnosis of being healed data, first fit separate models for RX and US technique  For instance, separate parametric models

49 Fitting the copula: two stage approach #Clayton copula for time to diagnosis timetodiag <- read.table("c:\\docs\\onderwijs\\survival\\flames\\diag.csv", header = T,sep=";") t1<-timetodiag$t1/30;t2<-timetodiag$t2/30;c1<-timetodiag$c1;c2<- timetodiag$c2; surv1<-survreg(Surv(t1,c1)~1);l1<-exp(-surv1$coeff/surv1$scale);r1<- (1/surv1$scale) surv2<-survreg(Surv(t2,c2)~1);l2<-exp(-surv2$coeff/surv2$scale);r2<- (1/surv2$scale) s1<-exp(-l1*t1^(r1));f1<-s1*r1*l1*t1^(r1-1) s2<-exp(-l2*t2^(r2));f2<-s2*r2*l2*t2^(r2-1) loglikcon.gamma<-function(theta){ P<-s1^(-theta)+ s2^(-theta)-1 loglik<- -(1-c1)*(1-c2)*(1/theta)*log(P)+c1*(1- c2)*((1+1/theta)*log(P)+(theta+1)*log(s1)- log(f1))+c2*(1- c1)*((1+1/theta)*log(P)+(theta+1)*log(s2)-log(f2))+c1*c2*(log(1+theta)- (2+1/theta)*log(P)-(theta+1)*log(s1)+log(f1)-(theta+1)*log(s2)+log(f2)) -sum(loglik)} nlm(loglikcon.gamma,c(0.5))

50 Example Clayton copula (2)  Estimates for marginal models are  Based on these estimates we obtain which can be inserted in the likelihood expression which is then maximized for

51 Exercise  Fit the copula model to the diagnosis data as one-stage model

52 Fitting the copula: one stage approach #Clayton copula for time to diagnosis – one stage loglikcon3.gamma<-function(param){ theta<-param[1];l1<-param[2];l2<-param[3];r1<-param[4];r2<-param[5] s1<-exp(-l1*t1^(r1));f1<-s1*r1*l1*t1^(r1-1) s2<-exp(-l2*t2^(r2));f2<-s2*r2*l2*t2^(r2-1) P<-s1^(-theta)+ s2^(-theta)-1 loglik<- -(1-c1)*(1-c2)*(1/theta)*log(P)+c1*(1- c2)*((1+1/theta)*log(P)+(theta+1)*log(s1)-log(f1))+c2*(1- c1)*((1+1/theta)*log(P)+(theta+1)*log(s2)-log(f2))+c1*c2*(log(1+theta)- (2+1/theta)*log(P)-(theta+1)*log(s1)+log(f1)-(theta+1)*log(s2)+log(f2)) -sum(loglik) } nlm(loglikcon3.gamma, c(0.5,1,1,1,1))

53 Example Clayton copula (3)  For parametric marginal models, the likelihood can also be maximized simul- taneously for all parameters leading to  Thus, for small sample sizes, the two- stage approach can differ substantially from the one-stage approach

54 Example Clayton copula (4)  Alternatives can be used for marginal survival functions Nonparametric Semiparametric leading to

55 The frailty model  The ‘shared’ frailty model is given by with the frailty  An alternative formulation is given by with

56 The gamma frailty model  Gamma frailty distribution is easiest choice with and

57 Marginal likelihood for the gamma frailty model  Start from conditional (on frailty) likelihood with containing the baseline hazard parameters, e.g., for Weibull

58 Marginal likelihood: integrating out the frailties …  Integrate out frailties using distribution with

59 Closed form expression for marginal likelihood  Integration leads to (homework) and taking log and summing over s clusters

60 Maximisation of marginal likelihood leads to estimates  Marginal likelihood no longer contains frailties. By maximisation estimates of are obtained  Furthermore, the asymptotic variance- covariance matrix can be obtained as the inverse of the observed information matrix with the Hessian matrix with entries

61 Entries of Hessian matrix from marginal likelihood  As an example, the entry of the Hessian matrix for is given by

62 Example for the parametric gamma frailty model  Consider time to first insemination data  Assume Weibull distributed event times and model the heifer effect  We have the following conditional functions

63 R program: read the data #read data setwd("c://docs//onderwijs//survival//Flames//notas//") insemfix<-read.table("insemfix.csepv", header=T,sep=",") #Create four column vectors, four different variables herd<-insemfix$herdnr;timeto<-(insemfix$end*12/365.25) stat<-insemfix$score;heifer<-insemfix$par2 #Derive some values n<-length(levels(as.factor(herd))); di<-aggregate(stat,by=list(herd),FUN=sum)[,2];r<- sum(di)

64 R program: the function #Observable likelihood weibull #l=exp(p[1]), theta=exp(p[2]), beta=p[3], rho=exp(p[4]) #r=No events,di=number of events by herd likelihood.weibul<-function(p){ cumhaz<-exp(heifer*p[3])*(timeto^(exp(p[4])))*exp(p[1]) cumhaz<-aggregate(cumhaz,by=list(herd),FUN=sum)[,2] lnhaz<-stat*(heifer*p[3]+log((exp(p[4])*timeto^(exp(p[4])- 1))*exp(p[1]))) lnhaz<-aggregate(lnhaz,by=list(herd),FUN=sum)[,2] lik<-r*log(exp(p[2]))- sum((di+1/exp(p[2]))*log(1+cumhaz*exp(p[2])))+sum(lnhaz)+ sum(sapply(di,function(x) ifelse(x==0,0,log(prod(x+1/exp(p[2])-seq(1,x)))))) -lik}

65 R program: the output res<-nlm(likelihood.weibul,c(log(0.128),log(0.39),0.15,log(1.76)), hessian=T) lambda<-exp(res$estimate[1]) theta<-exp(res$estimate[2]) beta<-res$estimate[3] rho<-exp(res$estimate[4])

66 Time to first insemination: effect of heifer with herd as cluster  ML Monthly hazard rate Monotone increasing Variance of frailties Within herd heifer effect Hazard ratio with 95 % CI

67 Using parfm library library(parfm) #Create four column vectors, four different variables herd<-as.factor(insemfix$herdnr);timeto<- (insemfix$end*12/365.25) stat<-insemfix$score;heifer<-insemfix$par2 insem<- data.frame(herd=herd,timeto=timeto,stat=stat,heifer=heifer) parfm(Surv(timeto,stat)~heifer,cluster="herd",data=insem,fr ailty="gamma")

68 Interpretation of frailty variance  The parameter refers to the variability at the hazard level: difficult to interprete!  Maybe plot the hazard function for subjects with a particular frailty

69 Plotting hazard of insemination for multiparous cows #Interpretation of parameters lambda<-0.174;theta<-0.394;rho<-1.769 lambda<-lambda*((365.25/12)^(-rho)) time<-seq(1,350);timet<-time+29.5 h1f0<-lambda*rho*time^(rho-1) h1f05<-qgamma(0.05,1/theta,1/theta)*lambda*rho*time^(rho-1) h1f95<-qgamma(0.95,1/theta,1/theta)*lambda*rho*time^(rho-1)

70 Plotting hazard of insemination for multiparous cows #Hazards par(mfrow=c(1,2)); par(adj=0.5);par(cex=1.2) plot(c(0,360),c(min(h1f05,h2f05),max(h1f95,h2f95)),type='n',xlab="Time after calving (days)",ylab="hazard") lines(timet,h1f0,lty=1,lwd=3);lines(timet,h1f05,lty=1,lwd=1) lines(timet,h1f95,lty=1,lwd=1) par(adj=0);text(1,0.14,"Multiparous cows")

71 Exercise Plot hazard of insemination for heifers

72 Plotting hazard of insemination for heifers #Interpretation of parameters lambda<-0.174;theta<-0.394;rho<-1.769;beta<--0.153 lambda<-lambda*((365.25/12)^(-rho)) h2f0<-lambda*rho*exp(beta)*time^(rho-1) h2f05<-qgamma(0.05,1/theta,1/theta)*lambda*rho*exp(beta)*time^(rho-1) h2f95<-qgamma(0.95,1/theta,1/theta)*lambda*rho*exp(beta)*time^(rho-1)

73 Plotting hazard of insemination #Hazards par(mfrow=c(1,2)); par(adj=0.5);par(cex=1.2) plot(c(0,360),c(min(h1f05,h2f05),max(h1f95,h2f95)),type='n',xlab="Time after calving (days)",ylab="hazard") lines(timet,h1f0,lty=1,lwd=3);lines(timet,h1f05,lty=1,lwd=1) lines(timet,h1f95,lty=1,lwd=1) par(adj=0);text(1,0.14,"Multiparous cows") par(adj=0.5) plot(c(0,360),c(min(h1f05,h2f05),max(h1f95,h2f95)),type='n',xlab="Time after calving (days)",ylab="hazard") lines(timet,h2f0,lty=1,lwd=3);lines(timet,h2f05,lty=1,lwd=1) lines(timet,h2f95,lty=1,lwd=1) par(adj=0);text(1,0.14,"Heifers")

74 Interpretation of frailty variance  The parameter refers to the variability at the hazard level: difficult to interprete! Multiparous cows Heifers

75 Transformation to median  Density function of transformation of random variable

76 Median of Weibull distribution  To find the median survival time for cluster i, put = 0.5

77 Density of median for Weibull distribution  The density function is then with and

78 Density of median for Weibull distribution  Leading to

79 Plotting density function of median for multiparous cows lambda<-0.174;theta<-0.394;rho<-1.769;beta<-- 0.153; lambda<-lambda*((365.25/12)^(-rho)) #Medians calcm<-function(m){ rho * (log(2)/(theta*lambda))^(1/theta) * (1/m)^(1+rho/theta) * (1/gamma(1/theta)) *exp(- log(2)/(theta*lambda*m^(rho)))} timedens<-seq(1,200,1) densmd1<-sapply(timedens,calcm) plot(c(0,230),c(min(densmd1),max(densmd1)),type='n',xlab="Median time to first insemination (days)",ylab="Density function median") lines(timedens+29.5,densmd1,lty=1,lwd=3)

80 Exercise  Plot density of median for Heifers

81 Plotting density function of median for multiparous cows and heifers lambda<-0.174;theta<-0.394;rho<-1.769;beta<--0.153; lambda<-lambda*((365.25/12)^(-rho)) #Medians calcm<-function(m){ rho * (log(2)/(theta*lambda))^(1/theta) * (1/m)^(1+rho/theta) * (1/gamma(1/theta)) *exp(-log(2)/(theta*lambda*m^(rho)))} timedens<-seq(1,200,1) densmd1<-sapply(timedens,calcm) lambda<-lambda*exp(beta) densmd2<-sapply(timedens,calcm) plot(c(0,230),c(min(densmd1,densmd2),max(densmd1,densmd2)),t ype='n',xlab="Median time to first insemination (days)",ylab="Density function median") lines(timedens+29.5,densmd1,lty=1,lwd=3);lines(timedens+29.5,de nsmd2,lty=2,lwd=3) legend(130,0.015,legend=c("Multiparous","Heifer"),lty=c(1,2))

82 Variability of median time to first insemination between herds

83 Exercise  Derive the density function for the percentage survivan at a particular time t

84 Transformation to percentage survival  The percentage in cluster i with first insemination at time t is given by  Thus  and

85 Interpretation of frailty variance in terms of % events at time t  The density function is then obtained by  and thus

86 Variability of % first insemination at time t between herds Multiparous cows Heifers

87 Efficiency comparisons in the reconstitution data example  Estimates (se) for reconstitution data


Download ppt "Multivariate Survival Analysis Alternative approaches Prof. L. Duchateau Ghent University."

Similar presentations


Ads by Google