Download presentation
Presentation is loading. Please wait.
Published byHelen Collins Modified over 9 years ago
1
Multivariate Survival Analysis Alternative approaches Prof. L. Duchateau Ghent University
2
Overview The different approaches The marginal model The fixed effects model The stratified model The copula model The frailty model Efficiency comparisons
3
The marginal model The marginal model approach consists of two stages Stage 1: Fit the model without taking into account the clustering Stage 2: Adjust for the clustering in the data
4
The ML estimate from the Independence Working Model (IWM) is a consistent estimator for (Huster, 1989) More generally, the ML estimate ( and baseline parameters) from the IWM is also a consistent estimator for whole Parameter refers to the whole population Consistency of marginal model parameter estimates
5
The variance estimate based on the inverse of the information matrix of is an inconsistent estimator of Var( ) One possible solution: jackknife estimation General expression of jackknife estimator (Wu, 1986) with N the number of observations and a the number of parameters Adjusting the variance of IWM estimates
6
The grouped jackknife estimator For clustered observations: grouped jackknife estimator with s the number of clusters
7
Reconstitution: jackknife #Jackknife estimator bdel<-rep(NA,100) b1<- - survreg(Surv(timerec,stat)~trt,data=reconstitution,dist="expon ential")$coeff[2] for (i in 1:100){ temp<-reconstitution[reconstitution$cowid!=i,] bdel[i]<-- survreg(Surv(timerec,stat)~trt,data=temp,dist="exponential")$ coeff[2]} var.robust<-0.98*sum((bdel-b1)^2);stderr.robust<- sqrt(var.robust) var.robust;stderr.robust
8
Reconstitution: jackknife in R? #Jackknife estimator using cluster() and robust=T command analexp.jc<- survreg(Surv(timerec,stat)~trt+cluster(cowid),robust=T,dist=" exponential") Error in score %*% vv : non-conformable arguments
9
Example marginal model with jackknife estimator Example: Time to reconstitution with drug versus placebo Estimates from IWM model with time- constant hazard rate assumption are given by Grouped jackknife = approximation
10
Jackknife estimator Adjusts for clustering Reconstitution example: jackknife estimator is smaller Time to diagnosis example?
11
Jackknife estimator Adjusts for clustering Reconstitution example: jackknife estimator is smaller What is then the picture? Simulation
12
Jackknife estimator- simulations(1) Is jackknife estimate always smaller than estimate from unadjusted model? Generate data from the frailty model with We generate 2000 datasets, each of 100 pairs of two subjects for the settings 1. Matched clusters, no censoring 2. 20% of clusters 2 treated or untreated subjects, no censoring 3. Matched clusters, 20% censoring
13
Jackknife estimator- simulations(2)
14
The fixed effects model The fixed effects model is given by with the fixed effect for cluster i, Assume for simplicity
15
The fixed effects model: ML solution General survival likelihood expression For fixed effects model using assumptions
16
Reconstitution: fixed effects model #Fixed effects model res.fixed<- survreg(Surv(timerec,stat)~trt+as.factor(cowid),dist="exponen tial",data=reconstitution) res.fixed summary(res.fixed)
17
Treatment effect for reconstitution data using R-function survreg (loglin. model) Output treatment effect
18
Parameter interpretation corresponds to constant hazard of untreated udder quarter of cow 1 corresponds to constant hazard of untreated udder quarter of cow i Cowid65 ≈ 0 Cowid100 exp(-21+18.8)=0.11 Treatment effect: HR=exp(0.185)=1.203 with 95% CI [0.83;1.75] Parameter interpretation
19
Investigate cow characteristic: heifer #Fixed effects model res.fixed<- survreg(Surv(timerec,stat)~heifer+as.factor(cowid), dist="exponential",data=reconstitution) res.fixed summary(res.fixed)
20
Heifer effect for reconstitution data introducing heifer first in the model Hazard ratio impossibly high Output heifer effect
21
Add cow characteristic: heifer after cowid? #Fixed effects model res.fixed<- survreg(Surv(timerec,stat)~trt+as.factor(cowid)+heifer, dist="exponential",data=reconstitution) res.fixed summary(res.fixed)
22
Heifer effect for reconstitution data introducing cowid first in the model Hazard ratio equal to 1 Example: between cluster covariate (2)
23
Exercise Investigate method and type of fracture in diagnosis data
24
Note on overparametrisation and confounding
25
Cell means model: no overparametrisation Milk reduction as a function of low and high inoculation dose
26
Factor effects model: overparametrisation Milk reduction as a function of low and high inoculation dose
28
Confounding between temperature in F and C Effect of temperature on bacterial growth (log(CFU))
29
Temperature in °C vs °F
30
Conversion from°F to °C
31
Infinite number of model representations
32
Confounding between blocks and block factors Cow factor is not confounded with treatment factor
33
Fitting model with cow and treatment
34
Model with cow and treatment vs cow alone
35
Adding the heifer factor
36
Infinite number of model representations
37
Example: heifer - cowid confounded There is complete confounding between fixed heifer effect and cowid
38
The stratified model Based on the Cox model where now baseline hazard function unspecified Cox (1972) showed that if only order of events matters, the survival likelihood reduces to the partial likelihood
39
Partial likelihood for the stratified model The stratified model is given by Maximisation of partial likelihood
40
Reconstitution: stratified model #stratified Cox model library(survival) res.strat<- coxph(Surv(timerec,stat)~trt+strata(cowid),data=reconstituti on) res.strat summary(res.strat)
41
Example for bivariate data The partial likelihood for reconstitution data Estimates
42
Exercise Fit the stratified model for the diagnosis data
43
The copula model The copula model is often considered to be a two-stage model First obtain the population (marginal) survival functions for each subject in a cluster. The copula function then links these population survival functions to generate the joint survival function (Frees et al., 1996).
44
Example of copula model Time to diagnosis of being healed
45
Bivariate copula model likelihood Four different possible contributions of a cluster Estimated population survival functions are inserted, only copula parameters unknown
46
The Clayton copula The Clayton copula (Clayton, 1978) is The Clayton copula corresponds to the family of Archimedean copulas, i.e., with in the Clayton copula case
47
Clayton copula likelihood Two censored observations Observation j censored No observations censored
48
Example Clayton copula (1) For diagnosis of being healed data, first fit separate models for RX and US technique For instance, separate parametric models
49
Fitting the copula: two stage approach #Clayton copula for time to diagnosis timetodiag <- read.table("c:\\docs\\onderwijs\\survival\\flames\\diag.csv", header = T,sep=";") t1<-timetodiag$t1/30;t2<-timetodiag$t2/30;c1<-timetodiag$c1;c2<- timetodiag$c2; surv1<-survreg(Surv(t1,c1)~1);l1<-exp(-surv1$coeff/surv1$scale);r1<- (1/surv1$scale) surv2<-survreg(Surv(t2,c2)~1);l2<-exp(-surv2$coeff/surv2$scale);r2<- (1/surv2$scale) s1<-exp(-l1*t1^(r1));f1<-s1*r1*l1*t1^(r1-1) s2<-exp(-l2*t2^(r2));f2<-s2*r2*l2*t2^(r2-1) loglikcon.gamma<-function(theta){ P<-s1^(-theta)+ s2^(-theta)-1 loglik<- -(1-c1)*(1-c2)*(1/theta)*log(P)+c1*(1- c2)*((1+1/theta)*log(P)+(theta+1)*log(s1)- log(f1))+c2*(1- c1)*((1+1/theta)*log(P)+(theta+1)*log(s2)-log(f2))+c1*c2*(log(1+theta)- (2+1/theta)*log(P)-(theta+1)*log(s1)+log(f1)-(theta+1)*log(s2)+log(f2)) -sum(loglik)} nlm(loglikcon.gamma,c(0.5))
50
Example Clayton copula (2) Estimates for marginal models are Based on these estimates we obtain which can be inserted in the likelihood expression which is then maximized for
51
Exercise Fit the copula model to the diagnosis data as one-stage model
52
Fitting the copula: one stage approach #Clayton copula for time to diagnosis – one stage loglikcon3.gamma<-function(param){ theta<-param[1];l1<-param[2];l2<-param[3];r1<-param[4];r2<-param[5] s1<-exp(-l1*t1^(r1));f1<-s1*r1*l1*t1^(r1-1) s2<-exp(-l2*t2^(r2));f2<-s2*r2*l2*t2^(r2-1) P<-s1^(-theta)+ s2^(-theta)-1 loglik<- -(1-c1)*(1-c2)*(1/theta)*log(P)+c1*(1- c2)*((1+1/theta)*log(P)+(theta+1)*log(s1)-log(f1))+c2*(1- c1)*((1+1/theta)*log(P)+(theta+1)*log(s2)-log(f2))+c1*c2*(log(1+theta)- (2+1/theta)*log(P)-(theta+1)*log(s1)+log(f1)-(theta+1)*log(s2)+log(f2)) -sum(loglik) } nlm(loglikcon3.gamma, c(0.5,1,1,1,1))
53
Example Clayton copula (3) For parametric marginal models, the likelihood can also be maximized simul- taneously for all parameters leading to Thus, for small sample sizes, the two- stage approach can differ substantially from the one-stage approach
54
Example Clayton copula (4) Alternatives can be used for marginal survival functions Nonparametric Semiparametric leading to
55
The frailty model The ‘shared’ frailty model is given by with the frailty An alternative formulation is given by with
56
The gamma frailty model Gamma frailty distribution is easiest choice with and
57
Marginal likelihood for the gamma frailty model Start from conditional (on frailty) likelihood with containing the baseline hazard parameters, e.g., for Weibull
58
Marginal likelihood: integrating out the frailties … Integrate out frailties using distribution with
59
Closed form expression for marginal likelihood Integration leads to (homework) and taking log and summing over s clusters
60
Maximisation of marginal likelihood leads to estimates Marginal likelihood no longer contains frailties. By maximisation estimates of are obtained Furthermore, the asymptotic variance- covariance matrix can be obtained as the inverse of the observed information matrix with the Hessian matrix with entries
61
Entries of Hessian matrix from marginal likelihood As an example, the entry of the Hessian matrix for is given by
62
Example for the parametric gamma frailty model Consider time to first insemination data Assume Weibull distributed event times and model the heifer effect We have the following conditional functions
63
R program: read the data #read data setwd("c://docs//onderwijs//survival//Flames//notas//") insemfix<-read.table("insemfix.csepv", header=T,sep=",") #Create four column vectors, four different variables herd<-insemfix$herdnr;timeto<-(insemfix$end*12/365.25) stat<-insemfix$score;heifer<-insemfix$par2 #Derive some values n<-length(levels(as.factor(herd))); di<-aggregate(stat,by=list(herd),FUN=sum)[,2];r<- sum(di)
64
R program: the function #Observable likelihood weibull #l=exp(p[1]), theta=exp(p[2]), beta=p[3], rho=exp(p[4]) #r=No events,di=number of events by herd likelihood.weibul<-function(p){ cumhaz<-exp(heifer*p[3])*(timeto^(exp(p[4])))*exp(p[1]) cumhaz<-aggregate(cumhaz,by=list(herd),FUN=sum)[,2] lnhaz<-stat*(heifer*p[3]+log((exp(p[4])*timeto^(exp(p[4])- 1))*exp(p[1]))) lnhaz<-aggregate(lnhaz,by=list(herd),FUN=sum)[,2] lik<-r*log(exp(p[2]))- sum((di+1/exp(p[2]))*log(1+cumhaz*exp(p[2])))+sum(lnhaz)+ sum(sapply(di,function(x) ifelse(x==0,0,log(prod(x+1/exp(p[2])-seq(1,x)))))) -lik}
65
R program: the output res<-nlm(likelihood.weibul,c(log(0.128),log(0.39),0.15,log(1.76)), hessian=T) lambda<-exp(res$estimate[1]) theta<-exp(res$estimate[2]) beta<-res$estimate[3] rho<-exp(res$estimate[4])
66
Time to first insemination: effect of heifer with herd as cluster ML Monthly hazard rate Monotone increasing Variance of frailties Within herd heifer effect Hazard ratio with 95 % CI
67
Using parfm library library(parfm) #Create four column vectors, four different variables herd<-as.factor(insemfix$herdnr);timeto<- (insemfix$end*12/365.25) stat<-insemfix$score;heifer<-insemfix$par2 insem<- data.frame(herd=herd,timeto=timeto,stat=stat,heifer=heifer) parfm(Surv(timeto,stat)~heifer,cluster="herd",data=insem,fr ailty="gamma")
68
Interpretation of frailty variance The parameter refers to the variability at the hazard level: difficult to interprete! Maybe plot the hazard function for subjects with a particular frailty
69
Plotting hazard of insemination for multiparous cows #Interpretation of parameters lambda<-0.174;theta<-0.394;rho<-1.769 lambda<-lambda*((365.25/12)^(-rho)) time<-seq(1,350);timet<-time+29.5 h1f0<-lambda*rho*time^(rho-1) h1f05<-qgamma(0.05,1/theta,1/theta)*lambda*rho*time^(rho-1) h1f95<-qgamma(0.95,1/theta,1/theta)*lambda*rho*time^(rho-1)
70
Plotting hazard of insemination for multiparous cows #Hazards par(mfrow=c(1,2)); par(adj=0.5);par(cex=1.2) plot(c(0,360),c(min(h1f05,h2f05),max(h1f95,h2f95)),type='n',xlab="Time after calving (days)",ylab="hazard") lines(timet,h1f0,lty=1,lwd=3);lines(timet,h1f05,lty=1,lwd=1) lines(timet,h1f95,lty=1,lwd=1) par(adj=0);text(1,0.14,"Multiparous cows")
71
Exercise Plot hazard of insemination for heifers
72
Plotting hazard of insemination for heifers #Interpretation of parameters lambda<-0.174;theta<-0.394;rho<-1.769;beta<--0.153 lambda<-lambda*((365.25/12)^(-rho)) h2f0<-lambda*rho*exp(beta)*time^(rho-1) h2f05<-qgamma(0.05,1/theta,1/theta)*lambda*rho*exp(beta)*time^(rho-1) h2f95<-qgamma(0.95,1/theta,1/theta)*lambda*rho*exp(beta)*time^(rho-1)
73
Plotting hazard of insemination #Hazards par(mfrow=c(1,2)); par(adj=0.5);par(cex=1.2) plot(c(0,360),c(min(h1f05,h2f05),max(h1f95,h2f95)),type='n',xlab="Time after calving (days)",ylab="hazard") lines(timet,h1f0,lty=1,lwd=3);lines(timet,h1f05,lty=1,lwd=1) lines(timet,h1f95,lty=1,lwd=1) par(adj=0);text(1,0.14,"Multiparous cows") par(adj=0.5) plot(c(0,360),c(min(h1f05,h2f05),max(h1f95,h2f95)),type='n',xlab="Time after calving (days)",ylab="hazard") lines(timet,h2f0,lty=1,lwd=3);lines(timet,h2f05,lty=1,lwd=1) lines(timet,h2f95,lty=1,lwd=1) par(adj=0);text(1,0.14,"Heifers")
74
Interpretation of frailty variance The parameter refers to the variability at the hazard level: difficult to interprete! Multiparous cows Heifers
75
Transformation to median Density function of transformation of random variable
76
Median of Weibull distribution To find the median survival time for cluster i, put = 0.5
77
Density of median for Weibull distribution The density function is then with and
78
Density of median for Weibull distribution Leading to
79
Plotting density function of median for multiparous cows lambda<-0.174;theta<-0.394;rho<-1.769;beta<-- 0.153; lambda<-lambda*((365.25/12)^(-rho)) #Medians calcm<-function(m){ rho * (log(2)/(theta*lambda))^(1/theta) * (1/m)^(1+rho/theta) * (1/gamma(1/theta)) *exp(- log(2)/(theta*lambda*m^(rho)))} timedens<-seq(1,200,1) densmd1<-sapply(timedens,calcm) plot(c(0,230),c(min(densmd1),max(densmd1)),type='n',xlab="Median time to first insemination (days)",ylab="Density function median") lines(timedens+29.5,densmd1,lty=1,lwd=3)
80
Exercise Plot density of median for Heifers
81
Plotting density function of median for multiparous cows and heifers lambda<-0.174;theta<-0.394;rho<-1.769;beta<--0.153; lambda<-lambda*((365.25/12)^(-rho)) #Medians calcm<-function(m){ rho * (log(2)/(theta*lambda))^(1/theta) * (1/m)^(1+rho/theta) * (1/gamma(1/theta)) *exp(-log(2)/(theta*lambda*m^(rho)))} timedens<-seq(1,200,1) densmd1<-sapply(timedens,calcm) lambda<-lambda*exp(beta) densmd2<-sapply(timedens,calcm) plot(c(0,230),c(min(densmd1,densmd2),max(densmd1,densmd2)),t ype='n',xlab="Median time to first insemination (days)",ylab="Density function median") lines(timedens+29.5,densmd1,lty=1,lwd=3);lines(timedens+29.5,de nsmd2,lty=2,lwd=3) legend(130,0.015,legend=c("Multiparous","Heifer"),lty=c(1,2))
82
Variability of median time to first insemination between herds
83
Exercise Derive the density function for the percentage survivan at a particular time t
84
Transformation to percentage survival The percentage in cluster i with first insemination at time t is given by Thus and
85
Interpretation of frailty variance in terms of % events at time t The density function is then obtained by and thus
86
Variability of % first insemination at time t between herds Multiparous cows Heifers
87
Efficiency comparisons in the reconstitution data example Estimates (se) for reconstitution data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.