Workshop frailty models Luc Duchateau, Rosemary Nguti and Paul Janssen
Contents The statistical package R The data set: time to first insemination Inference for shared gamma frailty models –Theoretical considerations –Fitting parametric and semiparametric models More flexible shared gamma frailty models –Time-varying covariates –Smoothing splines Other approaches for frailty models –Choice of the frailty density –A Bayesian approach for frailty models
The statistical package R Freeware (unlike Splus) Powerful statistical package Data handling a bit less powerful Downloadable from the Internet –Base –Libraries From within R Download library yourself
Installing R via Internet Go to Choose option download Choose closest mirrorsite Choose operating system Windows (95 and later) Choose subdirectory base Choose rw1090.exe
Installing additional packages Automatically within R –Menu item « Packages» « Install Packages from CRAN … » Choose « survival » –New session, specify « library(survival) » Download zipped package from CRAN –Unzip it under directory « c:\Program Files\R\rw1091\library »
The data set: Time to first insemination Database of regional Dairy Herd Improvement Association (DHIA) –Milk recording service –Artificial insemination –Select sample –Subset of 2567 cows from 49 dairy farms
Fixed covariates data set insemfix.dat
Time-varying covariates data set insemtvc.dat
Fitting parametric frailty models (1) Read the data –insemfix<-read.table("c://insemfix.dat",header=T) Create vectors heifer<- herd stat timeto insem$heifer
Fitting parametric frailty models (2) Derive quantities n, Di and e –n<-length(levels(as.factor(herd)) 49 –Di<-aggregate(stat,by=list(herd),FUN=sum)[,2] –e<-sum(Di) … … 34
Fitting parametric frailty models (3) Observable likelihood for constant hazard
Fitting parametric frailty models (4) Calculate observable likelihood for parameters –h=1/p[1], =1/p[2], =p[3] Cumulative hazard Hazard Timeto*exp(heifer*p[3])/p [1] cumhaz aggregate(cumhaz,by=list(herd),FUN=sum)[,2]cumhaz stat*log(exp(heifer*p[3])/p [1])) lnhaz aggregate(lnhaz,by=list(herd),FUN=sum)[,2]lnhaz
Fitting parametric frailty models (5) likelihood.exponential<-function(p){ cumhaz<-(timeto*exp(heifer*p[3]))/p[1] cumhaz<-aggregate(cumhaz,by=list(herd),FUN=sum)[,2] lnhaz<-stat*log(exp(heifer*p[3])/p[1]) lnhaz<-aggregate(lnhaz,by=list(herd),FUN=sum)[,2] lik<-e*log(1/p[2])-n*log(gamma(p[2]))+sum(log(gamma(Di+p[2])))- sum((Di+p[2])*log(1+cumhaz/p[2]))+sum(lnhaz) -lik}
Fitting parametric frailty models (6) initial<-c(100,17,0) t<-nlm(likelihood.exponential,initial,print.level=2) h<-1/t$estimate[1] h theta<-1/t$estimate[2] theta beta<-t$estimate[3] beta
Interpretation From , the heterogeneity parameter, to density for median time to event calcm<-function(m){ (log(2)/(theta*h))^(1/theta) *(1/m)^(1+1/theta)* (1/gamma(1/theta)) *exp(-log(2)/(theta*h*m)) } med<-seq(1,100) fmed<-sapply(med,calcm) plot(med,fmed,type="l",xlab="Median time",ylab="Density")
Interpretation
Fitting semiparametric models library(survival) coxph(Surv(timeto,stat)~heifer+frailty(herd)) Call: coxph(formula = Surv(timeto, stat) ~ heifer + frailty(herd)) coef se(coef) se2 Chisq DF p heifer e-08 frailty(herd) e+00 Iterations: 10 outer, 23 Newton-Raphson Variance of random effect= I-likelihood = Degrees of freedom for terms= Likelihood ratio test=281 on 41.9 df, p=0 n= 2579 HR=exp(-0.24)=0.79
Time-varying covariates data Contribtion to the denominator: t ij =10 : 2.29 t ij =55 : 2.61
Fitting Cox models with time-varying covariates (1) #Read data insemtvc<-read.table("c://insemtvc.dat",header=T) #Create four column vectors, four different variables herd<-insemtvc$herd begin<-insemtvc$begin end<-insemtvc$end stat<-insemtvc$stat ureum<-insemtvc$ureum heifer<-insemtvc$heifer library(survival) coxph(Surv(begin,end,stat)~heifer+ureum+frailty(herd))
Fitting Cox models with time-varying covariates (2) Call: coxph(formula = Surv(begin, end, stat) ~ heifer + ureum + frailty(herd)) coef se(coef) se2 Chisq DF p heifer e-07 ureum e-01 frailty(herd) e+00 Iterations: 10 outer, 23 Newton-Raphson Variance of random effect= I-likelihood = Degrees of freedom for terms= Likelihood ratio test=251 on 41.3 df, p=0 n= HR=exp( )=0.997
Fitting Cox models with smoothing splines (1) #Read data insemtvc<-read.table("c://insemtvc.dat",header=T) #Create four column vectors, four different variables herd<-insemtvc$herd begin<-insemtvc$begin end<-insemtvc$end stat<-insemtvc$stat ureum<-insemtvc$ureum heifer<-insemtvc$heifer library(survival) fit<-coxph(Surv(begin,end,stat)~heifer+pspline(ureum,nterm=3,theta=0.5)) temp<-order(ureum) plot(ureum[temp],predict(fit,type='terms')[temp,2],type='l',xlab='Ureum', ylab='Risk score')
Fitting Cox models with smoothing splines (2) Call: coxph(formula = Surv(begin, end, stat) ~ heifer + pspline(ureum, nterm = 3, theta = 0.5)) coef se(coef) se2 Chisq DF p heifer e-07 pspline(ureum, nterm = 3, e-01 pspline(ureum, nterm = 3, e-02 Iterations: 1 outer, 3 Newton-Raphson Theta= 0.5 Degrees of freedom for terms= Likelihood ratio test=30.2 on 3.18 df, p=1.56e-06 n= 23076
Fitting Cox models with smoothing splines (3)
Fitting the shared normal frailty model (1) #Read data insemfix<- read.table("c://docs//onderwijs//frailtymodels//insemfix.dat",header=T) #Create four column vectors, four different variables herd<-insemfix$herd timeto<-insemfix$timeto-min(insemfix$timeto) stat<-insemfix$stat heifer<-insemfix$heifer coxph(Surv(timeto,stat)~heifer+frailty(herd,distribution="gaussian"))
Fitting the shared normal frailty model (2) Call: coxph(formula = Surv(timeto, stat) ~ heifer + frailty(herd, distribution = "gaussian")) coef se(coef) se2 Chisq DF p heifer e-08 frailty(herd, distributio e+00 Iterations: 6 outer, 14 Newton-Raphson Variance of random effect= Degrees of freedom for terms= Likelihood ratio test=276 on 39.5 df, p=0 n= 2579