Workshop Semiparametric frailty models

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Qualitative predictor variables
3. Binary Choice – Inference. Hypothesis Testing in Binary Choice Models.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Workshop 1 Alternative models for clustered survival data.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
Lecture 23: Tues., Dec. 2 Today: Thursday:
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Workshop frailty models Luc Duchateau, Rosemary Nguti and Paul Janssen.
AGC DSP AGC DSP Professor A G Constantinides© Estimation Theory We seek to determine from a set of data, a set of parameters such that their values would.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Clustered or Multilevel Data
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
5. Estimation 5.3 Estimation of the mean K. Desch – Statistical methods of data analysis SS10 Is an efficient estimator for μ ?  depends on the distribution.
Chapter 2 Simple Comparative Experiments
Inferences About Process Quality
Modeling clustered survival data The different approaches.
Accelerated Failure Time (AFT) Model As An Alternative to Cox Model
Chapter 9: Introduction to the t statistic
Chapter 7 Inferences Regarding Population Variances.
Inferential Statistics: SPSS
Inference for regression - Simple linear regression
Multivariate survival analysis Luc Duchateau, Ghent University Paul Janssen, Hasselt University 1.
The Triangle of Statistical Inference: Likelihoood
1 Objective Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
The Scientific Method Probability and Inferential Statistics.
Lecture 13: Cox PHM Part II Basic Cox Model Parameter Estimation Hypothesis Testing.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lecture 19: Competing Risk Regression
Lecture 3 Linear random intercept models. Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The.
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
Workshop 2 Frailty models. Contents Inference for shared gamma frailty models –Theoretical considerations –Fitting parametric and semiparametric models.
Lecture 12: Cox Proportional Hazards Model
1 Objective Compare of two population variances using two samples from each population. Hypothesis Tests and Confidence Intervals of two variances use.
Lecture 16: Regression Diagnostics I Proportional Hazards Assumption -graphical methods -regression methods.
Linear Models Alan Lee Sample presentation for STATS 760.
A first order model with one binary and one quantitative predictor variable.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Inferences Concerning Variances
Multilevel and multifrailty models. Overview  Multifrailty versus multilevel Only one cluster, two frailties in cluster e.g., prognostic index (PI) analysis,
Estimating a Population Mean. Student’s t-Distribution.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
Introduction to Frailty Models
Confidence Intervals for a Population Mean, Standard Deviation Unknown.
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 6.1: Recurrent event data Intensity processes and rate functions Robust.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
1 Borgan and Henderson: Event History Methodology Lancaster, September 2006 Session 8.1: Cohort sampling for the Cox model.
Hypothesis Testing – Two Means(Small, Independent Samples)
Comparing Cox Model with a Surviving Fraction with regular Cox model
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Psychology 202a Advanced Psychological Statistics
Basics of the semiparametric frailty model
STAT Z-Tests and Confidence Intervals for a
CHAPTER 6 Statistical Inference & Hypothesis Testing
LESSON 18: CONFIDENCE INTERVAL ESTIMATION
If we can reduce our desire,
Presentation transcript:

Workshop Semiparametric frailty models coxph and coxme

The data set: Time to first insemination Database of regional Dairy Herd Improvement Association (DHIA) Milk recording service Artificial insemination Select sample Subset of 2567 cows from 49 dairy farms

Fixed covariates data set insemfix.dat

Time-varying covariates data set insemtvc.dat

Fitting semiparametric models with fixed covariates setwd("c://docs//onderwijs//survival//Flames//") insemfix<-read.table("insemfix.dat",header=T) coxph(Surv(timeto,stat)~heifer+frailty(herd)) HR=exp(-0.24)=0.79 Call: coxph(formula = Surv(timeto, stat) ~ heifer + frailty(herd)) coef se(coef) se2 Chisq DF p heifer -0.24 0.0432 0.0430 30.9 1.0 2.7e-08 frailty(herd) 205.3 40.9 0.0e+00 Iterations: 10 outer, 23 Newton-Raphson Variance of random effect= 0.123 I-likelihood = -16953 Degrees of freedom for terms= 1.0 40.9 Likelihood ratio test=281 on 41.9 df, p=0 n= 2579

Exercise Can we maximise the penalised partial likelihood at once? Depict the penalised partial likelihood as a function of theta The PPL is the second item of the named list loglik, e.g., for our example insemfix$loglik[2]

Penalised partial log likelihood detloglik<-function(x) {coxph(Surv(timeto,stat)~heifer+frailty(herd,theta=x),data=insemfix)$loglik[2]} thetlist<-seq(0.05,0.15,0.005) logliklist<-sapply(thetlist,detloglik) plot(thetlist,logliklist,type='l',xlab='theta',ylab='penalised partial loglikelihood')

Exercise Do we obtain the maximum profile likelihood? Depict the profile likelihood as a function of theta Theta can be fixed at x by frailty(herd,theta=x) The marginal likelihood can be obtained as insemfix$history[[1]]$c.loglik

Profile log likelihood detloglik<-function(x) {coxph(Surv(timeto,stat)~heifer+frailty(herd,theta=x),data=insemfix)$history[[1]]$c.loglik} thetlist<-seq(0.05,0.15,0.005) logliklist<-sapply(thetlist,detloglik) plot(thetlist,logliklist,type='l',xlab='theta',ylab='profile loglikelihood')

Adjusting outer number of iterations coxfit<-coxph(Surv(timeto,stat)~heifer+frailty(herd,eps=0.00000001),data=insemfix,outer.max=100) coxfit Call: coxph(formula = Surv(timeto, stat) ~ heifer + frailty(herd, eps = 1e-08), outer.max = 100) coef se(coef) se2 Chisq DF p heifer -0.241 0.0431 0.043 31.1 1.0 2.4e-08 frailty(herd, eps = 1e-08 194.5 38.5 0.0e+00 Iterations: 21 outer, 39 Newton-Raphson Variance of random effect= 0.0878 I-likelihood = -16952.1 Degrees of freedom for terms= 1.0 38.5 Likelihood ratio test=277 on 39.5 df, p=0 n= 2579

Time-varying covariates data Contribution to the denominator: tij=10 : 2.29 tij=55 : 2.61

Fitting Cox models with time-varying covariates (1) #Read data setwd("c://docs//onderwijs//survival//Flames//") insemtvc<-read.table("insemtvc.dat",header=T) coxph(Surv(begin,end,stat)~ureum+frailty(herd,eps=0.0000001), outer.max=100,data=insemtvc)

Fitting Cox models with time-varying covariates (2) > coxph(Surv(begin,end,stat)~ureum+frailty(herd,eps=0.0000001), outer.max=100,data=insemtvc) Call: coxph(formula = Surv(begin, end, stat) ~ ureum + frailty(herd, eps = 1e-07), data = insemtvc, outer.max = 100) coef se(coef) se2 Chisq DF p ureum -0.0525 0.0187 0.0186 7.85 1 0.0051 frailty(herd, eps = 1e-07 1238.47 165 0.0000 Iterations: 35 outer, 82 Newton-Raphson Variance of random effect= 0.334 I-likelihood = -69868.9 Degrees of freedom for terms= 1 165 Likelihood ratio test=2066 on 166 df, p=0 n= 93608

Alternative semiparametric models Unadjusted and marginal semiparametric model Stratified semiparametric model Semiparametric frailty model

Unadjusted semiparametric model ureumtv.unadjust <-coxph(Surv(begin,end,stat)~ureum,data=insemtvc) summary(ureumtv.unadjust) Call: coxph(formula = Surv(begin, end, stat) ~ ureum, data = insemtvc) n= 93608 coef exp(coef) se(coef) z p ureum -0.0273 0.973 0.0162 -1.68 0.093 exp(coef) exp(-coef) lower .95 upper .95 ureum 0.973 1.03 0.943 1.00 Rsquare= 0 (max possible= 0.779 ) Likelihood ratio test= 2.83 on 1 df, p=0.0926 Wald test = 2.82 on 1 df, p=0.0929 Score (logrank) test = 2.82 on 1 df, p=0.0929

Marginal semiparametric model ureumtv.marg<-coxph(Surv(begin,end,stat)~ureum+cluster(herd),data=insemtvc) summary(ureumtv.marg) Call: coxph(formula=Surv(begin,end,stat)~ureum+cluster(herd),data=insemtvc) n= 93608 coef exp(coef) se(coef) robust se z p ureum -0.0273 0.973 0.0162 0.0285 -0.957 0.34 exp(coef) exp(-coef) lower .95 upper .95 ureum 0.973 1.03 0.92 1.03 Rsquare= 0 (max possible= 0.779 ) Likelihood ratio test= 2.83 on 1 df, p=0.0926 Wald test = 0.92 on 1 df, p=0.338 Score (logrank) test = 2.82 on 1 df, p=0.0929, Robust = 0.91 p=0.339

Stratified semiparametric model ureumtv.strat<-coxph(Surv(begin,end,stat)~ureum+strata(herd),data=insemtvc) summary(ureumtv.strat) Call: coxph(formula = Surv(begin, end, stat) ~ ureum + strata(herd), data = insemtvc) n= 93608 coef exp(coef) se(coef) z p ureum -0.0588 0.943 0.0198 -2.97 0.003 exp(coef) exp(-coef) lower .95 upper .95 ureum 0.943 1.06 0.907 0.98 Rsquare= 0 (max possible= 0.444 ) Likelihood ratio test= 8.86 on 1 df, p=0.00291 Wald test = 8.83 on 1 df, p=0.00296 Score (logrank) test = 8.83 on 1 df, p=0.00296

Semiparametric frailty model ureumtv.frail<-coxph(Surv(begin,end,stat)~ureum+frailty(herd,eps=0.0000001),outer.max=100,data=insemtvc) summary(ureumtv.frail) Call: coxph(formula = Surv(begin, end, stat) ~ ureum + frailty(herd, eps = 1e-07), data = insemtvc, outer.max = 100) n= 93608 coef se(coef) se2 Chisq DF p exp(coef) lower.95 upper.95 ureum -0.0525 0.0187 0.0186 7.85 1 0.0051 0.949 0.915 0.984 frailty(herd, eps = 1e-07 1238.5 165 0.0000 Iterations: 35 outer, 60 Newton-Raphson Variance of random effect= 0.334 I-likelihood = -69868.9 Degrees of freedom for terms= 1 165 Rsquare= 0.022 (max possible= 0.779 ) Likelihood ratio test= 2066 on 166 df, p=0 Wald test = 7.85 on 166 df, p=1

Conclusions With adjustment, either by stratification or frailties, significant ureum effect Without adjustment, either with or without variance adjustment, no significant ureum effect Why? Plot herd specific ureum effect as a function of mean herd ureum concentration

Exercise Plot percentage events as a function of mean herd ureum concentration Using following scheme Derive average ureum conc. for each cow avureum.cow<-tapply(insemtvc$ureum,list(insemtvc$cowid),mean) Derive average ureum conc. for each herd Derive percentage events Plot

Mean ureum and %insemination setwd("c://docs//onderwijs//survival//Flames//") insemtvc<-read.table("insemtvc.dat",header=T) herd<-insemtvc$herdnr;timeto<-(insemtvc$end*12/365.25) stat<-insemtvc$stat;heifer<-insemtvc$par2 avureum.cow<-tapply(insemtvc$ureum,list(insemtvc$cowid),mean) herd.cow<-tapply(insemtvc$herd,list(insemtvc$cowid),mean) avureum.herd<-tapply(avureum.cow,list(herd.cow),mean) numinsem.herd<-tapply(insemtvc$stat,insemtvc$herd,sum) totanim.herd<-tapply(insemtvc$stat,insemtvc$herd,length) percinsem<- 100*numinsem.herd/totanim.herd plot(avureum.herd,percinsem,xlab="Ureum",ylab="Insemination (%)")

Exercise Plot frailty term as a function of mean herd ureum concentration Using following scheme Save frailties from Cox frailty model Derive average ureum conc. for each herd Plot

Mean ureum and frailties ureumtv.frail<-coxph(Surv(begin,end,stat)~ureum+ frailty(herd,eps=0.0000001),outer.max=100,data=insemtvc) avureum.cow<-tapply(insemtvc$ureum,list(insemtvc$cowid),mean) avureum.herd<-tapply(avureum.cow,list(herd.cow),mean) herd.cow<-tapply(insemtvc$herd,list(insemtvc$cowid),mean) stat.herd<-tapply(insemtvc$stat,list(insemtvc$herd),sum) results<-data.frame(frailty=ureumtv.frail$frail,ureum=avureum.herd,stat=stat.herd) results0<-results[results$stat==0,] plot(results$frailty,results$ureum,xlab="Frailty",ylab="Mean ureum concentration") points(results0$frailty,results0$ureum,pch=3)

Mean ureum and frailties plot

Exercise Derive herd specific ureum effects Fit the model for each herd separately

Herd specific treatment effect avureum.cow<-tapply(insemtvc$ureum,list(insemtvc$cowid),mean) herd.cow<-tapply(insemtvc$herd,list(insemtvc$cowid),mean) avureum.herd<-tapply(avureum.cow,list(herd.cow),mean) fitmod<-function(herdnr) {coxph(Surv(begin,end,stat)~ureum,subset=(insemtvc$herd==herdnr),data=insemtvc)$coef} ureumeffect.herd<-sapply(herd.cow,fitmod) Error in fitter(X, Y, strats, offset, init, control, weights = weights, : Can't fit a Cox model with zero failures

Exercise Refit the frailty model and the marginal model leaving out the herds without inseminations

Results deleting farms without inseminations insemtvcr<-insemtvc[insemtvc$herd!=1302100819 & insemtvc$herd!=1302310377 & insemtvc$herd!=3103311094 & insemtvc$herd!=3402102723 & insemtvc$herd!=3403801839 & insemtvc$herd!=4300110030,] ureumtvcr.unadjust<-coxph(Surv(begin,end,stat)~ureum,data=insemtvcr) summary(ureumtvcr.unadjust) Call: coxph(formula = Surv(begin, end, stat) ~ ureum, data = insemtvcr) n= 90232 coef exp(coef) se(coef) z p lower .95 upper .95 ureum -0.0433 0.958 0.0163 -2.65 0.008 0.927 0.989 Rsquare= 0 (max possible= 0.789 ) Likelihood ratio test= 7.04 on 1 df, p=0.00795 Wald test = 7.02 on 1 df, p=0.00804 Score (logrank) test = 7.02 on 1 df, p=0.00805

Results deleting farms without inseminations - alternative insemtvcr<-insemtvc[insemtvc$herd!=1302100819 & insemtvc$herd!=1302310377 & insemtvc$herd!=3103311094 & insemtvc$herd!=3402102723 & insemtvc$herd!=3403801839 & insemtvc$herd!=4300110030,] ureumtvcr.unadjust<-coxph(Surv(begin,end,stat)~ureum,data=insemtvcr) summary(ureumtvcr.unadjust) Call: coxph(formula = Surv(begin, end, stat) ~ ureum, data = insemtvcr) n= 90232 coef exp(coef) se(coef) z p lower .95 upper .95 ureum -0.0433 0.958 0.0163 -2.65 0.008 0.927 0.989 Rsquare= 0 (max possible= 0.789 ) Likelihood ratio test= 7.04 on 1 df, p=0.00795 Wald test = 7.02 on 1 df, p=0.00804 Score (logrank) test = 7.02 on 1 df, p=0.00805

Deleting farms without inseminations #alternative herdstat<-data.frame(herd=names(stat.herd),numinsem=stat.herd) insemtvcr2<-merge(insemtvc,herdstat,by="herd")

Exercise Fit the frailty model assuming the lognormal distribution for the frailties

Fitting the shared normal frailty model coxph(Surv(begin,end,stat)~ureum+frailty(herd,dist="gaussian",eps=0.00001),outer.max=100,data=insemtvc) Call: coxph(formula = Surv(begin, end, stat) ~ ureum + frailty(herd, dist = "gaussian", eps = 1e-05), data = insemtvc, outer.max = 100) coef se(coef) se2 Chisq DF p ureum -0.0513 0.0188 0.0186 7.48 1 0.0062 frailty(herd, dist = "gau 1581.56 165 0.0000 Iterations: 6 outer, 21 Newton-Raphson Variance of random effect= 0.411 Degrees of freedom for terms= 1 165 Likelihood ratio test=2030 on 166 df, p=0 n= 93608

Lognormal frailties versus normal random effects Assume U has lognormal distribution, with U=exp(W) and W~N(m,s2) The gamma density with mean 1, variance q is given by For q = 0.1, 0.5, 1, 1.5, plot the gamma density function and lognormal density chosing values for m and s2 so that Var(exp(W))= q and m = 0 Var(exp(W))= q and E(U)=1

Lognormal frailties versus normal random effects (2) With m equal to 0, we have With E(U)=1, we have

Plotting densities theta<-c(0.1,0.5,1,1.5) s1<-log((1+sqrt(1+4*theta))/2) s2<-log(theta+1) m2<- -0.5*s2 u<-seq(0,3,0.01) for (i in 1:length(u)){ dlnorm.theta0.1.EWeq0<-dlnorm(u,0,sqrt(s1[1])) dlnorm.theta0.5.EWeq0<-dlnorm(u,0,sqrt(s1[2])) dlnorm.theta1.EWeq0<-dlnorm(u,0,sqrt(s1[3])) dlnorm.theta1.5.EWeq0<-dlnorm(u,0,sqrt(s1[4])) dlnorm.theta0.1.EUeq1<-dlnorm(u,m2[1],sqrt(s2[1])) dlnorm.theta0.5.EUeq1<-dlnorm(u, m2[2],sqrt(s2[2])) dlnorm.theta1.EUeq1<-dlnorm(u, m2[3],sqrt(s2[3])) dlnorm.theta1.5.EUeq1<-dlnorm(u, m2[4],sqrt(s2[4])) dgam.theta0.1.EUeq1<-dgamma(u,1/theta[1], 1/theta[1]) dgam.theta0.5.EUeq1<- dgamma(u,1/theta[2], 1/theta[2]) dgam.theta1.EUeq1<- dgamma(u,1/theta[3], 1/theta[3]) dgam.theta1.5.EUeq1<- dgamma(u,1/theta[4], 1/theta[4]) }