Download presentation
Presentation is loading. Please wait.
Published byVincent Poole Modified over 8 years ago
1
Joint Modelling of Accelerated Failure Time and Longitudinal Data By By Yi-Kuan Tseng Yi-Kuan Tseng Joint Work With Joint Work With Professor Jane-Ling Wang Professor Jane-Ling Wang Professor Fushing Hsieh Professor Fushing Hsieh Tseng Y.K., Hsieh F., and Wang J.L. (2005). 92, pp. 587-603, Biometrika.
3
I. Introduction CD4 counts and time to AIDS (or death) CD4 counts and time to AIDS (or death)CD4
4
Self and Powitan(1992), Degruttola and Tu(1994), Tsiatis et al.(1995), Faucett and Thomas (1996), Wulfsohn and Tsiatis(1997) Bycott and Taylor(1998) Dafni and Tsiatis (1998), Tsiatis and Davidian (2001) Taylor et al.(1994), Lavalley and Degruttola(1996), Henderson et al.(2000),Wang and Taylor(2001), Xu and Zeger(2001) f(t ) :a vector of known functions of time t, U (t) : a stochastic process
5
Bayesian approaches Bayesian approaches Conditional score approaches Conditional score approaches Two-stage partial likelihood approaches Two-stage partial likelihood approaches -truncation causes bias -truncation causes bias Joint likelihood approaches -robust to the distribution of random effects - unbiased - efficient
6
Accelerated failure time model is an attractive alternative when the proportional hazard assumption fails. Accelerated failure time model is an attractive alternative when the proportional hazard assumption fails. For time independent covariates X: For time independent covariates X:
7
For time dependent covariates X(t), we consider For time dependent covariates X(t), we consider the AFT model in Cox and Oakes (1984): the AFT model in Cox and Oakes (1984): Biological meaning: Allows the influence of entire Biological meaning: Allows the influence of entire covariate history on subject specific risk. covariate history on subject specific risk.
8
For an absolutely continuous S 0, the hazard rate function with covariate history: If baseline hazard is unspecified, the expression If baseline hazard is unspecified, the expression corresponds to a semi-parametric model. corresponds to a semi-parametric model. Robins and Tsiatis (1992)– rank estimating equation Lin and Ying (1995)– asym. consistency and Normality Hsieh (2003)– over-identified estimating equation
9
Goal of the study: provide an effective estimators for β with unspecified baseline hazard and the parameters β with unspecified baseline hazard and the parameters of longitudinal process of longitudinal process Different assumptions on baseline hazard: -- Wulfsohn and Tsiatis (1997) Discrete baseline hazard with jumps at event times Discrete baseline hazard with jumps at event times -- Our assumption: The baseline hazard is a step function. The baseline hazard is a step function.
10
II. Joint AFT and Longitudinal model
11
Model for longitudinal data: Model for longitudinal data:
12
Model for survival: Model for survival: Joint likelihood: Assumptions -- noninformative censoring -- noninformative measurement schedule t ij, both are independent of future covariate history and random effects b i Joint likelihood: Assumptions -- noninformative censoring -- noninformative measurement schedule t ij, both are independent of future covariate history and random effects b i
14
III. EM Algorithm M-step: M-step: Complete data likelihood: Complete data likelihood:
16
Therefore, we have
17
There’s no closed form expression for. We may There’s no closed form expression for. We may maximize the conditional likelihood by numerical maximize the conditional likelihood by numerical method. method. E-step: E-step: To compute E i (.),we need knowledge of To compute E i (.),we need knowledge of which can be expressed as: which can be expressed as:
18
E i (.), we may generating M multivariate To derive E i (.), we may generating M multivariate normal sequences for, denoted by normal sequences for, denoted by ThTh The accuracy increases as M increases. In order to The accuracy increases as M increases. In order to have higher accuracy and less computing time, we have higher accuracy and less computing time, we may follow the suggestion in Wei and Tanner (1990) may follow the suggestion in Wei and Tanner (1990). That is, to use small value of M in the initial iterations. That is, to use small value of M in the initial iterations of the algorithm, and increase the values of M as the of the algorithm, and increase the values of M as the algorithm moves closer to convergence. algorithm moves closer to convergence.
19
We encounter two difficulties when estimating standard error of : EM algorithm involved missing information EM algorithm involved missing information -Remedies in Louis (1982) and McLachlan and -Remedies in Louis (1982) and McLachlan and Krishnan (1997) are valid for finite dimensional Krishnan (1997) are valid for finite dimensional parameter space. parameter space. No explicit profile likelihood No explicit profile likelihood - Need projection onto all other parameters - However, it’s very hard to derive due to λ 0
20
Bootstrap technique in Efron(1994):
21
IV. Simulation Studies Sample size n=100 with 100 MC replications Sample size n=100 with 100 MC replications
22
β μ1μ1μ1μ1 μ2μ2μ2μ2 σ 11 σ 12 σ 22 σeσeσeσe target110.50.01-0.0010.0010.25 mean1.00750.99550.50130.0087-0.00110.00090.2528 SD0.09450.01630.00550.00150.00020.00020.0135 2β μ1μ1μ1μ1 μ2μ2μ2μ2 σ 11 σ 12 σ 22 σeσeσeσetarget110.50.01-0.0010.0010.25 mean0.99180.99440.50150.0083-0.00110.00090.2516 SD0.12720.02490.00560.00230.00040.00020.0198 2 (i) Normal random effects without censoring (ii) Normal random effects with censoring
23
β μ1μ1μ1μ1 μ2μ2μ2μ2 σ 11 σ 12 σ 22 σeσeσeσe target110.50.01-0.0010.0010.25 empiricaltarget10.99930.67580.0104-0.00580.13580.2753 mean0.99501.00070.66820.0099-0.00060.16270.2500 SD0.10910.01400.05350.00040.00360.03180.0223 (iii) Nonnormal random effects with censoring 2
25
The medfly (Mediterranean fruit fly) data: The medfly (Mediterranean fruit fly) data: --From Carey, et al. (1998) -- We focus on 251 female medflies which have the most egg reproduction (>1150). --Range of event time from 22 to 99 -- Range of total reproduction from 1151 to 2349 --No censoring and missing Relationship between daily egg laying and mortality Relationship between daily egg laying and mortality V. Application on Medfly data
26
--Violate the proportionality (By scaled Schoenfeld residual test with p-value 0.00305) --Violate the proportionality (By scaled Schoenfeld residual test with p-value 0.00305)
28
Initial model: Initial model: Log transformed model: Log transformed model:
29
β μ1μ1μ1μ1 μ2μ2μ2μ2 σ 11 σ 12 σ 22 σeσeσeσe fitted values -0.43402.1227-0.14420.3701-0.04820.00680.8944 bootstrap mean -0.43132.1112-0.14290.3651-0.04830.00660.8958 bootstrap SD 0.01150.03750.00510.03530.00020.00050.0223 The parameter estimates derived from original data and 100 bootstrap samples under the joint AFT
32
Fitting incomplete medfly data: Fitting incomplete medfly data: --Randomly select 1-7 days as the corresponding schedule times for each individual. times for each individual. --Then, add the day of death as the last schedule time. Therefore, each individual may have 2-8 repeated Therefore, each individual may have 2-8 repeated measurements. measurements. -- The sub data set is further censored by exponential distribution with mean 500 (20% censoring rate) distribution with mean 500 (20% censoring rate)
33
β μ1μ1μ1μ1 μ2μ2μ2μ2 σ 11 σ 12 σ 22 σeσeσeσe fitted values -0.38902.2011-0.16650.2833-0.03820.00510.9775 bootstrap mean -0.35262.1986-0.15750.2862-0.03980.00570.9712 bootstrap SD 0.03230.04610.00740.03510.00460.00060.0570 The parameter estimates derived from incomplete data and 100 bootstrap samples under the joint AFT
36
The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.