Download presentation
1
Lecture 3: Parametric Survival Modeling
Parametric models Example and nuances in R
2
Parametric Distributions
We’ve discussed a variety of parametric distributions Exponential, Weibull, log-normal, log-logistic, gamma, …. But… how do we “fit” a model Model parameterizations Inclusion of coefficients
3
Modeling Homogeneous Population
Relatively “simple” Once we’ve determined the distribution we need to estimate the parameters For example, exponential
4
Covariates Frequently want to adjust survival for covariates
Two main approaches Accelerated Failure Time model Multiplicative model
5
Accelerated Failure Time
Under AFT model for two populations expected survival time median survival time Survival at time t for Population 1 are c times that of population 2, where c is constant.
6
Accelerated Failure Time
Data include Failure time T > 0 Vector of covariates Z’=(Z1, Z2, …, Zp) Quantitative Qualitative Log transform T for linear model approach
7
Accelerated Failure Time
When Z = 0, So(t) is survival function of em+sW
8
Accelerated Failure Time
First consider 2 populations that only differ by 1 unit in zk
9
Accelerated Failure Time
First consider 2 populations that only differ by 1 unit in zk
10
Exponential Models in R
Recall: Parameterization is the same for exponential in R rexp(n, rate)
11
Exponential Models in R
We can run an expontial survival model in R using survreg(formula, data, dist) R gives us: But, we can find: In a model with no covariates,
12
Exponential Models in R
The distribution of any T is exponential with constant hazard rate: We can interpret as the hazard ratio corresponding to a 1 unit increase in the covariate
13
Weibull Models in R Recall: Now our scale parameter is no longer 1
Unlike exponential, the parameterization for Weibull is different in R… Random weibull generation… rweibull(n, shape, scale)
14
Weibull Models in R Again we can run a Weibull model in R but parameterization different here too… survreg(formula, data, dist) R gives us: But, we can find:
15
AML Example In R Survival in patients with Acute Myelogenous Leukemia.
Data 23 Subjects Time to death Censoring indicator Treatment Standard course of chemotherapy Chemo extended ('maintainance') for additional cycles.
16
AML Dataset > library(survival) > aml time status x Maintained Maintained Maintained Maintained Maintained Maintained Maintained Maintained Maintained Maintained Maintained Nonmaintained Nonmaintained Nonmaintained Nonmaintained Nonmaintained … Nonmaintained
17
AML model in R: exponential (no covariates)
>library(MASS) >library(survival) >sdat<-Surv(aml$time, aml$status) >exp_fit<-survreg(sdat~1, dist=“exponential“) >#exp_fit<-survreg(sdat~1, dist="weibull", scale=1) alternative >summary(exp_fit) Call: survreg(formula = sdat ~ 1, dist = "exponential") Value Std. Error z p (Intercept) e-53 Scale fixed at 1 Exponential distribution Loglik(model)= Loglik(intercept only)= Number of Newton-Raphson Iterations: 4 n= 23
18
Checking Exponential Model Fit
19
Model Checks: Exponential
###Model checks for exponential par(mfrow=c(1,3)) lam_hat<-exp(-exp_fit$coefficient) logHt<-log(-log(emp_fit$surv)) logt<-log(emp_fit$time) # Plot log cumulative hazard vs. log time plot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))") points(logt, logHt, pch=16) abline(log(lam_hat), 1, lwd=2, col="red") # Second model check: Plot of H(t) vs. t Ht<--log(emp_fit$surv) t<-emp_fit$time plot(t, Ht, lwd=2, type="l", xlab="time", ylab="H(t)") points(t, Ht, pch=16) abline(0,lam_hat, lwd=2, col="red") #Third model check fit.dat<-exp(-lam_hat*c(0:150)) plot(emp_fit, xlab="Time", ylab="Survival Fraction") lines(c(0:150), fit.dat, lwd=2, col=2)
20
Lets Look at some Specifics for Exponential
Exponential Model… 1st estimate lambda 12 month survival ? Median survival ? Mean survival ?
21
AML model in R: Weibull (no covariates)
>weib_fit<-survreg(sdat~1, dist="weibull", scale=0) > summary(weib_fit) Call: survreg(formula = sdat ~ 1, dist = "weibull", scale = 0) Value Std. Error z p (Intercept) e-63 Log(scale) e-01 Scale= Weibull distribution Loglik(model)= Loglik(intercept only)= Number of Newton-Raphson Iterations: 5 n= 23
22
Model Checks: Weibull
23
Model Checks: Weibull ###Model checks for weibull alp_hat<-1/exp(weib_fit$scale) lam_hat<-exp(-weib_fit$coefficient[1]/exp(weib_fit$scale)) logHt<-log(-log(emp_fit$surv)) logt<-log(emp_fit$time) # Plot log cumulative hazard vs. log time plot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))") points(logt, logHt, pch=16) abline(log(lam_hat), alp_hat, lwd=2, col="red") # Plot of survival function vs. empircal fit.dat<-exp(-lam_hat*c(0:150)^alp_hat) plot(emp_fit, xlab="Time", ylab="Survival Fraction") lines(c(0:150), fit.dat, lwd=2, col=2)
24
Lets Look at some Specifics for Weibull
1st estimate lambda and alpha 12 month survival ? Median survival ? Mean survival ?
25
Compare Weibull/Exponential Fits to the Empirical Distribution (no covariates)
26
Empirical Distribution: What about specific times (no covariates)?
12 month survival = 0.74 Median survival = 27
27
What about relative to the empirical distribution (no covariates)?
12 month survival = 74% Median survival = 27 months Exponential Model: 12 month survival = 73% Median survival = 26.1 months Mean survival = 37.7 months Weibull model: 12 month survival = 75.5% Median survival = 27.3 months Mean survival = 36.9 months
28
What about covariates….
29
AML model in R: exponential (with covariate)
> exp_fit2<-survreg(sdat~x, dist="exponential", data=aml) > summary(exp_fit2) Call: survreg(formula = sdat ~ x, data = aml, dist = "exponential") Value Std. Error z p (Intercept) e-27 xNonmaintained e-02 Scale fixed at 1 Exponential distribution Loglik(model)= Loglik(intercept only)= Chisq= 4.06 on 1 degrees of freedom, p= Number of Newton-Raphson Iterations: 4 n= 23
30
Exponential Fit by Group
31
What about estimates by group?
Maintiained? Non-maintained?
32
AML model in R: Weibull (with covariates)
> weib_fit2<-survreg(sdat~x, dist="weibull", data=aml, scale=0) > summary(weib_fit2) Call: survreg(formula = sdat ~ x, data = aml, dist = "weibull", scale = 0) Value Std. Error z p (Intercept) e-43 xNonmaintained e-02 Log(scale) e-01 Scale= Weibull distribution Loglik(model)= Loglik(intercept only)= Chisq= 5.31 on 1 degrees of freedom, p= Number of Newton-Raphson Iterations: 5 n= 23
33
Weibull fit
34
What about estimates by group?
Maintiained? Non-maintained?
35
Exponential and Weibull Fits for Maintained vs. Non-maintained
Red = weibull Blue = exponetial
36
Empirical Distribution: What about specific survival times (with covariate)?
Maintained: 12 month survival = 91% Median survival = 31 months Non-Maintained: 12 Month survival = 58% Median survival = 23 months
37
Comparisons Maintained Empirical: Exponential Model: Weibull model:
Non-maintained Empirical: 12 month survival = 91% Median survival = 31 months Exponential Model: 12 month survival = 82% Median survival = 41.9 months Weibull model: 12 month survival = 88% Median survival = 45.9 months Empirical: 12 month survival = 58% Median survival = 23 months Exponential Model: 12 month survival = 60% Median survival = 16.1 months Weibull model: 12 month survival = 66% Median survival = 18 months
38
Compare Exponential & Empirical Distribution (with covariates)
39
Compare Weibull & Empirical Distribution (with covariates)
41
Multiplicative Hazard Rate Models
Hazard rate of individual with covariate vector z is: In these models ho(t) may be parametric or arbitrary non-negative function Most common link function proposed by Cox
42
Multiplicative Hazard Rate Models
Key feature is proportional hazards
43
Multiplicative Hazard Rate Model
These parametric models are very similar to semi-parametric Cox proportional hazard models we will discuss later… The AFT models using the exponential/Weibull are also classified as multiplicative models due to their proportional hazards property This is not true for any other parametric distribution Since Cox models are so commonly used, it is rare to see a parametric implementation of these models
44
Advantages of Parametric Models
If we correctly characterize the underlying distribution, our estimates will be more precise than semi- and non-parametric estimates. This means we may have greater power to identify relationships between our outcome and predictors However…
45
Disadvantages of Parametric Models
If we use the wrong distribution problems can arise Distribution often chosen based on the shape of the model without covariates, This can/will change as covariates are added\ Alternatively use intuition/theory about what the dependency is expected to be BUT the time-dependency is what is left over after conditioning on covariates so we are also likely to fail here.
46
Brief SAS Code /************************************/
/* Accelerated Failure Time Models */ /*Exponential models: 1st is intercept only, second is with the covariate*/ proc lifereg data=aml; model time*status(0) = /dist=exponential; run; class x; model time*status(0) = x/dist=exponential;
47
Brief SAS Code /************************************/ /* Accelerated Failure Time Models */ /*Weibull models: 1st is intercept only, second is with the covariate*/ proc lifereg data=aml; model time*status(0) = /dist=weibull; run; class x; model time*status(0) = x/dist=weibull;
48
Example of SAS Output Analysis of Maximum Likelihood Parameter Estimates Parameter DF Estimate Standard Error 95% Confidence Limits Chi-Square Pr > ChiSq Intercept 1 3.6288 0.2357 3.1668 4.0907 237.02 <.0001 Scale 1.0000 0.0000 Weibull Scale 8.8781 Weibull Shape Lagrange Multiplier Statistics Parameter Chi-Square Pr > ChiSq Scale 0.3305 0.5654
49
Next Time Likelihoods!!!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.