Lecture 3: Parametric Survival Modeling

Name: Lecture 3: Parametric Survival Modeling
Uploaded: 2017-08-19T23:45:50+00:00
Duration: PTM20S47
Channel: Jewel Jefferson
Description: Lecture 3: Parametric Survival Modeling

Lecture 3: Parametric Survival Modeling
Parametric models Example and nuances in R

Parametric Distributions
We’ve discussed a variety of parametric distributions Exponential, Weibull, log-normal, log-logistic, gamma, …. But… how do we “fit” a model Model parameterizations Inclusion of coefficients

Modeling Homogeneous Population
Relatively “simple” Once we’ve determined the distribution we need to estimate the parameters For example, exponential

Covariates Frequently want to adjust survival for covariates
Two main approaches Accelerated Failure Time model Multiplicative model

Accelerated Failure Time
Under AFT model for two populations expected survival time median survival time Survival at time t for Population 1 are c times that of population 2, where c is constant.

Data include Failure time T > 0 Vector of covariates Z’=(Z1, Z2, …, Zp) Quantitative Qualitative Log transform T for linear model approach

When Z = 0, So(t) is survival function of em+sW

First consider 2 populations that only differ by 1 unit in zk

Exponential Models in R
Recall: Parameterization is the same for exponential in R rexp(n, rate)

We can run an expontial survival model in R using survreg(formula, data, dist) R gives us: But, we can find: In a model with no covariates,

The distribution of any T is exponential with constant hazard rate: We can interpret as the hazard ratio corresponding to a 1 unit increase in the covariate

Weibull Models in R Recall: Now our scale parameter is no longer 1
Unlike exponential, the parameterization for Weibull is different in R… Random weibull generation… rweibull(n, shape, scale)

Weibull Models in R Again we can run a Weibull model in R but parameterization different here too… survreg(formula, data, dist) R gives us: But, we can find:

AML Example In R Survival in patients with Acute Myelogenous Leukemia.
Data 23 Subjects Time to death Censoring indicator Treatment Standard course of chemotherapy Chemo extended ('maintainance') for additional cycles.

AML Dataset > library(survival) > aml time status x Maintained Maintained Maintained Maintained Maintained Maintained Maintained Maintained Maintained Maintained Maintained Nonmaintained Nonmaintained Nonmaintained Nonmaintained Nonmaintained … Nonmaintained

AML model in R: exponential (no covariates)
>library(MASS) >library(survival) >sdat<-Surv(aml$time, aml$status) >exp_fit<-survreg(sdat~1, dist=“exponential“) >#exp_fit<-survreg(sdat~1, dist="weibull", scale=1)  alternative >summary(exp_fit) Call: survreg(formula = sdat ~ 1, dist = "exponential") Value Std. Error z p (Intercept) e-53 Scale fixed at 1 Exponential distribution Loglik(model)= Loglik(intercept only)= Number of Newton-Raphson Iterations: 4 n= 23

Checking Exponential Model Fit

Model Checks: Exponential
###Model checks for exponential par(mfrow=c(1,3)) lam_hat<-exp(-exp_fit$coefficient) logHt<-log(-log(emp_fit$surv)) logt<-log(emp_fit$time) # Plot log cumulative hazard vs. log time plot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))") points(logt, logHt, pch=16) abline(log(lam_hat), 1, lwd=2, col="red") # Second model check: Plot of H(t) vs. t Ht<--log(emp_fit$surv) t<-emp_fit$time plot(t, Ht, lwd=2, type="l", xlab="time", ylab="H(t)") points(t, Ht, pch=16) abline(0,lam_hat, lwd=2, col="red") #Third model check fit.dat<-exp(-lam_hat*c(0:150)) plot(emp_fit, xlab="Time", ylab="Survival Fraction") lines(c(0:150), fit.dat, lwd=2, col=2)

Lets Look at some Specifics for Exponential
Exponential Model… 1st estimate lambda 12 month survival ? Median survival ? Mean survival ?

AML model in R: Weibull (no covariates)
>weib_fit<-survreg(sdat~1, dist="weibull", scale=0) > summary(weib_fit) Call: survreg(formula = sdat ~ 1, dist = "weibull", scale = 0) Value Std. Error z p (Intercept) e-63 Log(scale) e-01 Scale= Weibull distribution Loglik(model)= Loglik(intercept only)= Number of Newton-Raphson Iterations: 5 n= 23

Model Checks: Weibull

Model Checks: Weibull ###Model checks for weibull alp_hat<-1/exp(weib_fit$scale) lam_hat<-exp(-weib_fit$coefficient[1]/exp(weib_fit$scale)) logHt<-log(-log(emp_fit$surv)) logt<-log(emp_fit$time) # Plot log cumulative hazard vs. log time plot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))") points(logt, logHt, pch=16) abline(log(lam_hat), alp_hat, lwd=2, col="red") # Plot of survival function vs. empircal fit.dat<-exp(-lam_hat*c(0:150)^alp_hat) plot(emp_fit, xlab="Time", ylab="Survival Fraction") lines(c(0:150), fit.dat, lwd=2, col=2)

Lets Look at some Specifics for Weibull
1st estimate lambda and alpha 12 month survival ? Median survival ? Mean survival ?

Compare Weibull/Exponential Fits to the Empirical Distribution (no covariates)

Empirical Distribution: What about specific times (no covariates)?
12 month survival = 0.74 Median survival = 27

What about relative to the empirical distribution (no covariates)?
12 month survival = 74% Median survival = 27 months Exponential Model: 12 month survival = 73% Median survival = 26.1 months Mean survival = 37.7 months Weibull model: 12 month survival = 75.5% Median survival = 27.3 months Mean survival = 36.9 months

What about covariates….

AML model in R: exponential (with covariate)
> exp_fit2<-survreg(sdat~x, dist="exponential", data=aml) > summary(exp_fit2) Call: survreg(formula = sdat ~ x, data = aml, dist = "exponential") Value Std. Error z p (Intercept) e-27 xNonmaintained e-02 Scale fixed at 1 Exponential distribution Loglik(model)= Loglik(intercept only)= Chisq= 4.06 on 1 degrees of freedom, p= Number of Newton-Raphson Iterations: 4 n= 23

Exponential Fit by Group

What about estimates by group?
Maintiained? Non-maintained?

AML model in R: Weibull (with covariates)
> weib_fit2<-survreg(sdat~x, dist="weibull", data=aml, scale=0) > summary(weib_fit2) Call: survreg(formula = sdat ~ x, data = aml, dist = "weibull", scale = 0) Value Std. Error z p (Intercept) e-43 xNonmaintained e-02 Log(scale) e-01 Scale= Weibull distribution Loglik(model)= Loglik(intercept only)= Chisq= 5.31 on 1 degrees of freedom, p= Number of Newton-Raphson Iterations: 5 n= 23

Weibull fit

What about estimates by group?
Maintiained? Non-maintained?

Exponential and Weibull Fits for Maintained vs. Non-maintained
Red = weibull Blue = exponetial

Empirical Distribution: What about specific survival times (with covariate)?
Maintained: 12 month survival = 91% Median survival = 31 months Non-Maintained: 12 Month survival = 58% Median survival = 23 months

Comparisons Maintained Empirical: Exponential Model: Weibull model:
Non-maintained Empirical: 12 month survival = 91% Median survival = 31 months Exponential Model: 12 month survival = 82% Median survival = 41.9 months Weibull model: 12 month survival = 88% Median survival = 45.9 months Empirical: 12 month survival = 58% Median survival = 23 months Exponential Model: 12 month survival = 60% Median survival = 16.1 months Weibull model: 12 month survival = 66% Median survival = 18 months

Compare Exponential & Empirical Distribution (with covariates)

Compare Weibull & Empirical Distribution (with covariates)

Multiplicative Hazard Rate Models
Hazard rate of individual with covariate vector z is: In these models ho(t) may be parametric or arbitrary non-negative function Most common link function proposed by Cox

Multiplicative Hazard Rate Models
Key feature is proportional hazards

Multiplicative Hazard Rate Model
These parametric models are very similar to semi-parametric Cox proportional hazard models we will discuss later… The AFT models using the exponential/Weibull are also classified as multiplicative models due to their proportional hazards property This is not true for any other parametric distribution Since Cox models are so commonly used, it is rare to see a parametric implementation of these models

Advantages of Parametric Models
If we correctly characterize the underlying distribution, our estimates will be more precise than semi- and non-parametric estimates. This means we may have greater power to identify relationships between our outcome and predictors However…

Disadvantages of Parametric Models
If we use the wrong distribution problems can arise Distribution often chosen based on the shape of the model without covariates, This can/will change as covariates are added\ Alternatively use intuition/theory about what the dependency is expected to be BUT the time-dependency is what is left over after conditioning on covariates so we are also likely to fail here.

Brief SAS Code  /************************************/
/* Accelerated Failure Time Models */ /*Exponential models: 1st is intercept only, second is with the covariate*/ proc lifereg data=aml; model time*status(0) = /dist=exponential; run; class x; model time*status(0) = x/dist=exponential;

Brief SAS Code  /************************************/ /* Accelerated Failure Time Models */ /*Weibull models: 1st is intercept only, second is with the covariate*/ proc lifereg data=aml; model time*status(0) = /dist=weibull; run; class x; model time*status(0) = x/dist=weibull;

Example of SAS Output Analysis of Maximum Likelihood Parameter Estimates Parameter DF Estimate Standard Error 95% Confidence Limits Chi-Square Pr > ChiSq Intercept 1 3.6288 0.2357 3.1668 4.0907 237.02 <.0001 Scale 1.0000 0.0000 Weibull Scale 8.8781 Weibull Shape Lagrange Multiplier Statistics Parameter Chi-Square Pr > ChiSq Scale 0.3305 0.5654

Next Time Likelihoods!!!

Lecture 3: Parametric Survival Modeling

Similar presentations

Presentation on theme: "Lecture 3: Parametric Survival Modeling"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 3: Parametric Survival Modeling

Similar presentations

Presentation on theme: "Lecture 3: Parametric Survival Modeling"— Presentation transcript:

Similar presentations

About project

Feedback