Lecture 12: Cox Proportional Hazards Model Introduction
Cox Proportional Hazards Model Names Cox regression Semi-parametric proportional hazards Proportional hazards model Multiplicative hazards model When? 1972 Why? Allows for adjustment of covariates (continuous and categorical) in a survival setting Allows prediction of survival based on a set of covariates Analogous to linear and logistic regression in many ways
Cox PHM Notation (K & M) Data on n individuals: Tj : time on study for individual j dj : event time indicator for individual j Zj : vector of covariates for individual j More complicated: Zj(t) Covariates are time dependent May change with time/age
Basic Model
Comments on the Basic Model h0(t): Arbitrary baseline hazard Notice that it varies by t b: Regression coefficient vector Interpretation is a log hazard ratio Semi-parametric form Non-parametric baseline hazard Parametric form assumed for covariate effects only
Linear Model Formulation Usual formulation Coding of covariates similar to linear and logistic (and other generalized linear models)
Refresher of Coding Covariates Should be nothing new Two kinds of “independent” variables Quantitative Qualitative Quantitative are continuous Need to determine scale Units Transformations? Qualitative are generally categorical Ordered Nominal Coding affects interpretation
Why Proportional Hazard ratio Does not depend on t (i.e. it is constant over time) But, it is proportional (constant multiplicative factor) Also referred to (sometimes) as the relative risk
Simple Example One covariate: Hazard ratio: Interpretation: exp{btrt}is the risk of having the event in the new treatment group vs. the standard treatment group Interpretation: At any point in time, the risk of the event in the new treatment group is exp{btrt} time the risk in the standard treatment group
Fig 3. Cantù M G et al. JCO 2002;20:1232-1237 ©2002 by American Society of Clinical Oncology
Hazard Ratios Hazard function = P(die at time t| survived to time t) Assumption: “proportional hazards” The risk does not depend on time That is “the risk is constant over time” But that is still vague… Hypothetical example: Assume the hazard ratio is 0.5 Patient in new therapy group are at half the risk of death as those in the standard treatment, at any given point in time Hazard function = P(die at time t| survived to time t)
Hazard Ratios Hazard ratio = Makes assumption that this ratio is constant over time
Interpretation Again For any fixed point in time, individuals in the new treatment group have half the risk of death as the standard treatment group.
A Slightly More Complicated Example What if we had 2 binary covariates? How is the hazard ratio estimated in this case? What about the proportional hazards assumption?
A Slightly More Complicated Example Consider a model that includes
A Slightly More Complicated Example Our model looks like:
A Slightly More Complicated Example From this we can estimate 4 possible hazard rates
A Slightly More Complicated Example And if we “compare” the different hazards by taking the ratio we get
A Slightly More Complicated Example But what does this mean in terms of proportional hazards?
Hazard ratio is not always valid…
Let’s Think About the Likelihood…
Let’s Think About the Likelihood…
Let’s Think About the Likelihood…
Let’s Think About the Likelihood…
Partial Likelihood The partial likelihood is defined as Where j = 1, 2, …, n No ties t1 < t2 < … < tD Z(i)k is the kth covariate associated with the individual whose failure time is ti R(ti) =Yi is the risk set at time ti
Things to Notice Numerator only depends on information from a patient who experiences the event The denominator incorporates information across all patients in the risk set
Constructing the Likelihood Without Censoring… Say we have the following data on n = 5 subjects Observed times and even indicators: ti = 11, 12, 14, 16, 21 di = 1, 1, 1, 1, 1 And a single binary covariate zi = 0, 1, 0, 1, 1
Constructing the Likelihood First let’s construct our risk set for each unique time
Constructing the Likelihood Now, we can construct our likelihood…
Constructing the Likelihood Now, we can construct our likelihood…
Constructing the Likelihood But what if we have censoring? Consider the revised data: Observed times and even indicators: ti = 11, 12, 14, 16, 21 di = 1, 1, 0, 1, 0 And a single binary covariate zi = 0, 1, 0, 1, 1
Constructing the Likelihood Again let’s construct our risk set for each unique time
Constructing the Likelihood And again we can construct our likelihood…
Estimation The log-likelihood Maximize log-likelihood to solve for estimates of b
Estimation Maximize log-likelihood to solve for estimates of b Score equations and information matrices are found using standard approaches Solving for estimates can be done numerically (e.g. Newton-Raphson)
Tests of the Model Testing that bk = 0 for all k = 1, 2, …, p Three main tests Chi-square/ Wald test Likelihood ratio test Score test All three have chi-square distribution with p degrees of freedom
Example: CGD Study examining the impact of gamma interferon treatment on infection in people with chronic granulotomous disease 203 subject Main variable of interest is treatment Placebo Gamma interferon Other variables Demographics (age, height, weight) Steroid use Pattern of inheritance Treatment center … Outcome: Time to first major infection
Cox PHM Approach > data(cgd) > st<-Surv(cgd$time, cgd$infect) > reg1<-coxph(st~cgd$treat) > reg1 Call: coxph(formula = st ~ cgd$treat) coef exp(coef) se(coef) z p cgd$treatrIFN-g -1.09 0.337 0.268 -4.06 4.9e-05 Likelihood ratio test=18.9 on 1 df, p=1.36e-05 n= 203, number of events= 76 > attributes(reg1) $names [1] "coefficients" "var" "loglik" "score" "iter" "linear.predictors" [7] "residuals" "means" "concordance" "method" "n" "nevent" [13] "terms" "assign" "wald.test" "y" "formula" "xlevels" "contrasts" "call" $class [1] "coxph"
Results > summary(reg1) Call: coxph(formula = st ~ cgd$treat) n= 203, number of events= 76 coef exp(coef) se(coef) z Pr(>|z|) cgd$treatrIFN-g -1.0864 0.3374 0.2677 -4.059 4.93e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 exp(coef) exp(-coef) lower .95 upper .95 cgd$treatrIFN-g 0.3374 2.964 0.1997 0.5702 Likelihood ratio test= 18.92 on 1 df, p=1.364e-05 Wald test = 16.47 on 1 df, p=4.933e-05 Score (logrank) test = 18.07 on 1 df, p=2.124e-05
Fitting More Covariates in R > reg2<-coxph(st~treat+steroids+inherit+hos.cat+sex+age+weight, data=cgd) > reg2 Call: coxph(formula = st ~ treat + steroids + inherit + hos.cat + sex + age + weight, data = cgd) coef exp(coef) se(coef) z p treatrIFN-g -1.2025 0.300 0.2828 -4.253 2.1e-05 steroids 1.7743 5.896 0.5852 3.032 2.4e-03 inheritautosomal 0.6169 1.853 0.2824 2.184 2.9e-02 hos.catUS:other 0.0589 1.061 0.3208 0.184 8.5e-01 hos.catEurope:Amsterdam -0.5687 0.566 0.4432 -1.283 2.0e-01 hos.catEurope:other -0.6232 0.536 0.4956 -1.257 2.1e-01 sexfemale -0.6193 0.538 0.3872 -1.600 1.1e-01 age -0.0861 0.917 0.0336 -2.566 1.0e-02 weight 0.0235 1.024 0.0127 1.858 6.3e-02 Likelihood ratio test=41.2 on 9 df, p=4.65e-06 n= 203, number of events= 76
Next Time More on constructing our hypothesis tests next time…