Download presentation
1
Lecture 12: Cox Proportional Hazards Model
Introduction
2
Cox Proportional Hazards Model
Names Cox regression Semi-parametric proportional hazards Proportional hazards model Multiplicative hazards model When? 1972 Why? Allows for adjustment of covariates (continuous and categorical) in a survival setting Allows prediction of survival based on a set of covariates Analogous to linear and logistic regression in many ways
3
Cox PHM Notation (K & M) Data on n individuals:
Tj : time on study for individual j dj : event time indicator for individual j Zj : vector of covariates for individual j More complicated: Zj(t) Covariates are time dependent May change with time/age
4
Basic Model
5
Comments on the Basic Model
h0(t): Arbitrary baseline hazard Notice that it varies by t b: Regression coefficient vector Interpretation is a log hazard ratio Semi-parametric form Non-parametric baseline hazard Parametric form assumed for covariate effects only
6
Linear Model Formulation
Usual formulation Coding of covariates similar to linear and logistic (and other generalized linear models)
7
Refresher of Coding Covariates
Should be nothing new Two kinds of “independent” variables Quantitative Qualitative Quantitative are continuous Need to determine scale Units Transformations? Qualitative are generally categorical Ordered Nominal Coding affects interpretation
8
Why Proportional Hazard ratio
Does not depend on t (i.e. it is constant over time) But, it is proportional (constant multiplicative factor) Also referred to (sometimes) as the relative risk
9
Simple Example One covariate: Hazard ratio:
Interpretation: exp{btrt}is the risk of having the event in the new treatment group vs. the standard treatment group Interpretation: At any point in time, the risk of the event in the new treatment group is exp{btrt} time the risk in the standard treatment group
10
Fig 3. Cantù M G et al. JCO 2002;20:1232-1237
©2002 by American Society of Clinical Oncology
11
Hazard Ratios Hazard function = P(die at time t| survived to time t)
Assumption: “proportional hazards” The risk does not depend on time That is “the risk is constant over time” But that is still vague… Hypothetical example: Assume the hazard ratio is 0.5 Patient in new therapy group are at half the risk of death as those in the standard treatment, at any given point in time Hazard function = P(die at time t| survived to time t)
12
Hazard Ratios Hazard ratio = Makes assumption that this ratio is
constant over time
13
Interpretation Again For any fixed point in time, individuals in the new treatment group have half the risk of death as the standard treatment group.
14
A Slightly More Complicated Example
What if we had 2 binary covariates? How is the hazard ratio estimated in this case? What about the proportional hazards assumption?
15
A Slightly More Complicated Example
Consider a model that includes
16
A Slightly More Complicated Example
Our model looks like:
17
A Slightly More Complicated Example
From this we can estimate 4 possible hazard rates
18
A Slightly More Complicated Example
And if we “compare” the different hazards by taking the ratio we get
19
A Slightly More Complicated Example
But what does this mean in terms of proportional hazards?
20
Hazard ratio is not always valid…
21
Let’s Think About the Likelihood…
22
Let’s Think About the Likelihood…
23
Let’s Think About the Likelihood…
24
Let’s Think About the Likelihood…
25
Partial Likelihood The partial likelihood is defined as Where
j = 1, 2, …, n No ties t1 < t2 < … < tD Z(i)k is the kth covariate associated with the individual whose failure time is ti R(ti) =Yi is the risk set at time ti
26
Things to Notice Numerator only depends on information from a patient who experiences the event The denominator incorporates information across all patients in the risk set
27
Constructing the Likelihood
Without Censoring… Say we have the following data on n = 5 subjects Observed times and even indicators: ti = 11, 12, 14, 16, 21 di = 1, 1, 1, 1, 1 And a single binary covariate zi = 0, 1, 0, 1, 1
28
Constructing the Likelihood
First let’s construct our risk set for each unique time
29
Constructing the Likelihood
Now, we can construct our likelihood…
30
Constructing the Likelihood
Now, we can construct our likelihood…
31
Constructing the Likelihood
But what if we have censoring? Consider the revised data: Observed times and even indicators: ti = 11, 12, 14, 16, 21 di = 1, 1, 0, 1, 0 And a single binary covariate zi = 0, 1, 0, 1, 1
32
Constructing the Likelihood
Again let’s construct our risk set for each unique time
33
Constructing the Likelihood
And again we can construct our likelihood…
34
Estimation The log-likelihood
Maximize log-likelihood to solve for estimates of b
35
Estimation Maximize log-likelihood to solve for estimates of b
Score equations and information matrices are found using standard approaches Solving for estimates can be done numerically (e.g. Newton-Raphson)
36
Tests of the Model Testing that bk = 0 for all k = 1, 2, …, p
Three main tests Chi-square/ Wald test Likelihood ratio test Score test All three have chi-square distribution with p degrees of freedom
37
Example: CGD Study examining the impact of gamma interferon treatment on infection in people with chronic granulotomous disease 203 subject Main variable of interest is treatment Placebo Gamma interferon Other variables Demographics (age, height, weight) Steroid use Pattern of inheritance Treatment center … Outcome: Time to first major infection
39
Cox PHM Approach > data(cgd) > st<-Surv(cgd$time, cgd$infect) > reg1<-coxph(st~cgd$treat) > reg1 Call: coxph(formula = st ~ cgd$treat) coef exp(coef) se(coef) z p cgd$treatrIFN-g e-05 Likelihood ratio test=18.9 on 1 df, p=1.36e-05 n= 203, number of events= 76 > attributes(reg1) $names [1] "coefficients" "var" "loglik" "score" "iter" "linear.predictors" [7] "residuals" "means" "concordance" "method" "n" "nevent" [13] "terms" "assign" "wald.test" "y" "formula" "xlevels" "contrasts" "call" $class [1] "coxph"
40
Results > summary(reg1) Call: coxph(formula = st ~ cgd$treat) n= 203, number of events= 76 coef exp(coef) se(coef) z Pr(>|z|) cgd$treatrIFN-g e-05 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 exp(coef) exp(-coef) lower .95 upper .95 cgd$treatrIFN-g Likelihood ratio test= on 1 df, p=1.364e-05 Wald test = on 1 df, p=4.933e-05 Score (logrank) test = on 1 df, p=2.124e-05
41
Fitting More Covariates in R
> reg2<-coxph(st~treat+steroids+inherit+hos.cat+sex+age+weight, data=cgd) > reg2 Call: coxph(formula = st ~ treat + steroids + inherit + hos.cat + sex + age + weight, data = cgd) coef exp(coef) se(coef) z p treatrIFN-g e-05 steroids e-03 inheritautosomal e-02 hos.catUS:other e-01 hos.catEurope:Amsterdam e-01 hos.catEurope:other e-01 sexfemale e-01 age e-02 weight e-02 Likelihood ratio test=41.2 on 9 df, p=4.65e-06 n= 203, number of events= 76
42
Next Time More on constructing our hypothesis tests next time…
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.