Lecture 12: Cox Proportional Hazards Model

Slides:



Advertisements
Similar presentations
Residuals Residuals are used to investigate the lack of fit of a model to a given subject. For Cox regression, there’s no easy analog to the usual “observed.
Advertisements

Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Survival Analysis In many medical studies, the primary endpoint is time until an event occurs (e.g. death, remission) Data are typically subject to censoring.
Qualitative predictor variables
Lecture 20 Comparing groups Cox PHM. Comparing two or more samples  Anova type approach where τ is the largest time for which all groups have at least.
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
HSRP 734: Advanced Statistical Methods July 24, 2008.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
Survival analysis1 Every achievement originates from the seed of determination.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Practical Meta-Analysis -- D. B. Wilson
Modeling clustered survival data The different approaches.
Accelerated Failure Time (AFT) Model As An Alternative to Cox Model
Model Checking in the Proportional Hazard model
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Simple Linear Regression
Lecture 20 Comparing groups Cox PHM. Comparing two or more samples  Anova type approach where τ is the largest time for which all groups have at least.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
1 Survival Analysis Biomedical Applications Halifax SAS User Group April 29/2011.
HSRP 734: Advanced Statistical Methods July 10, 2008.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
Assessing Survival: Cox Proportional Hazards Model
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Introduction to Logistic Regression Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein.
Logistic regression. Analysis of proportion data We know how many times an event occurred, and how many times did not occur. We want to know if these.
Lecture 13: Cox PHM Part II Basic Cox Model Parameter Estimation Hypothesis Testing.
AN INTRODUCTION TO LOGISTIC REGRESSION ENI SUMARMININGSIH, SSI, MM PROGRAM STUDI STATISTIKA JURUSAN MATEMATIKA UNIVERSITAS BRAWIJAYA.
Linear correlation and linear regression + summary of tests
HSRP 734: Advanced Statistical Methods July 17, 2008.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Lecture 19: Competing Risk Regression
Lecture 15: Time Varying Covariates Time-varying covariates.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Introduction to logistic regression and Generalized Linear Models July 14, 2011 Introduction to Statistical Measurement and Modeling Karen Bandeen-Roche,
Lecture 16: Regression Diagnostics I Proportional Hazards Assumption -graphical methods -regression methods.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Linear Models Alan Lee Sample presentation for STATS 760.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
Lecture 3: Parametric Survival Modeling
Logistic Regression Analysis Gerrit Rooks
Treat everyone with sincerity,
Introduction to Frailty Models
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
BINARY LOGISTIC REGRESSION
Logistic regression.
Comparing Cox Model with a Surviving Fraction with regular Cox model
CHAPTER 7 Linear Correlation & Regression Methods
Statistics 103 Monday, July 10, 2017.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Presentation transcript:

Lecture 12: Cox Proportional Hazards Model Introduction

Cox Proportional Hazards Model Names Cox regression Semi-parametric proportional hazards Proportional hazards model Multiplicative hazards model When? 1972 Why? Allows for adjustment of covariates (continuous and categorical) in a survival setting Allows prediction of survival based on a set of covariates Analogous to linear and logistic regression in many ways

Cox PHM Notation (K & M) Data on n individuals: Tj : time on study for individual j dj : event time indicator for individual j Zj : vector of covariates for individual j More complicated: Zj(t) Covariates are time dependent May change with time/age

Basic Model

Comments on the Basic Model h0(t): Arbitrary baseline hazard Notice that it varies by t b: Regression coefficient vector Interpretation is a log hazard ratio Semi-parametric form Non-parametric baseline hazard Parametric form assumed for covariate effects only

Linear Model Formulation Usual formulation Coding of covariates similar to linear and logistic (and other generalized linear models)

Refresher of Coding Covariates Should be nothing new Two kinds of “independent” variables Quantitative Qualitative Quantitative are continuous Need to determine scale Units Transformations? Qualitative are generally categorical Ordered Nominal Coding affects interpretation

Why Proportional Hazard ratio Does not depend on t (i.e. it is constant over time) But, it is proportional (constant multiplicative factor) Also referred to (sometimes) as the relative risk

Simple Example One covariate: Hazard ratio: Interpretation: exp{btrt}is the risk of having the event in the new treatment group vs. the standard treatment group Interpretation: At any point in time, the risk of the event in the new treatment group is exp{btrt} time the risk in the standard treatment group

Fig 3. Cantù M G et al. JCO 2002;20:1232-1237 ©2002 by American Society of Clinical Oncology

Hazard Ratios Hazard function = P(die at time t| survived to time t) Assumption: “proportional hazards” The risk does not depend on time That is “the risk is constant over time” But that is still vague… Hypothetical example: Assume the hazard ratio is 0.5 Patient in new therapy group are at half the risk of death as those in the standard treatment, at any given point in time Hazard function = P(die at time t| survived to time t)

Hazard Ratios Hazard ratio = Makes assumption that this ratio is constant over time

Interpretation Again For any fixed point in time, individuals in the new treatment group have half the risk of death as the standard treatment group.

A Slightly More Complicated Example What if we had 2 binary covariates? How is the hazard ratio estimated in this case? What about the proportional hazards assumption?

A Slightly More Complicated Example Consider a model that includes

A Slightly More Complicated Example Our model looks like:

A Slightly More Complicated Example From this we can estimate 4 possible hazard rates

A Slightly More Complicated Example And if we “compare” the different hazards by taking the ratio we get

A Slightly More Complicated Example But what does this mean in terms of proportional hazards?

Hazard ratio is not always valid…

Let’s Think About the Likelihood…

Let’s Think About the Likelihood…

Let’s Think About the Likelihood…

Let’s Think About the Likelihood…

Partial Likelihood The partial likelihood is defined as Where j = 1, 2, …, n No ties t1 < t2 < … < tD Z(i)k is the kth covariate associated with the individual whose failure time is ti R(ti) =Yi is the risk set at time ti

Things to Notice Numerator only depends on information from a patient who experiences the event The denominator incorporates information across all patients in the risk set

Constructing the Likelihood Without Censoring… Say we have the following data on n = 5 subjects Observed times and even indicators: ti = 11, 12, 14, 16, 21 di = 1, 1, 1, 1, 1 And a single binary covariate zi = 0, 1, 0, 1, 1

Constructing the Likelihood First let’s construct our risk set for each unique time

Constructing the Likelihood Now, we can construct our likelihood…

Constructing the Likelihood Now, we can construct our likelihood…

Constructing the Likelihood But what if we have censoring? Consider the revised data: Observed times and even indicators: ti = 11, 12, 14, 16, 21 di = 1, 1, 0, 1, 0 And a single binary covariate zi = 0, 1, 0, 1, 1

Constructing the Likelihood Again let’s construct our risk set for each unique time

Constructing the Likelihood And again we can construct our likelihood…

Estimation The log-likelihood Maximize log-likelihood to solve for estimates of b

Estimation Maximize log-likelihood to solve for estimates of b Score equations and information matrices are found using standard approaches Solving for estimates can be done numerically (e.g. Newton-Raphson)

Tests of the Model Testing that bk = 0 for all k = 1, 2, …, p Three main tests Chi-square/ Wald test Likelihood ratio test Score test All three have chi-square distribution with p degrees of freedom

Example: CGD Study examining the impact of gamma interferon treatment on infection in people with chronic granulotomous disease 203 subject Main variable of interest is treatment Placebo Gamma interferon Other variables Demographics (age, height, weight) Steroid use Pattern of inheritance Treatment center … Outcome: Time to first major infection

Cox PHM Approach > data(cgd) > st<-Surv(cgd$time, cgd$infect) > reg1<-coxph(st~cgd$treat) > reg1 Call: coxph(formula = st ~ cgd$treat) coef exp(coef) se(coef) z p cgd$treatrIFN-g -1.09 0.337 0.268 -4.06 4.9e-05 Likelihood ratio test=18.9 on 1 df, p=1.36e-05 n= 203, number of events= 76 > attributes(reg1) $names [1] "coefficients" "var" "loglik" "score" "iter" "linear.predictors" [7] "residuals" "means" "concordance" "method" "n" "nevent" [13] "terms" "assign" "wald.test" "y" "formula" "xlevels" "contrasts" "call" $class [1] "coxph"

Results > summary(reg1) Call: coxph(formula = st ~ cgd$treat) n= 203, number of events= 76 coef exp(coef) se(coef) z Pr(>|z|) cgd$treatrIFN-g -1.0864 0.3374 0.2677 -4.059 4.93e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 exp(coef) exp(-coef) lower .95 upper .95 cgd$treatrIFN-g 0.3374 2.964 0.1997 0.5702 Likelihood ratio test= 18.92 on 1 df, p=1.364e-05 Wald test = 16.47 on 1 df, p=4.933e-05 Score (logrank) test = 18.07 on 1 df, p=2.124e-05

Fitting More Covariates in R > reg2<-coxph(st~treat+steroids+inherit+hos.cat+sex+age+weight, data=cgd) > reg2 Call: coxph(formula = st ~ treat + steroids + inherit + hos.cat + sex + age + weight, data = cgd) coef exp(coef) se(coef) z p treatrIFN-g -1.2025 0.300 0.2828 -4.253 2.1e-05 steroids 1.7743 5.896 0.5852 3.032 2.4e-03 inheritautosomal 0.6169 1.853 0.2824 2.184 2.9e-02 hos.catUS:other 0.0589 1.061 0.3208 0.184 8.5e-01 hos.catEurope:Amsterdam -0.5687 0.566 0.4432 -1.283 2.0e-01 hos.catEurope:other -0.6232 0.536 0.4956 -1.257 2.1e-01 sexfemale -0.6193 0.538 0.3872 -1.600 1.1e-01 age -0.0861 0.917 0.0336 -2.566 1.0e-02 weight 0.0235 1.024 0.0127 1.858 6.3e-02 Likelihood ratio test=41.2 on 9 df, p=4.65e-06 n= 203, number of events= 76

Next Time More on constructing our hypothesis tests next time…