Lecture 11: cohort analysis (part 2): Survival analysis and Poisson regression Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II Department of Public Health Sciences Medical University of South Carolina Spring 2015
Survival analysis Basic form: Kaplan-Meier method Takes into account different follow-up for individuals, different risk sets at each event Can produce Kaplan-Meier survival curves Graphical method for comparing time-to-event between different exposure groups Step function, with jumps at the failure times
Kaplan-Meier method 5 “failures” (events) 10 censored before disease onset or study end 5 still alive and healthy at study end 25% observed disease rate (this underestimates the true rate) “Product-limit estimate” is a more accurate way to calculate the rate, taking into account person-time at risk 17 at risk when first event occurred; 14 at risk when 2nd event occurred, etc.
Kaplan-Meier method 1st event: 17 at risk 2nd event: 14 at risk 3rd event: 11 at risk 4th event: 9 at risk 5th event: 7 at risk Probability of avoiding “failure” during entire study period: (16/17) * (13/14) * (10/11) * (8/9) * (6/7) = 0.605 Probability of “failure” = (1 – 0.605) = 0.395
Kaplan-Meier survival curve
Kaplan-Meier method If there are no withdrawals (i.e. if there are no censored observations): product-limit estimate = [(# of cases)/(# in cohort)] e.g. (4/5)*(3/4)*(2/3) = 2/5
Kaplan-Meier method Can be used to compare 2 exposure groups (chi-square statistic): e.g. 5-year survival Can be used to calculate stratified comparisons to account for possible confounding Can be used to compare more than 2 exposure groups
Survival analysis with models Regression models More flexible and powerful than Kaplan-Meier methods Can explicitly take into account time to event, rather than sequential risk sets with undefined length of time elapsed between events
Survival analysis Cox proportional hazards modeling Basic assumption: hazard of outcome in exposed group is a constant multiple of hazard of outcome in unexposed group
Survival analysis (constant HR)
Survival analysis Need to estimate B (constant multiplication factor determining risk in “exposed”) Convenient to define it as an exponential function (therefore we model the log hazard)
Survival analysis This gives Cox proportional hazards models similar characteristics to logistic regression models Multiplicative model; assumption of exponential risk increases
Survival analysis Model estimated in Cox proportional hazards regression:
Cox proportional hazards model Notable differences with logistic regression: Model does not estimate value or shape of baseline hazard function Under the proportional hazards assumption, we can estimate betas even with baseline hazard function unspecified We may specify a baseline hazard function if desired (e.g. Weibull proportional hazards model)
Cox proportional hazards model Notable differences with logistic regression: If hazard ratio is not constant over time, you must stratify on follow-up time This could be solved by fitting an interaction term (main exposure * follow-up time)
Poisson regression Used with person-time data Estimate adjusted relative rate
Poisson regression Similar to Cox model: Different from Cox model: Uses time-to-event data Different from Cox model: Does not assemble individual comparison group for each event at the time it occurs Rather, calculates rates (events per person-time) in different exposure-covariate strata to produce regression coefficients Produces true estimate of rate ratio
Poisson regression Assumes outcome (count variable) has a Poisson distribution Count variable (integer) Minimum 0, maximum is infinite Mean and variance are equal Distribution shape depends on mean Skewed right for low mean (e.g. 1, 2) Fairly normal for higher mean (e.g. 10 or higher)
Summary Poisson regression Cox proportional hazards regression Can obtain adjusted rates, rate ratios Requires person-years of follow-up to calculate denominator of rate Cox proportional hazards regression Can obtain adjusted hazard ratio based on time-to-event Fair comparison of individuals at risk (i.e. better than logistic regression)