Survival Analysis. Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness,

Slides:



Advertisements
Similar presentations
The analysis of survival data: the Kaplan Meier method Kitty J. Jager¹, Paul van Dijk 1,2, Carmine Zoccali 3 and Friedo W. Dekker 1,4 1 ERA–EDTA Registry,
Advertisements

The analysis of survival data in nephrology. Basic concepts and methods of Cox regression Paul C. van Dijk 1-2, Kitty J. Jager 1, Aeilko H. Zwinderman.
Surviving Survival Analysis
Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Survival Analysis-1 In Survival Analysis the outcome of interest is time to an event In Survival Analysis the outcome of interest is time to an event The.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
بسم الله الرحمن الرحیم. Generally,survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of.
Analysis of Time to Event Data
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
Main Points to be Covered
Biostatistics in Research Practice Time to event data Martin Bland Professor of Health Statistics University of York
Introduction to Survival Analysis
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Cohort Studies.
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Sample Size Determination
EVIDENCE BASED MEDICINE
Sample Size Determination Ziad Taib March 7, 2014.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
Introduction to Survival Analysis August 3 and 5, 2004.
Estimating cancer survival and clinical outcome based on genetic tumor progression scores Jörg Rahnenführer 1,*, Niko Beerenwinkel 1,, Wolfgang A. Schulz.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
1 Survival Analysis Biomedical Applications Halifax SAS User Group April 29/2011.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
HSRP 734: Advanced Statistical Methods July 10, 2008.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Lecture 3 Survival analysis.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Prevalence The presence (proportion) of disease or condition in a population (generally irrespective of the duration of the disease) Prevalence: Quantifies.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
INTRODUCTION TO SURVIVAL ANALYSIS
Chapter 12 Survival Analysis.
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
HSRP 734: Advanced Statistical Methods July 31, 2008.
Medical Statistics as a science
1 Lecture 6: Descriptive follow-up studies Natural history of disease and prognosis Survival analysis: Kaplan-Meier survival curves Cox proportional hazards.
Describing the risk of an event and identifying risk factors Caroline Sabin Professor of Medical Statistics and Epidemiology, Research Department of Infection.
Lecture 5: The Natural History of Disease: Ways to Express Prognosis
Survival Analysis approach in evaluating the efficacy of ARV treatment in HIV patients at the Dr GM Hospital in Tshwane, GP of S. Africa Marcus Motshwane.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
12/20091 EPI 5240: Introduction to Epidemiology Incidence and survival December 7, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Biostatistics Case Studies 2014 Youngju Pak Biostatistician Session 5: Survival Analysis Fundamentals.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 2: Aging and Survival.
01/20151 EPI 5344: Survival Analysis in Epidemiology Quick Review from Session #1 March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health &
Some survival basics Developments from the Kaplan-Meier method October
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
INTRODUCTION TO CLINICAL RESEARCH Survival Analysis – Getting Started Karen Bandeen-Roche, Ph.D. July 20, 2010.
Topic 19: Survival Analysis T = Time until an event occurs. Events are, e.g., death, disease recurrence or relapse, infection, pregnancy.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
An introduction to Survival analysis and Applications to Predicting Recidivism Rebecca S. Frazier, PhD JBS International.
April 18 Intro to survival analysis Le 11.1 – 11.2
Statistics 103 Monday, July 10, 2017.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Measures of Disease Occurrence
Kaplan-Meier survival curves and the log rank test
Presentation transcript:

Survival Analysis

Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness, recovery from illness (binary or dichotomous variables) or transition above or below the clinical threshold of a meaningful continuous variable (e.g. CD4 counts). Accommodates data from randomized clinical trial or cohort study design.

Randomized Clinical Trial (RCT) Target population Intervention Control Disease Disease-free Disease Disease-free TIME Random assignment Disease-free, at-risk cohort

Target population Treatment Control Cured Not cured Cured Not cured TIME Random assignment Patient population Randomized Clinical Trial (RCT)

Target population Treatment Control Dead Alive Dead Alive TIME Random assignment Patient population Randomized Clinical Trial (RCT)

Survival analysis Primary focus is ‘time-to-event’ or “survival” time e.g time until death, time until recurrence, time until remission, time until CD4 count declines or drops below a certain level etc. Events may include all kinds of “positive” or “negative’”events e.g. Time until tumor shrinks 20%, time until death, time until an alcoholic relapses and begins drinking again, etc. Can use single and combined endpoints e.g time until death or time until CD4 count declines are single endpoints while time until either CD4 count declines or death occurs is a combined endpoint. Problem: the event of interest may never be observed!

Censoring Most survival analyses must deal with a key problem called censoring. Censoring occurs when the event of interest is not observed for whatever reason so we do not know the exact “survival” time. There are generally three reasons why censoring occurs: 1)a person does not experience the event before the study ends. 2)a person is lost to follow-up during the study period. 3)a person “withdraws” from the study for whatever reason.

The incidence rate of death for renal replacement therapy (RRT) patients Survival times of eight patients at risk of death on RRT. The inclusion period was , whereas follow-up was ended on 31 December Patients death start of RRT Status censored event censored event censored event recovery of renal function start of RRT loss to follow-up start of RRT death start of RRT death Patients death Status censored event censored event recovery of renal function start of RRT loss to follow-up start of RRT death start of RRT death death due to competing cause Start End Example – Survival time on RRT: events & censored observations ____________________________________________________________ Incident RRT patients in the ERA-EDTA Registry were included in an analysis of patient survival on RRT. Like in most survival studies patients were recruited over a period of time ( the inclusion period) and they were observed up to a specific date (31 December the end of the follow-up period). During this period the event of interest was ‘death while on RRT’, whereas censoring took place at recovery of renal function, loss to follow-up and at 31 December 2005.

Assumptions related to censoring At any time patients who are censored have the same survival prospects as those who continue to be followed. –This sometimes is problematic: e.g. in the calculation of survival on dialysis censoring at the time of transplantation is needed because these patients are no longer at risk of death on dialysis – however dialysis patients on the transplant waiting list do not have the same prospects as dialysis patients who are not on the waiting list Survival probabilities are assumed to be the same for subjects recruited early and late in the study. –May test this by splitting a cohort of patients in those who were recruited early and those recruited late and see if their survival curves are different.

Kaplan Meier method Observed survival times are first sorted in ascending order, starting with the patient with the shortest survival time and presented in a table. Example 2 - Survival probability in RRT patients due to diabetes mellitus and other causes In a sample of 50 RRT patients taken from a study on diabetes mellitus survival time started running at the moment a patient was included in the study, in this case at the start of RRT. Patients were followed until death or censoring. The survival probability was calculated using the Kaplan Meier method. Subsequently, the survival of patients with ESRD due to diabetes mellitus was compared to the survival of those with ESRD due to other causes. Used to estimate survival probabilities and to compare survival of different groups.

Kaplan Meier method At the start of the study all 50 patients were alive - proportion surviving and cumulative survival were 1.00 When the first patient died on day 34 after the start of RRT, the proportion surviving was 49/50 = = 98%. To calculate the cumulative survival this proportion surviving was multiplied by the 1.0 cumulative survival from the previous step resulting in a cumulative survival dropping to When the second patient died at day 35, the proportion surviving was 48/49 = To obtain the cumulative survival at day 35, again, this proportion was multiplied by the cumulative survival from the previous step which resulted in a cumulative survival dropping that day to On day 57, however, a patient was withdrawn alive from the study (censored). The proportion surviving that day was 47/47 = 1.00, as this patient did not die but was withdrawn alive from the study. As a result the cumulative survival did not drop that day but remained unchanged at Time in days Number at risk DeathsWithdrawn alive (censored) Proportion surviving on this day Cumulative survival † Cumulative mortality /50 = /49 = /48 = …....

Kaplan Meier method Cumulative survival is a probability of surviving the next period multiplied by the probability of having survived the previous period All subjects at risk - also those not experiencing the event during the observation period - can contribute survival time to the denominator of the incidence rate By censoring one is able to reduce the number of persons alive without affecting the cumulative survival

Kaplan Meier method The median survival is that point in time, from the time of inclusion, when the cumulative survival drops below 50%, in this case it is 1708 days Is not related to the number of deaths or the number of subjects that is still at risk Why mean survival is used less frequently: –Survival data mostly highly skewed. –In case of censoring one does not know if and when the person will experience the event – this complicates the calculation of the mean. –In order to calculate a mean survival one would need to wait until all persons experienced the event. Time in days Number at risk DeathsWithdrawn alive (censored) Proportion surviving on this day Cumulative survival † Cumulative mortality /50 = /49 = /48 = … /18 = /17 =

Log-rank Test Most popular method of comparing the survival of groups. Takes the whole follow-up period into account. Addresses the hypothesis that there are no differences between the populations being studied in the probability of an event at any time point. P = 0.04

Purpose: evaluate drug’s ability to maintain remissions Patients randomly assigned Study terminated after 1 year Different follow up times due to sequential enrollment 6-MP 6,6,6,7,10,22,23,6+,9+,10+,11+,17+,19+,20+,25+,32+,32+,34+,35+ Placebo 1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23 Example: Remission time of acute leukemia

6-MP (Group = 1) 6,6,6,6+,7,9+,10,10+,11+,17+,19+,20+,22,23,25+,32+,32+,34+,35+ Placebo (Group = 2) 1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23 In JMP (1 is used to denote censored times, 0 for non-censored) Example: Remission time of acute leukemia E.g. for Group 1 – first 8 observations 6, 6, 6, 6+, 7, 9+, 10, 10+

Example: Remission time of acute leukemia Group 1 – 6-MP Group 2 - Placebo We can clearly see that the time until remission (“survival”) time is larger for the treatment (6-MP) group than control. The log-rank and Wilcoxon tests for comparing the “survival” experience of both groups suggest a statistically significant difference exist (p <.0001).

Retrospective cohort study: From December 2003 BMJ: Aspirin, ibuprofen, and mortality after myocardial infarction: retrospective cohort study

What the Kaplan Meier method and the log-rank test can and cannot do… Together the Kaplan Meier method and the logrank test provide an opportunity to: –Estimate survival probabilities and –Compare survival between groups However –One cannot adjust for confounding variables – i.e. no mutlivariate analysis –They do not provide an estimate of the effect size and the relating confidence interval → In those cases one needs a regression technique like the Cox proportional hazards model (Cox PH Model)

Cox Proportional Hazard Model Before we can talk about the Cox PH model we need to consider some characteristics and terminology associated with survival time distributions. Here survival times might be time until death, but these times can also represent other outcomes such as time until remission, time until relapse, etc.

Introduction to survival distributions T i the event time for an individual, is a random variable having a probability distribution. Different models for survival data are distinguished by different choices for the distribution of T i.

Describing Survival Distributions The idea is this: Assume that times-to-event for individuals in your dataset follow a continuous probability distribution (typically a skewed right distribution, generally not normal!). For all possible times T i after baseline, there is a certain probability that an individual will have an event at exactly time T i. For example, human beings have a certain probability of dying at ages 3, 25, 80, and 140: P(T=3), P(T=25), P(T=80), and P(T=140). These probabilities are obviously vastly different.

Probability density function: f(t) In the case of human longevity, T i is unlikely to follow a normal distribution, because the probability of death is not highest in the middle ages, but at the beginning and end of life. Hypothetical data: People have a high chance of dying in their 70’s and 80’s; BUT they have a smaller chance of dying in their 90’s and 100’s, because few people make it long enough to die at these ages.

Probability density function: f(t) Show’s how failure times are distributed. If we had no censoring a histogram of the survival times of say ESRD patients would give us an impression of what the probability density function, f(t), looks like. f(t) The smoothed curve added to the histogram is a visualization of f(t) based upon a sample of patients with ESRD.

Survival function: 1 - F(t) The goal of survival analysis is to estimate and compare survival experiences of different groups. Survival experience is described by the cumulative survival function: Example: If t = 100 years, S(100) = S(t=100) which is the probability of surviving beyond 100 years. F(t) is the CDF of f(t), and is “more interesting” than f(t).

27 Cumulative Survival Same hypothetical data, plotted as cumulative distribution rather than density: Recall f(t)

28 Cumulative survival, S(t) = P(T >t) S(80) = P(T>80) S(20) = P(T>20)

29 Hazard Function h(t): a new concept AGES Hazard rate is an instantaneous incidence rate. Think of it like the rate of change of your chance of dying, like a speedometer on a car racing towards death.

Hazard function h(t) In words: the probability that if you survive to t, you will succumb to the event in the next instant.

Hazard h(t) vs. Density f(t) This is subtle, but the idea is: When you are born, you have a certain probability of dying at any age; that’s the probability density. –Example: a woman born today has, say, a 1% chance of dying at 80 years. However, as you survive for awhile, your probabilities keep changing (think: conditional probability) –Example, a woman who is 79 today has, say, a 5% chance of dying at 80 years.

32 A possible set of probability density, failure, survival, and hazard functions. F(t) =cumulative failure = P(T < t) S(t) =cumulative survival h(t) =hazard function f(t) =density function

Cox Proportional Hazards Model Model for the hazard function as a function of covariates/predictors/independent variables. The interpretation of the estimated coefficients in the model is similar to the coefficients in a logistic regression model. Logistic Regression  Odds Ratios (OR) Cox PH Model  Hazard Ratio (HR) In order to understand the distinction between OR’s and HR’s we need to discuss the difference between incidence rates and proportions.

Incidence Rate vs. Proportion Incidence (hazard) rate - number of new cases of disease per population at-risk per unit time (or mortality rate, if outcome is death). Cumulative incidence - proportion of new cases that develop in a given time period Hazard or rate ratio (HR) is the ratio of incidence rates. Odds or risk ratio (OR or RR) is the ratio of proportions.

Cox Proportional Hazards Model

Hazard Ratio (HR)

Hazard Ratio (HR) – for dichotomous covariates

Hazard Ratio (HR) – for continuous covariates

Example: Remission time for acute leukemia

The estimated HR associated with not receiving the 6-MP therapy is 4.02 with a CI (1.698, ) and the estimated HR associated with doubling the WBC is 4.92 with a CI (2.65, 9.73).

Example: Remission time for acute leukemia The estimated HR for males vs. females is 1.30, however the CI includes 1, so we cannot say there is increased risk of recurrence for males. This is further supported by the p-value =.5596.

Summary of Survival Analysis rvival analysis involves making inferences about the time until event occurs. Survival analysis involves making inferences about the time until event occurs. Due to the prospective nature of these studies there are frequently censored time observations. Due to the prospective nature of these studies there are frequently censored time observations. The Kaplan-Meier Method allows us to describe both visually and numerically the survival experience of subjects in our study. The Kaplan-Meier Method allows us to describe both visually and numerically the survival experience of subjects in our study. The log-rank test allows us to compare the survival experience of subjects across treatment groups. The log-rank test allows us to compare the survival experience of subjects across treatment groups. The Cox Proportional Hazards Model allows us to examine the relationship between the survival experience of subjects and covariates that might be related to their survival; or to look at group/treatment differences adjusted for other covariates. The Cox Proportional Hazards Model allows us to examine the relationship between the survival experience of subjects and covariates that might be related to their survival; or to look at group/treatment differences adjusted for other covariates.