01/20141 EPI 5344: Survival Analysis in Epidemiology Introduction to concepts and basic methods February 25, 2014 Dr. N. Birkett, Department of Epidemiology.

Slides:



Advertisements
Similar presentations
SC968: Panel Data Methods for Sociologists
Advertisements

Surviving Survival Analysis
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Survival Analysis. Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness,
SC968: Panel Data Methods for Sociologists
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
Main Points to be Covered
Introduction to Survival Analysis
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Cohort Studies.
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Sample Size Determination Ziad Taib March 7, 2014.
Cox Proportional Hazards Regression Model Mai Zhou Department of Statistics University of Kentucky.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Survival Analysis Diane Stockton. Survival Curves Y axis, gives the proportion of people surviving from 1 at the top to zero at the bottom, representing.
Incidence and Prevalence
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
Introduction to Survival Analysis August 3 and 5, 2004.
01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
HSRP 734: Advanced Statistical Methods July 10, 2008.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Retrospective Cohort Study. Review- Retrospective Cohort Study Retrospective cohort study: Investigator has access to exposure data on a group of people.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Prevalence The presence (proportion) of disease or condition in a population (generally irrespective of the duration of the disease) Prevalence: Quantifies.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
01/20151 EPI 5344: Survival Analysis in Epidemiology Maximum Likelihood Estimation: An Introduction March 10, 2015 Dr. N. Birkett, School of Epidemiology,
01/20151 EPI 5344: Survival Analysis in Epidemiology Interpretation of Models March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive.
INTRODUCTION TO SURVIVAL ANALYSIS
Chapter 12 Survival Analysis.
01/20151 EPI 5344: Survival Analysis in Epidemiology Epi Methods: why does ID involve person-time? March 10, 2015 Dr. N. Birkett, School of Epidemiology,
01/20141 EPI 5344: Survival Analysis in Epidemiology Epi Methods: why does ID involve person-time? March 13, 2014 Dr. N. Birkett, Department of Epidemiology.
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
HSRP 734: Advanced Statistical Methods July 31, 2008.
Epidemiologic design from a sampling perspective Epidemiology II Lecture April 14, 2005 David Jacobs.
Censoring an observation of a survival r.v. is censored if we don’t know the survival time exactly. usually there are 3 possible reasons for censoring.
School of Epidemiology, Public Health &
1 Lecture 6: Descriptive follow-up studies Natural history of disease and prognosis Survival analysis: Kaplan-Meier survival curves Cox proportional hazards.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
12/20091 EPI 5240: Introduction to Epidemiology Incidence and survival December 7, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Measures of Disease Frequency
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Biostatistics Case Studies 2014 Youngju Pak Biostatistician Session 5: Survival Analysis Fundamentals.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 2: Aging and Survival.
01/20151 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
01/20151 EPI 5344: Survival Analysis in Epidemiology Confounding and Effect Modification March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Quick Review from Session #1 March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health &
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 3: An Alternative to Last-Observation-Carried-Forward:
Measures of Disease Occurrence Dr. Kamran Yazdani, MD MPH Department of Epidemiology & Biostatistics School of public health Tehran University of Medical.
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
INTRODUCTION TO CLINICAL RESEARCH Survival Analysis – Getting Started Karen Bandeen-Roche, Ph.D. July 20, 2010.
01/20141 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models April 1, 2014 Dr. N. Birkett, Department of Epidemiology & Community.
Topic 19: Survival Analysis T = Time until an event occurs. Events are, e.g., death, disease recurrence or relapse, infection, pregnancy.
1 Chapter 6 SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.
02/20161 EPI 5344: Survival Analysis in Epidemiology Hazard March 8, 2016 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
EPI 5344: Survival Analysis in Epidemiology Week 6 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa 03/2016.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
03/20161 EPI 5344: Survival Analysis in Epidemiology Estimating S(t) from Cox models March 29, 2016 Dr. N. Birkett, School of Epidemiology, Public Health.
April 18 Intro to survival analysis Le 11.1 – 11.2
Measures of Disease Occurrence
Presentation transcript:

01/20141 EPI 5344: Survival Analysis in Epidemiology Introduction to concepts and basic methods February 25, 2014 Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

01/20142 Survival concepts (1) Cohort studies –Follow-up a pre-defined group of people for a period of time which can be: Same time for everyone Different time for different people. –Determine which people achieve specified outcome. –Outcomes could be many different things, such as: Death –Any cause or cause-specific Onset of new disease Resumption of smoking in someone who had quit Recidivism for drug use or criminal activity Change in numerical measure such as blood pressure –Longitudinal data analysis

01/20143 Survival concepts (2) Cohort studies –Traditional approach to cohorts assumes everyone is followed for the same time incidence proportion logistic regression modeling –If follow-up time varies, what do you do with subjects who don’t make it to the end of the study? Censoring –Cohort studies can provide more information than presence/absence of outcome. Time when outcome occurred Type of outcome (competing outcomes) –Can look at rate or speed of development of outcome incidence rate person-time

01/20144 Survival concepts (3) Time to event analysis –Survival Analysis (general term) –Life tables –Kaplan-Meier curves –Actuarial methods –Log-rank test –Cox modeling (proportional hazards) Strong link to engineering –Failure time studies

01/20145 Survival concepts (4) Analysis of Cohort studies (from epidemiology) –Incidence proportion (cumulative incidence) Select a point in time as the end of follow-up. Compare groups using t-test, CIR (RR) Issues include: –What point in time to use? –What if not all subjects remain under follow-up that long? –Ignores information from subjects who don’t get outcome or reach the time point –What is incidence proportion for the outcome ‘death’ if we set the follow-up time to 200 years? »Will always be 100%

01/20146 Survival concepts (5) Analysis of Cohort studies (from epidemiology) –Incidence rate (density) Based on person time of follow-up Can include information on drop-outs, etc. Closely linked to survival analysis methods

01/20147 Survival concepts (6) Cumulative Incidence –The probability of becoming ill over a pre-defined period of time. –No units –Range 0-1 Incidence density (rate) –The rate at which people get ill during person-time of follow-up Units: 1/time or cases/Person-time Range 0 to +∞ –Very closely related to hazard rate.

01/20148 Measuring Time (1) Need to consider: –Units to use to measure time Normally, years/months/days Time of events is usually measured as ‘calendar time’ Other measures are possible (e.g. hours) –‘scale’ to be used time on study age calendar date –Time ‘0’ (‘origin of time’) The point when time starts

01/20149 Time Scale (1) Time of events is usually measured as ‘calendar time’ Can be represented by ‘time lines’ in a graph Conceptual idea used in analyses Patient #1enters on Feb 15, 2000 & dies on Nov 8, 2000 Patient #2enters on July 2, 2000 & is lost (censored) on April 23, 2001 Patient #3Enters on June 5, 2001 & is still alive (censored) at the end of the follow-up period Patient #4Enters on July 13, 2001 and dies on December 12, 2002

01/ D D C C

01/ Time Scale (2) In survival analysis, focus is commonly on ‘study time’ –How long after a patient starts follow-up do their events occur? –Particularly common choice for RCT’s –Need to define a ‘time 0’ or the point when study time starts accumulating for each patient. Most epidemiologists recommend using ‘age’ as the time scale for etiological studies –We’ll focus on time since a defining event but, remember this for the future.

01/ Origin of Time (1) Choice of time ‘0’ affects analysis –can produce very different regression coefficients and model fit; Preferred origin is often unavailable More than one origin may make sense –no clear criterion to choose which to use

01/ Time ‘0’ (2) No best time ‘0’ for all situations –Depends on study objectives and design RCT of Rx –‘0’ = date of randomization Prognostic study –‘0’ = date of disease onset –Inception cohort –Often use: date of disease diagnosis

01/ Time ‘0’ (3) ‘point source’ exposure Date of event –Hiroshima atomic bomb –Dioxin spill, Seveso, Italy

01/ Time ‘0’ (4) Chronic exposure date of study entry Date of first exposure Age (preferred origin/time scale) –Issues There often is no first exposure (or no clear date of 1 st exposure) Recruitment long after 1 st exposure –Immortal person time –Lack of info on early events. –‘Attained age’ as time scale

01/ Time ‘0’ (5) Calendar time can be very important –studies of incidence/mortality trends In survival analysis, focus is on ‘study time’ –When after a patient starts follow-up do their events occur Need to change time lines to reflect new time scale Patient #1enters on Feb 15, 2000 & dies on Nov 8, 2000 Patient #2enters on July 2, 2000 & is lost (censored) on April 23, 2001 Patient #3Enters on June 5, 2001 & is still alive (censored) at the end of the follow-up period Patient #4Enters on July 13, 2001 and dies on December 12, 2002

01/ D D C C

01/ D D C C

01/ Study course for patients in cohort

01/201420

01/ Time ‘0’ (5) Can be interested in more than one ‘event’ and thus more than one ‘time to event’ An Example –Patients treated for malignant melanoma –Treated with ‘A’ or ‘B’ –Expected to influence both time to relapse and survival

01/ Time ‘0’ (6) Some studies have more than one outcome event Let’s use this to illustrate SAS code to compute time- to-event. Four time points: –Date of surgery: Time ‘0’ –Relapse –Death –Last follow-up (if still alive without relapse.) Event #1: earliest of relapse/death/end Event #2: Earliest of death/end

01/ Time ‘0’ How do we compute the ‘time on study’ for each of these events? Convert to days (weeks, months, years) from time ‘0’ for each person SAS reads date data using ‘date format’ stored as # days since Jan 1, 1960.

01/201424

SAS code to create event variables Data melanoma; set melanoma; /* dfs -> Died or relapsed */ dfsevent = 1 – (date_of_relapse =.)*(date_of_death =.); /* surv -> Alive at the end of follow-up */ survevent = (date_of_death ne.); if (survevent = 0) then survtime = (date_of_last – date_of_surg)/30.4; else survtime = (date_of_death – date_of_surg)/30.4; if (dfsevent = 0) then dfstime = (date_of_last - date_of_surg)/30.4; else if (date_of_relapse NE.) then dfstime = (date_of_relapse - date_of_surg)/30.4; else if (date_of_relapse =. and date_of_death NE.) then dfstime = (date_of_death - date_of_surg)/30.4; else dfstime =.E; Run; 01/201425

01/201426

01/ Survival curve (1) What can we do with data which includes time-to-event? Might be nice to see a picture of the number of people surviving from the start to the end of follow-up.

Sample Data: Mortality, no losses Year# still alive# dying in the year ,0002, ,0001, ,4001, ,1201, , /201428

01/ Not the right axis for a survival curve

01/ Survival curve (2) Previous graph has a problem –What if some people were lost to follow-up? –Plotting the number of people still alive would effectively say that the lost people had all died.

Sample Data: Mortality, no losses 01/ Year# still alive# dying in the yearLost to follow-up ,000 2,000 1, Year# still alive# dying in the yearLost to follow-up ,000 2,000 1, , Year# still alive# dying in the yearLost to follow-up ,000 2,000 1, ,000 1, , , ,

01/201432

01/ Survival curve (2) Previous graph has a problem –What if some people were lost to follow-up? –Plotting the number of people still alive would effectively say that the lost people had all died. Instead –True survival curve plots the probability of surviving.

01/201434

01/201435

01/ Survival Curves (1) Primary outcome is ‘time to event’ Also need to know ‘type of event’ PersonTypeTime 1Death100 2Alive200 3Lost150 4Death65 And so on

01/ Survival Curves (2) Censored –People who do not have the targeted outcome (e.g. death) For now, assume no censoring How do we represent the ‘time’ data in a statistical method? –Histogram of death times - f(t) –Survival curve - S(t) –Hazard curve - h(t) To know one is to know them all

01/ Histogram of death time -Skewed to right -pdf or f(t) -CDF or F(t) -Area under ‘pdf’ from ‘0’ to ‘t’ t F(t)

01/ Survival curves (3) Plot % of group still alive (or % dead) S(t) = survival curve = % still surviving at time ‘t’ = P(survive to time ‘t’) Mortality rate = 1 – S(t) = F(t) = Cumulative incidence

01/ Deaths CI(t) Survival S(t) t S(t) 1-S(t)

01/ ‘Rate’ of dying Consider these 2 survival curves Which has the better survival profile? –Both have S(3) = 0

01/201442

01/ Survival curves (4) Most people would prefer to be in group‘A’ than group ‘B’. –Death rate is lower in first two years. –Will live longer than in pop ‘B’ Concept is called: –Hazard: Survival analysis/stats –Force of mortality: Demography –Incidence rate/density: Epidemiology DEFINITION –h(t) = rate of dying at time ‘t’ GIVEN that you have survived to time ‘t’ –Similar to asking the speed of your car given that you are two hours into a five hour trip from Ottawa to Toronto Slight detour and then back to main theme

01/ Conditional Probability h(t 0 ) = rate of failing at ‘t 0 ’ conditional on surviving to t 0 Requires the ‘conditional survival curve’: Essentially, you are re-scaling S(t) so that S * (t 0 ) = 1.0 Survival Curves (5)

01/ S(t 0 ) t0t0 t0t0

01/ S * (t) = survival curve conditional on surviving to ‘t 0 ‘ CI * (t) = failure/death/cumulative incidence at ‘t’ conditional on surviving to ‘t 0 ‘ Hazard at t 0 is defined as: ‘the slope of CI * (t) at t 0 ’ Hazard (instantaneous) Force of Mortality Incidence rate Incidence density Range: 0  ∞

01/ Some relationships If the rate of disease is small: CI(t) ≈ H(t) If we assume h(t) is constant (= ID): CI(t)≈ID*t

01/ Some survival functions (1) Exponential –h(t) = λ –S(t) = exp (- λt) Underlies most of the ‘standard’ epidemiological formulae. Assumes that the hazard is constant over time –Big assumption which is not usually true

01/201449

01/ Some survival functions (2) Weibull –h(t) = λ γ t γ-1 –S(t) = exp (- λ t γ ) Allows fitting a broader range of hazard functions Assumes hazard is monotonic –Always increasing (or decreasing)

01/201451

01/ Hazard curves (2)

01/ Hazard curves (3)

01/ Some survival functions (3) All these functions assume that everyone eventually gets the outcome event. Suppose this isn’t true: –Cures occur –Immunity Mixture models –S(t) = exp(-λt) (1-π) + 1 π –S(t)  π as t  ∞

01/ Some survival functions (4) Piece-wise exponential –Divide follow-up into intervals –The hazard is constant within interval but can differ across intervals (e.g. ‘0’ for cure)

01/201456

01/ Some survival functions (5) Piece-wise exponential –Divide follow-up into intervals –The hazard is constant within interval but can differ across intervals (e.g. ‘0’ for cure) Gompertz Model –Uses a functional form for S(t) which goes to a fixed, non-zero value after a finite time

01/ Censoring (1) So much for theory In real world, we run into practical issues: –May only know that subject was disease-free up to time ‘t’ but then you lost track of them –May only know subject got disease before time ‘t’ –May only know subject got disease between two exam dates. –May know subject must have been outcome-free for the first ‘x’ years of follow-up (immortal person-time) –Can’t measure time to infinite precision Often only know year of event –Exact time of event might not even exist in theory

Censoring (2) Three main kinds of censoring –Right censoring The time of the event is known to be later than some time Subject moves to Australia after three years of follow-up –We only know that they died some time after 3 years. –Left censoring The time of the event is known to be before some time –Looking at age of menarche, starting with a group of 12 year old girls. –Some girls are already menstruating –Interval censoring Time of the event occurred between two known times –Annual HIV test –Negative on Jan 1, 2012 –Positive on Jan 1, /201459

01/ D D D

01/ Censoring (3) Right censoring is most commonly considered –Type 1 censoring The censoring time is ‘fixed’ (under control of investigator) –Singly censored Everyone has the same censoring time Commonly due to the study ending on a specific date –Type 2 censoring Terminate study after a fixed number of events has happened –most common in lab studies –Random censoring Observation terminated for reason not under investigator’s control Varying reasons for drop-out Varying entry times

01/ Censoring (4) Right censoring is most commonly considered –Event of interest is death but at the end of their follow-up, subject is still alive. Administrative Censoring Loss-to-follow-up –A patient moves away or is lost without having experienced event of interest Drop-out –Patient dropped from study due to protocol violation, etc. Competing risks –Death occurs due to a competing event We know something about these patients. –Discarding them would ‘waste’ information

01/ Study course for patients in cohort

01/ Censoring (5) Standard analysis ignores method used to generate censoring. Type 1/2 methods are fine ‘Random’ censoring can be a problem. –Informative vs. uninformative censoring Standard analyses require ‘uninformative’ censoring –The development of the outcome in subjects who are censored must be the same as in the subjects who remained in follow-up

01/ Censoring (6) Informative vs. uninformative censoring –RCT of new therapy with serious side effects. Patients on this Rx can tolerate side effects until near death. Then, they drop out. Mortality rate in this group will be 0 (/100,000) –Control therapy has no side-effects Patients do not drop out near death. Strong bias

01/ Type of CensoringMay Violate Assumption of Independence of Censoring/Survival If assumption is violated, likely direction of bias on CIR estimate Deaths from other causes when there are common risk factors* YesUnderestimation Failure to follow-up contacts YesUnderestimation MigrationYesVariable Administrative censoringUnlikely § Variable * In cause-specific incidence or mortality studies § More likely in studies with a prolonged accrual period in the presence of secular trends.

01/201467