Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.

Slides:



Advertisements
Similar presentations
The analysis of survival data in nephrology. Basic concepts and methods of Cox regression Paul C. van Dijk 1-2, Kitty J. Jager 1, Aeilko H. Zwinderman.
Advertisements

Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma.
Simple Logistic Regression
Survival Analysis. Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness,
Departments of Medicine and Biostatistics
HSRP 734: Advanced Statistical Methods July 24, 2008.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
1 Statistics 262: Intermediate Biostatistics Kaplan-Meier methods and Parametric Regression methods.
بسم الله الرحمن الرحیم. Generally,survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of.
Analysis of Time to Event Data
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
Survival analysis1 Every achievement originates from the seed of determination.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Biostatistics in Research Practice Time to event data Martin Bland Professor of Health Statistics University of York
Introduction to Survival Analysis
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. Stanford University Department of Health.
Introduction to Survival Analysis August 3 and 5, 2004.
Simple Linear Regression
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
1 Survival Analysis Biomedical Applications Halifax SAS User Group April 29/2011.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
HSRP 734: Advanced Statistical Methods July 10, 2008.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Assessing Survival: Cox Proportional Hazards Model
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
INTRODUCTION TO SURVIVAL ANALYSIS
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
Linear correlation and linear regression + summary of tests
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 9 Survival Analysis Henian Chen, M.D., Ph.D.
Statistical Inference for more than two groups Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Lecture 12: Cox Proportional Hazards Model
Advanced Statistics for Interventional Cardiologists.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,
Nonparametric Statistics
Analysis of matched data Analysis of matched data.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
03/20161 EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 5, 2016 Dr. N. Birkett, School of Epidemiology, Public.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Carolinas Medical Center, Charlotte, NC Website:
Nonparametric Statistics
April 18 Intro to survival analysis Le 11.1 – 11.2
Applied Biostatistics: Lecture 2
Statistical Inference for more than two groups
Statistics 103 Monday, July 10, 2017.
Nonparametric Statistics
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences

Presentation goals Survival analysis compared w/ other regression techniques What is survival analysis When to use survival analysis Univariate method: Kaplan-Meier curves Multivariate methods: Cox-proportional hazards model Parametric models Assessment of adequacy of analysis Examples

Regression vs. Survival Analysis

What is survival analysis? Model time to failure or time to event Unlike linear regression, survival analysis has a dichotomous (binary) outcome Unlike logistic regression, survival analysis analyzes the time to an event –Why is that important? Able to account for censoring Can compare survival between 2+ groups Assess relationship between covariates and survival time

Importance of censored data Why is censored data important? What is the key assumption of censoring?

Types of censoring Subject does not experience event of interest Incomplete follow-up Lost to follow-up Withdraws from study Dies (if not being studied) Left or right censored

When to use survival analysis Examples Time to death or clinical endpoint Time in remission after treatment of disease Recidivism rate after addiction treatment When one believes that 1+ explanatory variable(s) explains the differences in time to an event Especially when follow-up is incomplete or variable

Relationship between survivor function and hazard function Survivor function, S(t) defines the probability of surviving longer than time t this is what the Kaplan-Meier curves show. Hazard function is the derivative of the survivor function over time h(t)=dS(t)/dt –instantaneous risk of event at time t (conditional failure rate) Survivor and hazard functions can be converted into each other

Approach to survival analysis Like other statistics we have studied we can do any of the following w/ survival analysis: Descriptive statistics Univariate statistics Multivariate statistics

Descriptive statistics Average survival When can this be calculated? What test would you use to compare average survival between 2 cohorts? Average hazard rate Total # of failures divided by observed survival time (units are therefore 1/t or 1/pt-yrs) An incidence rate, with a higher values indicating more events per time

Univariate method: Kaplan-Meier survival curves Also known as product-limit formula Accounts for censoring Generates the characteristic “stair step” survival curves Does not account for confounding or effect modification by other covariates When is that a problem? When is that OK?

Time to Cardiovascular Adverse Event in VIGOR Trial

Comparing Kaplan-Meier curves Log-rank test can be used to compare survival curves Less-commonly used test: Wilcoxon, which places greater weights on events near time 0. Hypothesis test (test of significance) H 0 : the curves are statistically the same H 1 : the curves are statistically different Compares observed to expected cell counts Test statistic which is compared to  2 distribution

Comparing multiple Kaplan-Meier curves Multiple pair-wise comparisons produce cumulative Type I error – multiple comparison problem Instead, compare all curves at once analogous to using ANOVA to compare > 2 cohorts Then use judicious pair-wise testing

Limit of Kaplan-Meier curves What happens when you have several covariates that you believe contribute to survival? Example Smoking, hyperlipidemia, diabetes, hypertension, contribute to time to myocardial infarct Can use stratified K-M curves – for 2 or maybe 3 covariates Need another approach – multivariate Cox proportional hazards model is most common -- for many covariates (think multivariate regression or logistic regression rather than a Student’s t-test or the odds ratio from a 2 x 2 table)

Multivariate method: Cox proportional hazards Needed to assess effect of multiple covariates on survival Cox-proportional hazards is the most commonly used multivariate survival method Easy to implement in SPSS, Stata, or SAS Parametric approaches are an alternative, but they require stronger assumptions about h(t).

Cox proportional hazard model Works with hazard model Conveniently separates baseline hazard function from covariates Baseline hazard function over time –h(t) = h o (t)exp(B 1 X+Bo) Covariates are time independent B 1 is used to calculate the hazard ratio, which is similar to the relative risk Nonparametric Quasi-likelihood function

Cox proportional hazards model, continued Can handle both continuous and categorical predictor variables (think: logistic, linear regression) Without knowing baseline hazard h o (t), can still calculate coefficients for each covariate, and therefore hazard ratio Assumes multiplicative risk—this is the proportional hazard assumption Can be compensated in part with interaction terms

Limitations of Cox PH model Does not accommodate variables that change over time Luckily most variables (e.g. gender, ethnicity, or congenital condition) are constant –If necessary, one can program time-dependent variables –When might you want this? Baseline hazard function, h o (t), is never specified You can estimate h o (t) accurately if you need to estimate S(t).

Hazard ratio What is the hazard ratio and how to you calculate it from your parameters, β How do we estimate the relative risk from the hazard ratio (HR)? How do you determine significance of the hazard ratios (HRs). Confidence intervals Chi square test

Assessing model adequacy Multiplicative assumption Proportional assumption: covariates are independent with respect to time and their hazards are constant over time Three general ways to examine model adequacy Graphically Mathematically Computationally: Time-dependent variables (extended model)

Model adequacy: graphical approaches Several graphical approaches Do the survival curves intersect? Log-minus-log plots Observed vs. expected plots

Testing model adequacy mathematically with a goodness-of-fit test Uses a test of significance (hypothesis test) One-degree of freedom chi-square distribution p value for each coefficient Does not discriminate how a coefficient might deviate from the PH assumption

Example: Tumor Extent 3000 patients derived from SEER cancer registry and Medicare billing information Exploring the relationship between tumor extent and survival Hypothesis is that more extensive tumor involvement is related to poorer survival

Log-Rank  2 = p <.0001

Example: Tumor Extent Tumor extent may not be the only covariate that affects survival Multiple medical comorbidities may be associated with poorer outcome Ethnic and gender differences may contribute Cox proportional hazards model can quantify these relationships

Example: Tumor Extent Test proportional hazards assumption with log- minus-log plot Perform Cox PH regression Examine significant coefficients and corresponding hazard ratios

Example: Tumor Extent 5 The PHREG Procedure Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Variable Variable DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits Label age <age<=80 age race black race other comorb < comorb < comorb < DISTANT < REGIONAL < LIPORAL < PHARYNX < treat both treat < rad treat < none

Summary Survival analyses quantifies time to a single, dichotomous event Handles censored data well Survival and hazard can be mathematically converted to each other Kaplan-Meier survival curves can be compared statistically and graphically Cox proportional hazards models help distinguish individual contributions of covariates on survival, provided certain assumptions are met.