Survival Analysis A Brief Introduction. 2 3 1. Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.

Slides:



Advertisements
Similar presentations
Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Advertisements

Survival Analysis In many medical studies, the primary endpoint is time until an event occurs (e.g. death, remission) Data are typically subject to censoring.
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Survival Analysis. Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness,
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
HSRP 734: Advanced Statistical Methods July 24, 2008.
SC968: Panel Data Methods for Sociologists
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
1 Statistics 262: Intermediate Biostatistics Kaplan-Meier methods and Parametric Regression methods.
بسم الله الرحمن الرحیم. Generally,survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of.
Part 21: Hazard Models [1/29] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Biostatistics in Research Practice Time to event data Martin Bland Professor of Health Statistics University of York
Introduction to Survival Analysis
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Introduction to Survival Analysis PROC LIFETEST and Survival Curves.
Modeling clustered survival data The different approaches.
Survival Analysis for Risk-Ranking of ESP System Performance Teddy Petrou, Rice University August 17, 2005.
Analysis of Complex Survey Data
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. Stanford University Department of Health.
17. Duration Modeling. Modeling Duration Time until retirement Time until business failure Time until exercise of a warranty Length of an unemployment.
Inference for regression - Simple linear regression
Simple Linear Regression
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
1 Survival Analysis Biomedical Applications Halifax SAS User Group April 29/2011.
Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
HSRP 734: Advanced Statistical Methods July 10, 2008.
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Longitudinal Methods for Pharmaceutical Policy Evaluation Common Analytic Approaches Michael Law The Centre for Health Services and Policy Research The.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Assessing Survival: Cox Proportional Hazards Model
Time-dependent covariates and further remarks on likelihood construction Presenter Li,Yin Nov. 24.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
INTRODUCTION TO SURVIVAL ANALYSIS
Applied Epidemiologic Analysis Fall 2002 Patricia Cohen, Ph.D. Henian Chen, M.D., Ph. D. Teaching Assistants Julie KranickSylvia Taylor Chelsea MorroniJudith.
01/20151 EPI 5344: Survival Analysis in Epidemiology Survival curve comparison (non-regression methods) March 3, 2015 Dr. N. Birkett, School of Epidemiology,
HSRP 734: Advanced Statistical Methods July 17, 2008.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
HSRP 734: Advanced Statistical Methods July 31, 2008.
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 9 Survival Analysis Henian Chen, M.D., Ph.D.
Pro gradu –thesis Tuija Hevonkorpi.  Basic of survival analysis  Weibull model  Frailty models  Accelerated failure time model  Case study.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Lecture 12: Cox Proportional Hazards Model
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
We’ll now look at the relationship between a survival variable Y and an explanatory variable X; e.g., Y could be remission time in a leukemia study and.
Lecture 3: Parametric Survival Modeling
Treat everyone with sincerity,
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Topic 19: Survival Analysis T = Time until an event occurs. Events are, e.g., death, disease recurrence or relapse, infection, pregnancy.
Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional.
Nonparametric Statistics
Slide 16.1 Hazard Rate Models MathematicalMarketing Chapter Event Duration Models This chapter covers models of elapsed duration.  Customer Relationship.
DURATION ANALYSIS Eva Hromádková, Applied Econometrics JEM007, IES Lecture 9.
[Topic 11-Duration Models] 1/ Duration Modeling.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
Survival time treatment effects
Comparing Cox Model with a Surviving Fraction with regular Cox model
Overview What is survival analysis? Terminology and data structure.
Parametric Survival Models (ch. 7)
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
EVENT PROJECTION Minzhao Liu, 2018
Treat everyone with sincerity,
Presentation transcript:

Survival Analysis A Brief Introduction

2

3 1. Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event occurs (e.g. death, remission) Data are typically subject to censoring (e.g. when a study ends before the event occurs) Survival Function - A function describing the proportion of individuals surviving to or beyond a given time. Notation: ◦ T: survival time of a randomly selected individual ◦ t: a specific point in time. ◦ Survival Function:

4 Hazard Function/Rate Hazard Function (t): instantaneous failure rate at time t given that the subject has survived upto time t. That is Here f(t) is the probability density function of the survival time T. That is, where F(t) is the cumulative distribution function of T: 4

5 2. The Key Word is ‘Censoring’ Because of censoring, many common data analysis procedures can not be adopted directly. For example, one could use the logistic regression model to model the relationship between survival probability and some relevant covariates ◦ However one should use the customized logistic regression procedures designed to account for censoring 5

6 Key Assumption: Independent Censoring Those still at risk at time t in the study are a random sample of the population at risk at time t, for all t This assumption means that the hazard function, λ (t), can be estimated in a fair/unbiased/valid way

7 3A. Kaplan-Meier (Product-Limit) Estimator of the Survival Curve The Kaplan–Meier estimator is the nonparametric maximum likelihood estimate of S(t). It is a product of the form nonparametric is the number of subjects alive just before time denotes the number who died at time

8 Kaplan-Meier Curve, Example Time t i # at risk# events [1-(2/20)]*1.00= [1-(0/18)]*0.90= [1-(1/15)]*0.90= (1-(2/14)]*0.84=0.72

9 9 Kaplan Meier Curve

10 Figure 1. Plot of survival distribution functions for the NCI and the SCI Groups. The Y-axis is the probability of not declining to GDS 3 or above. The X-axis is the time (in years) to decline. (Barry Reisberg et al., 2010; Alzheimer & Dementia; in press.)

11 3B. Comparing Survival Functions Survival Distribution Function Time High Medium Low

12 Log-Rank Test The log-rank test tests whether the survival functions are statistically equivalent is a large-sample chi-square test that uses the observed and expected cell counts across the event times has maximum power when the ratio of hazards is constant over time.

13 Wilcoxon Test The Wilcoxon test weights the observed number of events minus the expected number of events by the number at risk across the event times can be biased if the pattern of censoring is different between the groups.

14 Log-rank versus Wilcoxon Test Log-rank test is more sensitive than the Wilcoxon test to differences between groups in later points in time. Wilcoxon test is more sensitive than the log-rank test to differences between groups that occur in early points in time.

15 4. Two Parametric Distributions Here we present two most notable models for the distribution of T. Exponential distribution: Weibull distribution: ◦ Its survival function: ◦ Thus:

16 Weibull Hazard Function, Plot

17 5. Regression Models The Exponential and the Weibull distribution inspired two parametric regression approaches: 1. Parametric proportional hazard model – this model can be generalized to a semi-parametric model: the Cox proportional hazard model 2. Accelerated failure time model

18 Proportional Hazard Model In a regression model for survival analysis one can try to model the dependence on the explanatory variables by taking the (new) hazard rate to be: Hazard rates being positive it is natural to choose the function c such that c( β,x ) is positive irrespective the values of x.

19 Proportional Hazard Model Thus a good choice is: The resulting proportional hazard model is: For the Weibull distribution we have: For the Exponential distribution we have:

20 Accelerated Failure Time Model For the Weibull distribution (including the Exponential distribution), the proportional hazard model is equivalent to a log linear model in survival time T: Here the error term can be shown to follow the 2-parameter Extreme Vvalue distribution

21 Apply Both Models Simultaneously If the underlying distribution for T is Weibull or Exponential, one can apply both regression models simultaneously to reflect different aspects of the survival process. That is Prediction of degree of decline using the Weibull proportional hazard model Prediction of time of decline using the accelerated failure time model

22 An Example In a recent paper (Reisberg et al., 2010), we applied both regression models to a dementia study conducted at NYU: The results are shown next

23

24 6. Cox Proportional Hazards Model

25 Parametric versus Nonparametric Models Parametric models require that the distribution of survival time is known the hazard function is completely specified except for the values of the unknown parameters. Examples include the Weibull model, the exponential model, and the log-normal model.

26 Parametric versus Nonparametric Models Properties of nonparametric models are the distribution of survival time is unknown the hazard function is unspecified. An example is the Cox proportional hazards model.

27 Cox Proportional Hazards Model Baseline Hazard function - involves time but not predictor variables Linear function of a set of predictor variables - does not involve time... β = 0 → hazard ratio = 1 Two groups have the same survival experience

28 Popularity of the Cox Model The Cox proportional hazards model provides the primary information desired from a survival analysis, hazard ratios and adjusted survival curves, with a minimum number of assumptions is a robust model where the regression coefficients closely approximate the results from the correct parametric model.

29 Partial Likelihood Partial likelihood differs from maximum likelihood because it does not use the likelihoods for all subjects it only considers likelihoods for subjects that experience the event it considers subjects as part of the risk set until they are censored.

30 Partial Likelihood SubjectSurvival TimeStatus C2.01 B3.01 A4.00 D5.01 E6.00

31 Partial Likelihood

32 Partial Likelihood

33 Partial Likelihood The overall likelihood is the product of the individual likelihood. That is:

34 7. SAS Programs for Survival Analysis There are three SAS procedures for analyzing survival data: LIFETEST, PHREG, and LIFEREG. PROC LIFETEST is a nonparametric procedure for estimating the survivor function, comparing the underlying survival curves of two or more samples, and testing the association of survival time with other variables. PROC PHREG is a semiparametric procedure that fits the Cox proportional hazards model and its extensions. PROC LIFEREG is a parametric regression procedure for modeling the distribution of survival time with a set of concomitant variables.

35 Proc LIFETEST The Kaplan-Meier(K-M) survival curves and related tests (Log-Rank, Wilcoxon) can be generated using SAS PROC LIFETEST The Kaplan-Meier(K-M) survival curves and related tests (Log-Rank, Wilcoxon) can be generated using SAS PROC LIFETEST PROC LIFETEST DATA=SAS-data-set ; TIME variable ; STRATA variable >; TEST variables; RUN;

36 Proc PHREG The Cox (proportional hazards) regression is performed using SAS PROC PHREG proc phreg data=rsmodel.colon; model surv_mm*status(0,2,4) = sex yydx / risklimits; run;

37 Proc LIFEREG The accelerated failure time regression is performed using SAS PROC LIFEREG proc lifereg data=subset outest=OUTEST(keep=_scale_); model (lower, hours) = yrs_ed yrs_exp / d=normal; output out=OUT xbeta=Xbeta; run;

38 Selected References PD Allison (1995). Survival Analysis Using SAS: A Practical Guide. SAS Publishing. JD Kalbfleisch and RL Prentice (2002).The Statistical Analysis of Failure Time Data. Wiley-Interscience.

39 Questions?