April 18 Intro to survival analysis Le 11.1 – 11.2

Slides:



Advertisements
Similar presentations
Surviving Survival Analysis
Advertisements

Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
Survival Analysis In many medical studies, the primary endpoint is time until an event occurs (e.g. death, remission) Data are typically subject to censoring.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Survival Analysis-1 In Survival Analysis the outcome of interest is time to an event In Survival Analysis the outcome of interest is time to an event The.
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
Departments of Medicine and Biostatistics
Objectives (BPS chapter 24)
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Introduction to Survival Analysis PROC LIFETEST and Survival Curves.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
Inference for regression - Simple linear regression
Statistics for clinical research An introductory course.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
HSRP 734: Advanced Statistical Methods July 10, 2008.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
1 Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008.
Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
INTRODUCTION TO SURVIVAL ANALYSIS
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 9 Survival Analysis Henian Chen, M.D., Ph.D.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Medical Statistics as a science
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Additional Regression techniques Scott Harris October 2009.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
Methods and Statistical analysis. A brief presentation. Markos Kashiouris, M.D.
Carolinas Medical Center, Charlotte, NC Website:
Chapter 13 Logistic regression.
Survival time treatment effects
March 28 Analyses of binary outcomes 2 x 2 tables
An introduction to Survival analysis and Applications to Predicting Recidivism Rebecca S. Frazier, PhD JBS International.
Logistic Regression APKC – STATS AFAC (2016).
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Survival curves We know how to compute survival curves if everyone reaches the endpoint so there is no “censored” data. Survival at t = S(t) = number still.
Survival Analysis: From Square One to Square Two Yin Bun Cheung, Ph.D. Paul Yip, Ph.D. Readings.
Chi-Square X2.
This Week Review of estimation and hypothesis testing
PCB 3043L - General Ecology Data Analysis.
Basic Practice of Statistics - 5th Edition
Statistical Inference for more than two groups
Essential Statistics Two Categorical Variables: The Chi-Square Test
Basic Statistics Overview
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Statistics 103 Monday, July 10, 2017.
Multiple logistic regression
HEART TRANSPLANTATION
Chapter 9 Hypothesis Testing.
HEART TRANSPLANTATION
LUNG TRANSPLANTATION Pediatric Recipients ISHLT 2010
Chapter 11: Inference for Distributions of Categorical Data
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
When You See (This), You Think (That)
Additional Regression techniques
CHAPTER 18: Inference about a Population Mean
Where are we?.
Kaplan-Meier survival curves and the log rank test
Presentation transcript:

April 18 Intro to survival analysis Le 11.1 – 11.2 Not covered in C & S

Intro to Survival Data Our voyage so far… Continuous outcome data T-tests, linear regression, ANOVA Categorical data Odds ratios, relative risk, chi-square tests, logistic regression New scenario; time to event data Categorical outcome (yes/no) Follow-up time

Rational Want to take into account not just whether a patient has an event of interest but the amount of time from some starting point until the event. Patient who dies 2 weeks after diagnosis of cancer should be considered differently than a patient who dies 2 years after diagnosis

Goals Describe the rate (probability) of the event over time Called the survival function Compare survival function among groups Examine risk factors for having the event taking into consideration the time of the event

Kaplan-Meier survival curve Survival After Diagnosis of Lung Cancer S (t) is the probability of surviving to at least t S (200) = 0.37

Comparing Two Survival Curves

Time To ? Death after diagnosis of cancer CVD event after enrolled in a study Re-arrest after release from prison Divorce after marriage Survival analyses better described as “Time to Event” analyses Note: The event does not have to inevitable

Kaplan-Meier Life Curves

Nature of Data Definitive starting point (become “at risk”) Definitive ending point If had event then date of event If did not have event then date last know not to have had the event Analyses based on two factors: Had event or did not have event (0/1 variable) Length of time followed (ending – starting date)

Examples Death after diagnosis of cancer Divorce after marriage Starting point: date of diagnosis Ending point: date of death or date last know to be alive Divorce after marriage Starting point: date of marriage Ending point: date of divorce or date last know to be still married

Censoring After a certain period of time the patient does not have the event but it is unknown as to whether the patient had the event after this time. Called right censoring

Reasons for Censoring Patient no longer followed (thus event status not know after a certain date) Patient has a different event that make the primary event not possible Primary event: death from cancer but patient dies from CHD Primary event: divorce but one spouse dies Study could end or patient becomes lost Patient no longer “at risk” for study purposes

Censoring example Follow-up for study is 365 days Patient survives 245 days then is lost At that point, we KNOW that they survived 245 days but we do NOT KNOW whether they survived between days 246 and 365 If we exclude them from any end-point calculations we ignore 245 days worth of information

Types of censoring Uninformative Informative “lost” status not related to outcome Those lost similar to those not lost (usually not true) Informative “lost” status is related to outcome Those who are lost are more likely to be dead than those not lost Most methods assume we have uninformative censoring Could be true, say an entire clinical center closes

Example of Follow-up Times C O U P E S Divorced after 6 years D C Has been married 10 years at time of analyses C One spouse dies after 3 yrs C No contact with couple after 5 years 0 5 10 Years Since Marriage

Survival Function Estimation Patients are followed for different length of time Like to use all the data to estimate the survival function Patients followed 1-year can help estimate survival function in first year Patients followed 2-years can help estimate survival function in first 2-years

Life Table Calculation 100 couples married in 2002 followed 2 years 100 couples married in 2003 followed 1 year Follow-up through 2004 Year 1 of follow-up Year 2 of follow-up 10 D (5 each from 2002 and 2003 marriages) 200 8 D 95 C 190 95 S (1) = 190/200 = .95 87 S (2) = S (1) * S (2| S>1) = .95 * .92 = .870 Note: S (1) is estimated with more precision than S(2)

Estimating Survival Curves Kaplan-Meier Method Also called Product-Limit or Life-table curve For each time where 1 or more events occur, calculate number who die at that point over number who survived to that point (di/ni) Multiply all these quantities;

Calculating Kaplan-Meier estimates ni di 1-di/ni S(ti) 6 21 3 0.8571 7 17 1 0.9412 0.8067 10 16 0.9375 0.7563 13 14 2 0.6483 Number at risk SAS calculates these automatically 0.8571 x 0.9412 x 0.9375 x 0.8571

Questions What is the survival rate over time for persons diagnosed with lung cancer? Is the survival rate over time different for different types of cancer? Are patient characteristics related to survival

Comparing Two Survival Curves

How do we describe this data? Logistic regression? Model risk of death Would ignore the amount of follow-up time Linear regression? Model survival time How do you handle those who died vs. those who survived? Survival times not normally distributed (all >0) Need new methods that incorporate follow-up time information Survival or time-to-event analyses

Comparing survival curves For any time point, can see probability of survival for either group Median survival time; point where probability surviving = 50% Rank Tests – Compare entire curves

Estimating survival curves Survival curve estimates less precise over time SAS can produce confidence intervals for the survival curve 95% CI of form;

Testing survival curves Formal statistical tests exist Log-rank test and Wilcoxon test Both assess whether survival distributions are equal Null hypothesis: survival distributions (curves) are equal Alternative hypothesis: survival distributions (curves) are not equal; one greater/less than other Each compares survival distributions in a slightly different way Log-rank test more powerful when relative risk is constant Wilcoxon more powerful for detecting short term risk

USING SAS Patient died 72 days after diagnosis Obs Age Cell death SurVTime 1 69 squamous 1 72 2 64 squamous 1 411 10 70 squamous 0 100 11 81 squamous 1 42 12 63 squamous 1 8 13 63 squamous 1 144 14 52 squamous 0 25 15 48 squamous 1 11 23 41 large 1 200 24 66 large 1 156 25 62 large 0 182 26 60 large 1 143 Patient alive after 100 days but status after that time is unknown

PROC LIFETEST PLOTS = (s); WHERE cell in('squamous','large'); TIME survtime*death(0); STRATA cell; Tells SAS to draw life table plot Tells SAS that values of 0 are censored observations Tells SAS to compute life table estimates separately for each cell type

RUNNING ON SATURN (UNIX) GOPTIONS DEVICE = png htext=0.8 htitle=1 ftext=swissb gsfmode=replace PROC LIFETEST PLOTS = (s); WHERE cell in('squamous','large'); TIME survtime*death(0); STRATA cell; Creates a file called sasgraph.png FTP over to PC and insert file into word insert/ picture/ from file

PROC LIFETEST OUTPUT Summary of the Number of Censored and Uncensored Values Percent Stratum Cell Total Failed Censored Censored 1 large 27 26 1 3.70 2 squamous 35 31 4 11.43 --------------------------------------------------------------- Total 62 57 5 8.06

Test of Equality over Strata Pr > Test Chi-Square DF Chi-Square Log-Rank 0.8226 1 0.3644 Wilcoxon 0.0520 1 0.8197 -2Log(LR) 1.0218 1 0.3121 Tests equality of 2 survival functions

X-Y points for life table graph Stratum 1: Cell = large Product-Limit Survival Estimates Survival Standard Number Number SurvTime Survival Failure Error Failed Left 0.000 1.0000 0 0 0 27 12.000 0.9630 0.0370 0.0363 1 26 15.000 0.9259 0.0741 0.0504 2 25 19.000 0.8889 0.1111 0.0605 3 24 43.000 0.8519 0.1481 0.0684 4 23 49.000 0.8148 0.1852 0.0748 5 22 52.000 0.7778 0.2222 0.0800 6 21 53.000 0.7407 0.2593 0.0843 7 20 100.000 0.7037 0.2963 0.0879 8 19 103.000 0.6667 0.3333 0.0907 9 18 105.000 0.6296 0.3704 0.0929 10 17 111.000 0.5926 0.4074 0.0946 11 16 133.000 0.5556 0.4444 0.0956 12 15 143.000 0.5185 0.4815 0.0962 13 14 156.000 0.4815 0.5185 0.0962 14 13 162.000 0.4444 0.5556 0.0956 15 12 164.000 0.4074 0.5926 0.0946 16 11 177.000 0.3704 0.6296 0.0929 17 10 182.000* . . . 17 9 200.000 0.3292 0.6708 0.0913 18 8 X-Y points for life table graph First death after 12 days

Stratum 1: Cell = large Product-Limit Survival Estimates Survival Standard Number Number SurvTime Survival Failure Error Failed Left 0.000 1.0000 0 0 0 27 12.000 0.9630 0.0370 0.0363 1 26 15.000 0.9259 0.0741 0.0504 2 25 19.000 0.8889 0.1111 0.0605 3 24 S(0) = 1 S(12) = .9630 (26/27) S(15) = .9259 (25/27) which is also 26/27 * 25/26 S(19) = .8889 (24/27) What is S(17) ? Estimated survival function is a step function

2 patients died after 1 day Stratum 2: Cell = squamous Product-Limit Survival Estimates Survival Standard Number Number SurvTime Survival Failure Error Failed Left 0.000 1.0000 0 0 0 35 1.000 . . . 1 34 1.000 0.9429 0.0571 0.0392 2 33 8.000 0.9143 0.0857 0.0473 3 32 10.000 0.8857 0.1143 0.0538 4 31 11.000 0.8571 0.1429 0.0591 5 30 15.000 0.8286 0.1714 0.0637 6 29 25.000 0.8000 0.2000 0.0676 7 28 25.000* . . . 7 27 30.000 0.7704 0.2296 0.0713 8 26 2 patients died after 1 day

Crossing Survival curves Validity of tests require risk in one group always greater than risk in other group When survival curves cross, terms used in calculating test statistic cancel out Give test statistic value near zero P-value is larger than it should be Graph survival curves to check for crossing Use alternative method

Censoring vs. missing data Censoring is a special case of having missing data Missing; don’t know whether or not person had outcome Censoring; don’t know whether or not person had outcome, but know they didn’t have outcome after being followed for some time

Statistical Techniques for censored data Kaplan-Meier (life table analysis) Survival curves log rank, wilcoxon significance tests Tests to compare survival curves Cox proportional hazards regression Relate covariates to survival