Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.

Slides:



Advertisements
Similar presentations
Surviving Survival Analysis
Advertisements

Survival Analysis. Key variable = time until some event time from treatment to death time for a fracture to heal time from surgery to relapse.
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Comparing Two Proportions (p1 vs. p2)
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,
Logistic Regression.
Simple Logistic Regression
Survival Analysis. Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness,
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
Departments of Medicine and Biostatistics
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
بسم الله الرحمن الرحیم. Generally,survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
BIOST 536 Lecture 3 1 Lecture 3 – Overview of study designs Prospective/retrospective  Prospective cohort study: Subjects followed; data collection in.
Proportional Hazard Regression Cox Proportional Hazards Modeling (PROC PHREG)
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Introduction to Survival Analysis PROC LIFETEST and Survival Curves.
Main Points to be Covered Cumulative incidence using life table method Difference between cumulative incidence based on proportion of persons at risk and.
Measures of disease frequency (I). MEASURES OF DISEASE FREQUENCY Absolute measures of disease frequency: –Incidence –Prevalence –Odds Measures of association:
Statistics for Managers Using Microsoft® Excel 5th Edition
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
Survival Analysis: From Square One to Square Two
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Survival Curves Marshall University Genomics Core.
HSTAT1101: 27. oktober 2004 Odd Aalen
Chapter 10 Hypothesis Testing
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
NASSER DAVARZANI DEPARTMENT OF KNOWLEDGE ENGINEERING MAASTRICHT UNIVERSITY, 6200 MAASTRICHT, THE NETHERLANDS 22 OCTOBER 2012 Introduction to Survival Analysis.
Survival Data John Kornak March 29, 2011
HSRP 734: Advanced Statistical Methods July 10, 2008.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
Dr Laura Bonnett Department of Biostatistics. UNDERSTANDING SURVIVAL ANALYSIS.
Retrospective Cohort Study. Review- Retrospective Cohort Study Retrospective cohort study: Investigator has access to exposure data on a group of people.
Assessing Survival: Cox Proportional Hazards Model
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
INTRODUCTION TO SURVIVAL ANALYSIS
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
HSRP 734: Advanced Statistical Methods July 31, 2008.
Confidence intervals and hypothesis testing Petter Mostad
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Statistical test for Non continuous variables. Dr L.M.M. Nunn.
Survival Analysis 1 Always be contented, be grateful, be understanding and be compassionate.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
1 Lecture 6: Descriptive follow-up studies Natural history of disease and prognosis Survival analysis: Kaplan-Meier survival curves Cox proportional hazards.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.
More Contingency Tables & Paired Categorical Data Lecture 8.
Satistics 2621 Statistics 262: Intermediate Biostatistics Jonathan Taylor and Kristin Cobb April 20, 2004: Introduction to Survival Analysis.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
X Treatment population Control population 0 Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx  Let X = decrease (–) in cholesterol.
INTRODUCTION TO CLINICAL RESEARCH Survival Analysis – Getting Started Karen Bandeen-Roche, Ph.D. July 20, 2010.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Additional Regression techniques Scott Harris October 2009.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
SURVIVAL ANALYSIS PRESENTED BY: DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA.
Carolinas Medical Center, Charlotte, NC Website:
April 18 Intro to survival analysis Le 11.1 – 11.2
Statistics 103 Monday, July 10, 2017.
Presentation transcript:

Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn

Objectives of this Talk  Clarify what survival data is.  Explain what makes survival data special.  Example 1 – Survival estimation for a single population.  Example 2 – Survival comparison for two populations via infant ALL data.  Provide motivation for the need of “special” methods for analyzing survival data.

Objectives of this Talk (cont)  The notion of risk estimation for survival data, the Cox-Proportional Hazards Model.  Example 3 – Infant ALL data revisited, analyzed using Cox-Proportional Hazards Model.

What is Survival Data?  Data that deal with the time until the occurrence of any well-defined event.  Binary response which does not have to be death/survival.

Examples of Events  Death  Response to a treatment  Development of a disease in someone at high risk  Resumption of smoking by someone who had quit  Cancellation of service by a credit card customer  Relapse of a patient in whom disease had been in remission

Complete Data  The value of each sample unit is observed or known.  Ex.) Compute the average test score for a sample of 5 students: 90, 80, 76, 85, 82.

Why is Survival Data Special?  Censored data: The event of interest may not be observed or the exact times- to-event of all the units are not known. Examples:  The event of interest is death, but at the time of analysis the patient is still alive.  A patient was lost to follow-up without having experienced the event of interest.

Examples (cont)  The event of interest is death caused by cancer. A patient may die of an unrelated cause, such as an automobile accident.  A patient is dropped from the study without having experienced the event of interest because of a major protocol violation.

Types of Censoring  Right censoring:  Right censoring: a survival time is not known exactly but known to be greater than some value.

Types of Censoring (cont)  Left censoring: a failure time is only known to be before a certain time. Ex.) Event of interest: development of a disease. At the time of examination, a 50-year-old participant was found to have already developed the disease of interest, but no record of exact time. At the time of examination, a 50-year-old participant was found to have already developed the disease of interest, but no record of exact time.

Types of Censoring (cont)  Interval censoring: Objects of interest are not constantly monitored. Event of interest is known to have occurred between times a and b. Ex.) At age of 45, the patient did not have the disease. His age of diagnosis was between age 45 and 50.

Survival Estimation  Example 1 - A hypothetical clinical trial: Suppose that 10 patients enroll in a clinical trial at the beginning of During 1988, 6 patients die. At the beginning of 1989, 20 additional patients enroll in the trial. During 1989, 3 patients who enrolled in the trial at the beginning of 1988 die, and 15 patients who enrolled in the trial at the beginning of 1989 die. We are asked to estimate the one year and two year survival for these patients.

FOLLOW-UP TIMEPARTICIPANTS TRACKEDDEATHSCENSORED OBSERVATIONSESTIMATED SURVIVAL PROBABILITY TOTALS: 246

Survival Comparison  Example 2 - For acute lymphoblastic leukemia (ALL) in children, a small percentage of approximately 3% in this age range are diagnosed in the first year of life – referred to as infant ALL.  Generally the outcome for infant ALL is much poorer than that for other children, where about 75% go into a quick remission and never have their disease return (i.e., are cured).

Survival Comparison (Ex 2 cont)  For infant ALL probably 65% will die of their disease. While the outcome of ALL in these very small babies is not good, there is nevertheless substantial known heterogeneity in outcome based on patient characteristics – some subgroups doing much better and some much worse than the general outcome in infants.  In this exercise, we will examine if survival among ALL infants differs, depending on time of diagnosis (0-5 mo. vs mo.).

Survival Comparison (Ex 2 cont)  Hypothesis test setup: Null states that survival among ALL infants is the same, irrespective of the age of diagnosis. Alternative states that survival among infants diagnosed with ALL at 0-5 months is a constant scaled power (at any follow- up time) of the survival among infants diagnosed with ALL at 6-11 months.  More precisely, the alternative states that the hazard rates for the two infant ALL groups are proportional through time.

Structure of Survival Data  The following SAS output provides an overview of collected survival data.

Survival Comparison (Ex 2 cont)  To test these hypotheses, we use the Log- Rank Test.

Survival Comparison (Ex 2 cont)  The Log-Rank Test from SAS’ Proc Lifetest yields a p-value of There is evidence in this case to reject the null hypothesis. These data indicate that there is a statistically significantly difference in survival among children diagnosed with ALL at 0-5 months when compared to children diagnosed with ALL at 6-11 months (p<0.01). The data suggest that survival is better among children diagnosed with ALL later in infancy.

Survival Comparison (Ex 2 cont)  Assess Goodness-of-Fit (PH assumption).

Confounding Factors  Recall, a confounding factor for an association of interest – in this case, the age at diagnosis/survival relationship – must itself be associated to the outcome of interest (survival) and to the exposure of interest (age at diagnosis).  Let’s examine if abnormality for CHR11Q23 is a confounding factor for our example.

Confounding Factors (cont.)  To assess whether CHR11Q23 is associated with survival, we use the Log- Rank Test. SAS reports a p-value <0.01. These data indicate that there is a statistically significantly difference in survival among children with an abnormality at the CHR11Q23 loci compared to children without the abnormality (p<0.01). The data suggest that survival is better among children without the abnormality.

Confounding Factors (cont.)  To assess whether CHR11Q23 is associated with age at ALL disgnosis, we use Categorical Data Analysis. These data indicate the odds of CHR11Q23 abnormality among children diagnosed with ALL at 0-5 months is 2.78 times those among children diagnosed with ALL at months (95% CI for OR, 1.19 – 6.48).

Confounding Factors (cont.)  Thus, the data suggest that CHR11Q23 is associated to both survival (outcome) and to age of ALL diagnosis… CHR11Q23 appears to be a confounder, and so we should control for the factor in the analysis.  After controlling for CHR11Q23, these data still suggest that survival is better among infants diagnosed with ALL later in infancy, but the evidence of the association has decreased (p=0.03).

Why the use of “Special” Statistical Methods for Survival Data?  More precisely, since we have a binary response, why not use categorical data analysis methods (e.g., 2xC contingency tables, logistic regression) to analyze survival data?

“Special” Methods (cont)  Log-Rank Test and the Score Test from Logistic Regression essentially equivalent when all censored observations equal the maximum follow-up time.  Biased results could arise from the use of categorical data analysis methods, if uniform censoring through follow-up time in one group occurs and censoring at the maximum follow-up time for the second group occurs.

“Special” Methods (cont)  In utilizing categorical data analysis methods, uniform censoring through follow- up time in both groups could lead to bias toward the null hypothesis.  If censoring occurs at the beginning of the follow-up time for each group, utilizing categorical data analysis methods could lead to bias toward the alternative hypothesis.

“Special” Methods (cont)  If censoring does not occur, categorical data analysis methods cannot be applied.  In summary, survival analysis methods exist to handle the censoring of observations.

Risk Estimation for Survival Data  Log-Rank Test provides a means in testing for an association in survival. Cox- Proportional Hazards (CPH) Model provides a regression extension so that risk estimation in survival can be made.  Risk estimate for CPH Model is the hazard ratio.

GLM versus CPH Model  GLM – Parametric Models:  CPH – Semi-parametric Model:

Survival Comparison (CPH)  Example 3 – Let’s revisit the infant ALL data and analyze using the CPH Model.  Null hypothesis states that the hazard rate among ALL infants is the same, irrespective of the age at diagnosis. Alternative states that the hazard rate (at any follow-up time) among infants diagnosed with ALL at 6-11 months is a constant multiple of the hazard rate among infants diagnosed with ALL at 0-5 months.

Survival Comparison (Ex 3 cont)  To test these hypotheses, we use the Cox-Proportional Hazards Regression Model.  The CPH Model:  Model under the null:

Survival Comparison (Ex 3 cont)  Model under the alternative:  SAS’ Proc Phreg reports the p-value from the Likelihood Ratio Test to be Note that this result is essentially equivalent to the Log-Rank Test. This is expected as the hypotheses are the same for the CPH Test and the Log-Rank Test.

Survival Comparison (Ex 3 cont)  These data indicate the risk of death among infants diagnosed with ALL at 0-5 months is 2.10 times that of infants diagnosed with ALL at 6-11 months (95% CI for RR, 1.23 – 3.60).

Survival Comparison (Ex 3 cont)  Survival curves from SAS’ Proc Phreg:

Survival Comparison (Ex 3 cont)  Assess Goodness-of-Fit (PH assumption).

Confounding Factors Revisited  As with the Log-Rank procedure, we can control for confounding factors in the CPH Model.  The interpretation of the RR, models that of other regression techniques.

Adjusted RR Interpretation  After controlling for CHR11Q23 abnormality, these data indicate the risk of death among infants diagnosed with ALL at 0-5 months is 1.82 times that of infants diagnosed with ALL at 6-11 months (95% CI for RR, 1.05 – 3.15).

Questions?