Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Survival Analysis August 3 and 5, 2004.

Similar presentations


Presentation on theme: "Introduction to Survival Analysis August 3 and 5, 2004."— Presentation transcript:

1 Introduction to Survival Analysis August 3 and 5, 2004

2 Overview What is survival analysis? Introduction to Kaplan-Meier methods. Introduction to Cox proportional hazards methods (Thursday) Recommended reading in Walker: Chapters 21-22

3 What is survival analysis? Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness, recovery from illness (binary variables) or transition above or below the clinical threshold of a meaningful continuous variable (e.g. CD4 counts). Accommodates data from randomized clinical trial or cohort study design.

4 Randomized Clinical Trial (RCT) Target population Intervention Control Disease Disease-free Disease Disease-free TIME Random assignment Disease-free, at-risk cohort

5 Target population Treatment Control Cured Not cured Cured Not cured TIME Random assignment Patient population Randomized Clinical Trial (RCT)

6 Target population Treatment Control Dead Alive Dead Alive TIME Random assignment Patient population Randomized Clinical Trial (RCT)

7 Cohort study (prospective/retrospective) Target population Exposed Unexposed Disease Disease-free Disease Disease-free TIME Disease-free cohort

8 Examples of survival analysis in medicine

9 RCT: Women’s Health Initiative (JAMA, 2001) On hormones On placebo Cumulative incidence

10 Retrospective cohort study: From December 2003 BMJ: Aspirin, ibuprofen, and mortality after myocardial infarction: retrospective cohort study

11 – Estimate time-to-event for a group of individuals, such as time until second heart-attack for a group of MI patients. – To compare time-to-event between two or more groups, such as treated vs. placebo MI patients in a randomized controlled trial. – To assess the relationship of co-variables to time-to- event, such as: does weight, insulin resistance, or cholesterol influence survival time of MI patients? Note: expected time-to-event = 1/incidence rate Objectives of survival analysis

12 Survival Analysis: Terms Time-to-event: The time from entry into a study until a subject has a particular outcome Censoring: Subjects are said to be censored if they are lost to follow up or drop out of the study, or if the study ends before they die or have an outcome of interest. They are counted as alive or disease-free for the time they were enrolled in the study. – If dropout is related to both outcome and treatment, dropouts may bias the results

13 Why use survival analysis? 1. Why not compare mean time-to-event between your groups using a t-test or linear regression? -- ignores censoring 2. Why not compare proportion of events in your groups using odds ratios or logistic regression? --ignores time

14 Data Structure: survival analysis Time variable: t i = time at last disease-free observation or time at event Censoring variable: c i =1 if had the event; c i =0 no event by time t i

15 Choice of time of origin. Note varying start times.

16 Count every subject’s time since their baseline data collection.

17 Survival function Gives the probability of surviving past a certain time. For example, the probability of surviving beyond 10, years, 50 years, or 100 years. One goal of survival analysis is to estimate and compare survival experiences of different groups. Survival experience is described by the survival function:

18 Introduction to Kaplan-Meier Non-parametric estimate of the survival function. Commonly used to describe survivorship of study population/s. Commonly used to compare two study populations. Intuitive graphical presentation.

19 Beginning of studyEnd of study  Time in months  Subject B Subject A Subject C Subject D Subject E Survival Data (right-censored) 1. subject E dies at 4 months X

20 100%  Time in months  Corresponding Kaplan-Meier Curve Probability of surviving to 4 months is 100% = 5/5 Fraction surviving this death = 4/5 Subject E dies at 4 months

21 Beginning of studyEnd of study  Time in months  Subject B Subject A Subject C Subject D Subject E Survival Data 2. subject A drops out after 6 months 1. subject E dies at 4 months X 3. subject C dies at 7 months X

22 100%  Time in months  Corresponding Kaplan-Meier Curve subject C dies at 7 months Fraction surviving this death = 2/3

23 Beginning of studyEnd of study  Time in months  Subject B Subject A Subject C Subject D Subject E Survival Data 2. subject A drops out after 6 months 4. Subjects B and D survive for the whole year-long study period 1. subject E dies at 4 months X 3. subject C dies at 7 months X

24 100%  Time in months  Corresponding Kaplan-Meier Curve Product limit estimate of survival = P(surviving event 1/at-risk up to failure 1) * P(surviving event 2/at-risk up to failure 2) = 4/5 * 2/3=.5333

25 The product limit estimate The probability of surviving in the entire year, taking into account censoring = (4/5) (2/3) = 53% NOTE:  40% (2/5) because the one drop-out survived at least a portion of the year. AND <60% (3/5) because we don’t know if the one drop-out would have survived until the end of the year.

26 Comparing 2 groups Use log-rank test to test the null hypothesis of no difference between survival functions of the two groups.

27 Caveats Survival estimates can be unreliable toward the end of a study when there are small numbers of subjects at risk of having an event.

28 WHI and breast cancer Small numbers left

29 Limitations of Kaplan-Meier Mainly descriptive Doesn’t control for covariates Requires categorical predictors Can’t accommodate time-dependent variables

30 Introduction to Cox Regression History “Regression Models and Life-Tables” by D.R. Cox, published in 1972, is one of the most frequently cited journal articles in statistics and medicine

31 Introduction to Cox Regression Also called proportional hazards regression Multivariate regression technique where time-to-event (taking into account censoring) is the dependent variable. Estimates covariate-adjusted hazard ratios. – A hazard ratio is a ratio of incidence, or hazard, rates

32 Introduction to Cox Regression Distinction between rate and proportion: Incidence rate: number of new cases of disease per population at-risk per unit time – Hazard rate: Instantaneous incidence rate; probability that, given you survived disease-free up to time t, you succumb to the disease in the next instant. Cumulative incidence (or cumulative risk): proportion of new cases that develop in a given time period

33 Rates vs. risks Relationship between risk and rates:

34 Rates vs. risks For example, if rate is 5 cases/1000 person- years, then the chance of developing disease over 10 years is: Compare to.005(10) = 5% The loss of persons at risk because they have developed disease within the period of observation is small relative to the size of the total group.

35 Rates vs. risks If rate is 50 cases/1000 person-years, then the chance of developing disease over 10 years is: Compare to.05(10) = 50%

36 Distinction between hazard/rate ratio and odds ratio/risk ratio: Hazard ratio: ratio of hazard rates Odds/risk ratio: ratio of proportions By taking into account time, you are taking into account more information than just binary yes/no. Gain power/precision. Logistic regression aims to estimate the odds ratio; Cox regression aims to estimate the hazard ratio Introduction to Cox Regression

37 Example: Study of publication bias By Kaplan- Meier methods From: Publication bias: evidence of delayed publication in a cohort study of clinical research projects BMJ 1997;315:640-645 (13 September)

38 Table 4 Risk factors for time to publication using univariate Cox regression analysis Characteristic# not published# publishedHazard ratio (95% CI) Null29231.00 Non-significant trend 1640.39 (0.13 to 1.12) Significant47992.32 (1.47 to 3.66) Interpretation: Significant results have a 2-fold higher incidence of publication compared to null results. Univariate Cox regression

39 Example : Study of mortality in academy award winning screenwriters (multivariate) Kaplan- Meier methods From: Longevity of screenwriters who win an academy award: longitudinal study BMJ 2001;323:1491-1496 ( 22-29 December )

40 Table 2. Death rates for screenwriters who have won an academy award. * Values are percentages (95% confidence intervals) and are adjusted for the factor indicated * Relative increase in death rate for winners Basic analysis37 (10 to 70) Adjusted analysis Demographic: Year of birth32 (6 to 64) Sex36 (10 to 69) Documented education39 (12 to 73) All three factors33 (7 to 65) Professional: Film genre37 (10 to 70) Total films39 (12 to 73) Total four star films40 (13 to 75) Total nominations43 (14 to 79) Age at first film36 (9 to 68) Age at first nomination32 (6 to 64) All six factors40 (11 to 76) All nine factors35 (7 to 70) HR=1.37; interpretation: 37% higher incidence of death for winners compared with nominees HR=1.35; interpretation: 35% higher incidence of death for winners compared with nominees even after adjusting for potential confounders

41 The model Components: A baseline hazard function that is left unspecified but must be positive (=the hazard when all covariates are 0) A linear function of a set of k fixed covariates that is exponentiated. (=the relative risk) Can take on any form

42 The model The point is to compare the hazard rates of individuals who have different covariates: Hence, called Proportional hazards: Hazard functions should be strictly parallel.

43 Evaluation of proportional hazards assumption.

44 Characteristics of Cox Regression Cox models the effect of predictors and covariates on the hazard rate but leaves the baseline hazard rate unspecified. Does NOT assume knowledge of absolute risk. Estimates relative rather than absolute risk.

45 Assumptions of Cox Regression Proportional hazards assumption: the hazard for any individual is a fixed proportion of the hazard for any other individual Multiplicative risk

46 Survival analysis: Example

47 <1800 g (n=15) 1800-2199 g (n=55) ≥2200 g (n=52) Kaplan-Meier estimates of stress fracture-free survivorship by BMC at baseline

48 <800 mg/day (n=22) 800-1499 mg/day (n=63) 1500+mg/day (n=36) Kaplan-Meier estimates of stress fracture-free survivorship by levels of daily calcium intake at baseline

49 Previous fracture (n=39) No previous fracture (n=83) Kaplan-Meier estimates of stress fracture-free survivorship by previous stress fracture

50 Lowest quartile of lean mass Highest quartile of lean mass Middle two quartiles

51

52 Risk Factors Hazard Ratio (95% CI) History of menstrual irregularity prior to baseline 2.91 (0.81,10.43) BMC<1800g3.70 (1.31, 10.46) Low calcium (<800 mg/d)3.60 (1.12,11.59) Stress fracture prior to baseline5.45 (1.48,20.08) Fat mass (per kg)1.05 (0.91, 1.21) **All analyses are stratified on site and menstrual status at baseline, and adjusted for age and spine Z-score at baseline using Cox Regression.

53 Other protective factors Hazard Ratio (95% CI) Spine BMD (per 1-standard deviation increase).54 (0.30, 0.96) Every 100-mg/d calcium (continuous).90 (0.81, 0.99) Lean mass (per kg), time-dependent.91 (0.81, 1.02) Change in lean mass (per kg).83 (0.56, 1.24) Menarche (per 1-year older).55 (0.34,0.90) **All analyses are stratified on site and menstrual status at baseline, and adjusted for age and spine Z-score at baseline (except spine Z score) using Cox Regression.


Download ppt "Introduction to Survival Analysis August 3 and 5, 2004."

Similar presentations


Ads by Google