Download presentation
Presentation is loading. Please wait.
Published byMerilyn Dennis Modified over 9 years ago
1
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa
2
01/20152 Analyzing Survival Data(1) Three methods to analyze survival data: –Parametric methods –semi-parametric methods –non-parametric methods
3
01/20153 Analyzing Survival Data (2) Parametric methods –Assume one of the functions discussed earlier. Usually assume the probability distribution Estimating the hazard function is key –Estimate the parameters directly Use Maximum Likelihood Estimation methods –Has greatest statistical power provided that the model is correct –Will be discussed in a later session
4
01/20154 Analyzing Survival Data (3) Semi-Parametric methods (doesn’t ‘estimate’ S(t)) –Assume that there is a parametric relationship between the hazard in different treatment/exposure groups E.g. Males have twice the hazard as females –BUT, let the hazard function be unspecified (non- parametric) Can be any form, including cure models –Cox modeling is most common method used Proportional Hazard assumption is commonly used but not essential More later in course
5
01/20155 Analyzing Survival Data (4) Non-Parametric methods –Make no assumption about the survival curve, distribution function, etc. –Common approach used in epidemiology and medicine. –Actuarial method (life-table/Cutler-Ederer) Treat time as ‘intervals’ Does not need exact time of the event Used for 100+ years by demographers –Kaplan-Meier (product-limit) method Most frequent approach for RCTs Requires knowing the actual time of event
6
01/20156 Actuarial Method: Key Concept Divide the follow-up period into smaller time units –Often use 1 year intervals Can be: days, months, decades, etc. –Intervals don’t have to be the same size but usually are Compute survival in each interval Combine these into an overall estimate of S(t)
7
01/20157 Year# at start# dying 01000100 190090 281081 What is Cumulative Incidence over 3 years? Standard Epi formula: Consider a simple example: Follow 1000 people for 3 years and count number that die in each year
8
01/20158 Another view: How can you still be alive after 3 years? Don’t die in year 1 and Don’t die in year 2 and Don’t die in year 3
9
01/20159 DEAD p1p1 1- p 1 p2p2 1- p 2 p3p3 1- p 3 Year 0 Year 1 Year 2 Year 3
10
01/201510 Year# at start# dying 01000100 190090 281081 Our simple example: Apply the formula: Same Answer! Why? No losses/censoring
11
01/201511 Conditional Probs Cumulative Probs ABCDEFG Year# people still alive # people dying in this year Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0002,000 1991 8,0001,600 1992 6,4001,280 1993 5,1201,024 1994 4,096 820 ABCDEFG Year# people still alive # people dying in this year Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0002,0000.20.8 1991 8,0001,600 1992 6,4001,280 1993 5,1201,024 1994 4,096 820 ABCDEFG Year# people still alive # people dying in this year Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0002,0000.20.8 0.2 1991 8,0001,600 1992 6,4001,280 1993 5,1201,024 1994 4,096 820 ABCDEFG Year# people still alive # people dying in this year Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0002,0000.20.8 0.2 1991 8,0001,6000.20.8 1992 6,4001,280 1993 5,1201,024 1994 4,096 820 ABCDEFG Year# people still alive # people dying in this year Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0002,0000.20.8 0.2 1991 8,0001,6000.20.80.640.36 1992 6,4001,280 1993 5,1201,024 1994 4,096 820 ABCDEFG Year# people still alive # people dying in this year Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0002,0000.20.8 0.2 1991 8,0001,6000.20.80.640.36 1992 6,4001,2800.20.80.510.49 1993 5,1201,024 1994 4,096 820 ABCDEFG Year# people still alive # people dying in this year Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0002,0000.20.8 0.2 1991 8,0001,6000.20.80.640.36 1992 6,4001,2800.20.80.510.49 1993 5,1201,0240.20.80.410.59 1994 4,096 820 ABCDEFG Year# people still alive # people dying in this year Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0002,0000.20.8 0.2 1991 8,0001,6000.20.80.640.36 1992 6,4001,2800.20.80.510.49 1993 5,1201,0240.20.80.410.59 1994 4,096 8200.20.80.330.67
12
01/201512 ABCD Year# people still alive # people ‘lost’ in year # people dying in this year 199010,0005,0001,500 1991 3,5001,750 525 1992 1,225 612 184 1993 429 215 64 1994 150 75 23 SAMPLE DATA: MORTALITY AND LOSSES
13
Actuarial Method (1) Consider the first interval of time: –10,000 people ‘at risk’ at start of interval –1,500 die –5,000 are ‘lost’ before end of interval Is the probability of death: –1,500/10,000 01/201513 NO
14
Actuarial Method (2) ‘Lost’ people are only at risk of ‘dying’ until they are lost. When are they lost? –We don’t know. Losses could follow any pattern: 01/201514
15
01/201515
16
Actuarial Method (3) The Actuarial ASSUMPTION (two forms) –‘lost’ subjects are ‘at risk’ for one-half of the interval –Only one-half of lost subjects are ‘at risk’ for the interval. For 1990, this implies: 01/201516
17
Actuarial Method (4) This is identical to the standard formula for estimating Cumulative Incidence learned in Epi 1. 01/201517
18
01/201518 ABCDEFGHI Year# people still alive # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0005,0001,5007,5000.20.8 0.2 1991 3,5001,750 5252,625 1992 1,225 612 184 919 1993 429 215 64 322 1994 150 75 23 113 Conditional Probs Cumulative Probs
19
Actuarial Method (5) Now, consider 1991 (Assume that you survive to the start of 1991) The standard epidemiology formula gives: 01/201519
20
Actuarial Method (6) What is: Prob(died by 1991)? 01/201520 AND SO ON
21
01/201521 ABCDEFGHI Year# people still alive # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year Cum. Prob of surviving to this year Cum. Prob of dying by this year 199010,0005,0001,5007,5000.20.8 0.2 1991 3,5001,750 5252,6250.20.80.640.36 1992 1,225 612 184 9190.20.80.510.49 1993 429 215 64 3220.20.80.410.59 1994 150 75 23 1130.20.80.330.67
22
Actuarial Method (7) ‘The Algebra’ 01/201522 Compute these for each interval. Gives columns A through G
23
Actuarial Method (7) ‘The Algebra’ – part 2 01/201523 Compute these for each interval. Gives columns H and I The Cumulative probabilities
24
01/201524 Kaplan Meier: Key Concept Similar approach to the actuarial method EXCEPT: –Use exact times for each outcome to define ‘intervals’ rather than a fixed length interval Compute a new survival value at every time where an outcome event occurs –Can ignore times with censored events –Excluded censored people from the ‘at risk’ group
25
01/201525 Kaplan Meier: Risk set At any time ‘t’ during follow-up, there will be a group of people still under active follow-up –Excludes People with previous outcomes People who have been censored prior to ‘t’ These are the only people at risk of having an outcome at time ‘t’ Called the RISK SET at time ‘t’
26
Kaplan-Meier: Formulae Compute at each time when an event happens (t i ) 01/201526
27
01/201527 Data: 5, 12, 25, 26, 27, 28, 32+, 33+, 34+, 37, 39, 40+, 42+ ‘i'time# deaths# in risk set S(t) for t i ≤ t ≤ t i+1 Cumulative S(t) 00--- 1.0 1511312/130.9230.92 212 325 426 527 628 732 833 934 1037 1139 1240 1342 ‘i'time# deaths# in risk set S(t) for t i ≤ t ≤ t i+1 Cumulative S(t) 00--- 1.0 1511312/130.9230.92 2121 11/120.9170.85 325 426 527 628 732 833 934 1037 1139 1240 1342 ‘i'time# deaths# in risk set S(t) for t i ≤ t ≤ t i+1 Cumulative S(t) 00--- 1.0 1511312/130.9230.92 2121 11/120.9170.85 32511110/110.9090.77 4261109/100.90.69 527198/90.8890.62 628187/80.8750.54 732077/7(1.0)0.54 833066/6(1.0)0.54 934055/5(1.0)0.54 1037143/40.750.40 1139132/30.6670.27 1240022/2(1.0)0.27 1342011/1(1.0)0.27 ‘i'time# deaths# in risk set S(t) for t i ≤ t ≤ t i+1 Cumulative S(t) 00--- 1.0 1511312/130.9230.92 2121 11/120.9170.85 32511110/110.9090.77 4261109/100.90.69 527198/90.8890.62 628187/80.8750.54 732 833 934 1037143/40.750.40 1139132/30.6670.27 1240 1342
28
01/201528
29
Confidence Intervals for S i (t) 01/201529 Greenwood’s Formula
30
01/201530
31
01/201531 Median Survival (1) Mean survival –hard to estimate and has limited value –due to right skewing of survival distribution Median survival is more useful: –The time by which 50% of the cohort will have had the outcome –S(t) = 0.50 With no censoring, easy to get: –Use normal formula with the survival times.
32
01/201532 Median Survival (2) With censoring, you need to solve: –S(t) = 0.5 –You can do this directly in the KM plot –Can be computed as well –Is based on the rank order of the survival times, not on the actual times Except at the median itself 95% CI can be obtained –Complex formula/method –Tend to be very wide.
33
K-M: A Couple Of Notes If the last time corresponds to an ‘event’, then S(t last ) must be 0. –This does NOT mean that every one in the group dies. If the last time corresponds to a ‘censoring’, then S(t last ) will be non-zero. –The mean survival time will be biased Under-estimated 01/201533
34
01/201534 Numerical example IDTime(mons)Censored 114XXXXX 222 329 437XXXXX 545XXXXX 646 761 876XXXXX 992XXXXX 10111XXXXX
35
01/201535 Actuarial Method ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-380180.1250.8750.783 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-380180.1250.8750.783 3-472160.1670.8330.652 4-54004010.652 5-640140.250.750.489 6-73102.5010.489 7-82101.5010.489 8-91100.5010.489 9-100000010.489
36
01/201536 Kaplan-Meier method ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 2 3 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 3 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 346150.2000.8000.622 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 346150.2000.8000.622 461140.2500.7500.467 Note that the last three subjects were censored. Final S(t) value is non-zero.
37
01/201537 Actuarial Curve
38
01/201538 Kaplan-Meier Curve
39
01/201539 Both Curves
40
01/201540 Estimating 75 th percentile survival 0.75 45
41
01/201541
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.