Presentation is loading. Please wait.

Presentation is loading. Please wait.

01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology.

Similar presentations


Presentation on theme: "01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology."— Presentation transcript:

1 01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

2 01/20142 Objectives (for entire session) Primary goal is to address two key concepts: –Hazard estimation role in survival methods –Methods to compare two survival curves using non- parametric methods

3 01/20143 Objectives (for entire session) Review –Survival concepts –Hazard Smoothing methods Methods for estimation of hazard Proportional hazards Non-regression comparison of survival curves –Log-rank test –Variations of log-rank test Relate Hazard/ID to person-time

4 Review Material, Session #1 01/20144

5 5 Time ‘0’ (1) Time is usually measured as ‘calendar time’ Patient #1enters on Feb 15, 2000 & dies on Nov 8, 2000 Patient #2enters on July 2, 2000 & is lost (censored) on April 23, 2001 Patient #3Enters on June 5, 2001 & is still alive (censored) at the end of the follow-up period Patient #4Enters on July 13, 2001 and dies on December 12, 2002

6 01/20146 Study course for patients in cohort 2001 2003 2013

7 01/20147

8 8 Histogram of death time -Skewed to right -pdf or f(t) -CDF or F(t) -Area under ‘pdf’ from ‘0’ to ‘t’ t F(t)

9 01/20149 Survival curves (3) Plot % of group still alive (or % dead) S(t) = survival curve = % still surviving at time ‘t’ = P(survive to time ‘t’) Mortality rate = 1 – S(t) = F(t) = Cumulative incidence

10 01/201410 Deaths CI(t) Survival S(t) t S(t) 1-S(t)

11 01/201411 Essentially, you are re-scaling S(t) so that S * (t 0 ) = 1.0 Conditional Survival Curves

12 01/201412 S * (t) = survival curve conditional on surviving to ‘t 0 ‘ CI * (t) = failure/death/cumulative incidence at ‘t’ conditional on surviving to ‘t 0 ‘ Hazard at t 0 is defined as: ‘the slope of CI * (t) at t 0 ’ Hazard (instantaneous) Force of Mortality Incidence rate Incidence density Range: 0  ∞

13 01/201413 Some relationships If the rate of disease is small: CI(t) ≈ H(t) If we assume h(t) is constant (= ID): CI(t)≈ID*t

14 01/201414 DEAD p1p1 1- p 1 p2p2 1- p 2 p3p3 1- p 3 Year 0 Year 1 Year 2 Year 3

15 01/201415 Actuarial Method ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-380180.1250.8750.783 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-380180.1250.8750.783 3-472160.1670.8330.652 4-54004010.652 5-640140.250.750.489 6-73103.5010.489 7-82102.5010.489 8-91101.5010.489 9-100000010.489

16 01/201416 Kaplan-Meier method ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 2 3 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 3 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 346150.2000.8000.622 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 346150.2000.8000.622 461140.2500.7500.467

17 END OF REVIEW MATERIAL 01/201417

18 Smoothing methods Naïve non-parametric regression ‘windows’ Sliding windows Local averaging Kernel estimation 01/201418

19 01/201419

20 01/201420

21 01/201421

22 01/201422

23 01/201423

24 01/201424

25 01/201425

26 Sliding windows (1) The divisions we used created five ‘windows’ into the data. –Within each window, we computed the mean ‘X’ and ‘Y’ and plotted that point for the regression line Why do we need to make the windows ‘fixed’? –Define the width of a window –Slide it from left to right –Compute the ‘window-specific data point’ and plot as before. The essence of ‘smoothing’. 01/201426

27 01/201427

28 Sliding windows (2) The size of the window is a ‘tuning parameter’. –Fixed number of neighboring data points –Fixed width include all points inside Large windows tend to ‘over-smooth’ Small windows do little smoothing and show the random noise. 01/201428

29 Window-specific data point (1) Many ways to compute the representative data point for the window: –X-value Mean of the x’s in window Median of the x’s in window Define window around a specific data point and use that x-value –Y-value Mean of the y’s in the window Median of the y’s in the window Do a regression (linear, quadratic or cubic) of data points in window –use the predicted ‘y’ for the selected ‘x’ 01/201429

30 01/201430

31 Window-specific data point (2) Can ‘weight’ data points –Points closer to the middle should provide more information about the true (x,y) than those further away. The weights are called a ‘kernel’. The method is called ‘Kernel Smoothing’ 01/201431

32 Window-specific data point (3) Many weight functions (kernels) can be used. A common one is the tricube weight Select an ‘x i ’ –Define the window around x i to get points inside window –For each point inside the window let ‘z ij ’ measure how far the point ‘x ij ’ is from the left boundary of the window towards the right boundary –-1 means on the left boundary –+1 means on the right boundary –Then the weight for that point is given by: 01/201432

33 01/201433

34 LOWESS LOWESS = LOcally WEighted Scatterplot Smoothing Use above procedure but compute a linear regression of ‘x’ on ‘y’ and use the regression equation to estimate ‘y i ’ for given ‘x i ’ Implemented in SAS as a PROC (LOESS) –Available through ODS Graphics and elsewhere Can use a higher order polynomial regression instead of the linear model –Linear model is usually OK ‘Tuning’ done by varying the percentage of the data set included in the window. –Empirical/feel are ‘best’ for choosing tuning –Some statistics are available (e.g. residuals) but that is advanced material 01/201434

35 01/201435

36 01/201436


Download ppt "01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology."

Similar presentations


Ads by Google