01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology.

01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology & Community Medicine, University of Ottawa

01/20142 Objectives (for entire session) Primary goal is to address two key concepts: –Hazard estimation role in survival methods –Methods to compare two survival curves using non- parametric methods

01/20143 Objectives (for entire session) Review –Survival concepts –Hazard Smoothing methods Methods for estimation of hazard Proportional hazards Non-regression comparison of survival curves –Log-rank test –Variations of log-rank test Relate Hazard/ID to person-time

Review Material, Session #1 01/20144

5 Time ‘0’ (1) Time is usually measured as ‘calendar time’ Patient #1enters on Feb 15, 2000 & dies on Nov 8, 2000 Patient #2enters on July 2, 2000 & is lost (censored) on April 23, 2001 Patient #3Enters on June 5, 2001 & is still alive (censored) at the end of the follow-up period Patient #4Enters on July 13, 2001 and dies on December 12, 2002

01/20146 Study course for patients in cohort 2001 2003 2013

01/20147

8 Histogram of death time -Skewed to right -pdf or f(t) -CDF or F(t) -Area under ‘pdf’ from ‘0’ to ‘t’ t F(t)

01/20149 Survival curves (3) Plot % of group still alive (or % dead) S(t) = survival curve = % still surviving at time ‘t’ = P(survive to time ‘t’) Mortality rate = 1 – S(t) = F(t) = Cumulative incidence

01/201410 Deaths CI(t) Survival S(t) t S(t) 1-S(t)

01/201411 Essentially, you are re-scaling S(t) so that S * (t 0 ) = 1.0 Conditional Survival Curves

01/201412 S * (t) = survival curve conditional on surviving to ‘t 0 ‘ CI * (t) = failure/death/cumulative incidence at ‘t’ conditional on surviving to ‘t 0 ‘ Hazard at t 0 is defined as: ‘the slope of CI * (t) at t 0 ’ Hazard (instantaneous) Force of Mortality Incidence rate Incidence density Range: 0  ∞

01/201413 Some relationships If the rate of disease is small: CI(t) ≈ H(t) If we assume h(t) is constant (= ID): CI(t)≈ID*t

01/201414 DEAD p1p1 1- p 1 p2p2 1- p 2 p3p3 1- p 3 Year 0 Year 1 Year 2 Year 3

01/201415 Actuarial Method ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-380180.1250.8750.783 3-4 4-5 5-6 6-7 7-8 8-9 9-10 ABCDEFGH Year# people under follow-up # lost# people dying in this year Effective # at risk Prob die in year Prob survive this year S(t) 0-11000 011 1-210119.50.1050.895 2-380180.1250.8750.783 3-472160.1670.8330.652 4-54004010.652 5-640140.250.750.489 6-73103.5010.489 7-82102.5010.489 8-91101.5010.489 9-100000010.489

01/201416 Kaplan-Meier method ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 2 3 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 3 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 346150.2000.8000.622 4 ‘i'time# deaths # in risk set Prob die in interval Prob survive interval S(t 1 ) 00--- 1.0 122190.1110.889 229180.1250.8750.778 346150.2000.8000.622 461140.2500.7500.467

END OF REVIEW MATERIAL 01/201417

Smoothing methods Naïve non-parametric regression ‘windows’ Sliding windows Local averaging Kernel estimation 01/201418

01/201419

01/201420

01/201421

01/201422

01/201423

01/201424

01/201425

Sliding windows (1) The divisions we used created five ‘windows’ into the data. –Within each window, we computed the mean ‘X’ and ‘Y’ and plotted that point for the regression line Why do we need to make the windows ‘fixed’? –Define the width of a window –Slide it from left to right –Compute the ‘window-specific data point’ and plot as before. The essence of ‘smoothing’. 01/201426

01/201427

Sliding windows (2) The size of the window is a ‘tuning parameter’. –Fixed number of neighboring data points –Fixed width include all points inside Large windows tend to ‘over-smooth’ Small windows do little smoothing and show the random noise. 01/201428

Window-specific data point (1) Many ways to compute the representative data point for the window: –X-value Mean of the x’s in window Median of the x’s in window Define window around a specific data point and use that x-value –Y-value Mean of the y’s in the window Median of the y’s in the window Do a regression (linear, quadratic or cubic) of data points in window –use the predicted ‘y’ for the selected ‘x’ 01/201429

01/201430

Window-specific data point (2) Can ‘weight’ data points –Points closer to the middle should provide more information about the true (x,y) than those further away. The weights are called a ‘kernel’. The method is called ‘Kernel Smoothing’ 01/201431

Window-specific data point (3) Many weight functions (kernels) can be used. A common one is the tricube weight Select an ‘x i ’ –Define the window around x i to get points inside window –For each point inside the window let ‘z ij ’ measure how far the point ‘x ij ’ is from the left boundary of the window towards the right boundary –-1 means on the left boundary –+1 means on the right boundary –Then the weight for that point is given by: 01/201432

01/201433

LOWESS LOWESS = LOcally WEighted Scatterplot Smoothing Use above procedure but compute a linear regression of ‘x’ on ‘y’ and use the regression equation to estimate ‘y i ’ for given ‘x i ’ Implemented in SAS as a PROC (LOESS) –Available through ODS Graphics and elsewhere Can use a higher order polynomial regression instead of the linear model –Linear model is usually OK ‘Tuning’ done by varying the percentage of the data set included in the window. –Empirical/feel are ‘best’ for choosing tuning –Some statistics are available (e.g. residuals) but that is advanced material 01/201434

01/201435

01/201436

01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology.

Similar presentations

Presentation on theme: "01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology.

Similar presentations

Presentation on theme: "01/20141 EPI 5344: Survival Analysis in Epidemiology Quick Review and Intro to Smoothing Methods March 4, 2014 Dr. N. Birkett, Department of Epidemiology."— Presentation transcript:

Similar presentations

About project

Feedback