Presentation is loading. Please wait.

Presentation is loading. Please wait.

03/20161 EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 5, 2016 Dr. N. Birkett, School of Epidemiology, Public.

Similar presentations


Presentation on theme: "03/20161 EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 5, 2016 Dr. N. Birkett, School of Epidemiology, Public."— Presentation transcript:

1 03/20161 EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 5, 2016 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine, University of Ottawa

2 03/20162 Objectives Background –Residuals in Cox regression –The ‘STRATA’ statement in PHREG. Graphical approaches to PH testing Model-based approaches to PH testing The ‘ASSESS’ option in SAS.

3 03/20163 Background Residuals in Cox regression The ‘STRATA’ statement in Phreg.

4 03/20164 Residuals for Cox (1) Residuals in linear regression measure how far the model deviates from the true data points: That doesn’t work for Cox because we have no ‘y’ Two alternative types of residuals are used –Individual –Covariate-wise

5 03/20165 Residuals for Cox (2) Individual Residuals –Each subject has one residual Computed at the time they leave the study Value obtained for the time when an event happens to the subject –At least three variants Cox-Snell Deviance Martingale –The most useful

6 03/20166 Residuals for Cox (3) Martingale Residuals –Related to counting process methods –For the Cox model, we get: –N(t) is # of events at time ‘t’ –H(t,x) is cumulative hazard –M(t) is like an error term Mathematically called a Martingale.

7 03/20167 Residuals for Cox (4) Martingale Residuals –Residual is defined for a subject at the time the leave the study as: Generally, these are not widely used –Will return to them when we discuss ASSESS statement

8 03/20168 Residuals for Cox (5) Schoenfeld Residuals –One residual for each covariate at each event time –Schoenfeld residuals are computed at each event time –Only subject(s) getting the outcome at same time have values for Schoenfeld residuals Censored subjects have missing values for Schoenfeld residuals –Based on the expected value for each of the covariates at point when an event happens

9 03/20169 Residuals for Cox (6) Schoenfeld Residuals –At each event point, every subject in the risk set has a probability that they would have had an event –Use these probabilities to determine the expected value of each of the covariates at that point in time –The residual for this covariate for each subject having an event at this time point is: The difference between this expected value and the covariate for the subject who had the event.

10 Schoenfeld Residuals (cont) –For now, assume only one event at each time point –At event time ‘t’, let subject j* be the one who had the event –Consider any subject ‘j’ who is still in the risk set at ‘t’. From earlier classes, we have: 03/201610 Residuals for Cox (7)

11 03/201611 Residuals for Cox (8) Schoenfeld Residuals (cont) –Compute the expected value for covariate ‘i’ at time ‘t’: –The Schoenfeld residual for covariate ‘I’at time ‘t’ is:

12 Residuals for Cox (9) Note that there is one residual for: each covariate at the risk set time point for the subject(s) having the event at that time Differs from linear regression which has one residual for: each subject If there is more than one event at the time point Compute one residual for each subject Expected value is ‘0’ under PH assumption 03/201612

13 A worked example Cox model: At time ‘t’: 3 subjects remain in the risk set covariates are given in the table Subject #2 has event 03/201613 IDx1x1 x2x2 10.30.4 2 0.2 30.50.1 Event

14 03/201614 Compute probability that each subject has an event at ‘t’

15 03/201615 Expected value of x 1 at ‘t’ is: Schoenfeld Residual for x 1 at ‘t’ is:

16 03/201616 Expected value of x 2 at ‘t’ is: Schoenfeld Residual for x 2 at ‘t’ is:

17 The ‘STRATA’ Statement (1) Proc Lifetest has a ‘strata’ statement –Used to define two (or more) groups for the log-rank test. –Produces one S(t) curve for each level of the stratification variable. –Plot log(-log(S(t))) vs. ‘t’ to check PH (more later) Phreg also has STRATA statement –Useful for ‘adjusting out’ variables which do not meet PH assumption and which aren’t of interest to us. 03/201617

18 Effectively, fits a separate model in each stratum, with a different baseline hazard: The ‘beta’ is constrained to be the same in the 2 models Can combine with Baseline statement to give a graphic test of the PH assumption 03/201618 The ‘STRATA’ Statement (2)

19 End of Background 03/201619

20 03/201620 Testing PH (1) We will explore 2 graphical methods and 1 modeling approach

21 03/201621 Testing PH (1) Graphical method #1 Plot: log(-log(S(t))) vs. log(t) –Can also plot against just ‘t’ Consider two groups which satisfy the PH assumption.

22 But: Take another log of both sides: So, plotting log-log curves of the 2 groups should show curves which are parallel. Can plot against ‘t’ or ‘ln(t)’ 03/201622

23 Plot log(-log(S 1 (t))) vs. log(t) or ‘t’) Plot log(-log(S 2 (t))) vs. log(t) or ‘t’) If PH is true, the vertical distance between the two curves should be ‘log(λ)’ for all time points 03/201623

24 03/201624 Testing PH (2) How can we generate the curves? Method #1 Use KM method in Proc LIFETEST –Use STRATA statement to generate different curves for each level of the predictor –Add a ‘plots=lls’ option to the Proc LIFETEST statement

25 03/201625 Testing PH (3) Produces one set of log(-log(S(t))) values for each set of predictor variable Limitations –Can not adjust for other factors –Hard to use with continuous predictors.

26 03/201626 ODS graphics on; proc lifetest data=njb1 plots=(s,ls,lls); time week*arrest(0); strata fin; run; ODS graphics off;

27 03/201627 No aid Aid

28 03/201628 H(t) No aid Aid

29 03/201629 No aid Aid

30 03/201630 ODS graphics on; proc lifetest data=njb1 plots=(s,ls,lls); time week*arrest(0); strata age (20,25); run; ODS graphics off;

31 03/201631

32 03/201632 H(t) <20 20-25 >25

33 03/201633 <20 20-25 >25

34 03/201634 Testing PH (4) Method #2 Use Proc PHREG with the STRATA and BASELINE statements BASELINE estimates S(t) based on a common baseline hazard. –Can derive log(S(t)), log(-log(S(t))) etc. from this What happens if we use BASELINE and STRATA at the same time?

35 03/201635 Testing PH (5) Method #2 Using ‘baseline’ and ‘strata’ together produces an estimate of S(t), ln(S(t)), etc. within each stratum, adjusted for the other variables in the model. Variable being tested for PH goes into the STRATA statement and not in the model Plot curves using ‘sgplot’ and examine

36 03/201636 PROC PHREG DATA=allison.recid; class fin; MODEL week*arrest(0)=age prio / TIES=EFRON; STRATA fin; BASELINE OUT=a SURVIVAL=s logsurv = logs loglogs = loglogs ; RUN;

37 03/201637 Testing PH (6) Generates one S(t) curve for each stratum level. Use these to plot ‘log(-log(S(t))’ for each group Examples are shown in associated video

38 03/201638 Testing PH (7) You might ask, why no do a simpler method: –Use the COVARIATES option with the BASELINE statement –Can produces one S(t) curve for each level of the variable we want to test: Male Female

39 03/201639 Testing PH (8) Does not work!! –Does produce group-specific S(t) curves for each level of the variable of interest –BUT S(t) is based on a common baseline hazard –Means that the log(-log(S(t))) curves MUST be parallel –See video for some examples

40 03/201640 Testing PH (9) Do we need the COVARIATES option with the STRATA/BASELINE model? –Not needed –We don’t care what the actual covariates used to produce the S(t) curves are –Only question is are the log(-log(S(t))) curves parallel? –As long as covariates are the same for all strata, it ‘works’

41 03/201641 Testing PH (10) How do we get the graphs? ODS Graphics can produce the S(t) and H(t) plots. ODS graphics can not produce the log(-log(S(t))) plots directly Instead, I used SAS Graph I’ll show an example here Could also use PROC SGPLOT (examples shown in video)

42 03/201642 proc sort data=a; by fin week; run; symbol1 interpol=j color=red width=6; symbol2 interpol=j color=green width=6; axis1 order=(0 to 1 by.1); axis2 logbase=10 logstyle=expand order=(1,10,100); proc gplot data=a ; plot (s)*week=fin/vaxis=axis1; plot (loglogs)*week=fin; run; proc gplot data=a ; plot (loglogs)*week=fin/ haxis=axis2; run; Gives the plot against log(t)

43 03/201643 S(t) vs. time

44 03/201644 Log(-log(S(t))) vs. time

45 03/201645 Log(-log(S(t))) vs. log(time)

46 03/201646 Second example – graphs only

47 03/201647

48 03/201648

49 03/201649 Testing PH (11) Graphical method #2 Plot: Schoenfeld residuals –One plot for each variable in the model –Fit a LOESS curve to each graph –Smoothed curve should be parallel to the x-axis –Departures imply PH assumption is violated Can handle continuous predictors

50 03/201650 Testing PH (12) Graphical method #2 Problems: –Interpreting graphs is ‘tricky’ Hard to distinguish random fluctuation from non-PH effects –Power is decreased Only have Schoenfeld residuals for subjects who failed

51 03/201651 Simulated data; 2 vars; dichotomous var; PH is OK Continuous var; PH is OK Dichot. var Cont. var

52 03/201652 Dichot. var Cont. var Simulated data; 2 vars; dichotomous var; PH is OK Continuous var; PH is NOT OK

53 03/201653 Dichot. var Cont. var Simulated data; 2 vars; dichotomous var; PH is NOT OK Continuous var; PH is OK

54 03/201654 Testing PH (13) Graphical method #3 Uses the new SAS command: ASSESS Easy to produce but tricky to understand Based on Martingale residuals and a ‘counting process’ approach to survival models

55 03/201655 Testing PH (14) Graphical method #3 As mentioned earlier, the observed ‘count’ of events, can be split into two parts as: H(t,x) = Cumulative hazard M(t) is random noise (martingale)

56 03/201656 Testing PH (15) ASSESS uses simulation methods to generate the results Start with the cohort –We have the H(t,x) ‘process’ from PHREG methods –H(t,x) gives us the probability that subject has an event at time ‘t

57 03/201657 Testing PH (16) The hazard process determines rate of getting outcome. –Can simulate the cohort ‘history’ –Gives a simulated ‘N(t)’ process –Compare to the observed data to produce a ‘Martingale’ or ‘noise’. Repeat over and over again

58 03/201658 Testing PH (17) Generate 1,000 simulations of the ‘hazard’ process, which meets the PH assumption (RESAMPLE) –For each, compute the Martingale residuals –Plot the observed curve and simulated curves Kolmogorov-type supremum test gives p-value that the observed curve is ‘consistent’ with PH.

59 03/201659 ODS GRAPHICS ON; ODS rtf; PROC PHREG DATA=allison.recid; MODEL week*arrest(0)=fin age race wexp mar paro prio / TIES=EFRON; ASSESS PH / RESAMPLE; RUN; ODS rtf close; ODS GRAPHICS OFF;

60 03/201660

61 03/201661

62 03/201662

63 03/201663 Testing PH (18) Analytical method If you have Non-PH, that means –HR varies over time If we knew how it varied, we could model with a time-varying covariate Can screen for non-PH using a time-varying covariate What covariate to use? x*t x*log(t)

64 03/201664 Testing PH (19) Analytical method (cont) Can use either ‘t’ or ‘log(t)’ –Log(t) is usually preferred ‘time’ can get very large –Can produce numerical problems –Not usually an issue with modern computer software log(t) tends to avoid numerical problems.

65 03/201665 Testing PH (20) Analytical method (cont) PROCESS –Define the time varying covariate Using the Proc step is easier –Place covariate in model and run it –Look for statistical significance of the time varying variable

66 03/201666 ODS GRAPHICS ON; ODS rtf; PROC PHREG DATA=allison.recid; MODEL week*arrest(0)=fin age race wexp mar paro prio aget / TIES=EFRON; aget = age*log(week); RUN; ODS rtf close; ODS GRAPHICS OFF;

67 03/201667

68 03/201668 Simulated data; 2 vars; dichotomous var; PH is OK Continuous var; PH is OK

69 03/201669 Simulated data; 2 vars; dichotomous var; PH is OK Continuous var; PH is NOT OK

70 03/201670 Simulated data; 2 vars; dichotomous var; PH is OK Continuous var; PH is NOT OK

71 03/201671 Testing PH (21) Which approach do you (should you) use? All methods are useful Suggested Approach –Start by looking at each variable using categories and KM models –Run ‘ASSESS’ if desired –Look at time interactions –Model the multivariate non-PH effects if interesting

72 03/201672 Testing PH (22) I don’t find the Schoenfeld method useful ASSESS method is easy to implement –Gives a ‘p-value’ and a visual measure of PH –But, gives no idea of how PH is violated. Time-varying screening test is also easy –Might miss some non-PH

73 03/201673 Testing PH (20) Univariate vs. multivariate models –Either should work Univariate is easier to implement Multivariate protects against complex models

74 03/201674


Download ppt "03/20161 EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 5, 2016 Dr. N. Birkett, School of Epidemiology, Public."

Similar presentations


Ads by Google