Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional.

Similar presentations


Presentation on theme: "Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional."— Presentation transcript:

1 Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional hazards assumptions 1

2 Cox-Snell Residuals Definitions 1. Cox-Snell Residuals r j =-ln{S(t j ; θ ̂ )} S( t j ; θ ̂ ) is the value of the estimated survivor function at t ime t j. They are just just the estimated cumulative hazard If the model is correct, then the residuals should have a n exponential distribution with mean 1. Cox-Snell residuals are useful for assessing the fit of th e parametric models They are not very informative for Cox models estimated by partial likelihood. 2

3 Martingale Residuals 2. Martingale Residuals For a censored case, the Martingale residual is the negative of the Cox-Snell residual. For an uncensored case, it is one minus the Cox-Snell residual. Martingale residuals can then be plotted against the respective covariate and enhance the plots by including L owess curves (smoother) to indicate the functional form of the relationship between the log-hazard function and the covariate. Weaknesses. They are not symmetrically distributed about zero even when the fitted model is correct. This skewness makes plots difficult to interpret. 3

4 Deviance Residuals Definitions 3. Deviance Residuals Behaving much like residuals from LS regression Symmetrically distributed around 0 and have an approximate standard deviation of 1.0. Are negative for observations that have longer survival times than expected and positive for observations with survival times that are smaller than expected. I Censoring can produce striking patterns that don't necessarily imply any problem with the model itself. 4

5 Liver Data Example Data data Liver; input Time Status Age Albumin Bilirubin Edema Protime @@; label Time="Follow-up Time in Years"; Time= Time / 365.25; datalines; 400 1 58.7652 2.60 14.5 1.0 12.2 4500 0 56.4463 4.14 1.1 0.0 10.6 1012 1 70.0726 3.48 1.4 0.5 12.0 1925 1 54.7406 2.54 1.8 0.5 10.3 1504 0 38.1054 3.53 3.4 0.0 10.9 2503 1 66.2587 3.98 0.8 0.0 11.0 1832 0 55.5346 4.09 1.0 0.0 9.7 2466 1 53.0568 4.00 0.3 0.0 11.0 …. 5

6 Liver Data, Fitting PH Fitting PH Cox Model ParameterDF Parameter Estimate Standard Error Chi-SquarePr > ChiSq Hazard Ratio Bilirubin10.117330.0129881.7567<.00011.124 logProtime12.775810.7148215.07940.000116.052 logAlbumin1-3.171950.6294525.3939<.00010.042 Age10.037790.0080522.0288<.00011.039 Edema10.847720.281259.08500.00262.334 TotalEventCensored Percent Censored 41816125761.48 6

7 Deviance Residual Diagnosis 7

8 8

9 9

10 Conventional Residuals Analysis Issues highly subjective difficult to interpret 10

11 New Method of Residual Diagnosis Objective way Checking model fit based on cumulative sum of Martingale Asymptotic property of the sum Gaussian Process Bootstrapping 11

12 Definition of Random Process Definitions 1. Random Process (Stochastic Process) A random process is the counterpart to a deterministic process. Instead of dealing with only one possible "reality" of how the process might evolve under time (as is the case, for example, for solutions of an ordinary differential equation), in a stochastic or random process there is some indeterminacy in its future evolution described by probability distributions This means that even if the initial condition (or starting point) is known, there are many possibilities the process might go to, but some paths are more probable and others less Example: Markov process,, Gaussian process 12

13 Definition of Random Process Random process X(t) X 2 (t) X N (t) t The totality of all sample functions is called an ensemble For a specific time X(tk) is a random variable 13

14 Definition of Gaussian Process 2. Gaussian Process A random process X(t) is a Gaussian process if for all n and for all, the random variables has a jointly Gaussian density function, which may expressed as : n random variables : mean value vector : nxn covariance matrix 14

15 Why Gaussian Process ? Central limit theorem The sum of a large number of independent and identically distributed(i.i.d) random variables getting closer to Gaussian distribution Cumulative residuals will be centered at zero if the model is correct. Under the null hypothesis of a correct model fit, they can be approximated as a zero mean Gaussian process with a covariance structure determined by the particular type of regression model. Realizations of the Gaussian process can be simulated by computer and compared with the observed process to assess whether the observed residual process represents anything beyond random variation. 15

16 Liver Data, Residuals Diagnosis 1. Checking the Functional Form of a Covariate 16

17 Liver Data, Residuals Diagnosis 17

18 Liver Data, Residuals Diagnosis 18

19 Residuals Sum Diagnosis Summary The light dashed lines in Figure 2 are the first 20 realizations of 10,000 simulated paths of the cumulative residual process under the null hypothesis of a correct model fit. All the paths tend to be closer to and intersect the horizontal axis compared the observed residuals. The fitted model overestimates the hazards for the low end of the Bilirubin values and underestimate the hazards for high Bilirubin values None of the 10,000 simulated paths has an absolute maximum exceeding that of the observed process. Thus, the p-value for a Kolmogorov-type supremum test is 0. These results suggest that there may be a better fitting model for the surgical unit data. The pattern suggests a logarithmic transform. 19

20 Fitting Cox With logBilirubin After Fitting Cox to Liver data using logBilurubin in stead of Bilirubin VariableDF Parameter Estimate Standard Error Chi-SquarePr > ChiSq Hazard Ratio logBilirubin 10.870720.08263111.0484<.00012.389 logProtime12.377890.766749.61810.001910.782 logAlbumin1-2.532640.6481915.2664<.00010.079 Age10.039400.0076526.5306<.00011.040 Edema10.859340.2711410.04470.00152.362 20

21 Log Transformation of Bilirubin Residuals Diagnosis after fitting logBilirubin 21

22 Comment When the log transform is applied to Bilirubin, the observed process appears to be more typical of the simulated processes. The p-value, based on 10,000 simulated samples, is 0.0572, indicating a much improved model 22

23 Checking PH Assumptions 2. Checking Proportional Hazards Assumptions To check the proportional hazards assumption the score process (which is a transformed partial sum process of the martingale residuals) is compared to the simulated processes under the null hypothesis that the proportional hazards assumption holds. 23

24 Ch ecking PH Assumption for log(protime) 24

25 Comment Comment The observed standardized score process for log(Protime) and the first 20 of 10,000 simulated null processes reveals violation of the proportional hazards assumption As Lin et al. (1993) suggests, the violation may be corrected using time-dependent covariates or stratification 25

26 The Kolmogorov-type supremum test results for all the covariates Checking PH assumption Variable Maximum Absolute Value ReplicationsSeed Pr > MaxAbsVal logBilirubin1.08801000190.1480 logProtime 1.7243100019 0.0010 logAlbumin0.84431000190.4390 Age0.73871000190.4780 Edema 1.4350100019 0.0310 26

27 Comment In addition to log(Protime), the proportional hazards assumption appears to be violated for Edema. 27

28 28

29 29


Download ppt "Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional."

Similar presentations


Ads by Google