Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional.

Slides:



Advertisements
Similar presentations
Residuals Residuals are used to investigate the lack of fit of a model to a given subject. For Cox regression, there’s no easy analog to the usual “observed.
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Tests of Static Asset Pricing Models
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Managerial Economics in a Global Economy
Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
HSRP 734: Advanced Statistical Methods July 24, 2008.
Ch11 Curve Fitting Dr. Deshi Ye
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
The Simple Linear Regression Model: Specification and Estimation
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Applied Geostatistics
Linear and generalised linear models
Introduction to Probability and Statistics Linear Regression and Correlation.
An Introduction to Logistic Regression
Chapter 11: Inference for Distributions
Inferences About Process Quality
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Modeling clustered survival data The different approaches.
Simple Linear Regression and Correlation
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Model Checking in the Proportional Hazard model
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Inference for regression - Simple linear regression
Simple Linear Regression
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Random Sampling, Point Estimation and Maximum Likelihood.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Cox Regression II Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy Kristin Sainani Ph.D.
Borgan and Henderson:. Event History Methodology
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Why Model? Make predictions or forecasts where we don’t have data.
TODAY we will Review what we have learned so far about Regression Develop the ability to use Residual Analysis to assess if a model (LSRL) is appropriate.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Bootstrap Event Study Tests Peter Westfall ISQS Dept. Joint work with Scott Hein, Finance.
Treat everyone with sincerity,
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Random Processes Gaussian and Gauss-Markov processes Power spectrum of random processes and white processes.
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Joint Modelling of Accelerated Failure Time and Longitudinal Data By By Yi-Kuan Tseng Yi-Kuan Tseng Joint Work With Joint Work With Professor Jane-Ling.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Estimation Econometría. ADE.. Estimation We assume we have a sample of size T of: – The dependent variable (y) – The explanatory variables (x 1,x 2, x.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
03/20161 EPI 5344: Survival Analysis in Epidemiology Testing the Proportional Hazard Assumption April 5, 2016 Dr. N. Birkett, School of Epidemiology, Public.
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Probability Theory and Parameter Estimation I
The Simple Linear Regression Model: Specification and Estimation
Stochastic Hydrology Hydrological Frequency Analysis (II) LMRD-based GOF tests Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
6-1 Introduction To Empirical Models
Parametric Survival Models (ch. 7)
Regression Assumptions
Parametric Methods Berlin Chen, 2005 References:
Love does not come by demanding from others, but it is a self initiation. Survival Analysis.
Regression Assumptions
Presentation transcript:

Proportional Hazards Model Checking the adequacy of the Cox model: The functional form of a covariate The link function The validity of the proportional hazards assumptions 1

Cox-Snell Residuals Definitions 1. Cox-Snell Residuals r j =-ln{S(t j ; θ ̂ )} S( t j ; θ ̂ ) is the value of the estimated survivor function at t ime t j. They are just just the estimated cumulative hazard If the model is correct, then the residuals should have a n exponential distribution with mean 1. Cox-Snell residuals are useful for assessing the fit of th e parametric models They are not very informative for Cox models estimated by partial likelihood. 2

Martingale Residuals 2. Martingale Residuals For a censored case, the Martingale residual is the negative of the Cox-Snell residual. For an uncensored case, it is one minus the Cox-Snell residual. Martingale residuals can then be plotted against the respective covariate and enhance the plots by including L owess curves (smoother) to indicate the functional form of the relationship between the log-hazard function and the covariate. Weaknesses. They are not symmetrically distributed about zero even when the fitted model is correct. This skewness makes plots difficult to interpret. 3

Deviance Residuals Definitions 3. Deviance Residuals Behaving much like residuals from LS regression Symmetrically distributed around 0 and have an approximate standard deviation of 1.0. Are negative for observations that have longer survival times than expected and positive for observations with survival times that are smaller than expected. I Censoring can produce striking patterns that don't necessarily imply any problem with the model itself. 4

Liver Data Example Data data Liver; input Time Status Age Albumin Bilirubin Edema Protime label Time="Follow-up Time in Years"; Time= Time / ; datalines; …. 5

Liver Data, Fitting PH Fitting PH Cox Model ParameterDF Parameter Estimate Standard Error Chi-SquarePr > ChiSq Hazard Ratio Bilirubin < logProtime logAlbumin < Age < Edema TotalEventCensored Percent Censored

Deviance Residual Diagnosis 7

8

9

Conventional Residuals Analysis Issues highly subjective difficult to interpret 10

New Method of Residual Diagnosis Objective way Checking model fit based on cumulative sum of Martingale Asymptotic property of the sum Gaussian Process Bootstrapping 11

Definition of Random Process Definitions 1. Random Process (Stochastic Process) A random process is the counterpart to a deterministic process. Instead of dealing with only one possible "reality" of how the process might evolve under time (as is the case, for example, for solutions of an ordinary differential equation), in a stochastic or random process there is some indeterminacy in its future evolution described by probability distributions This means that even if the initial condition (or starting point) is known, there are many possibilities the process might go to, but some paths are more probable and others less Example: Markov process,, Gaussian process 12

Definition of Random Process Random process X(t) X 2 (t) X N (t) t The totality of all sample functions is called an ensemble For a specific time X(tk) is a random variable 13

Definition of Gaussian Process 2. Gaussian Process A random process X(t) is a Gaussian process if for all n and for all, the random variables has a jointly Gaussian density function, which may expressed as : n random variables : mean value vector : nxn covariance matrix 14

Why Gaussian Process ? Central limit theorem The sum of a large number of independent and identically distributed(i.i.d) random variables getting closer to Gaussian distribution Cumulative residuals will be centered at zero if the model is correct. Under the null hypothesis of a correct model fit, they can be approximated as a zero mean Gaussian process with a covariance structure determined by the particular type of regression model. Realizations of the Gaussian process can be simulated by computer and compared with the observed process to assess whether the observed residual process represents anything beyond random variation. 15

Liver Data, Residuals Diagnosis 1. Checking the Functional Form of a Covariate 16

Liver Data, Residuals Diagnosis 17

Liver Data, Residuals Diagnosis 18

Residuals Sum Diagnosis Summary The light dashed lines in Figure 2 are the first 20 realizations of 10,000 simulated paths of the cumulative residual process under the null hypothesis of a correct model fit. All the paths tend to be closer to and intersect the horizontal axis compared the observed residuals. The fitted model overestimates the hazards for the low end of the Bilirubin values and underestimate the hazards for high Bilirubin values None of the 10,000 simulated paths has an absolute maximum exceeding that of the observed process. Thus, the p-value for a Kolmogorov-type supremum test is 0. These results suggest that there may be a better fitting model for the surgical unit data. The pattern suggests a logarithmic transform. 19

Fitting Cox With logBilirubin After Fitting Cox to Liver data using logBilurubin in stead of Bilirubin VariableDF Parameter Estimate Standard Error Chi-SquarePr > ChiSq Hazard Ratio logBilirubin < logProtime logAlbumin < Age < Edema

Log Transformation of Bilirubin Residuals Diagnosis after fitting logBilirubin 21

Comment When the log transform is applied to Bilirubin, the observed process appears to be more typical of the simulated processes. The p-value, based on 10,000 simulated samples, is , indicating a much improved model 22

Checking PH Assumptions 2. Checking Proportional Hazards Assumptions To check the proportional hazards assumption the score process (which is a transformed partial sum process of the martingale residuals) is compared to the simulated processes under the null hypothesis that the proportional hazards assumption holds. 23

Ch ecking PH Assumption for log(protime) 24

Comment Comment The observed standardized score process for log(Protime) and the first 20 of 10,000 simulated null processes reveals violation of the proportional hazards assumption As Lin et al. (1993) suggests, the violation may be corrected using time-dependent covariates or stratification 25

The Kolmogorov-type supremum test results for all the covariates Checking PH assumption Variable Maximum Absolute Value ReplicationsSeed Pr > MaxAbsVal logBilirubin logProtime logAlbumin Age Edema

Comment In addition to log(Protime), the proportional hazards assumption appears to be violated for Edema. 27

28

29