HST412: SURVIVAL MODELS Chipepa Fastel
Outline Survival functions, hazard rates, types of censoring and truncation. Life tables, Kaplan-Meier plots, log-rank tests, Cox regression models, inference for parametric regression models. Survival models and the life table; Describe the future lifetime as a random variable Define probabilities of death and survival, Define the actuarial functions tpx, tqx, n/mqx, Define the complete and curtate expectations of future lifetime, Describe the life table functions lx and dx, Describe the simple laws of mortality, Define simple assurance and annuity contracts and develop formulae for means and variances. Estimating the lifetime distribution Fx(t); Describe how lifetime data might be censored,
Outline Describe the estimation of empirical survival function, Describe the Kaplan- Meier estimate of the survival function in the presence of censoring, Describe the Nelson-Aalen estimate of the cumulative hazard rate in the presence of censoring, compute it from typical data and estimate its variance. The Cox regression model; Describe the Cox model for proportional hazards and derive the partial likelihood estimate. The two state Markov model; Describe the two state model of a single decrement and compare the assumptions with those of the random lifetime, Derive the MLE for the transition intensities in models of transfers between states with piecewise constant transition intensities, Define waiting time in a state. The general Markov model; Describe the statistical models of transfers between multiple states, State the assumptions underlying the Markov model of transfers between a finite number of states in continuous time,
Outline Binomial and Poisson Models of Mortality; Describe the Binomial model of mortality, derive a maximum likelihood estimator for the probability of death and compare the Binomial model with the multiple state models Graduation and statistical tests; Describe how to test crude estimates for consistency with standard table or a set of graduated estimates, and describe the process. Methods of graduation; Describe the process of graduation by the three common methods and state the advantages and disadvantages of each. Exposed to risk; Define initial and central exposed to risk, and the various common rate intervals, Calculate the central exposed to risk in simple cases. State the principle of correspondence.
Overview What is survival analysis? Terminology and data structure. Survival/hazard functions. Parametric versus semi-parametric regression techniques. Introduction to Kaplan-Meier methods (non-parametric). Relevant SAS Procedures (PROCS).
Early example of survival analysis, 1669 Christiaan Huygens' 1669 curve showing how many out of 100 people survive until 86 years. From: Howard Wainer STATISTICAL GRAPHICS: Mapping the Pathways of Science. Annual Review of Psychology. Vol. 52: 305-335.
Early example of survival analysis Roughly, what shape is this function? What was a person’s chance of surviving past 20? Past 36? This is survival analysis! We are trying to estimate this curve—only the outcome can be any binary event, not just death.
What is survival analysis? Statistical methods for analyzing longitudinal data on the occurrence of events. Events may include death, injury, onset of illness, recovery from illness (binary variables) or transition above or below the clinical threshold of a meaningful continuous variable (e.g. CD4 counts). Accommodates data from randomized clinical trial or cohort study design.
Randomized Clinical Trial (RCT) Intervention Control Disease Random assignment Disease-free Target population Disease-free, at-risk cohort Disease Disease-free TIME
Randomized Clinical Trial (RCT) Treatment Control Cured Random assignment Not cured Target population Patient population Cured Not cured TIME
Randomized Clinical Trial (RCT) Treatment Control Dead Random assignment Alive Target population Patient population Dead Alive TIME
Cohort study (prospective/retrospective) Disease Exposed Disease-free Target population Disease-free cohort Disease Unexposed Disease-free TIME
Examples of survival analysis in medicine
RCT: Women’s Health Initiative (JAMA, 2002) On hormones On placebo Cumulative incidence Women’s Health Initiative Writing Group. JAMA. 2002;288:321-333.
WHI and low-fat diet… Control Low-fat diet Prentice et al. JAMA, February 8, 2006; 295: 629 - 642.
Retrospective cohort study: From December 2003 BMJ: Aspirin, ibuprofen, and mortality after myocardial infarction: retrospective cohort study Curits et al. BMJ 2003;327:1322-1323.
Objectives of survival analysis Estimate time-to-event for a group of individuals, such as time until second heart-attack for a group of MI patients. To compare time-to-event between two or more groups, such as treated vs. placebo MI patients in a randomized controlled trial. To assess the relationship of co-variables to time-to-event, such as: does weight, insulin resistance, or cholesterol influence survival time of MI patients? Note: expected time-to-event = 1/incidence rate
Why use survival analysis? 1. Why not compare mean time-to-event between your groups using a t-test or linear regression? -- ignores censoring 2. Why not compare proportion of events in your groups using risk/odds ratios or logistic regression? --ignores time 1. If no censoring (everyone followed to outcome-of-interest) than ttest on mean or median time to event is fine. 2. If time at-risk was the same for everyone, could just use proportions.
Survival Analysis: Terms Time-to-event: The time from entry into a study until a subject has a particular outcome Censoring: Subjects are said to be censored if they are lost to follow up or drop out of the study, or if the study ends before they die or have an outcome of interest. They are counted as alive or disease-free for the time they were enrolled in the study. If dropout is related to both outcome and treatment, dropouts may bias the results PhD candidates who are most likely to take longest may be most likely to drop out, thereby biasing results.
Data Structure: survival analysis Two-variable outcome : Time variable: ti = time at last disease-free observation or time at event Censoring variable: ci =1 if had the event; ci =0 no event by time ti
CENSORING Different types Right Left Interval Each leads to a different likelihood function Most common is right censored
Right censored data “Type I censoring” Event is observed if it occurs before some prespecified time Mouse study Clock starts: at first day of treatment Clock ends: at death Always be thinking about ‘the clock’
Simple example: Type I censoring Time 0
Introduce “administrative” censoring Time 0 STUDY END
Introduce “administrative” censoring Time 0 STUDY END
More realistic: clinical trial “Generalized Type I censoring” Time 0 STUDY END
More realistic: clinical trial “Generalized Type I censoring” Time 0 STUDY END
Additional issues Patient drop-out Loss to follow-up
Drop-out or LTFU Time 0 STUDY END
How do we ‘treat” the data? Shift everything so each patient time represents time on study Time of enrollment
Another type of censoring: Competing Risks Patient can have either event of interest or another event prior to it Event types ‘compete’ with one another Example of competers: Death from lung cancer Death from heart disease Common issue not commonly addressed, but gaining more recognition
Left Censoring The event has occurred prior to the start of the study OR the true survival time is less than the person’s observed survival time We know the event occurred, but unsure when prior to observation In this kind of study, exact time would be known if it occurred after the study started Example: Survey question: when did you first smoke? Alzheimers disease: onset generally hard to determine HPV: infection time
Interval censoring Due to discrete observation times, actual times not observed Example: progression-free survival Progression of cancer defined by change in tumor size Measure in 3-6 month intervals If increase occurs, it is known to be within interval, but not exactly when. Times are biased to longer values Challenging issue when intervals are long
Key components Event: must have clear definition of what constitutes the ‘event’ Death Disease Recurrence Response Need to know when the clock starts Age at event? Time from study initiation? Time from randomization? time since response? Can event occur more than once?
Introduction to survival distributions Ti the event time for an individual, is a random variable having a probability distribution. Different models for survival data are distinguished by different choice of distribution for Ti.
Describing Survival Distributions Parametric survival analysis is based on so-called “Waiting Time” distributions (ex: exponential probability distribution). The idea is this: Assume that times-to-event for individuals in your dataset follow a continuous probability distribution (which we may or may not be able to pin down mathematically). For all possible times Ti after baseline, there is a certain probability that an individual will have an event at exactly time Ti. For example, human beings have a certain probability of dying at ages 3, 25, 80, and 140: P(T=3), P(T=25), P(T=80), P(T=140). These probabilities are obviously vastly different.
Probability density function: f(t) In the case of human longevity, Ti is unlikely to follow a normal distribution, because the probability of death is not highest in the middle ages, but at the beginning and end of life. Hypothetical data: People have a high chance of dying in their 70’s and 80’s; BUT they have a smaller chance of dying in their 90’s and 100’s, because few people make it long enough to die at these ages.
Probability density function: f(t) The probability of the failure time occurring at exactly time t (out of the whole range of possible t’s).
Survival function: 1-F(t) The goal of survival analysis is to estimate and compare survival experiences of different groups. Survival experience is described by the cumulative survival function: F(t) is the CDF of f(t), and is “more interesting” than f(t). Example: If t=100 years, S(t=100) = probability of surviving beyond 100 years.
Cumulative survival Recall pdf: Same hypothetical data, plotted as cumulative distribution rather than density: Recall pdf:
Cumulative survival P(T>20) P(T>80)
Hazard Function: new concept AGES Hazard rate is an instantaneous incidence rate.
Hazard Function A little harder to conceptualize Instantaneous failure rate or conditional failure rate Interpretation: approximate probability that a person at time t experiences the event in the next instant. Only constraint: h(t)0 For continuous time,
Hazard Function Treatment related mortality Aging Useful for conceptualizing how chance of event changes over time That is, consider hazard ‘relative’ over time Examples: Treatment related mortality Early on, high risk of death Later on, risk of death decreases Aging Early on, low risk of death Later on, higher risk of death
Shapes of hazard functions Increasing Natural aging and wear Decreasing Early failures due to device or transplant failures Bathtub Populations followed from birth Hump-shaped Initial risk of event, followed by decreasing chance of event
Examples
Median Very/most common way to express the ‘center’ of the distribution Rarely see another quantile expressed Find t such that Complication: in some applications, median is not reached empirically Reported median based on model seems like an extrapolation Often just state ‘median not reached’ and give alternative point estimate.
X-year survival rate Many applications have ‘landmark’ times that historically used to quantify survival Examples: Breast cancer: 5 year relapse-free survival Pancreatic cancer: 6 month survival Acute myeloid leukemia (AML): 12 month relapse-free survival Solve for S(t) given t
Hazard vs. density This is subtle, but the idea is: When you are born, you have a certain probability of dying at any age; that’s the probability density (think: marginal probability) Example: a woman born today has, say, a 1% chance of dying at 80 years. However, as you survive for awhile, your probabilities keep changing (think: conditional probability) Example, a woman who is 79 today has, say, a 5% chance of dying at 80 years.
A possible set of probability density, failure, survival, and hazard functions. f(t)=density function F(t)=cumulative failure h(t)=hazard function S(t)=cumulative survival
A probability density we all know: the normal distribution What do you think the hazard looks like for a normal distribution? Think of a concrete example. Suppose that times to complete the midterm exam follow a normal curve. What’s your probability of finishing at any given time given that you’re still working on it?
f(t), F(t), S(t), and h(t) for different normal distributions:
Examples: common functions to describe survival Exponential (hazard is constant over time, simplest!) Weibull (hazard function is increasing or decreasing over time)
f(t), F(t), S(t), and h(t) for different exponential distributions:
f(t), F(t), S(t), and h(t) for different Weibull distributions: Parameters of the Weibull distribution
Exponential Constant hazard function: Exponential density function: Survival function:
With numbers… Why isn’t the cumulative probability of survival just 90% (rate of .01 for 10 years = 10% loss)? Incidence rate (constant). Probability of developing disease at year 10. Probability of surviving past year 10. (cumulative risk through year 10 is 9.5%)
Example… Recall this graphic. Does it look Normal, Weibull, exponential?
Example… One way to describe the survival distribution here is: P(T>76)=.01 P(T>36) = .16 P(T>20)=.20, etc.
Example… Or, more compactly, try to describe this as an exponential probability function—since that is how it is drawn! Recall the exponential probability distribution: If T ~ exp (h), then P(T=t) = he-ht Where h is a constant rate. Here: Event time, T ~ exp (Rate)
Example… To get from the instantaneous probability (density), P(T=t) = he-ht, to a cumulative probability of death, integrate: Area to the left Area to the right
Example… Solve for h:
Example… This is a “parametric” survivor function, since we’ve estimated the parameter h.
Hazard rates could also change over time… Example: Hazard rate increases linearly with time.
Relating these functions (a little calculus just for fun…): If you know one, you can derive all the others. We saw special case of 2 and 3 with exponential waiting times.
Getting density from hazard… Example: Hazard rate increases linearly with time.
Getting survival from hazard…
Methods to estimate distribution of survival times Nonparametric methods to estimate the distribution of survival times (both Kaplan-Meier and life table methods) Parametric models – Weibull model, Exponential model and Lognormal model Semi-parametric model – Cox proportional hazards model
Objectives To understand how to describe survival times To understand how to choose a survival analysis model
Survival Data (1) Example one: Four Liver Cancer Patients Mike 1/2/02 Date of Diagnosis Endpoint Date of Death or Censoring Survival Time (Day) Treatment Mike 1/2/02 Dead 9/1/02 242 A Kathy 4/7/02 7/8/02 92 Tom 3/3/02 Alive 11/4/02 246+ B Susan 2/4/02 11/3/02 272 Complete data (noncensored data): survival time = 242, 92, 272 Incomplete data (censored data): survival time = 246+ for Tom The survival time for Tom will exceed 246 days, but we don’t know the exact survival time for Tom.
Survival Data (2) Right-Censored Data: Subjects observed to be event-free to a certain time beyond which their status is unknown 1. Subjects sometimes withdraw from a study, or die from other causes (diseases). 2. The study is completed before the endpoint is reached. Methods for survival analysis must account for both censored and noncensored data.
Survival Data (3) Survival analysis assumes censoring is random. Censoring times vary across individuals and are not under the control of the investigator. Random censoring also includes designs in which observation ends at the same time for all individuals, but begins at different times.
Survival Data (4) Example two: Researchers treated 65 multiple myeloma patients with alkylating agents. Of those patients, 48 died during the study and 17 survived. The goal of this study is to identify important prognostic factors. TIME survival time in months from diagnosis STATUS 1 = dead, 0 = alive (censored) LOGBUN log blood urea nitrogen (BUN) at diagnosis HGB hemoglobin at diagnosis PLATELET platelets at diagnosis: 0 = abnormal, 1 = normal AGE age at diagnosis in years LOGWBC log WBC at diagnosis FRACTURE fractures at diagnosis: 0 = none, 1 = present LOGPBM log percentage of plasma cells in bone marrow PROTEIN proteinuria at diagnosis SALCIUM serum calcium at diagnosis
Survival Data (5) – more examples Survival analysis techniques arose from the life insurance industry as a method of costing insurance premiums. The term “survival” does not limit the usefulness of the technique to issues of life and death. A “survival” analysis could be used to examine: The survival time after a heart transplant The time a kidney graft remains functional The time from marriage to divorce The time from release to first arrest The time to a job change
Nonparametric Methods 1. Kaplan-Meier method (also called product-limit method) 2. Life table method To estimate the distribution of survival times -- estimate the survival rate -- calculate the median survival time -- graphs: survival curve, log(time) against log[-log(survival rate)] -- comparison of two survival curves
Product-Limit (Kaplan-Meier) Survival Estimates How to describe survival times (1) Product-Limit (Kaplan-Meier) Survival Estimates Survival Standard Number Number TIME Survival Failure Error Failed Left 0.0000 1.0000 0 0 0 65 1.2500 . . . 1 64 1.2500 0.9692 0.0308 0.0214 2 63 2.0000 . . . 3 62 2.0000 . . . 4 61 2.0000 0.9231 0.0769 0.0331 5 60 3.0000 0.9077 0.0923 0.0359 6 59 4.0000* . . . 6 58 4.0000* . . . 6 57 5.0000 . . . 7 56 5.0000 0.8758 0.1242 0.0411 8 55 --------------------------------------------------------------- 89.0000 0.0414 0.9586 0.0382 47 1 92.0000 0 1.0000 0 48 0 NOTE: The marked survival times are censored observations.
Product-Limit (Kaplan-Meier) Survival Estimates How to describe survival times (2) Product-Limit (Kaplan-Meier) Survival Estimates ni: the number of surviving units just prior to ti di: the number of units that fail at ti q = di / ni p = 1- q time ni di q p survival rate 1.25 65 2 2/65 63/65 (63/65)=0.9692 63 3 3/63 60/63 (63/65)(60/63)=0.9231 60 1 1/60 59/60 (63/65)(60/63)(59/60)=0.9077 5 57 2/57 55/57 (63/65)(60/63)(59/60)(55/57)=0.8758 Applied Epidemiologic Analysis Fall 2002
How to describe survival times (3) Product-Limit (Kaplan-Meier) Survival Estimates Kaplan-Meier method uses the actual observed event and censoring times. A problem arises with Kaplan-Meier method if there exist censored times that are later than the last event time. The average duration will be underestimated when we use the time until the last event occurs. In the practical application of such cases, an interpretation only considers the length of time until the last event occurs.
How to describe survival times (4) Life Table Survival Estimates Effective Conditional Interval Number Number Sample Probability [Lower, Upper) Failed Censored Size of Failure Survival NF NC n q p 0 10 16 5 62.5 0.2560 1.0000 20 15 7 40.5 0.3704 0.7440 20 30 3 1 21.5 0.1395 0.4684 30 40 3 0 18.0 0.1667 0.4031 40 50 2 1 14.5 0.1379 0.3359 50 60 4 2 11.0 0.3636 0.2896 60 70 2 0 6.0 0.3333 0.1843 70 80 0 1 3.5 0 0.1228 80 90 2 0 3.0 0.6667 0.1228 . 1 0 1.0 1.0000 0.0409 n = N – ½ (NC); 62.5 = 65 – 5/2, 40.5 = 44 – 7/2 q = NF / n; 0.2560 = 16/62.5, 0.3704 = 15/40.5 p = Пp = П(1-q); 0.7440 = 1 – 0.2560, 0.4684 = (1-0.2560)(1-0.3704)
How to describe survival times (5) Life Table Survival Estimates The Life Table method uses time interval. The Life Table method is very useful for a large sample, but the estimated results will depend on the chosen interval length. The larger the interval, the poorer the estimations. You should apply Kaplan-Meier method if the sample is not very large.
How to describe survival times (6) Survival Curve
How to describe survival times (7) Summary Statistics for Time Variable Point 95% Confidence Interval Percent Estimate [Lower Upper) 75 52.0000 35.0000 67.0000 50 19.0000 15.0000 35.0000 25 9.0000 6.0000 14.0000 Mean Standard Error 32.1460 4.0301 Percent Total Failed Censored Censored 65 48 17 26.15
How to describe survival times (8) Median Survival Time The median survival time is defined as the value at which 50% of the individuals have longer survival times and 50% have shorter survival times. The reason for reporting the median survival time rather than the mean survival time is because the distributions of survival time data often tend to be skewed, sometimes with a small number of long-term ‘survivors’. Another reason is that we can not calculate the mean survival time for the survival time with censored data.
How to describe survival times (9) How to estimate median survival time If there are no censored data, the median survival time is estimated by the middle observation of the ranked survival times. In the presence of censored data the median survival time is estimated by first calculating the Kaplan-Meier survival curve, then finding the value of survival time when survival rate=0.50 (50%)
How to describe survival times (10) Graph of Log Negative Log SDF versus Log Time Exponential Distribution The graph is approximately a straight line, the slope is 1. Weibull Distribution The graph is approximately a straight line, but the slope is greater or less than 1.
How to describe survival times (11) Graph of Log Negative Log SDF versus Log Time
Comparison of Two Survival Curves (1)
Comparison of Two Survival Curves (2) Median Survival Time Group 1: PLATELET = 0 (abnormal) Point 95% Confidence Interval Percent Estimate [Lower Upper) 50 13.0000 6.0000 35.0000 Group 2: PLATELET = 1 (normal) Point 95% Confidence Interval Percent Estimate (Lower Upper) 50 24.0000 16.0000 41.0000
Comparison of Two Survival Curves (3) Test of Equality of Two Survival Curves Test Chi-Square DF P Value Log-Rank 3.2923 1 0.0696 Wilcoxon 2.3724 1 0.1235 -2Log(LR) 2.4065 1 0.1208 Log-Rank test for Weibull distribution or proportional hazards assumption, using weight=1 so that each failure time has equal weighting, placing less emphasis on the earlier failure times. Wilcoxon test For lognormal distribution, using weight=the total number at risk at that time so that earlier times receive greater weight than later times, placing less emphasis on the later failure times. -2Log(LR) : Likelihood Ratio test for exponential distribution survival data.
Parametric Models (1) Whenever fundamental hypotheses are to be tested or you have clear idea about the distribution of survival data, you should use a parametric model. Three most common parametric models: 1. Exponential regression model 2. Weibull regression model 3. Lognormal regression model
Exponential Regression Model Parametric Models (2) Exponential Regression Model The exponential distribution is a useful form of the survival distribution when the hazard function (probability of failure) is constant and does not depend on time, the graph is approximately a straight line with slope=1. In biomedical field, a constant hazard function is usually unrealistic, the situation will not be the case.
Weibull Regression Model Parametric Models (3) Weibull Regression Model The hazard function changes with time, the graph is approximately a straight line, but the slope is not 1. The hazard function always increase when the parameter α >1 The hazard function always decrease when α <1 It is the exponential regression model when α=1
Lognormal Regression Model Parametric Models (4) Lognormal Regression Model The survival times are log-normal distribution. The hazard function changes with time. The hazard function first increase and then decrease (an inverted “U” shape).
Cox Model (1) Disadvantages of parametric models: 1. It is necessary to decide how the hazard function depends on time. 2. It may be difficult to find a parametric model if the hazard function is believed to be nonmonotonic. 3. Parametric models do not allow for explanatory variables whose values change over time. It is cumbersome to develop fully parametric models that include time-varying covariates. Time-varying covariates are very important in survival analysis: 1) continuous time-varying variable: income is changed over time 2) discrete time-varying variable: single - married - divorce - remarried
h(t|xi) = h0(t) exp (βixi) Cox Model (2) David Cox, a British statistician, solved these problems in 1972, published a paper entitled “Regression Models and Life-Tables (with Discussion),” Journal of the Royal Statistical Society, Series B, 34:187-220 h(t|xi) = h0(t) exp (βixi)
Why is Cox model a semiparametric model ? h(t|xi) = h0(t) exp (βixi) h0(t): nonparametric baseline hazard function, this function does not have to be specified, the hazard may change as a function of time. exp (βixi): parametric form for the effects of the covariates, the hazard function changes as a exponential function of covariates
For any time t, hi(t) / hj(t) = C Cox Model (4) Why is Cox model a ‘proportional hazards’ model? Any two individuals (or groups, i & j) at any point in time, the ratio of their hazards is a constant (a fixed proportional). For any time t, hi(t) / hj(t) = C C may depend on explanatory variables but not on time.
What is a partial likelihood ? h(t|xi) = h0(t) exp (βixi) Cox Model (5) What is a partial likelihood ? It is easy for a statistician to write down a model: h(t|xi) = h0(t) exp (βixi) It isn’t easy to devise ways to estimate this model. Cox’s most important contribution was to propose a method called partial likelihood because it does not include the baseline hazard function h0(t). Partial likelihood depends only on the order in which events occur, not on the exact times of occurrence.
What is a partial likelihood ? (cont) Cox Model (6) What is a partial likelihood ? (cont) Partial likelihood accounts for censored survival times. Partial likelihood allows time-dependent explanatory variables. It is not fully efficient because some information is lost by ignoring the exact times of event occurrence. But the loss of efficiency is usually so small that it is not worth worrying about.
Cox Model (7) Using Cox model to fit our data (final model) Parameter Standard Hazard 95% Hazard Ratio Variable Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits LOGBUN 1.67440 0.61209 7.4833 0.0062 5.336 1.608 17.709 HGB -0.11899 0.05751 4.2811 0.0385 0.888 0.793 0.994 The hazards ratio (also known as risk ratio) is the ratio of the hazards functions that correspond to a change of one unit of the given variable and conditional on fixed values of all other variables. An increase in one unit of the log of blood urea nitrogen increases the hazard of dying by 433.6% (5.336-1). An increase in one unit of hemoglobin at diagnosis decreases the hazard of dying by 11.2% (1-0.888).
Cox Model (8): Examine Proportional Hazards Assumption Checking the assumption graphically The two plots appear ‘parallel’ in that there is an approximately constant vertical distance between them at any given time. The hazards for the two groups are proportional, their ratio remains approximately constant with time.
2. Statistical test of the assumption Cox Model (9) Examine Proportional Hazards Assumption cont. 2. Statistical test of the assumption Testing the increasing or decreasing trend over time in the hazard function by investigating the interaction between time and covariate. A significant interaction would imply the hazard function changes with time, the proportional hazards model assumption is invalid.
How do you decide which model to use? (1) How does hazard function depend on time? Examples The hazard function for retirement increases with age. The hazard function for being arrested declines with age at least after age 25. The hazard function for death from any cause has “U” shape.
How do you decide which model to use? (2) Using exponential regression model if hazard function is constant and does not depend on time. 2. Using Weibull regression model (monotonic models) if hazard function always increases or always decreases with time. 3. Using Lognormal regression model (nonmonotonic models) if hazard function first increases and then decreases with time (an inverted “U” shape).
How do you decide which model to use? (3) 4. Using Cox regression model if hazard function first decreases and then increases, or changes dynamically (a “U” shape or other shapes) Cox model can fit any distribution of survival data if the proportional hazards assumption is valid (actually most hazards ratios are fixed proportional). This is why the Cox model is used so widely now. By the way, when we have a Cox model, we can not use this model for forecasting because we just have exp (βixi), we do not have the h0(t) (baseline hazard function). We have to estimate h0(t) (by using BASELINE Statement in SAS) before we forecast.