Event History Analysis 7

Slides:

Advertisements

Similar presentations

Residuals Residuals are used to investigate the lack of fit of a model to a given subject. For Cox regression, there’s no easy analog to the usual “observed.

Advertisements

Multilevel Event History Modelling of Birth Intervals

Event History Models 1 Sociology 229A: Event History Analysis Class 3

Brief introduction on Logistic Regression

Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.

HSRP 734: Advanced Statistical Methods July 24, 2008.

Lecture 4 (Chapter 4). Linear Models for Correlated Data We aim to develop a general linear model framework for longitudinal data, in which the inference.

Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Multinomial Logit Sociology 8811 Lecture 11 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Models with Discrete Dependent Variables

Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.

1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.

Duration models Bill Evans 1. timet0t0 t2t2 t 0 initial period t 2 followup period a b c d e f h g i Flow sample.

Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.

Event History Analysis: Introduction Sociology 229 Class 3 Copyright © 2010 by Evan Schofer Do not copy or distribute without permission.

So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.

Parametric EHA Models Sociology 229A: Event History Analysis Class 6 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.

In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.

Topic 3: Regression.

Event History Models Sociology 229: Advanced Regression Class 5

An Introduction to Logistic Regression

Event History Analysis 5 Sociology 8811 Lecture 19 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Event History Analysis 1 Sociology 8811 Lecture 16 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Event History Models 2 Sociology 229A: Event History Analysis Class 4 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.

Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

Model Checking in the Proportional Hazard model

Multiple Regression 1 Sociology 8811 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

Event History Models: Cox & Discrete Time Models

Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.

Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.

Linear Regression Inference

Methods Workshop (3/10/07) Topic: Event Count Models.

Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Parametric EHA Models Sociology 229: Advanced Regression Class 6

Multiple Regression 1 Sociology 5811 Lecture 22 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without.

From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.

Multiple Regression 3 Sociology 5811 Lecture 24 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.

2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.

“Further Modeling Issues in Event History Analysis by Robert E. Wright University of Strathclyde, CEPR-London, IZA-Bonn and Scotecon.

Count Models 1 Sociology 8811 Lecture 12

University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.

HSRP 734: Advanced Statistical Methods July 17, 2008.

April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.

More EHA Models & Diagnostics Sociology 229A: Event History Analysis Class 7 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.

EHA Diagnostics Sociology 229A: Event History Analysis Class 5 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.

Lecture 12: Cox Proportional Hazards Model

Right Hand Side (Independent) Variables Ciaran S. Phibbs.

01/20151 EPI 5344: Survival Analysis in Epidemiology Cox regression: Introduction March 17, 2015 Dr. N. Birkett, School of Epidemiology, Public Health.

Correlation & Regression Analysis

Logistic Regression Analysis Gerrit Rooks

Treat everyone with sincerity,

ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.

01/20151 EPI 5344: Survival Analysis in Epidemiology Hazard March 3, 2015 Dr. N. Birkett, School of Epidemiology, Public Health & Preventive Medicine,

Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Logistic Regression and Odds Ratios Psych DeShon.

Birthweight (gms) BPDNProp Total BPD (Bronchopulmonary Dysplasia) by birth weight Proportion.

REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.

VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.

Remember the equation of a line: Basic Linear Regression As scientists, we find it an irresistible temptation to put a straight line though something that.

Logistic Regression: Regression with a Binary Dependent Variable.

Event History Analysis 3

CHAPTER 26: Inference for Regression

Parametric Survival Models (ch. 7)

Count Models 2 Sociology 8811 Lecture 13

Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.

Presentation transcript:

Event History Analysis 7 Sociology 8811 Lecture 21 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission

Announcements Paper Assignment #2 Due April 26 Class topic: Try to find a dataset soon Class topic: Parametric EHA models; diagnostics Later (if time allows): AFT models, discrete time models

Parametric Proportional Hazard Models Cox models do not specify a functional form for the hazard curve, h(t) Rather, they examine effects of variables net of a baseline hazard trend (to be inferred from the data) h(t) = h0(t)ebX = h0(t)exp(bX) Parametric models specify the general shape of the hazard curve Approach is more familiar – more like regression We can model Y as a constant, a linear function, a logit function, a binomial function (poisson), etc For instance, we could assume h(t) was a linear Then solve for values of a hazard slope that best fit the data (plus effects of other covariates on hazard rate).

Parametric Proportional Hazard Models Parametric models work best when you choose a curve that fits the data Just like OLS regression – which works best when the relationship between two variables is roughly linear If the actual relationship between two variables is non-linear, coefficient estimates may be incorrect Though sometimes one can transform variables (e.g., logging them) to get a good fit… Parametric models are more efficient than Cox models They can generate more precise estimates for a given sample size But, they can also be more wildly incorrect if you mis-specify h(t)! Note: These are proportional hazard models – like Cox! You must still check the proportional hazard assumption.

Exponential (Constant Rate) Model Exponential models are simplest: Note that there is no “t” in the equation… no coefficient that specifies time dependence of the hazard rate Rather, there are just exponentiated BXs PLUS: a, the constant Note 2: Box-Steffensmeier & Jones: h(t)=e-(bX) An exponential model solves for the constant value (a) that best fits the data… Along with values of Bs, which reflect effects of X vars In effect, the model assumes a constant hazard rate .

Exponential (Constant Rate) Model Another way of looking at it: An exponential model is a lot like a cox model But, with the assumption that the baseline hazard is a constant! Exponential Cox

Exponential (Constant Rate) Model Basic Model. Constant reflects base rate . streg gdp degradation education democracy ngo ingo, dist(exponential) nohr Exponential regression -- log relative-hazard form No. of subjects = 92 Number of obs = 1938 No. of failures = 77 Time at risk = 1938 Wald chi2(6) = 94.29 Log pseudolikelihood = 282.11796 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ | Robust _t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gdp | -.044568 .1842564 -0.24 0.809 -.4057039 .3165679 degradation | -.4766958 .1044108 -4.57 0.000 -.6813372 -.2720543 education | .0377531 .0130314 2.90 0.004 .0122121 .0632942 democracy | .2295392 .0959669 2.39 0.017 .0414475 .417631 ngo | .4258148 .1576803 2.70 0.007 .1167671 .7348624 ingo | .3114173 .365112 0.85 0.394 -.4041891 1.027024 _cons | -4.565513 1.864396 -2.45 0.014 -8.219663 -.9113642 Constant shows base hazard rate estimated from data: exp(-4.57) = .01

Exponential (Constant Rate) Model Suppose we plotted the baseline hazard rate estimated from our exponential model It would be a flat line: h(t) = .01 This is the estimated hazard if all X vars are zero If we plotted the estimated hazard for some values of X (ex: democracy = 10), we would get a higher value Since democracy has a positive effect, Democ = 10 would yield a higher hazard than democ = 0 But, again, the estimated hazard rate trend would be a flat line over time…

Exponential Model: Baseline Hazard Ex: stcurve, hazard See, the estimated baseline hazard really is flat!

Exponential Model: Estimated Hazard stcurve, hazard at1(democ=1) at2(democ=10) Here are estimated hazards for 2 groups Other vars pegged at mean

Exponential Model: Baseline Hazard Issue: Actual hazard is rising. A problem? Is an exponential model appropriate? Answer: It can be, IF we have X variables that account for increasing hazard If not, fit will be poor!

Exponential (Constant Rate) Model Cleves et al. 2004, p. 216: In the exponential model, h(t) being constant means that the failure rate is independent of time, and thus the failure process is said to lack memory. You may be tempted to view exponential regression as suitable for use only in the simplest of cases. This would be unfair. There is another sense in which the exponential model is the basis for all other models. The baseline hazard… is constant … the way in which the overall hazard varies is purely a function of bX. The overall hazard need not be constant with time; it is just that every bit of how the hazard varies must be specified in BX. If you fully understand a process, you should be able to do that. When you do not understand a process, you are forced to assign a role to time, and in that way, you hope, put to the side your ignorance and still describe the part of the process that you do understand. In addition, exponential models can be used to model the overall hazard as a function of time, if they include t or functions of t as covariates.

Exponential (Constant Rate) Model The exponential model is extremely flexible… You specify substantive covariates (X variables) to explain failures It is probably not due to some inherent feature of time, but rather due to some variable that you hope to control for If you do a great job, you will fully explain why hazard rate appears to go up (or down) over time And, you can include functions of time as independent variables to address temporal variation Independent (X) variable scan include time dummies, log time, linear time, time interactions, etc That is, if you can’t explain time variation with substantive X variables, you can add time variables to model it But, if you mis-specify your model, results will be biased In that case, you might be better off with a Cox model…

Piecewise Exponential Model If you have a lot of cases, you can estimate a piecewise model Essentially a separate model for different chunks of time Model will yield different coefficients and base rate (constant) for multiple chunks of time Even if hazard is not constant over time, it may be more or less constant in each period This allows you to effectively model any hazard trend A related approach: Put in time-period dummies This gives a single set of bX coefficient estimates But, allows you to specify changes in the hazard rate over different periods NOTE: Don’t forget to omit one of the time dummies!

Parametric Models Let’s try a more complex parametric model Example: Let’s specify a linear time trend Exponential Linear In this case, we estimate a constant (a) and slope (b0) which best summarize the time dependence of the hazard rate Note: this isn’t common – we have better options…

Gompertz Models Another option: an exponentiated line Rather than a linear function of time and exponentiated function of bX, we’ll exponentiate everything: Exponentiated Linear: Gompertz Slope coefficient is often represented by gamma: g Note: Exponentiation alters the line… it isn’t a simple linear function anymore. It is flat if gamma = 0 It is monotonically increasing if gamma > 0 It is monotonically decreasing if gamma < 0

Gompertz Models Exponentiating a linear function generates a curve defined by the value of gamma (g) Model estimates value of g that best fits the data g = 0 g < 0 g > 0 g >> 0

Gompertz Model Example: streg gdp degradation education democracy ngo ingo, robust nohr dist(gompertz) Gompertz regression -- log relative-hazard form No. of subjects = 92 Number of obs = 1938 No. of failures = 77 Time at risk = 1938 Wald chi2(6) = 46.48 Log pseudolikelihood = 307.64758 Prob > chi2 = 0.0000 _t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gdp | .4633559 .2104244 2.20 0.028 .0509316 .8757802 degradation | -.4394712 .1434178 -3.06 0.002 -.720565 -.1583775 education | .0026837 .0145341 0.18 0.854 -.0258026 .03117 democracy | .2890106 .092612 3.12 0.002 .1074943 .4705268 ngo | .2522894 .1658275 1.52 0.128 -.0727265 .5773054 ingo | .0037688 .2275176 0.02 0.987 -.4421575 .4496952 _cons | -253.035 45.28363 -5.59 0.000 -341.7892 -164.2807 gamma | .124117 .0224506 5.53 0.000 .0801146 .1681195 ------------------------------------------------------------------------------ Model estimates gamma to be positive, significant. Implies increasing baseline hazard

Gompertz Model: Estimated Hazard stcurve, hazard at1(democ=1) at2(democ=10) Estimated hazards for 2 groups Other vars pegged at mean Note: curves are actually proportional – hard to see because bottom curve is nearly zero…

Weibull Models Another option: the Weibull curve Another curve that can fit monatonic hazards Weibull Model estimates p to best fit the model Hazard is flat if p = 1 Hazard is monotonically increasing if p > 1 Hazard is monotonically decreasing if p < 1.

Weibull: Visually The Weibull family: Monotonic increasing or decreasing, depending on p Time Hazard Rate p = 1 p = 4 p = .5 p = 2

Weibull Model Example: streg gdp degradation education democracy ngo ingo, robust nohr dist(weibull) Weibull regression -- log relative-hazard form No. of subjects = 92 Number of obs = 1938 No. of failures = 77 Time at risk = 1938 LR chi2(6) = 23.71 Log likelihood = 307.6045 Prob > chi2 = 0.0006 _t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- gdp | .4631871 .2360589 1.96 0.050 .0005202 .9258541 degradation | -.4396978 .1486662 -2.96 0.003 -.7310781 -.1483175 education | .0027319 .0141652 0.19 0.847 -.0250314 .0304953 democracy | .288927 .0913855 3.16 0.002 .1098147 .4680394 ngo | .2522595 .1610192 1.57 0.117 -.0633324 .5678514 ingo | .004058 .1835743 0.02 0.982 -.355741 .363857 _cons | -1884.071 280.0398 -6.73 0.000 -2432.939 -1335.203 /ln_p | 5.511481 .1486542 37.08 0.000 5.220124 5.802837 p | 247.5173 36.79449 184.9571 331.2381 1/p | .0040401 .0006006 .003019 .0054067 ------------------------------------------------------------------------------

Ancillary Parameters Gompertz & Weibull models have parameters that determine the shape of the curve Gamma (g), p Ex: Bigger g = greater increase of h(t) over time You can actually specify covariate effects on those parameters Effectively allowing a different curve shape across values of X variables Ex: If you think that hazard increases more for men than women, you can look to see if Dmale affects g streg male educ, dist(gompertz) ancillary(male) Model estimates effect of male on hazard AND on gamma…

Parametric: Model Fit Parametric models use maximum likelihood estimation (MLE) Comparisons among nested models can be made using a likelihood ratio test (LR test) Just like logit: Addition of groups of variables can be tested with lrtest Some parametric models are themselves nested Ex: A Weibull model simplifies to an exponential model if p = 1 Thus, exponential is nested within Wiebull LR tests can be used to see if Weibull is preferable to exponential.

Parametric: Model Fit Parametric models use maximum likelihood estimation (MLE) Comparisons among nested models can be made using a likelihood ratio test (LR test) Just like logit: Addition of groups of variables can be tested with lrtest Some parametric models are themselves nested Ex: A Weibull model simplifies to an exponential model if p = 1 Thus, exponential is nested within Wiebull LR tests can be used to see if Weibull is preferable to exponential.

Parametric Model Fit: AIC Non-nested parametric models can be compared via the Akaike Information Criterion k = # independent variables in the model c = # shape parameters in model (ex: p in Weibull) Exponential has one parameter (a); Weibull has 2. AIC compares likelihoods, but corrects for parameters in the model – rewarding simpler models… Low values = better model fit Even for negative values… -100 is better than -50.

Frailty Two kinds of models: Shared Frailty – a “random effects” model Useful for clustered data (non-independent cases) Can be used with Cox & parametric models We’ll discuss this in detail in coming weeks Unshared Frailty Models for “unobserved heterogeneity” Only available for parametric models Refers to individual-specific (unknown) characteristics that affect likelihood of failure.

Unobserved Heterogeneity Unobserved heterogeneity = differences among cases in risk set that affect failure Think of it as “omitted variable bias” Example: Effect of drug on mortality Question: What half of the patients are smokers but you didn’t know that? An “unobserved” attribute that makes them different Answer: The smokers and non-smokers might have very different hazard rates… But, you wouldn’t know to control for this…

Unobserved Heterogeneity The observed hazard rate is modeled w/o controlling for the cause of the drop off… Observed h(t) Visually: Time (months) 0 10 20 30 40 50 60 70 Hazard Rate Smokers die early… exhausting the sample. Then h(t) drops off Smokers Non-Smokers

Unobserved Heterogeneity Result of unobserved heterogeneity: 1. Bias in the effects of covariates Due to “uncontrolled antecedents” (Yamaguchi 1991) 2. Problems estimating duration effects Because some leave the risk set early, resulting in a “depressed” rate later on Evidence of decline in hazard rate may be misleading.

Unobserved Heterogeneity Strategies: 1. Develop fully-specified models The best solution 2. Specify the form of the heterogeneity (frailty) Approach: assume unobserved alpha (a) – case-specific factor that makes events more (or less) likely Frailty Model Where h(t) is some familiar model (ex: Weibull) Requires functional form assumptions to estimate Ex: Assume a is gamma (or inv gaussian) distributed…

PH Assumption & Outliers Models discussed today are proportional hazard models… Require the same assumption as Cox models But, most of the “tests” of proportionality are only available in Cox models But: You can still use piecewise models and interaction terms to check the assumption Cumulative Cox-Snell residuals can be used to identify outliers Use “predict”: predict ccs, ccsnell Then, plot residuals by case ID, time, etc.

Parametric Models: Outliers Cumulative Cox-Snell residuals vs case ID Note that Scandinavia has highest residuals Probably not outliers, but interesting nevertheless

Accelerated Failure Time Models An alternative approach: model log time Using parametric approach like exponential or Weibull Focus is time rather than hazard rate But, models are similar to hazard rate models – just in a different “metric” Where last term “e” is assumed to have a distribution that defines the model (e.g., making it Weibull) AFT models aren’t very common in sociology But, don’t be intimidated by them… they are similar to parametric proportional hazard models… But some software presents coefficient signs that are opposite!

Discrete Time EHA Models Another completely different approach to EHA Described in Yamaguchi reading Break time into discrete chunks (ex: months, years) Model dichotomous outcome (event vs. non-event) for all chunks of time Allows use of simple model, like logit Other common discrete time models: Probit, complementary log log models (“cloglog”) Data structure is similar to what we did for time-varying covariates, but… All records must cover the same length of time Logit models don’t weight cases based on start/end time Instead, time in analysis is represented simply by the number of cases.

Choosing a Hazard Model A Cox model is a good starting point Less problems due to accidental mis-specification of the time-dependence of the hazard rate Box-Steffensmeier & Jones point to cites: Cox models are 95% as efficient as parametric models under many circumstances Cox models treat time dependence as a “nuisance”, put the focus on substantive covariates Which is often desirable.

Choosing a Hazard Model Parametric models are good when 1. You have strong theoretical expectations about the hazard rate 2. You are confident that you can fit the time dependence well with a parametric model 3. You need the most efficient estimates possible AGAIN: Substantive model specification is typically more important Biases due to omitted variables are often greater than biases due to poor model choice (e.g., Cox vs. Weibull) Also: In small samples, outliers are likely to be more important.