Presentation is loading. Please wait.

Presentation is loading. Please wait.

EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without.

Similar presentations


Presentation on theme: "EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without."— Presentation transcript:

1 EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission

2 Announcements Final paper assignment due next week Questions? Class topics: More on interpreting hazard & cumulative hazard functions More multilevel models…

3 Hazard Plots: Smoothing Issue: Stata heavily smooths hazard plots “Raw” hazard plots are very spiky… smoothing can help with interpretation Issue: Too much smoothing obscures the detail within your data –Simplest way to control smoothing: Set the “width” of the kernel smoother in Stata EX: sts graph, haz width(3) Lower width = less smoothing; try different values.

4 Hazard Smoothing Environmental Law Data: Default smoothing

5 Hazard Smoothing Environmental Law Data: width (1)

6 Hazard Smoothing Environmental Law Data: width (.2)

7 Hazard Smoothing Don’t make width too small!: width (.001) Stata’s default smoother amplifies peaks in data if width is too small!

8 Hazard Smoothing: Remarks Stata default smoothing is quite aggressive Obscures detail in your data –Stata default smoothing “width” is ~4 in this case Smoothing of 1-2 works much better In addition to removing detail, smoothing but lowers the peaks… Highest peak =.1 (width 4) Highest peak =.3 (width.2) Also: REALLY narrow width exaggerates peaks –Hightest peak = 50 (width.0001)

9 Survival Plot Problem: noorigin Issue: Stata always likes to include t=0…

10 Survival Plot Problem: noorigin Solution: sts graph, noorigin

11 Plots: Confidence Intervals Confidence intervals are a good idea Especially useful when comparing groups –Stata sts graph, ci sts graph, haz ci –Issue: Adding CIs tends to compress the Y axis to make room for the confidence bands Makes the hazard look less variable over time Watch for that… –Issue: CIs can make charts “busy” / hard to read.

12 Hazard Plot with 95% CI

13 Hazard plot with 95% CI

14 Survivor plot with 95% CI

15 Other sts graph options Options to show # of lost, entered, or censored cases Lost: puts a number above plots showing cases lost Atrisk: shows # of cases at risk –Actually, it shows risk per interval –EX: if unit = nation, it shows nation-years in an interval Censored: shows number of cases censored

16 Sts graph: atrisk

17 Interpreting Hazard & Cumulative Haz The survivor plot has a clear interpretation: The proportion of cases that have not experienced the event Assuming non-repeated events –If events repeat frequently, survivor falls to 0, stays there… Assuming the risk-set stays more-or-less constant –Survivor never goes back up, even if more cases enter the risk set… But, hazard rates & cumulative hazard rates are harder to understand intuitively… So, I made some illustrative examples

18 Hazard Example 1 Start with 10 people Let’s put them in the risk set sequentially All cases start at time t=0 One case fails at each point in time Start End Failed? 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 1 0 7 1 0 8 1 0 9 1 0 10 1

19 Example 1: Survivor Plot

20 Example 1: Hazard Plot Events occur at an even interval… but rate goes up because the risk set dwindles…

21 Example 1: Integrated Hazard

22 Example 2 Let’s figure out what’s really going on… Again, start with 10 people Imagine each enters the risk set sequentially, and fails after 1 time unit –So only 1 case at risk in any period of time –And, 1 event per each point in time Start End Failed? 0 1 1 1 2 1 2 3 1 3 4 1 4 5 1 5 6 1 6 7 1 7 8 1 8 9 1 9 10 1

23 Example 2: Survivor Plot Survivor drops to zero when first case fails… doesn’t go back up when additional cases enter NOT very informative…

24 Example 2: Hazard Plot Hazard basically sits at 1.0. Variations = due to smoothing issues… That’s because for every time unit at risk there is event

25 Interpreting Hazards Let’s run an exponential model We’ll estimate the constant only… the baseline hazard. streg, dist(exponential) nohr Exponential regression -- log relative-hazard form No. of subjects = 10 Number of obs = 10 No. of failures = 10 Time at risk = 10 LR chi2(0) = 0.00 Log likelihood = 5.1044126 Prob > chi2 =. ------------------------------------------------------------------------------ _t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 0.3162278 0.00 1.000 -.619795.619795 ------------------------------------------------------------------------------ Why is the base rate zero? Answer: We need to exponentiate! Exp(0) = 1 The model estimates the baseline hazard to be 1.0!

26 Example 2: Integrated Hazard Integrated Hazard reaches 10 Same number of events as previous example… but less time-at-risk… so overall cumulated risk was higher

27 Example 3 Let’s keep those same cases but add 10 more Each in risk for 1 time-unit; all of which are censored Start End Failed? 0 1 1 1 2 1 2 3 1 3 4 1 4 5 1 5 6 1 6 7 1 7 8 1 8 9 1 9 10 1 0 1 0 1 2 0 2 3 0 3 4 0 4 5 0 5 6 0 6 7 0 7 8 0 8 9 0 9 10 0

28 Example 3: Hazard Plot The risk set is doubled, but # events stays the same… So, hazard drops by half… to.5

29 Interpreting Hazards Let’s run an exponential model We’ll estimate the constant only… the baseline hazard. streg, dist(exponential) nohr Exponential regression -- log relative-hazard form No. of subjects = 20 Number of obs = 20 No. of failures = 10 Time at risk = 20 LR chi2(0) = 0.00 Log likelihood = -1.8270592 Prob > chi2 =. ------------------------------------------------------------------------------ _t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | -.6931472.3162278 -2.19 0.028 -1.312942 -.0733521 ------------------------------------------------------------------------------ Exp(-.693) =.5 The baseline hazard rate is.5…

30 Example 3: Integrated Hazard Likewise, integrated hazard is only half as big…

31 Example 4 What about when events occur in clumps? Example: two dense clusters of events –Between times 1-2 and 4-5 Start End Failed? 0 1 1 0 1.25 1 0 1.5 1 0 1.75 1 0 2 1 0 3 1 0 4 1 0 4.25 1 0 4.5 1 0 4.75 1 0 5 1

32 Example 4: Survivor Plot Here we see the two “clumps” of events…

33 Example 4: Hazard Plot Second “clump” has much higher hazard because the risk set is much smaller… Default smoothing pretty much wipes out the first clump

34 Example 4: Hazard Plot, less smoothing Hazard with “width(.3)” Now both clumps of events are clearly visible…

35 Example 4: Integrated Hazard Note how events with small risk set affect the cumulative hazard more (2 nd clump)…

36 Interpreting Hazards The hazard rate reflects the rate of events per unit time at risk A constant hazard of.1 for one time-unit means that 10% of at-risk cases will have events –But, things are often more complex than that when hazards are computed in continuous time The rate may vary within the interval depending on how the events are concentrated The risk set may change over the interval… esp. if cases leave the risk set.

37 Interpreting Integrated Hazards Integrated hazards represent the total amount of risk that has accumulated If the hazard is constant at.1, the integrated hazard would reach 10 after one hundred time-units…


Download ppt "EHA: More On Plots and Interpreting Hazards Sociology 229A: Event History Analysis Class 9 Copyright © 2008 by Evan Schofer Do not copy or distribute without."

Similar presentations


Ads by Google