Download presentation
Presentation is loading. Please wait.
1
Nonparametric Estimation with Recurrent Event Data Edsel A. Pena Department of Statistics University of South Carolina E-mail: pena@stat.sc.edu Research supported by NIH and NSF Grants Based on joint works with R. Strawderman (Cornell) and M. Hollander (Florida State) Talk at Mississippi State Univ, 10/19/01
2
2 A Real Recurrent Event Data (Source: Aalen and Husebye (‘91), Statistics in Medicine) Variable: Migrating motor complex (MMC) periods, in minutes, for 19 individuals in a gastroenterology study concerning small bowel motility during fasting state.
3
3 Pictorial Representation of Data for a Unit or Subject Consider unit/subject #3. K = 3 Gap Times, T j : 284, 59, 186 Censored Time, -S K : 4 Calendar Times, S j : 284, 343, 529 Limit of Obs. Period: 0 =533 S 1 =284 S 2 =343 S 3 =529 T 1 =284 T 2 =59 T 3 =186 -S 3 =4 Calendar Scale
4
4 Features of Data Set Random observation period per subject (administrative constraints). Length of period: Event of interest is recurrent. A subject may have more than one event during observation period. # of events (K) informative about MMC period distribution (F). Last MMC period right-censored by a variable informative about F. Calendar times: S 1, S 2, …, S K. Right-censoring variable: -S K.
5
5 Assumptions and Problem Aalen and Husebye: “Consecutive MMC periods for each individual appear (to be) approximate renewal processes.” Translation: The inter-event times T ij ’s are assumed stochastically independent. Problem: Under this IID assumption, and taking into account the informativeness of K and the right- censoring mechanism, to estimate the inter-event distribution, F.
6
6 General Form of Data Accrual Calendar Times of Event Occurrences S i0 =0 and S ij = T i1 + T i2 + … + T ij Number of Events in Observation Period K i = max{j: S ij < i } Upper limit of observation periods, ’s, could be fixed, or assumed to be IID with unknown distribution G.
7
7 Observables Theoretical Problem To obtain an estimator of the gap- time or inter-event time distribution, F; and to determine its properties.
8
8 Relevance and Applicability Recurrent phenomena occur in a variety of settings. –Outbreak of a disease. –Terrorist attacks. –Labor strikes. –Hospitalization of a patient. –Tumor occurrence. –Epileptic seizures. –Non-life insurance claims. –When stock index (e.g., Dow Jones) decreases by at least 6% in one day.
9
9 Limitations of Existing Estimation Methods Consider only the first, possibly right-censored, observation per unit and use the product-limit estimator (PLE). –Loss of information –Inefficient Ignore the right-censored last observation, and use empirical distribution function (EDF). –Leads to bias (“biased sampling”). –Estimator actually inconsistent.
10
10 Review: Prior Results T 1, T 2, …, T n IID F(t) = P(T < t) Empirical Survivor Function (EDF) Asymptotics of EDF where W 1 is a zero-mean Gaussian process with covariance function Single-Event Complete Data
11
11 In Hazards View Hazard rate function Cumulative hazard function Equivalences Another representation of the variance
12
12 Single-Event Right-Censored Data Failure times: T 1, T 2, …, T n IID F Censoring times: C 1, C 2, …, C n IID G Right-censored data (Z 1, 1 ), (Z 2, 2 ), …, (Z n, n ) Z i = min(T i, C i ) i = I{T i < C i } with Product-limit or Kaplan-Meier Estimator
13
13 PLE Properties Asymptotics of PLE where W 2 is a zero-mean Gaussian process with covariance function If G(w) = 0 for all w, so no censoring,
14
14 Relevant Stochastic Processes for Recurrent Event Setting Calendar-Time Processes for ith unit is a vector of square-integrable zero-mean martingales. Then,
15
15 is the length since last event at calendar time v Needed: Calendar-Gaptime Space 0 Difficulty: arises because interest is on (.) or (.), but these appear in the compensator process in form GapTime Calendar Time 533284343529 s t For Unit 3 in MMC Data
16
16 Processes in Calendar-Gaptime Space N i (s,t) = # of events in calendar time [0,s] for ith unit whose gaptimes are at most t Y i (s,t) = number of events in [0,s] for ith unit whose gaptimes are at least t: “at-risk” process
17
17 GapTime Calendar Time 533284343529 s t s=400 t=100 MMC Unit #3 N 3 (s=400,t=100) = 1 Y 3 (s=400,t=100) = 1 K 3 (s=400) = 2
18
18 “Change-of-Variable” Formulas
19
19 Estimators of and F for the Recurrent Event Setting By “change-of-variable” formula, RHS is a sq-int. zero-mean martingale, so Estimator of (t)
20
20 Estimator of F Since by substitution principle, a generalized product-limit estimator (GPLE). GPLE extends the EDF for complete data, and the PLE or KME for single- event right-censored data.
21
21 Asymptotic Properties of GPLE Special Case: If F = EXP( ) and G = EXP( )
22
22 Weak Convergence Theorem: ; Proof relied on weak convergence theorem for recurrent and renewal settings in Pena, Strawderman and Hollander (2000), which utilized ideas in Sellke (1988) and Gill (1980).
23
23 Comparison of Limiting Variance Functions PLE: GPLE (recurrent event): For large s, For large t or if in stationary state, R(t) = t/ F, so approximately, with G (w) being the mean residual life of given > w. EDF:
24
24 Wang-Chang Estimator (JASA, ‘99) Beware! Wang and Chang developed this estimator to be able to handle correlated inter- event times, so comparison with GPLE is not completely fair to their estimator!
25
25 Frailty-Induced Correlated Model Correlation induced according to a frailty model: U 1, U 2, …, U n are IID unobserved Gamma( , ) random variables, called frailties. Given U i = u, (T i1, T i2, T i3, …) are independent inter-event times with Marginal survivor function of T ij:
26
26 Frailty-Model Estimator Frailty parameter, , determines dependence among inter-event times. Small (Large) : Strong (Weak) dependence. EM algorithm needed to obtain estimator (FRMLE). Unobserved frailties viewed as missing. Parallels Nielsen, Gill, Andersen, and Sorensen (1992). GPLE needed in EM algorithm. GPLE is not consistent when frailty parameter is finite, that is, when IID model does not hold.
27
27 Monte Carlo Studies Under gamma frailty model. F = EXP( ): = 6 G = EXP( ): = 1 n = 50 # of Replications = 1000 Frailty parameter took values in {Infty (IID), 6, 2} Computer programs: S-Plus and Fortran routines.
28
28 IID Simulated Comparison of the Three Estimators for Varying Frailty Parameter Black=GPLE; Blue=WCPLE; Red=FRMLE
29
29 Effect of the Frailty Parameter ( ) for Each of the Three Estimators (Black=Infty; Blue=6; Red=2)
30
30 The Three Estimates of Inter- Event Survivor Function for the MMC Data Set IID assumption seems acceptable. Estimate of is 10.2.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.