Download presentation
Presentation is loading. Please wait.
1
Nonparametric Estimation with Recurrent Event Data Edsel A. Pena Department of Statistics University of South Carolina E-mail: pena@stat.sc.edu Research supported by NIH and NSF Grants Based on joint works with R. Strawderman (Cornell) and M. Hollander (Florida State) Talk at MUSC, 11/12/01
2
2 A Real Recurrent Event Data (Source: Aalen and Husebye (‘91), Statistics in Medicine) Variable: Migrating motor complex (MMC) periods, in minutes, for 19 individuals in a gastroenterology study concerning small bowel motility during fasting state.
3
3 Pictorial Representation of Data for a Subject Consider unit/subject #3. K = 3 Period (Gap) Times, T j : 284, 59, 186 Censored Time, -S K : 4 Calendar Times, S j : 284, 343, 529 Limit of Observation Period: 0 =533 S 1 =284 S 2 =343 S 3 =529 T 1 =284 T 2 =59 T 3 =186 -S 3 =4 Calendar Scale
4
4 Features of Data Set Random observation period per subject (administrative constraints). Length of period: Event of interest is recurrent. A subject may have more than one event during observation period. # of events (K) informative about MMC period distribution (F). Last MMC period right-censored by a variable informative about F. Calendar times: S 1, S 2, …, S K. Right-censoring variable: -S K.
5
5 Assumptions and Problem Aalen and Husebye: “Consecutive MMC periods for each individual appear (to be) approximate renewal processes.” Translation: The inter-event times T ij ’s are assumed stochastically independent. Problem: Under this IID assumption, and taking into account the informativeness of K and the right- censoring mechanism, to estimate the inter-event distribution, F.
6
6 General Form of Data Accrual Calendar Times of Event Occurrences S i0 =0 and S ij = T i1 + T i2 + … + T ij Number of Events in Observation Period K i = max{j: S ij < i } Upper limit of observation periods, ’s, could be fixed, or assumed to be IID with unknown distribution G.
7
7 Observables Theoretical Problem To obtain a nonparametric estimator of the gap-time or inter-event time distribution, F; and to determine its properties. Such estimator could be used for other purposes.
8
8 Relevance and Applicability Recurrent phenomena occur in public health/medical settings. –Outbreak of a disease. –Hospitalization of a patient. –Tumor occurrence. –Epileptic seizures. In other settings. –Non-life insurance claims. –When stock index (e.g., Dow Jones) decreases by at least 6% in one day. –Labor strikes.
9
9 Existing Methods and Their Limitations Consider only the first, possibly right-censored, observation per unit and use the product-limit estimator (PLE). –Loss of information –Inefficient Ignore the right-censored last observation, and use empirical distribution function (EDF). –Leads to bias (“biased sampling”) –Estimator actually inconsistent
10
10 Review: Prior Results T 1, T 2, …, T n IID F(t) = P(T < t) Empirical Survivor Function (EDF) Asymptotics of EDF where W 1 is a zero-mean Gaussian process with covariance function Single-Event Complete Data
11
11 In Hazards View Hazard rate function Cumulative hazard function Equivalences Another representation of the variance
12
12 Single-Event Right-Censored Data Failure times: T 1, T 2, …, T n IID F Censoring times: C 1, C 2, …, C n IID G Right-censored data (Z 1, 1 ), (Z 2, 2 ), …, (Z n, n ) Z i = min(T i, C i ) i = I{T i < C i } Product-limit or Kaplan-Meier Estimator
13
13 PLE/KME Properties Asymptotics of PLE where W 2 is a zero-mean Gaussian process with covariance function If G(w) = 0 for all w, so no censoring,
14
14 Relevant Processes for Recurrent Event Setting Calendar-Time Processes for ith unit is a vector of square-integrable zero-mean martingales. Then,
15
15 is the length since last event at calendar time v Needed: Calendar-Gaptime Space 0 Difficulty: arises because interest is on (.) or (.), but these appear in the compensator process as GapTime Calendar Time 533284343529 s t For Unit 3 in MMC Data
16
16 Processes in Calendar-Gaptime Space N i (s,t) = # of events in calendar time [0,s] for the ith unit with gaptimes at most t Y i (s,t) = number of events in [0,s] for the ith unit with gaptimes at least t.
17
17 GapTime Calendar Time 533284343529 s t s=400 t=100 MMC Unit #3 N 3 (s=400,t=100) = 1 Y 3 (s=400,t=100) = 1 K 3 (s=400) = 2
18
18 “Change-of-Variable” Formulas
19
19 Estimators of and F for Recurrent Event Setting By “change-of-variable” formula, RHS is a sq-int. zero-mean martingale, so Estimator of (t)
20
20 Estimator of F Since a generalized product-limit estimator (GPLE). GPLE extends the EDF for complete data and the PLE or KME for single- event right-censored data to recurrent setting.
21
21 Asymptotic Properties of GPLE
22
22 Limiting Distribution Theorem: ;
23
23 Comparison of Variance Functions PLE: GPLE (recurrent event): For large s, If in stationary state, R(t) = t/ F, so G (w) is the mean residual life of given > w. EDF:
24
24 Wang-Chang Estimator (JASA, ‘99) Can handle correlated inter-event times. Comparison with GPLE may therefore be unfair to WC estimator!
25
25 Frailty Model U 1, U 2, …, U n are IID unobserved Gamma( , ) random variables, called frailties. Given U i = u, (T i1, T i2, T i3, …) are independent inter-event times with Marginal survivor function of T ij:
26
26 Estimator Under Frailty Model Frailty parameter, , determines dependence among inter-event times. Small (Large) : Strong (Weak) dependence. EM algorithm needed to obtain the estimator (FRMLE). Unobserved frailties viewed as missing. GPLE needed in EM algorithm. Under frailty model: GPLE not consistent when frailty parameter is finite.
27
27 Monte Carlo Comparisons Under gamma frailty model. F = EXP( ): = 6 G = EXP( ): = 1 n = 50 # of Replications = 1000 Frailty parameter took values {Infty (IID), 6, 2} Computer programs: S-Plus and Fortran routines.
28
28 IID Simulated Comparison of the Three Estimators for Varying Frailty Parameter Black=GPLE; Blue=WCPLE; Red=FRMLE
29
29 Effect of the Frailty Parameter ( ) for Each of the Three Estimators (Black=Infty; Blue=6; Red=2)
30
30 The Three Estimates of MMC Period Survivor Function IID assumption seems acceptable. Estimate of is 10.2.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.