Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nonparametric Estimation with Recurrent Event Data Edsel A. Pena Department of Statistics, USC Research supported by NIH and NSF Grants Based on joint.

Similar presentations


Presentation on theme: "Nonparametric Estimation with Recurrent Event Data Edsel A. Pena Department of Statistics, USC Research supported by NIH and NSF Grants Based on joint."— Presentation transcript:

1 Nonparametric Estimation with Recurrent Event Data Edsel A. Pena Department of Statistics, USC Research supported by NIH and NSF Grants Based on joint works with R. Strawderman (Cornell) and M. Hollander (Florida State) Talk at USC Epid/Biostat 03/04/02

2 2 A Real Recurrent Event Data (Source: Aalen and Husebye (‘91), Statistics in Medicine) Variable: Migrating motor complex (MMC) periods, in minutes, for 19 individuals in a gastroenterology study concerning small bowel motility during fasting state.

3 3 Pictorial Representation of Data for a Subject Consider subject #3. K = 3 Inter-Event Times, T j : 284, 59, 186 Censored Time,  -S K : 4 Calendar Times, S j : 284, 343, 529 Limit of Observation Period:  0  =533 S 1 =284 S 2 =343 S 3 =529 T 1 =284 T 2 =59 T 3 =186  -S 3 =4 Calendar Scale

4 4 Features of Data Set Random observation period per subject (administrative, time, resource constraints). Length of period:  Event of interest is recurrent. # of events (K) informative about MMC period distribution (F). Last MMC period right-censored by which is informative about F. Calendar times: S 1, S 2, …, S K.

5 5 Assumptions and Problem Aalen and Husebye: “Consecutive MMC periods for each individual appear (to be) approximate renewal processes.” Translation: The inter-event times T ij ’s are assumed stochastically independent. Problem: Under this IID or renewal assumption, and taking into account the informativeness of K and the right-censoring mechanism, how to estimate the inter-event distribution, F and its parameters (e.g., median).

6 6 General Form of Data Accrual Calendar Times of Event Occurrences S i0 =0 and S ij = T i1 + T i2 + … + T ij Number of Events in Observation Period K i = max{j: S ij <  i } Limits of observation periods,  i ’s, may be fixed, or assumed IID with unknown distribution G.

7 7 Observables Problem Obtain a nonparametric estimator of the inter-event time distribution, F, and determine its properties.

8 8 Relevance and Applicability Recurrent phenomena occur in public health/medical settings. –Outbreak of a disease. –Hospitalization of a patient. –Tumor occurrence. –Epileptic seizures. In other settings. –Non-life insurance claims. –Stock index (e.g., Dow Jones) decreases by at least 6% in a day. –Labor strikes. –Reliability and engineering.

9 9 Existing Methods and Their Limitations Consider only the first, possibly right-censored, observation per unit and use the product-limit estimator (PLE). –Loss of information –Inefficient Ignore the right-censored last observation, and use empirical distribution function (EDF). –Leads to bias (“biased sampling”) –Estimator inconsistent

10 10 Review: Complete Data T 1, T 2, …, T n IID F(t) = P(T < t) Asymptotics of EDF Empirical Survivor Func. (EDF)

11 11 In Hazards View Hazard rate function Cumulative hazard function Product representation

12 12 Alternative Representations Another representation of the variance EDF in Product Form

13 13 Review: Right-Censored Data Failure times: T 1, T 2, …, T n IID F Censoring times: C 1, C 2, …, C n IID G Right-censored data (Z 1,  1 ), (Z 2,  2 ), …, (Z n,  n ) Z i = min(T i, C i )  i = I{T i < C i } Product-Limit Estimator

14 14 PLE Properties Asymptotics of PLE If G(w) = 0, so no censoring,

15 15 Recurrent Event Setting N i (s,t) = number of events for the ith unit in calendar period [0,s] with inter-event times at most t. Y i (s,t) = number of events for the ith unit which are known during the calendar period [0,s] to have inter-event times at least t. K i (s) = number of events for the ith unit that occurred in [0,s]

16 16 s t N 3 (s=400,t=100) = 1 Y 3 (s=400,t=100) = 1 K 3 (s=400) = 2 Inter-Event Time Calendar Time  533 284343529 s=400 t=100 For MMC Unit #3 t=50 s=550 K 3 (s=550) = 3 N 3 (s=550,t=50) = 0 Y 3 (s=550,t=50) = 3

17 17 Aggregated Processes Limit Processes as s Increases

18 18 Estimator of F in Recurrent Event Setting Estimator is called the GPLE; generalizes the EDF and the PLE discussed earlier.

19 19 Properties of GPLE

20 20 Asymptotic Distribution of GPLE For large n and regularity conditions, with variance function

21 21 Variance Functions: Comparisons PLE: GPLE (recurrent event): If in stationary state, R(t) = t/  F, so  G (w) = mean residual life of  at w. EDF:

22 22 Wang-Chang Estimator (JASA, ‘99) Can handle correlated inter-event times. Comparison with GPLE may be unfair to WC estimator under the IID setting!

23 23 Frailty (“Random Effects”) Model U 1, U 2, …, U n are IID unobserved Gamma( ,  ) random variables, called frailties. Given U i = u, (T i1, T i2, T i3, …) are independent inter-event times with Marginal survivor function of T ij:

24 24 Estimator under Frailty Model Frailty parameter,  : dependence parameter. Small (Large)  : Strong (Weak) dependence. EM algorithm needed to obtain the estimator (FRMLE). Unobserved frailties viewed as missing. GPLE needed in EM algorithm. Under frailty model: GPLE not consistent when frailty parameter is finite.

25 25 Comparisons of Estimators Under gamma frailty model. F = EXP(  ):  = 6 G = EXP(  ):  = 1 n = 50 # of Replications = 1000 Frailty parameter  took values: {  (IID Model), 6, 2} Computer programs: S-Plus and Fortran routines.

26 26 IID Simulated Comparison of the Three Estimators for Varying Frailty Parameter Black=GPLE; Blue=WCPLE; Red=FRMLE

27 27 Effect of the Frailty Parameter (  ) for Each of the Three Estimators (Black=Infty; Blue=6; Red=2)

28 28 The Three Estimates of MMC Period Survivor Function IID assumption seems acceptable. Estimate of frailty parameter  is 10.2.


Download ppt "Nonparametric Estimation with Recurrent Event Data Edsel A. Pena Department of Statistics, USC Research supported by NIH and NSF Grants Based on joint."

Similar presentations


Ads by Google