Survival Function Estimation from Current Status Data SWB JSM San Diego 2012
Objective: Compare Baseline and Endline U5 U5 = P[Life 5 years] Added objective: Compare survivor function estimates –P[Life > t|Baseline] vs. P[Life > t|Endline], t =1,2,…,60+ months –Baseline = 2007; Endline = 2010
Data were from Surveys IRC surveyed two Afghan refugee camps in Pakistan in 2007 and one after humanitarian aid in 2010 –IRC = International Rescue Committee, –Before 2007 = “baseline” and after 2010 = “endline,” –2008 earthquake prevented endline survey of other refugee camp
Data Details Many questions: intrusive, detailed Data included survey dates, live birth dates within last 5 years, deaths, and current status (dead or alive) Pdf files showed questionnaires Excel spreadsheets contained data relevant to objective(s)
Data Problems If exact dates were unknown, month or season were given, including “monsoon” Some ages at deaths were inadmissable; deaths would have occurred after survey dates Baseline and endline data were in different formats Some instructions seemed to have been misunderstood; e.g., baseline survey date
Some Endline Grouped Age-at-Death Data
Sufficient Current Status Data Not all data are shown. This is endline data
Nonparametric Max. Likelihood from Current Status Data ln L = dj ln(Fj) + sj dj) ln(1 Fj) Fj = P[Life j]; dj and sj are current status deaths from the both surveys and all live births for j = 1,2,…,60+ Use Excel Solver to maximize log likelihood as function of pk 0, Fj = pk, k = 1,2,…,j –Also used least squares fit to current status npmle
Kaplan-Meier Estimate from Admissible Ages at Deaths Data arranged in form of Nevada table Used workbook described at an.htm an.htm Includes Greenwood’s estimate of standard deviation of survival function
CDF Estimates (Current Status)
CDF Estimates from Current Status Data
Birth, Deaths, U5, and Std. Err. Std. Error Est. is Greenwood’s standard deviation See Banerjee and Wellner for confidence intervals from current status data
Conclusions Estimates from current status and admissible ages at deaths agreed tolerably U5 = ~10% baseline and ~4% endline –Pakistan U5 is 8.7% (Wikipedia) Infant mortality (1 year) is almost 4% baseline AND endline –Pakistan infant mortality is 6.7% (World Bank, World Development measures)
References Bannerjee, Moulinath and Jon A. Wellner “Confidence Intervals for Current Status Data,” Scand. J. of Statist., Vol. 32, pp , 2005 George, L. L. “Kaplan-Meier Reliability Estimation Spreadsheet,” ASQ Reliability Review, vol. 25, no. 2, June ibid. “What Price Required Data?” /NwsRev2.doc pp. 3-4, 2000http:// /NwsRev2.doc Greenwood, M. “The Natural Duration of Cancer,” Reports on Public Health and Medical Subjects 33, His Majesty’s Stationery Office, pp 1-26, 1926 Miller, Rupert “What Price Kaplan-Meier?” Biometrics, vol. 39, pp , 1983 ibid. Beyond ANOVA, Basics of Applied Statistics, Chapman and Hall, New York, 1997 “Survival Analysis,” Nov. 2, 2011