Download presentation
Presentation is loading. Please wait.
Published byFrancine Eleanor Cain Modified over 9 years ago
1
NCHS July 11, 2006
2
A Semiparametric Approach to Forecasting US Mortality Age Patterns Presenter: Rong Wei 1 Coauthors: Guanhua Lu 2, Benjamin Kedem 2 and Paul D. Williams 1 1 National Center for Health Statistics (NCHS) 2 Math Dept. University of Maryland, College Park
3
NCHS July 11, 2006 Outline Background Project tasks Model Introduction New Approach: Semiparametric model Mortality forecasting: US, small states Comparison with Lee-Carter Model Conclusion
4
NCHS July 11, 2006 Background NCHS publishes race-gender specific life tables for each of 50 states plus DC decennially; Out of 300+ tables, about 1/5 of tables could not be published due to small numbers of deaths in a short time period; Mortality data have been well documented in NCHS for every year, state, race-gender population since 1968.
5
NCHS July 11, 2006 An example of life tables
6
NCHS July 11, 2006 Mortality age patterns: data from US and large states
7
NCHS July 11, 2006 Mortality in small states: one year data vs. 30 years historical data
8
NCHS July 11, 2006 Another view of the data: time series at each age
9
NCHS July 11, 2006 The tasks To solve the insufficient data problem, data from 30+ years are used to model the age- specific death pattern for small areas; Select a time series model which gives better control for time effect and random error in multiple time series with short prediction; Project mortality curves (one year ahead vs. many years prediction) in small areas with historical data and robust statistical methodology.
10
NCHS July 11, 2006 Introduction to mortality forecasting models: US mortality forecasting model by Lee and Carter (1992): Ln( m x,t ) = a x + b x k t + e x,t k t = k t-1 + c + e t The LC model is based on principle components. It searches for the 1 st PC in n dimensional time series data and solves for the age and time parameters by singular value decompositions. The LC model explains 60 – 93% of total dimensional variance (Girosi and King). For some populations, the 1 st PC may be insufficient to explain the variance in high-dimensional data.
11
NCHS July 11, 2006 New Approach: Semiparametric model Semiparametric approach Short mortality time series used from 1968 to 1998 for consistency of data collection Combining more information from age neighborhood Centered death rates Emphasis on predictions of incoming years
12
NCHS July 11, 2006 Semiparametric model
13
NCHS July 11, 2006 Semiparametric model in Time Series
14
NCHS July 11, 2006 Parameter estimation from pooled sample
15
NCHS July 11, 2006 Maximum likelihood function
16
NCHS July 11, 2006 Reference sample distribution
17
NCHS July 11, 2006 Application on US mortality forecasting Data: Mortality data from death certificates filed in state vital statistics offices and reported to NCHS from 1968 – 2002; Population data from decennial census and interpolated between two adjacent decennial census Age-specific mortality rates were calculated for each race-gender demographic population.
18
NCHS July 11, 2006 Cont’d 85 age-specific time series for ages 1,…, 85, where the age category 85+ includes age 85 and above; For each age, time series is from 1970 to 2001, 2002 data are available for comparison with the prediction result; All the 85 time series are categorized into 5 year age groups 1-5, 6-10,..., 81-85+, a total of 17 groups; Death rates at each age are rescaled by centralized from the averages over years; Residuals from the time series “in the middle” of each group are taken as the reference.
19
NCHS July 11, 2006 Mortality age-patterns across four decades: 1970 – 2000: US National Vital Statistics
20
NCHS July 11, 2006 Age-specific time series for log-death rates
21
NCHS July 11, 2006 Log-death rates centered by rescaling from age-specific averages over years
22
NCHS July 11, 2006 Centered age-specific time series for log-death rates
23
NCHS July 11, 2006 Mortality forecasting procedure
24
NCHS July 11, 2006 Procedure cont’d
25
NCHS July 11, 2006 Fit of TS & histogram of residuals
26
NCHS July 11, 2006 Comparison for single age
27
NCHS July 11, 2006 Comparison of age groups 32-34 & 31-35 Combining more information increases the fit of density curves
28
NCHS July 11, 2006 Empirical (solid) and estimated (dot) CDF
29
NCHS July 11, 2006 One-year-ahead predictive distribution
30
NCHS July 11, 2006 Predicted mortality curves from LC & SP models in 2002
31
NCHS July 11, 2006 Predicted mortality curves for age group 1-30
32
NCHS July 11, 2006 Predicted mortality curves for age group 31-50
33
NCHS July 11, 2006 Predicted mortality curves for age group 51-70
34
NCHS July 11, 2006 Predicted mortality curves for age group 71-85
35
NCHS July 11, 2006 Mean Square Error of prediction from Semiparametric model (SP) & Lee-Carter (LC) MSE for total population MSE for Female Age Group 1-851-3031-5051-7071-85 SP model.104.050.015.030.009 LC model.297.078.180.029.013 Age Group 1-851-3031-5051-7071-85 SP model.187.121.026.032.008 LC model.619.226.341.027.025
36
NCHS July 11, 2006 Semiparametric Time Series Estimate: Mortalities in Small Populations
37
NCHS July 11, 2006 Semiparametric Time Series Estimate: Mortalities in Small Populations
38
NCHS July 11, 2006 Conclusion Historical data fitted by the time series - semiparametric model can help when estimating mortality rates in small areas with insufficient observations; Compared to LC model, the semiparametric method reduces the overall MSE appreciably due to better modeling the predictive probabilities with conditional distributions; This is a non-Bayesian method. The Bayesian method will result in relatively large prediction interval, so further than one year ahead prediction could apply.
39
NCHS July 11, 2006 Alternative ways to solve the problem of estimating mortalities for small areas In addition to the way of borrowing strength from historical data, other alternatives include: Borrow strength from national mortality data; Borrow strength from geographic neighborhood data; Borrow strength from other area data with similarities in cause of death.
40
NCHS July 11, 2006 Small area estimation by Bayesian: borrow national data strength Mortality curve - black male, IA -10 -8 -6 -4 -2 0 0102030405060708090 Age Log (q) Estimated State observation National observation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.