NCHS July 11, 2006. A Semiparametric Approach to Forecasting US Mortality Age Patterns Presenter: Rong Wei 1 Coauthors: Guanhua Lu 2, Benjamin Kedem 2.

Slides:



Advertisements
Similar presentations
Period Life Tables for the Non- Hispanic American Indian and Alaska Native Population in CHSDA Counties Elizabeth Arias, Ph.D. Mortality Statistics Branch.
Advertisements

Inference for Regression
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
CHAPTER 52 POPULATION ECOLOGY Copyright © 2002 Pearson Education, Inc., publishing as Benjamin Cummings Section A: Characteristics of Populations 1.Two.
Objectives (BPS chapter 24)
A Short Introduction to Curve Fitting and Regression by Brad Morantz
ELEC 303 – Random Signals Lecture 18 – Statistics, Confidence Intervals Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 10, 2009.
St. Louis City Crime Analysis 2015 Homicide Prediction Presented by: Kranthi Kancharla Scott Manns Eric Rodis Kenneth Stecher Sisi Yang.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Today Today: Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
Lecture 9: One Way ANOVA Between Subjects
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Chapter 2 Simple Comparative Experiments
458 Fitting models to data – III (More on Maximum Likelihood Estimation) Fish 458, Lecture 10.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Joint Canada/U.S. Health Survey Catherine Simile, National Center for Health Statistics Patrice Mathieu, Statistics Canada Ed Rama, Statistics Canada NCHS.
BC Jung A Brief Introduction to Epidemiology - IV ( Overview of Vital Statistics & Demographic Methods) Betty C. Jung, RN, MPH, CHES.
Review of normal distribution. Exercise Solution.
An Assessment of the Cohort-Component-Based Demographic Analysis Estimates of the Population Aged 55 to 64 in 2010 Kirsten West U.S. Census Bureau Applied.
Inference for regression - Simple linear regression
Issues Related to Data Dissemination in Official Statistics Presented at the European Conference On Quality in Official Statistics Helsinki, Finland May.
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
© Jorge Miguel Bravo 1 Eurostat/UNECE Work Session on Demographic Projections Lee-Carter Mortality Projection with "Limit Life Table" Jorge Miguel Bravo.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Census Bureau’s Interim and Final State Projections Population Projections Branch Population Division U.S. Census Bureau.
Estimation in Sampling!? Chapter 7 – Statistical Problem Solving in Geography.
Review of Statistical Models and Linear Regression Concepts STAT E-150 Statistical Methods.
Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D.1 Descriptive Statistics, The Normal Distribution, and Standardization.
Utah Department of Health 1 1 Identifying Peer Areas for Community Health Collaboration and Data Smoothing Brian Paoli Utah Department of Health 6/6/2007.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
1 Modeling Coherent Mortality Forecasts using the Framework of Lee-Carter Model Presenter: Jack C. Yue /National Chengchi University, Taiwan Co-author:
Module 13: Normal Distributions This module focuses on the normal distribution and how to use it. Reviewed 05 May 05/ MODULE 13.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Statistics PSY302 Quiz One Spring A _____ places an individual into one of several groups or categories. (p. 4) a. normal curve b. spread c.
A Process Control Screen for Multiple Stream Processes An Operator Friendly Approach Richard E. Clark Process & Product Analysis.
The Ogden Tables and Contingencies Other than Mortality Zoltan Butt Steven Haberman Richard Verrall Ogden Committee Meeting 21 July 2005.
Standardizing Rates Nam Bains October 15 th, 2007 Statistics and Analysis in Public Health APHEO.
Sampling Distribution WELCOME to INFERENTIAL STATISTICS.
May 21, 2013 Lina Xu Methods to model Mortality Improvement 2. Lee Carter Model 3. Model fit/Analysis/Result 4. Industry Mortality Improvement.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
School of Geography FACULTY OF ENVIRONMENT ESRC Research Award RES What happens when international migrants settle? Ethnic group population.
The U.S. Census Bureau’s Postcensal and Intercensal Population Estimates Alexa Jones-Puthoff Population Division National Conference on Health Statistics.
Introduction to Inference Sampling Distributions.
1 Chapter FIGURE 11.1 Normal densities for male and for female femoral circumferences (in millimeters). The summary statistics for drawing these.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.
Stats 242.3(02) Statistical Theory and Methodology.
Prediction of lung cancer mortality in Central & Eastern Europe Joanna Didkowska.
Stat 223 Introduction to the Theory of Statistics
Review 1. Describing variables.
Stat 223 Introduction to the Theory of Statistics
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Basic Practice of Statistics - 5th Edition
Chapter 2 Simple Comparative Experiments
Pankaj Das, A. K. Paul, R. K. Paul
Special Topics In Scientific Computing
Statistics PSY302 Review Quiz One Fall 2018
Inferences and Conclusions from Data
POINT ESTIMATOR OF PARAMETERS
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Stat 223 Introduction to the Theory of Statistics
Statistics PSY302 Review Quiz One Spring 2017
Advanced Algebra Unit 1 Vocabulary
Presentation transcript:

NCHS July 11, 2006

A Semiparametric Approach to Forecasting US Mortality Age Patterns Presenter: Rong Wei 1 Coauthors: Guanhua Lu 2, Benjamin Kedem 2 and Paul D. Williams 1 1 National Center for Health Statistics (NCHS) 2 Math Dept. University of Maryland, College Park

NCHS July 11, 2006 Outline Background Project tasks Model Introduction New Approach: Semiparametric model Mortality forecasting: US, small states Comparison with Lee-Carter Model Conclusion

NCHS July 11, 2006 Background NCHS publishes race-gender specific life tables for each of 50 states plus DC decennially; Out of 300+ tables, about 1/5 of tables could not be published due to small numbers of deaths in a short time period; Mortality data have been well documented in NCHS for every year, state, race-gender population since 1968.

NCHS July 11, 2006 An example of life tables

NCHS July 11, 2006 Mortality age patterns: data from US and large states

NCHS July 11, 2006 Mortality in small states: one year data vs. 30 years historical data

NCHS July 11, 2006 Another view of the data: time series at each age

NCHS July 11, 2006 The tasks To solve the insufficient data problem, data from 30+ years are used to model the age- specific death pattern for small areas; Select a time series model which gives better control for time effect and random error in multiple time series with short prediction; Project mortality curves (one year ahead vs. many years prediction) in small areas with historical data and robust statistical methodology.

NCHS July 11, 2006 Introduction to mortality forecasting models: US mortality forecasting model by Lee and Carter (1992): Ln( m x,t ) = a x + b x k t + e x,t k t = k t-1 + c + e t The LC model is based on principle components. It searches for the 1 st PC in n dimensional time series data and solves for the age and time parameters by singular value decompositions. The LC model explains 60 – 93% of total dimensional variance (Girosi and King). For some populations, the 1 st PC may be insufficient to explain the variance in high-dimensional data.

NCHS July 11, 2006 New Approach: Semiparametric model Semiparametric approach Short mortality time series used from 1968 to 1998 for consistency of data collection Combining more information from age neighborhood Centered death rates Emphasis on predictions of incoming years

NCHS July 11, 2006 Semiparametric model

NCHS July 11, 2006 Semiparametric model in Time Series

NCHS July 11, 2006 Parameter estimation from pooled sample

NCHS July 11, 2006 Maximum likelihood function

NCHS July 11, 2006 Reference sample distribution

NCHS July 11, 2006 Application on US mortality forecasting Data: Mortality data from death certificates filed in state vital statistics offices and reported to NCHS from 1968 – 2002; Population data from decennial census and interpolated between two adjacent decennial census Age-specific mortality rates were calculated for each race-gender demographic population.

NCHS July 11, 2006 Cont’d 85 age-specific time series for ages 1,…, 85, where the age category 85+ includes age 85 and above; For each age, time series is from 1970 to 2001, 2002 data are available for comparison with the prediction result; All the 85 time series are categorized into 5 year age groups 1-5, 6-10,..., , a total of 17 groups; Death rates at each age are rescaled by centralized from the averages over years; Residuals from the time series “in the middle” of each group are taken as the reference.

NCHS July 11, 2006 Mortality age-patterns across four decades: 1970 – 2000: US National Vital Statistics

NCHS July 11, 2006 Age-specific time series for log-death rates

NCHS July 11, 2006 Log-death rates centered by rescaling from age-specific averages over years

NCHS July 11, 2006 Centered age-specific time series for log-death rates

NCHS July 11, 2006 Mortality forecasting procedure

NCHS July 11, 2006 Procedure cont’d

NCHS July 11, 2006 Fit of TS & histogram of residuals

NCHS July 11, 2006 Comparison for single age

NCHS July 11, 2006 Comparison of age groups & Combining more information increases the fit of density curves

NCHS July 11, 2006 Empirical (solid) and estimated (dot) CDF

NCHS July 11, 2006 One-year-ahead predictive distribution

NCHS July 11, 2006 Predicted mortality curves from LC & SP models in 2002

NCHS July 11, 2006 Predicted mortality curves for age group 1-30

NCHS July 11, 2006 Predicted mortality curves for age group 31-50

NCHS July 11, 2006 Predicted mortality curves for age group 51-70

NCHS July 11, 2006 Predicted mortality curves for age group 71-85

NCHS July 11, 2006 Mean Square Error of prediction from Semiparametric model (SP) & Lee-Carter (LC) MSE for total population MSE for Female Age Group SP model LC model Age Group SP model LC model

NCHS July 11, 2006 Semiparametric Time Series Estimate: Mortalities in Small Populations

NCHS July 11, 2006 Semiparametric Time Series Estimate: Mortalities in Small Populations

NCHS July 11, 2006 Conclusion Historical data fitted by the time series - semiparametric model can help when estimating mortality rates in small areas with insufficient observations; Compared to LC model, the semiparametric method reduces the overall MSE appreciably due to better modeling the predictive probabilities with conditional distributions; This is a non-Bayesian method. The Bayesian method will result in relatively large prediction interval, so further than one year ahead prediction could apply.

NCHS July 11, 2006 Alternative ways to solve the problem of estimating mortalities for small areas In addition to the way of borrowing strength from historical data, other alternatives include: Borrow strength from national mortality data; Borrow strength from geographic neighborhood data; Borrow strength from other area data with similarities in cause of death.

NCHS July 11, 2006 Small area estimation by Bayesian: borrow national data strength Mortality curve - black male, IA Age Log (q) Estimated State observation National observation