Download presentation
Presentation is loading. Please wait.
Published byHarvey Small Modified over 6 years ago
1
Estimation methods for the integration of administrative sources
Specific contract n° ESTAT N° under framework contract Lot 1 n° Estimation methods for the integration of administrative sources Responsible person at Commission: Fabrice Gras Eurostat – Unit B1 Authors: Istat, CBS, Univ. of Southampton DIME ITDG Steering Group 26 June 2017
2
Overview of the project
Project “Estimation methods for the integration of administrative sources” Duration 1/06/ /03/2017 Part of ESS.VIP Admin Project (Work package 2 “Statistical methods”) Experts from Istat, Statistics Netherlands and external advisor of University of Southampton
3
Aim of the project Identification and presentation of existing statistical methods as well as the associated contextual framework in order to enable and ease the integration of administrative sources into a statistical production system
4
The main objectives Identification of the main possible use of administrative sources Identification of the different steps of the statistical production system where methods can be used for integrating administrative sources. A literature review presenting actual examples in the NSIs Drafting of technical summary sheet for each identified statistical method
5
Overview of the project: 7 tasks
Task 1. Specify usages of admin data Task 2. Identification and description of possible statistical tasks where methods can be envisaged in order to integrate administrative sources Task 3. Comprehensive identification and enumeration of possible estimation methods that could be used for cases identified in Task 2
6
Overview of the project: 7 tasks
Task 4. Literature review presenting examples in NSIs for the type of use of administrative sources and for steps that have been previously identified Task 5a. Provision of template for review of estimation methods Task 5b. Methods description Task 6 & 7. Final presenation and report
7
Task 1: Statistical usages
Direct 1. Direct Tabulation 2. Substitution and supplementation Indirect 1. Creation and maintenance of registers 2. Editing and imputation 3. Indirect estimation 4. Data validation/ confrontation
8
Task 2: Statistical tasks
Statistical tasks for using integrated admin data Data editing and imputation Creation of joint statistical micro data Alignment of statistical data Multisource estimation at aggregated level
9
Data editing and imputation
Resolving micro-data inconsistencies and imputing missing data
10
Creation of joint statistical micro data
Data linkage: Identification of the set of unique units residing in multiple datasets Statistical matching: Inference of joint distribution based on marginal observations Hashing techniques
11
Alignment of statistical data
Alignment of units: Harmonisation of relevant units, creation of target statistical units Alignment of measurements: Harmonisation of relevant variables, derivation of target statistical variable
12
Multisource estimation at aggregated level
Population size estimation: multiple lists with imperfect coverage of target population Univalent estimation: numerical and statistical consistent estimation of common variables Coherent estimation: aggregates that relate to each other in terms of accounting equations
13
Relationships among usages and statistical tasks
Su1 one admin datasource, su2 >1 admin data sources,…
14
Task5a: Template for methods description
Template: presentation of the method, the contextual framework for using it as well as the conditions of applications, the pros and cons, an example of use, and a list of related existing software
15
Task5: Data editing Methods
Deductive editing Selective editing Automatic editing Interactive editing Macro-editing Deductive Imputation Model-Based Imputation Donor Imputation Imputation for Longitudinal Data Imputation under Edit Constraints Outlier detection Reconciling Conflicting Micro-data: Prorating, Minimum Adjustment Methods, Generalised Ratio Adjustments
16
Task 5. Creation of joint statistical micro data methods
Matching of Object Characteristics (Unweighted & Weighted Matching) Probabilistic Record Linkage Data Fusion at Micro Level (relevant choice of Statistical Matching Methods) Data hashing & anonymisation techniques
17
Task 5. Alignment of statistical data methods
Alignment of units No general methods are available Alignment of measurements Harmonisation based on latent variable models
18
Task 5. Multisource estimation at aggregated level methods
Generalised Regression Estimator EBLUP Area Level for Small Area Estimation (Fay- Herriot) Method Small Area Estimation Methods for Time Series Data
19
Multisource estimation at aggregated level methods
IVa. Population size estimation: multiple lists with imperfect coverage of target population. Multiple-list models for population size estimation IVb. Univalent estimation: numerical and statistical consistent estimation of common variables. Repeated weighting Mass imputation Repeated imputation Macro-integration
20
Mutlisource estimation at macro level methods
Univalent estimates at longitudinal level State space models (estimation of unobserved variable and possible application to alignment of stat data) Temporal Disaggregation/benchmarking methods: Denton's Method & Chow-Lin Method IVc. Coherent estimation: aggregates that relate to each other in terms of accounting equations Macro-integration
21
Task 4: NSIs actual examples
Statistics on road accidents -Probabilistic Record Linkage The creation of a Social Policy Simulation Database (Stat Canada) - Statistical Matching Modelling measurement error in admin and survey variables on turnover Estimating classification errors under edit-restrictions in combined register-survey data Variable harmonisation based on latent variable models: Dutch Population and Housing Census Statistical methods for achieving univalent estimates for cross- sectional data Macro-Integration (Repeated weighting)
22
Conclusions Documents provide an overview and a quick access to methods used in the production of statistics involving integrated admin data Several methods with different level of maturity Although research is -for all of them- on-going, latent class models for harmonisation are the one requiring more studies. The task of unit harmonisation deserves research in terms of methods and impact of errors
23
Conclusions Registers are a statistical product (direct usage)
Until now considered without errors The use of many sources implies uncertainty in the register, from now on statistical register How to deal with uncertainty in statistical register is an open problem that should be investigated
24
Thank you for your attention
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.