Guidelines on the use of estimation methods for the integration of administrative sources DIME/ITDG meeting 2018/02/22.

Slides:



Advertisements
Similar presentations
Quality Guidelines for statistical processes using administrative data European Conference on Quality in Official Statistics Q2014 Giovanna Brancato, Francesco.
Advertisements

Lecture Nine Database Planning, Design, and Administration
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Role of editing and imputation in integration of sources for structural business statistics Svein Gåsemyr, Statistics Norway Svein Nordbotten, University.
Work Package 5: Integrating data from different sources in the production of business statistics Daniel Lewis Office for National Statistics (UK)
WP.5 - DDI-SDMX Integration
TOWARDS INTEROPERABLE STATISTICAL BUSINESS REGISTERS Harrie van der Ven Project manager ESSnet EGR January 2014 Valencia.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Dutch Virtual Census Presentation at the International Seminar on Population and Housing Censuses; Beyond the 2010 Round November, 2012 Egon Gerards,
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Assessing Quality for Integration Based Data M. Denk, W. Grossmann Institute for Scientific Computing.
Deliverable 2.6: Selective Editing Hannah Finselbach 1 and Orietta Luzi 2 1 ONS, UK 2 ISTAT, Italy.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Quality framework for the evaluation of administrative data (to be used for statistics) Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven.
Current and Future Applications of the Generic Statistical Business Process Model at Statistics Canada Laurie Reedman and Claude Julien May 5, 2010.
ESSnet on Datawarehousing - the business register Pieter Vlag – Statistics Netherlands.
New sources – administrative registers Genovefa RUŽIĆ.
Developing and applying business process models in practice Statistics Norway Jenny Linnerud and Anne Gro Hustoft.
Work packages SGA II ESSnet on microdata linking and data warehousing in statistical production Harry Goossens – Statistics Netherlands Head Data Service.
Institutional and legal framework of the national statistical system: the national system of official statistics Management seminar on global assessment.
ESS-net DWH ESSnet on microdata linking and data warehousing in statistical production Harry Goossens – Statistics Netherlands Head Data Service Centre.
Eurostat Accuracy of Results of Statistical Matching Training Course «Statistical Matching» Rome, 6-8 November 2013 Marcello D’Orazio Dept. National Accounts.
1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.
Towards a Process Oriented View on Statistical Data Quality Michaela Denk, Wilfried Grossmann.
14-Sept-11 The EGR version 2: an improved way of sharing information on multinational enterprise groups.
The business process models and quality issues at the Hungarian Central Statistical Office (HCSO) Mr. Csaba Ábry, HCSO, Methodological Department Geneva,
Standard Process Steps in Statistics Robbert Renssen Statistics Netherlands Robbert Renssen and Astrea Camstra, Statistics Netherlands.
KOMUSO - ESSnet on quality of multisource statistics
Theme (iv): Standards and international collaboration
Methods for Data-Integration
Short Training Course on Agricultural Cost of Production Statistics
Implementation of Quality indicators for administrative data
Generic Statistical Data Editing Models (GSDEMs)
Statistics Netherlands Division Social and Spatial Statistics
UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing April 2017 The Hague,
Rudi Seljak, Aleš Krajnc
Session D12: Multisource statistics New sources: new modelling approaches Author: Gras Fabrice, Eurostat, unit B1, Methodology and corporate architecture.
Task force on statistical units: survey of current practices
Modernisation of European social statistics
Implementing the ESS Vision 2020
Estimation methods for the integration of administrative sources
Estimation methods for the integration of administrative sources
Harmonisation process of anonymisation of microdata
Prague EU-SILC Best Practice Workshop, 14th and 15th September 2017
Survey phases, survey errors and quality control system
Generic Statistical Business Process Model (GSBPM)
Essnet on Methodology for Modern Business Statistics
Guidelines on the use of SBR for business demography and entrepreneurship statistics Tammy Hoogsteen (Statistics Canada) and Norbert Rainer (co-chair.
KOMUSO Information for the Big Data society in official statistics
Institutional Framework, Resources and Management
Survey phases, survey errors and quality control system
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Progress of the ESS.VIP ADMIN Special focus on the ESSnet on quality of multiple sources statistics. DIME/ITDG SG, Fabrice Gras, unit B1.
Guidelines on the use of estimation methods for the integration of administrative sources WG Methodology 2018/05/03.
Standard Process Steps in Statistics
Sub-regional workshop on integration of administrative data, big data
6.1 Quality improvement Regional Course on
Quality in administration of higher education
ESS.VIP ADMIN Sorina Vâju.
Issues in Administrative Data
ESS.VIP ADMIN Sorina Vâju.
Item 3 of the draft agenda ESS.VIP ADMIN: progress report
Draft Methodology for impact analysis of ESS.VIP Projects
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
ESS.VIP ADMIN EssNet on Quality in Multi-source Statistics, progress report 19TH WORKING GROUP ON QUALITY IN STATISTICS, 6 December 2016 Fabrice Gras,
Parallel Session: BR maintenance Quality in maintenance of a BR:
The role of metadata in census data dissemination
ESS.VIP ADMIN – Status report Item 4.1 of the draft agenda
Pete Benton , Beyond 2011 Programme Director
GSIM overview Mauro Scanu ISTAT
Presentation transcript:

Guidelines on the use of estimation methods for the integration of administrative sources DIME/ITDG meeting 2018/02/22

Contents Purpose of the presentation Objectives of the guidelines Pre-requisites for integration Usage scenarios Typology of methods Conditions for the use of methods Structure of the guidelines Purpose of the presentation To provide information to the DIME/ITDG about the status of the project and to receive feedback and comments

Objective of the guidelines – 1 Nowadays statistical products regularly rely on different data sources Such products are often called multi-source statistics This project focused on estimation methods for the integration of different sources Distinction between two kinds of data sources Statistical data Administrative data

Objective of the guidelines – 2 The guidelines consider the integration of these two kinds of sources For the integration of statistical data and administrative data it is necessary: To create an infrastructure which supports the integration To distinguish different usage scenarios for the integration To understand the conditions necessary for the application of the different methods

Objective of the guidelines – 3 For a number of usage scenarios the guidelines present: A short description of methods which are applicable in the usage scenario Decision rules for usage of the methods under more detailed specifications of the usage scenario A recommendation of preferred methods if the usage scenario allows the application of different methods

Prerequisites for integration of administrative data – 1 For the integration of administrative data in the statistical production process an appropriate infrastructure must be available: Appropriate IT infrastructure Database architecture Software for data access, data manipulation, and technical data integration (statistical) software which supports the use of methods

Prerequisites for integration of administrative data – 2 A model for assessing quality of multi-source statistics Administrative and organisational infrastructure Arrangements with administrative units for data exchange including legal aspects (privacy, security) Competent staff for the production of multi-source statistics

Prerequisites for integration of administrative data – 3 Useful sources for details about prerequisites: ESSnet on quality of multisource statistics – KOMUSO https://ec.europa.eu/eurostat/cros/content/essnet-quality-multisource-statistics-komuso_en ESSnet on the use of administrative data and accounts data in business statistics https://ec.europa.eu/eurostat/cros/content/use-administrative-and-accounts-data-business-statistics_en

Usage scenarios for administrative data – 1 A bird’s eye view on usage scenarios: Administrative data used either exclusively or in combination with survey data as source for statistical products (Direct usage) Administrative data used as source for building and maintaining statistical registers (Indirect usage) Administrative data used as support in the different sub-processes of the GSBPM (Indirect usage) Integrate data, Edit & impute, Weighting and estimation, Calculate aggregates, Validation, ….

Usage scenarios for administrative data – 2 All usage scenarios use methods for data linkage (GSBPM sub-process 5.1: Data integration in a narrow sense) and require: The identification of units The identification of duplicates Challenges are: The harmonisation of units and measurements in different sources (alignment) The presentation of the same figures for the same phenomenon in different sources (univalency)

A typology of methods – 1 Type A: Methods for different sub-processes of GSBPM: Integrate data (5.1) Edit & Impute (5.4) Weighting and estimation (5.6 et. al. ) Alignment of statistical units and measurements (hidden in different sub-processes)

A typology of methods – 2 Type B: Methods which define a workflow for statistical production Workflow for producing statistical products from administrative data (register based census) Workflow for producing statistical products from administrative data in combination with survey data Workflow for creation a statistical register Workflow for updating statistical registers

A typology of methods – 3 Useful sources for methods (mainly type A methods): ESS.VIP Admin WP 2: Statistical methods https://ec.europa.eu/eurostat/cros/content/wp2-statistical-methods_en ESSnet MEMOBUST https://ec.europa.eu/eurostat/cros/content/memobust_en ESSnet Data integration https://ec.europa.eu/eurostat/cros/content/data-integration_en A.&B. Wallgren: Register-based Statistics

Conditions for the use of methods – 1 Besides the usage scenario the application of methods depend on some conditions for the data Structural conditions for the data Possible structures: micro-data, macro-data Temporal structure: time series (longitudinal data) Due to the fact that physical integration is done step by step it is sufficient to consider the structure only for the use of two datasets

Conditions for the use of methods – 2 Possible combination of the structural conditions: Both datasets are microdata Both datasets are macro-data A combination between micro-data and macro-data Systematic treatment of different temporal structures is at the moment not considered

Conditions for the use of methods – 3 Representation of the envisaged population: Complete vs. incomplete Disjoint vs. overlap Representation of the variables of interest: Unique representation (variable only in dataset) vs. multiple representation Completeness with respect to statistical units vs. incompleteness (missing values)

Conditions for the use of methods – 4 Conditions specific to Type A methods (GSBPM sub-processes): Characterisation of the variables (numeric, categorical,….) Information about possible errors in the data Conditions specific to Type B methods: The basic methods are a detailed specification of the workflow and conceptual data modelling (for statistical units Conditions for detailed specification of some of the sub-processes (e.g. mass imputation, or alignment of measurements)

Structure of the guidelines – 1 The guidelines are organised as follows Introduction The use of administrative data in statistical production (Basically this presentation) Estimation methods for specific GSBPM and sub-processes in the case of micro-data and time series (editing, imputation, weighting and estimation Alignment of units and measurements The Integration of statistical and administrative micro-data Macro-integration Using administrative data for creation and maintenance of registers Direct usage of administrative data for statistical products

Structure of the guidelines 2 The chapters 3 – 8 are self-contained and can be read independently The structure of chapters 3 – 8 is as follows: Description of the problem (usage scenario) Conditions on the data for the possible methods Short description of the possible methods Evaluation of the methods: A decision tree branching according to the conditions If there are more than on methods in a terminal node the methods will be ordered from recommended up to not recommended