ESTP WORKSHOP ON SDMX IN NATIONAL ACCOUNTS

Slides:



Advertisements
Similar presentations
April, 2004 Lars Thygesen International Trade Expert meeting Whats going on at OECD: statistical information management.
Advertisements

SDMX – AN OECD PERSPECTIVE Paul Schreyer OECD CCSA Special Session, September 2014 Rome.
1 Introducing Reportnet Miruna Badescu. 2 A linear view of Reportnet process.
Global SDMX Implementation Experience from on-going projects Daniel Suranyi, Eurostat Project Officer for SDMX implementation SDMX Expert Group,
Implementing ESS standards for reference metadata and quality reporting at Istat Work Session on Statistical Metadata Topic (i): Metadata standards and.
13-Jul-07 Implementation of SDMX for data and metadata exchange Balance of Payments Working Group 2-3 April 2012 Daniel Suranyi Eurostat B5 Management.
Model and Representations
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
SDMX IN DATA COLLECTION AND DATA DISSEMINATION Workshop on Statistical Data Collection, Washington DC, 29 April - 1 May 2015.
SDMX IT Tools SDMX use in practice in NA
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois LEBLANC Christian SEBASTIAN SDMX IT Tools SDMX.
7b. SDMX practical use case: Census Hub
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
IAEA International Atomic Energy Agency Implementing SDMX for Energy Domain: From Discussion to Actual Implementation and Testing Andrii Gritsevskyi Oslo.
UNECE-CES Work session on Statistical Data Editing
PLM, Document and Workflow Management
Results from Essnet for SDMX WP7 PC-Axis SDMX Integration
National Accounts World Wide Exchange
The evolution of the SDMX infrastructure and services
Exchanging Reference Metadata using SDMX
The CVD Metadata Handler
SDMX Opportunities MED Meeting 14 May 2013 Daniel Suranyi Eurostat B5
SDMX Information Model
Data validation rules Item 3b Eurostat Task Force on Annual Financial Accounts Frankfurt, 4 March 2016.
Using the Checklist for SDMX Data Providers
7. SDMX practical use case: National Accounts
ESS VIP ICT Project Mapping Assistant in use (ICT domain)
Generic Statistical Business Process Model (GSBPM)
SDMX: A brief introduction
11. The future of SDMX Introducing the SDMX Roadmap 2020
Data collection of 2012: Data transmission standards and tools
SDMX Reference Infrastructure Introduction
SDMX Introduction and practical exercises
2. An overview of SDMX (What is SDMX? Part I)
2. An overview of SDMX (What is SDMX? Part I)
SDMX Tools Architecture
Workshop on ESA 2010 transmission programme – What and how?
Task Force on Annual Financial Accounts
Data Transmission Tools & Services EDAMIS, SDMX, Validation
SDMX in Statistical Information System, INS TUNISIA
Implementation of SDMX in the ESS
Working Group on Population and Housing Censuses
SDMX in the S-DWH Layered Architecture
SDMX Tools Overview and architecture
Statistical Information Technology
SDMX as basis for water data reporting
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
SDMX Tools Interactive demonstrations Structural Validation Service
Education and Training Statistics Working Group – 2-3 June 2016
SDMX : General introduction H. Linden, Eurostat, Unit B5
Ioannis Xirouchakis / Unit B3
SDMX IT Tools SDMX use in practice in NA
SDMX Implementation The National Accounts use case
ESTP course on Statistical Metadata – Introductory course
Eurostat Unit B3 – IT and standards for data and metadata exchange
5. SDMX: General input requirements
7. Introduction to the main SDMX objects for metadata exchange
Developing SDMX artefacts for data exchange, sharing and dissemination
SDMX: Frequently Asked Questions
Standardizing and industrializing a business process – the dissemination use case Alessio Cardacino - ESTP Course “Information standards.
Validation at Insee.
SDMX IT Tools SDMX Registry
Integrated Statistical Production System WITH GSBPM
SDMX IT building blocks
SDMX in AFRICA SDMX Roadmap th SDMX Global Conference
Presentation transcript:

ESTP WORKSHOP ON SDMX IN NATIONAL ACCOUNTS Challenges at Statistics Netherlands ESTP WORKSHOP ON SDMX IN NATIONAL ACCOUNTS

Challenges Statistics Netherlands - 1 SDMX implementation concurrently to ESA2010 revision Focus on NA domain (not BoP, etc.) One system for all transmissions to international organisations, extendable to other domains Full NA domain SDMX 2.0 (Eurostat) and 2.1 (ECB) Different header requirements Validation on-site

Fall 2013: start project Two parallel implementation projects: SDMX converter Straightforward, easy to plan (strict deadline!) January 2014: implementation of the converter + Excel templates. September 2014: first delivery SDMX to IO’s, using converter SDMX-RI: Complex, IT intensive Spring-Summer 2014: experimentation with RI Autumn-winter 2014/2015: requirements + development RI tooling Spring-summer 2015: testing + further development Fall-winter 2015/2016: Deployment

Implementation problems Data flow definitions: Some artefacts were missing Many only available in a test registry Different requirements for SDMX header specifics made by different international organisations ECB more stringent DSD matrix is needed for validation, but has no official status and contains some inconsistenties. Key set constraints As yet unavailable Validation tools still to be tested with those artefacts

ESTP WORKSHOP ON SDMX IN NATIONAL ACCOUNTS Implementation of SDMX-RI at Statistics Netherlands ESTP WORKSHOP ON SDMX IN NATIONAL ACCOUNTS

Who are we? Statistics Netherlands National Accounts Government Finance Statistics Harm Melief (NA): Statistics project leader; requirements and validation Vincent Ohm (GFS): requirements and testing Dick Windmeijer: RI-expertise and methodology Wilma Triepels: IT development (National accounts) Anne Reedijk: IT development (CBS IT department) Olav ten Bosch/Hans Beneker: IT-project leader.

SDMX at Statistics Netherlands Goal: Controlled introduction of SDMX at Statistics Netherlands Duration: 2009 to present Strategy: SDMX for external data communication, not (yet) for internal processes As much as possible generic SDMX services / tools / processes Projects: Census Hub (SDMX-RI) ICT-Hub (pilot, SDMX-RI) Fishery, Waste, Trade, Education, etc. (mostly SDMX converter) STES / SDDS (SDMX-RI + Statistics Netherlands dissemination database StatLine) National Accounts (SDMX-RI)

National Accounts project IT project at Statistics Netherlands: Business case (justification): Implement SDMX for National Accounts domain Using RI, converter supported until 2016 Basic assumptions General requirements (Business architecture) General functionality (Use case model) Specifications (Use cases) Development, testing, deployment

Basic assumptions One system for all transmissions to international organisations Exports both SDMX 2.0 (Eurostat) and 2.1 (ECB) Flexible with respect to header requirements Applicable for full NA domain Generic system reusable for other domains Local validation Using the RI software for generating SDMX files

Architecture Two interacting systems DSD’s NA Output system Dissemination System Dataflows Data files from Specialists SDMX files Definitions Codelists Two interacting systems NA output system for collecting and validating all NA dataflows (domain specific) Dissemination tooling (generic) for: interpreting DSD’s to derive dataflow definitions and codelists Importing, mapping and SDMX-conversion

Why two systems? Choice was made to separate generic from specific functionality. Dissemination tooling: Does not use all RI components IT intensive  not for statisticians. Extendable over other domains (generic) NA Output database Contains data + (staged) validations Statistical expertise needed  not for IT people Domain specific

General requirements for IT-project The following needs were defined: Delivery of SDMX files Technical validation of data Redelivery of data to correct mistakes Viewing data prior to SDMX conversion Implement changes in DSD’s Viewing older SDMX files (rejected) Administrative data, delivery times, versions, etc. Delivering meta-data to NA output system (added later-on )

Functionality Five (four + one) use cases defined: Implementation of new DSD versions Importing data files into the dissemination system Generating SDMX files Viewing SDMX files, prior to export Exporting SDMX meta-data to NA-output system

Example of Use Case

ESTP WORKSHOP ON SDMX IN NATIONAL ACCOUNTS CBS-System overview and demo ESTP WORKSHOP ON SDMX IN NATIONAL ACCOUNTS

Architecture revisited DSD’s NA Output system Dissemination System Dataflows Data files from Specialists SDMX files Definitions Codelists Two interacting systems NA output system for collecting and validating all NA dataflows (domain specific) Dissemination tooling (generic) for: interpreting DSD’s to derive dataflow definitions and codelists Importing, mapping and SDMX-conversion

Design choices Separate generic (dissemination) from domain-specific statistics (output) tooling. NA Output system delivers dataflows already mapped to SDMX dimensions and codes These choices allow: Statistical validation in NA output system Automation of Dissemination system

Dissemination tool architecture

Dissemination system – RI components Mapping assistant: Imports DSD Translates DSD information into mapping store Not used for (manual) mapping data to DSD Mapping Store database Separate database Stores DSD artefacts Contains mappings for all dataflows

Dissemination system – RI components Webservice: Imports Mappings from mapping store Imports Dataflows from local repository Converts data to SDMX Webclient Used for inspection of converted data RI components may be accessed through SOAP or REST interface of the RI webservice

Dissemination system – Generic I/O shell CBS mapping generator: Imports DSD + data flow definitions Generates dissimination DB from DSD artefacts Generates Data mapping: 1-on-1 Dissimination DB Generated into DSD format Contains dataflows imported from NA output database Built on SQL server/C#.NET and the REST interface.

Dissemination system – Generic I/O shell Post processing: Defining header through standard RI tooling complex. Easier and more flexible solution: adjust header after conversion Important for deliveries to ECB Body of SDMX file not changed

Dissemination system – User interface Controls I/O shell Export DSD metadata (e.g. codelists) to NA output database Manage data flows: Uploading from NA output database Monitoring properties (e.g. number of records) Initiating conversion and export of SDMX Allows specifying of header information Based on ASP.NET MVC WebApp and using WCF service

Dissemination system – Externals Rest interface allows only SDMX 2.1 Command-line version of converter used to convert SDMX 2.1 to SDMX 2.0 Dissemination Database pulls data and delivers meta-data from NA Output database, not vice versa.

Demo

NA-output tool architecture Two goals: Storing separate dataflows, delivered by the specialists Validating dataflows, using DSD information (codelist) Output database Keys validation Code list validation Other validations Collection Upload to dissimination system Specialist data

NA-output database - 2 SQL server database Storing unvalidated data Validation scripts (stored procedures) Storing validated data Access/VBA user interface General process management Scripts for importing specialist data Running validation scripts

Keys Validation -1 Join Keylist to dataflow Keys with no corresponding data indicate missing records Data with no corresponding keys indicate superfluous records Dataflow Keys Missing Match Superfluous

Keys Validation -2 Keys validation: Each record in a SDMX data flow can be uniquely identified by the values of its relevant dimensions: Key Each dataflows uses a (relevant) subset of the DSD-dimensions Relevant dimensions are defined in the DSD matrix Correct dataflow consists of set of unique keys Output database contains stored procedures to confront the unvalidated data to its key-set. Missing records imply an incomplete transmission Extra records are superfluous

Code list validation Code validation: Fields in the NA DSD’s: Values (OBS_VALUE) Generic format, i.e. datetime Categorial (using code lists) DSD links dimensions to code list (if applicable) Output database contains stored procedures to confront the unvalidated data to the linked codelists No plausibility checks, for instance not whether the country code matches the transmitting country.

Content validation Currently not implemented, planned for 2016 Monitoring progress and results of Validation TF Wish: Provide all Keylists in a central repository Some validation checks are planned: Country code (dimension in Key) Combinations Conf_Status and Embargo date Combinations Obs_Status and Obs_Value Build up the checks as more are defined by the TF.

Demo

Implementation problems Data flow definitions: Some were missing Many only available in a test registry Different requirements on header specifics made by different international organisations ECB more stringent DSD matrix is needed for validation, but has no official status and contains some inconsistenties. Key constraints As yet unavailable Possibly not be used by validation task force.

Current / Future work Collaborate with Eurostat to fill in missing artefacts Data flow definitions DSD matrix Monitor work validation taskforce Possibly implement validation rules Extend to other domains, for instance ESSPROS