SDMX at the International Labour Organization

Slides:



Advertisements
Similar presentations
MICS4 Survey Design Workshop Multiple Indicator Cluster Surveys Survey Design Workshop Data Archiving.
Advertisements

Visualise | communicate | ENGAGE Instant Atlas™ is a registered trademark of GeoWise Limited ©Copyright 2008 | Geowise Limited IA Desktop to LIS Solution.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
CountryData Development Improving the collation, availability and dissemination of development indicators (including the MDGs) Nairobi, 27 November 2013.
Restricted Daejeon, April An SDMX based unified data catalogue (UDC) MSIS – Meeting on the Management of Statistical Information Systems 1.
ILO Department of Statistics Edgardo Greising
ILO Department of Statistics Edgardo Greising
ILO Department of Statistics Edgardo Greising
Model and Representations
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
Edgardo Greising MSIS 2013 – International Organizations Session.
SDMX IT Tools SDMX use in practice in NA
7b. SDMX practical use case: Census Hub
Implementation of SDMX for Balance of Payments Balance of Payments Working Group 9-10 April 2013 BP Daniel Suranyi Eurostat B5 Management of statistical.
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
IAEA International Atomic Energy Agency Implementing SDMX for Energy Domain: From Discussion to Actual Implementation and Testing Andrii Gritsevskyi Oslo.
UNECE-CES Work session on Statistical Data Editing
B.6 Roadmap 2013 – 2014 SDMX RI User Group Luxembourg, September 2013.
Statistical Information Systems Introducing SIS tool .Stat
Presentation of the eTendersNI service Business Intelligence Module
Migrating Oracle Forms Using Oracle Application Express
Country use cases: Cambodia, and Tunisia
Interoperable data formats: SDMX
SDMX Opportunities MED Meeting 14 May 2013 Daniel Suranyi Eurostat B5
SDMX Information Model
Using SDMX structures to facilitate data reporting
SDMX: Enabling World Bank to automate data ingestion
Upcoming changes to the DMX technical standard
Generic Statistical Business Process Model (GSBPM)
SDMX: A brief introduction
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
(VIP-EDC) Point 6 of the agenda
Data collection of 2012: Data transmission standards and tools
SDMX Reference Infrastructure Introduction
2. An overview of SDMX (What is SDMX? Part I)
2. An overview of SDMX (What is SDMX? Part I)
SDMX Tools Architecture
Workshop on ESA 2010 transmission programme – What and how?
Data Transmission Tools & Services EDAMIS, SDMX, Validation
Data validation in Statistical Office of the Republic of Serbia
SDMX in the S-DWH Layered Architecture
SDMX: an Overview Abdulla Gozalov UNSD.
SDMX Tools Overview and architecture
Statistical Information Technology
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
Prepared by Peter Boško, Luxembourg June 2012
A review of the 2011 census round in the EU, including the successful implementation of a detailed European legal base First meeting of the Technical Coordination.
SDMX IT Tools SDMX use in practice in NA
Reportnet 3.0 Database Feasibility Study – Approach
GENEDI EUROPEAN COMMISSION - EUROSTAT GENERIC EDI TOOLBOX
European Statistical System Metadata Handler ESS MH (Super) Providers
Eurostat Unit B3 – IT and standards for data and metadata exchange
Validation Activities in the ESS What you will hear today…
Developing SDMX artefacts for data exchange, sharing and dissemination
Standardizing and industrializing a business process – the dissemination use case Alessio Cardacino - ESTP Course “Information standards.
SDMX IT Tools SDMX Registry
Integrated Statistical Production System WITH GSBPM
SDMX IT building blocks
SDMX in AFRICA SDMX Roadmap th SDMX Global Conference
Palestinian Central Bureau of Statistics
SDMX: From Labour Force Department to the Statistical Database
A non-statistical data exchange scenario
“Argentina´s first steps in SDMX”
Presentation transcript:

SDMX at the International Labour Organization SDMX Global Conference 16 – 19 September, 2019 – Budapest, Hungary

Once upon a time… Dissemination WS for ILOSTAT – 1st generation: 2013 Limited number of artefacts and formats delivered «Virtual registry» approach: all artefacts generated «on-the-fly» based on the structural metadata information in ILOSTAT Internal «consumers» ILO Knowledge Gateway: very easy integration of statistical DWI Country profiles: Desktop and Mobile applications Data Mapper: IMF product adapted to consume SDMX API WESO and YouthSTATS dashboards

Expanding the use of SDMX SDMX Query builder On line «wizard» to access ILOSTAT data and metadata in SDMX ILOSTAT Excel Add-in Superseeded (with new functionalities) the former «KILM» Excel Add-in Replaced the old proprietary WS by the SDMX standard API ILOSTAT Data Publisher Simple to use desktop tool to extract data and metadata from ILOSTAT Downloads information for one country ready to upload to .Stat v7

Expanding the use of SDMX Second generation WS: 2018 Same architecture as the previous version (on-the-fly virtual registry) Based on .Net NSIWS by Eurostat Implements all artefacts and complies with RESTful API v. 1.4 specification Delivers all available formats: SDMX-ML, SDMX-csv and SDMX-json ILO.Stat based in SIS-CC Data Explorer DE «connects» to ILOSTAT by consuming the new WS No changes in ILOSTAT’s backend

ILOSTAT Modular Architecture .Stat DE Reusable Components for the Web Search | Visualise | Share SMART DISSEMINATION WORKFLOW CONTROL VALIDATION & TRANSFORMATION METADATA MANAGEMENT DATA COLLECTION IT considerations   Modular design following GSBPM Oracle RDBMS and development tools Automated procedure for xQ and SDMX uploading with structural consistency E-Questionnaire online data collection Single set of metadata Single interactive consistency procedure regardless of data collection means «False positives» handling thru allowance issuing Full screen data editor Dynamic content dissemination website Data workflow management module Data is stored in a relational database mounted on Oracle 11g DBMS administered by ITCOM (the centralized ILO information technologies service). Two postulates have been established for the design of the new data structure: a) the data structure for the data collection database should be the same for all kind of time series data regardless of the periodicity, units of measure, classification breakdown and way of collection; and b) the main (atomic) unit is the “cell” of each table collected, which will be called VALUE and will keep associated dimensions and other attributes. Although it is a Data Compilation system (and not a proper statistical activity producing microdata), the system has a modular design following the recommendations of the GSBPM, including modules for Data Collection, Data Cleaning, Dissemination, Workflow tracking, Code lists maintenance, User Profiling and Access Control and Source & Methods (See Figure 3: ILOSTAT Information System modular design) Not included in the diagram Program development is based on Oracle APEX (Oracle Application Express) for the interactive applications, complemented with some PL/SQL packages and Java classes for specific tasks. Intensive data processing tasks, like consistency checking and Excel questionnaire generation are developed in SAS, accessing the Oracle database. The Workflow control dashboard and dissemination tables and charts are built using Oracle BI Enterprise Edition (OBIEE). The User Profiling and Access Control module, developed in APEX, includes a dynamic menu that lists the applications available for the user based on his user profile. Examples of them are Statistical Assistants, Analysts, Managers and External Users. The Data Collection module kept the automatic generation and upload of Excel questionnaires as in the former system, but it has been redesigned as to make use of a single set of metadata fully parameterized and common to both the collection and dissemination processes. The upload procedure (fully automated) performs basic consistency checking and routes the error report to the assigned SA for correction. The e-Questionnaire application (under development) will be an interactive full screen editor for value and annotations on the data collected developed in APEX and accessible thru the web. It will work on the “Data Collection” work tables and will operate based on the single set of metadata for the QTables. Electronic Data Interchange, probably using SDMX is in the roadmap for 2012, as a way of reducing the overburden to countries due to the request for information they already have in their databases and has to be transcript to offline or online questionnaires. The QTable Consistency process, developed in SAS, can be run as a batch process to analyze all records marked “for consistency” in the “Data Collection database” or can be launched on-demand by the SA. This process will pass the correct QTables to “Dissemination” database and mark those erroneous with the respective error codes, remaining in the repository. The assigned SA is notified of the results, and the status of each QTable is updated in the data management system (See Figure 4: Workflow status diagram). The Editor program is used by the SA to correct the errors detected in the data. This program displays the QTable being edited and the error messages related to it. When using off-line data collection methods, the country user can include annotations in the questionnaire that the Editor will display for the SA to code into notes associated to the data at the right level.

Community work SDMX v 2.1 plug-in for .Stat v7 Global DSD for Same architecture as ILOSTAT’s API Provides a full SDMX compliant API to .Stat v7 platforms Enables a smooth migration to .Stat Suite Data and Metadata download Data Explorer connected to v.7 backend Global DSD for Price statistics Labour statistics SDG reporting Definition of MSD mapping Global MCS to IHSN DDI-C template (work in progress)

Tools SMART DSD Constructor Use of SDMX structural metadata to define calculations and data recoding and reformatting SDMX-driven data conversion (including microdata) Batch utility SMARTcmd.exe allows scripting Data reporting without a real SDMX architecture in place DSD Constructor Easy to use tool for creating/editing DSD by combining concepts Online connection to any SDMX Registry Codelists and annotations management Perfect SMART companion tool

SMART DSD Constructor SDMX Registry ILOSTAT DSD Structural Metadata SMART Dataset DATA REPORTING DATA CONVERSION Microdata LMIS UPLOAD LMI ANALYSIS ILOSTAT-ART is a free basic statistical processor that can compute statistical tables (reported indicators) defined by DSD’s, either by processing microdata sets or transcoding aggregate input data. It relies strongly on mappings between input variables and DSD's concepts, which can be saved and reused. This approach ensures the consistency of the output codes since they match the structure of the DSD. Different file formats can be processed (e.g. Stata, SPSS, csv, SDMX), and produces output “data packages” in Excel, csv or SDMX formats, ready to fulfil the data reporting requirements or feed a .Stat dissemination platform. Aggregated Data Dataset MAPPING

Innovation: Electronic data exchange Non-statistical application of SDMX Institution 1 Institution 2 Define the model of the data to be exchanged 1 3 Data transmission Receive request Authenticate requester 2 Send data request 7 Receive response 4 Process request: Prepare data response 8 Authenticate response Is the sender authorized ? Data transmission Local databases & information systems 5 Encrypt & Sign response 9 Process response: Insert into local system 6 Send data response Local databases & information systems

Innovation: Electronic data exchange Current status: A proof-of-concept showed the feasibility of the approach. Prototype of death data exchange using the existing SDMX environment. Using the SDMX toolkit. Including: Data Structure, Data Flows, Data Packages/Sets, Code lists, etc. Customisation and Mapping Tools: Building Data Flows by selecting data fields from concept schemes. Connection of a Data Flow to a local database to generate Data Packages. Additional tools: GPG4Win: Signature and encryption of Data Packages. Nextcloud (in ISSA premises): Secured Communication channel based on shared folders. SMART: Desktop tool for converting files among different formats (XML, csv, etc.)

Innovation: microdata in SDMX PoC on microdata processing in SDMX

Innovation: CSV Structural Metadata Four data message formats: EDIFACT, xml, json and csv UN/EDIFACT SDMX-EDI only suitable for time series data xml: widely used for representing documents and general data structures base format for communications protocols and web services requires IT knowledge json: «new generation» data exchange format (2000s) highly oriented to web development csv: very popular data exchange format, partially standardized (RFC4180) Every spreadsheet or statistical package can import csv data

Innovation: CSV Structural Metadata SDMX-csv format supported for data messages only csv datasets are very efficient for statistical processing The lack of structural metadata messages in csv makes it difficult to access to categories’ valid codes and labels in these packages Code lists can be represented in csv without effort. An structural metadata artefact in csv format is required to link the dataflow to its DSD, conceptSchemes and codelists (work in progress)

Thank you ! Visit us at https://ilostat.ilo.org Edgardo Greising Head of Knowledge Management and Solutions Unit STATISTICS - ILO greising@ilo.org sdmx@ilo.org Visit us at https://ilostat.ilo.org

Data Reporting Data reporting without a real SDMX architecture in place Primary Statistical Activity One year ago, during the “Meeting of Experts 2016” in Aguascalientes, MEX, we discussed during the Breakout Session 2 “How to design and build an SDMX Enterprise Architecture”. Amongst the three basic scenarios presented, the so called “Light” intended for data reporting only and without a real SDMX architecture in place, happened to be recognized as a quite common situation along data producers in developing countries. Many of them lack a central repository of indicators, and the information to be reported is “spread” inside the institution in a number of different formats and media. Some tools are available from the SDMX community to help initiating the data reporting in SDMX, like the SDMX-RI Mapping tools to generate a “PUSH” mode flow, or the SDMX Converter if no database where SDMX-RI can be plugged in. Nonetheless, quite often (just to make it a bit more difficult) the structure of the indicators calculated by the data producer for its internal use differs from the specification of the information to be reported which, for example, uses a different variant of a classification breakdown. In this case, the microdata needs to be re-processed to generate the outputs with the right structure. Questionnaires in Excel, require manual transcription of data (and metadata)  Experts won’t do this job. Questionnaires arrive late, when the survey has already been processed and published  Experts are likely to be engaged in another project. Breakdowns are different from those used at national level  Requires re-processing including new mappings Variables definitions may differ from those used at national level  Requires re-processing including new calculations

ILOSTAT SMART facts No indicators’ database is required Tables defined dynamically via a DSD Selectable classifications’ versions and variants Flexible mapping Conditions applied on-the-fly to tally/sum/avg Mapping can be saved and re-used Multi-language ILO standard routines for derived variables (*) Stand alone + on line access to any SDMX registry and/or Data API Process microdata or aggregate datasets in Stata, SPSS, SDMX and csv Several output formats: .xls, pdf, csv, sdmx Desktop and Online(*) versions (*) Coming soon

Thank you ! Edgardo Greising Head of Knowledge Management and Solutions Unit STATISTICS - ILO greising@ilo.org sdmx@ilo.org