SDMX IT Tools SDMX Converter

Slides:



Advertisements
Similar presentations
SDMX training session on basic principles, data structure definitions and data file implementation 29 November
Advertisements

M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
CountryData Development Improving the collation, availability and dissemination of development indicators (including the MDGs) Nairobi, 27 November 2013.
CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.
Francesco Rizzo (ISTAT - Italy) SDMX ISTAT FRAMEWORK GENEVE May 2007 OECD SDMX Expert Group.
1 Eurostat Unit B5 – Statistical Information Technologies SDMX Basics – October 2011 SDMX Basics Core Elements Information Model Data Structure Definition.
Eurostat B.4 Enhancements Implemented SDMX RI User Group Luxembourg, September 2013.
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
SDMX IT Tools Introduction
Eurostat SDMX Reference Infrastructure: Tools demonstration November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois.
SDMX IT Tools SDMX use in practice in NA
Implementation of SDMX for Balance of Payments Balance of Payments Working Group 9-10 April 2013 BP Daniel Suranyi Eurostat B5 Management of statistical.
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX IT Tools SDMX Converter Jean-Francois LEBLANC Christian.
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois LEBLANC Christian SEBASTIAN SDMX IT Tools Common.
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois LEBLANC Christian SEBASTIAN SDMX IT Tools SDMX.
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
Eurostat Mapping Assistant May 2016 Eurostat, Unit B3 – IT solutions for statistical production Jean-Francois LEBLANC Christian SEBASTIAN 1.
Eurostat May 2016 Eurostat, Unit B3 – IT solutions for statistical production Test Client Jean-Francois LEBLANC Christian SEBASTIAN.
Solvency II Tripartite template V2 and V3 Presentation of the conversion tools proposed by FundsXML France.
Databases (CS507) CHAPTER 2.
B.6 Roadmap 2013 – 2014 SDMX RI User Group Luxembourg, September 2013.
Database System Concepts and Architecture
4. SDMX: Main objects for data exchange
Training course on Euro SDMX Registry
SDMX Opportunities MED Meeting 14 May 2013 Daniel Suranyi Eurostat B5
SDMX Information Model
Practical use case of SDMX (1): Short-term Statistics (STS)
SDMX Converter Raynald PALMIERI June 2015
SDMX: A brief introduction
SDMX Reference Infrastructure Introduction
Jean-Francois LEBLANC Christian SEBASTIAN
2. An overview of SDMX (What is SDMX? Part I)
Eurostat – Units E2, B5 Cristina BLANARU
2. An overview of SDMX (What is SDMX? Part I)
SDMX Tools Architecture
Workshop on ESA 2010 transmission programme – What and how?
SDMX Information Model: An Introduction
Data Transmission Tools & Services EDAMIS, SDMX, Validation
SDMX in the S-DWH Layered Architecture
SDMX: an Overview Abdulla Gozalov UNSD.
SDMX Tools Overview and architecture
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
SDMX IT Tools Data Structure Wizard
SDMX IT Tools SDMX use in practice in NA
9. Practical use case 3: Pesticides Use Project
Jean-Francois LEBLANC Christian SEBASTIAN
GENEDI EUROPEAN COMMISSION - EUROSTAT GENERIC EDI TOOLBOX
Sending data to EUROSTAT using STATEL and STADIUM web client
Eurostat Unit B3 – IT and standards for data and metadata exchange
Jean-Francois LEBLANC Christian SEBASTIAN
EDIT data validation system Ewa Stacewicz EUROSTAT VALIDATION TEAM
Eurostat Unit B3 – IT and standards for data and metadata exchange
Eurostat Unit B3 – IT and standards for data and metadata exchange
Eurostat Unit B3 – IT and standards for data and metadata exchange
5. SDMX: General input requirements
Presentation plan Accessing and Retrieving SDMX data
Validation Activities in the ESS What you will hear today…
Developing SDMX artefacts for data exchange, sharing and dissemination
SDMX: Frequently Asked Questions
Standardizing and industrializing a business process – the dissemination use case Alessio Cardacino - ESTP Course “Information standards.
SDMX IT Tools SDMX Registry
Integrated Statistical Production System WITH GSBPM
SDMX IT building blocks
SDMX Converter Abdulla Gozalov, UNSD.
Jean-Francois LEBLANC Christian SEBASTIAN
Presentation transcript:

SDMX IT Tools SDMX Converter Jean-Francois LEBLANC Christian SEBASTIAN 11-13 May 2015 Eurostat, Unit B3 – IT solutions for statistical production

Table of Contents Objectives What is the Converter Interfaces Design/Implementation Minimum requirements Supported formats Interfaces GUI CLI Common API Web Services

Table of Contents SMDX Converter vs SDMX-RI Where is the Converter Hands-on exercise

1. Objectives Illustrate how to use the Converter List all the supported formats Explain its interfaces Clarify when it is recommended State its limitations

2. What is the Converter The SDMX converter is a JAVA application that converts between all the following formats* SDMX 2.0 Generic, Compact, Utility And Cross-sectional. SDMX 2.1 Generic Data, Generic TS, Structure-specific, Structure-specific TS GESMES TS, 2.1, DSIS CSV, FLR MESSAGE GROUP (special SDMX 2.0 format) DSPL Excel *Limitations may apply. Please, check the User Manual. The SDMX Converter has been developed in the context of the SODI project. It is a Java tool that allows converting from/to various dataset types based on SDMX-ML DSDs. The allowed types between which the Converter is capable of converting are: SDMX-ML formats (in order to go from/to Cross-Sectional type the DSD should contain Cross-Sectional information and TimeDimension) GESMES/TS (aka SDMX-EDI) (not capable of converting from/to SDMX-ML Cross-Sectional format) GESMES/2.1 and GESMES/DSIS CSV, FLR, EXCEL, DSPL (supports mapping mechanism and parametric delimiter for CSV. Converting to CSV/FLR from other formats may result in loss of attributes attached at a higher level than observation.)

2.1 Design/Implementation Steps to proceed a conversion Reading the input message Parsing of the message Populating the data model of the tool (based on the SDMX information model) Reading the DSD The DSD is retrieved from the Registry in order to complete the conversion The DSD can be loaded from files so no connection is needed Writing the converted message Uses the data model to write the output message in the target format <pagebreak> The conversion process comprises two main activities; reading an input data message and writing out the converted data message. The are specific modules that read and write datasets i.e SDMX-ML (Compact, Generic, Utility, Cross-Sectional) Gesmes (TS, 2.1, DSIS) Flat files (CSV FLR). The information of a dataset to converter is stored in classes that are based on the SDMX Information Model v2.0. These classes play the role of an intermediate format between readers and writers. The Data Structure Definition related to the converted datasets is needed for performing a conversion. SDMX Converter, if it’s is not provided manually, can retrieved the DSD from the Registry.

2.2 Minimum requirements Input file Output file (complete path) Format for input and output files Specify DSD DSD file Reference to a DSD file in the Registry Reference to a Dataflow file in the Registry

2.3 Supported formats SDMX 2.0 GENERIC SDMX COMPACT SDMX UTILITY SDMX CROSS-SECTIONAL SDMX MESSAGE GROUP SDMX 2.1 GENERIC DATA 2.1 GENERIC TS DATA 2.1 STRUCTURE SPECIFIC DATA 2.1 STRUCTURE SPECIFIC TS DATA 2.1 GESMES GESMES TS GESMES 2.1 GESMES DSIS OTHERS CSV FLR DSPL EXCEL <pagebreak> The conversion process comprises two main activities; reading an input data message and writing out the converted data message. The are specific modules that read and write datasets i.e SDMX-ML (Compact, Generic, Utility, Cross-Sectional) Gesmes (TS, 2.1, DSIS) Flat files (CSV FLR). The information of a dataset to converter is stored in classes that are based on the SDMX Information Model v2.0. These classes play the role of an intermediate format between readers and writers. The Data Structure Definition related to the converted datasets is needed for performing a conversion. SDMX Converter, if it’s is not provided manually, can retrieved the DSD from the Registry. Limitations may apply. Please, check the User Manual.

3. Interfaces User interface Command line Web service API

3.1 GUI (Converter 5.1.0) 1. Selection of the input/output files and their format 2.a Select the DSD in the local drive 2.b Identify a DSD to download from the SDMX Registry (configuration required) 2. If the local DSD includes multiple versions, we can specify the one derired 2. If the local DSD includes multiple versions, we can specify the one desired 2.c Identify a dataflow linked to the DSD to download from the SDMX Registry (configuration required) 3. Excel parameter file There are some mandatory fields, e.g. the DSD is mandatory, therefore, either loading from a file or from the Registry, it must be present. A message will appear if there are mandatory fields missing. 3. SDMX header (.prop file) Only for flat and excel files CSV parameters 4. Mapping 5. CSV quotation 6. SDMX (output) validation XML parameters for SDMX output formats

3.1.1 Example 1. Input and output files 2. Format 3. DSD file or reference

3.1.2 Header

3.1.3 Change mapping

3.1.4 Transcoding

3.2 CLI Converter Options Windows OS: converter.bat Converter [Options] InputFile OutputFile InputFormat OutputFormat Converter Windows OS: converter.bat Unix OS: converter.sh Options -reg, -dsd_file, -dsd_id, -dsd_agency, -dsd_version, -df, -df_id, -df_version, -df_agency, -header_file, -date_format, -level, -mapping_file, -ordered_input, -trans_file, -delimiter, header_row For further information check the User Manual page 84. converter.bat -dsd_file "C:\ProjectsSharp\myOutputs\First_NA_MAIN_DSD.xml" -header_file "C:\ProjectsSharp\myOutputs\Input_DatasHeader.prop" -header_row DISREGARD_COLUMN_HEADER -delimiter ; "C:\ProjectsSharp\myOutputs\Input_Datas.csv" "C:\ProjectsSharp\myOutputs\CLI_Compact.xml" CSV COMPACT_SDMX

3.2.1 Example in Windows converter.bat -dsd_file "C:\ProjectsSharp\myOutputs\First_NA_MAIN_DSD.xml" -header_file "C:\ProjectsSharp\myOutputs\Input_DatasHeader.prop" -header_row DISREGARD_COLUMN_HEADER -delimiter ; "C:\ProjectsSharp\myOutputs\Input_Datas.csv" "C:\ProjectsSharp\myOutputs\CLI_Compact.xml" CSV COMPACT_SDMX

3.3 Common API <pagebreak> The conversion process comprises two main activities; reading an input data message and writing out the converted data message. In previous releases of this tool the first activity resulted in a populated data model, based on the SDMX v2.0 information model. The second activity used that populated data model to write the output message in the required target format. In other words the data model was used as an intermediate storage of the parsed data.   That solution had a significant advantage; the data model was used as a common ‘format’ which all readers should write to and all writers read from. That way the number of possible combinations of source and target formats was significantly reduced; only conversion from all other formats to the data model ‘format’ and then from that back to all other formats needed be implemented. On the other hand that same solution had an apparent drawback; the whole data model should be stored on system memory, which could not be enough for large datasets, even for systems with very large amounts of available memory. To tackle that obstacle a redesign of the conversion process was needed. In this version of the conversion tool its readers and writers are sort of ‘plugged together, in the sense that all writers should implement a ‘Writer’ interface and all readers should be capable of making calls to that interface. While the data model is still used as an intermediary, this time only chunks of that model get stored in system memory. Each of those chunks accounts for only a portion of the complete data message, which may correspond to a populated timeseries (including all its observations), or sibling group (only attribute data, or dataset (only attribute data, not including its groups and timeseries), or a message header. As soon as one of those chunks is completely populated it is send to the ‘plugged writer, by calling the appropriate method of the Writer interface, and then is removed from system memory. The previous version follows Data Storage Model while the latest one follows a Streaming Model. Nevertheless, for backwards compatibility mostly, all readers and writers still provide a method implementing the previous solution of using a complete populated data model, bearing of course the aforementioned disadvantage.

3.4 Web Service A web service also exists for the Converter Based on Java Can be installed on Tomcat Server or Weblogic A Test Client is provided to test the Web Service conversions (only available for Windows)

3.4.1 Wsdl

3.4.2 Web Service client

4. SMDX Converter vs SDMX-RI SDMX Converter SDMX-RI Standalone application Needs to be installed on a server File repository Connected to dissemination DB Generates SDMX files from input files Generates SDMX files from customized SDMX queries

5. Where to find the SDMX Converter You can download the latest version of the SDMX converter on CIRCABC https://circabc.europa.eu/w/browse/6b2323b6-0d4e-43dc-bc7e-dd5e9ff958cc Available packages SDMX Converter Platform Independent SDMX Converter Web Service SDMX Converter installer for Windows 32-bit SDMX Converter Documentation

6. Hands-on exercise Conversion using the Common API in JAVA. Eclipse Java EE (or equivalent) SDMXSource code (www.sdmxsource.org) Input file and DSD

SDMX Converter