SDMX Reference Infrastructure Eurostat Unit B3 Section: Standardisation and advanced IT for statistics
Presentation overview What is SDMX Reference Infrastructure Why use SDMX Reference Infrastructure Where is the SDMX Reference Infrastructure used Development strategy and tentative roadmap Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 2
What is SDMX Reference Infrastructure Universal framework for modern data provision Set of pick-and-choose reusable building blocks allowing a statistical office to expose data to the external world based on access rights Designed to provide data and structural metadata based on mappings to each organization's dissemination data warehouse Uses SDMX standards incl. one for Web Services Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 3
Eurostat Unit B3 Section: Standardisation and advanced IT for statistics ge
Why use SDMX Reference Infrastructure Developed to simplify the exchange of data Provides standard software and components, allowing individual statistical organisations to interact and exchange their data using the same software and methodology Modular approach, use part or the entire infrastructure, extend it by adding new modules or modify it in any other way to suit their own purposes Developed in both Java and .NET Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 5
Where is the SDMX Reference Infrastructure used As of March 2013 deployed in 22 EU countries Tested across EU Member States (2011 population census) Running in Mexico Expression of interest: Latin America, the Caribbean, OECD and Russia Autumn 2013: Expected to run in EU27 member states In Eurostat dissemination Web Service upgrade Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 6
Development strategy Eurostat Unit B3 Section: Standardisation and advanced IT for statistics
Development strategy 2012 – 2013 (1) Architectural changes are needed (two step approach) Intermediate solution to solve performance & out of memory errors "Final" solution to provide a common API and implement SDMX 2.1 Implement new user requests and correct defects Widen the scope and usage of SDMX-RI among data providers ESS.VIP-programme such as the ICT project Reuse for other statistical data collections DSWS (Eurostat dissemination web service) Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 8
Development strategy 2012 – 2013 (2) Architectural changes Common SDMX API integration Alignment to SDMX 2.1 Known defects and enhancements Tentative release calendar Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 9
Architectural changes Eurostat Unit B3 Section: Standardisation and advanced IT for statistics
Overview SDMX-RI First implementation Overview, shortcomings SDMX-RI Intermediate solution Overview, rationale, changes, benefits, impact to users SDMX-RI “Final” solution SDMX-RI First Implementation Streaming SDMX-RI Intermediate Solution Common API SDMX v2.1 SDMX-RI “Final” Solution Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 11
1. SDMX-RI First implementation Schematic overview Overview of different components Shortcomings Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 12
SDMX-RI overview Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 13
Eurostat’s first implementation of SDMX-RI getGenericData getCompactData getCrossSectionalData queryStructure Web Service Provider Web Service (1)Structure Retriever (2)Query Parser (3)Data Retriever (4)SDMX Data Generator (5)SDMX Model PC-Axis Mapping Store Dissemination DB Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 14
First implementation shortcomings Memory issues with large messages: It keeps all data in memory before sending the response Performance: It shows a significant decrease of performance for larger datasets and concurrent requests Does not offer support for SDMX 2.1 Current SDMX Model (5) design shortcomings: Tight to SDMX 2.0 XSD Schema Does not provide API Requires in-depth knowledge of SDMX Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 15
2. SDMX-RI Intermediate solution Why an Intermediate solution Overview of different components What changed Benefits Impact to SRI installations Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 16
Why an Intermediate solution To solve identified problems Decreased performance From increased memory allocation resulting into long response times “Out Of Memory” errors From increased memory allocation resulting into inability to serve large data requests Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 17
Eurostat first implementation of SDMX-RI getGenericData getCompactData getCrossSectionalData queryStructure Web Service Provider Web Service (1)Structure Retriever (2)Query Parser (3)Data Retriever (4)SDMX Data Generator (5)SDMX Model PC-Axis Mapping Store Dissemination DB Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 18
Eurostat intermediate solution of SDMX-RI getGenericData getCompactData getCrossSectionalData queryStructure Web Service Provider Web Service (3)Data Retriever (streaming) (1)Structure Retriever (3)Data Retriever (5)SDMX Model/IO (revised) PC-Axis Mapping Store Dissemination DB Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 19
Intermediate solution: What changed Streaming of data in the service Usage of JAX-WS in Java (Axis 1.0 could not support streaming) SDMX Model/IO (5) revised with Streaming Writers QP (2) is now part of SDMX Model/IO (5) library DG (4) functionality is now included in the SDMX Model/IO (5) library DR API (3) changes due to streaming Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 20
Intermediate solution: Benefits Better performance Improvement of approximately 75% in concurrent users scenarios Solution to “Out Of Memory” problems for large datasets No memory constraints Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 21
Intermediate solution: Impact for SDMX-RI installations Organizations that have installed the first implementation Only re-install the Web Service Existing clients of Web Service are not affected The SDMX 2.0 SOAP interface remains Organizations using the SDMX-RI components APIs APIs has changed due to streaming support Migration will be required Organizations that have already done modifications to the source code Will have to make the changes again if they want to use the intermediate solution Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 22
2. SDMX-RI “Final” solution Why a “Final” solution Overview of different components What changed Benefits Impact to SDMX-RI installations Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 23
Why a “Final” solution Eurostat’s decision for a Common SDMX API Implementation of components covering all aspects of the API Support for SDMX 2.1 New messages (data representation, queries) New Web Service interfaces (SOAP/REST) 24 Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics
Eurostat’s “Final” solution of SDMX-RI Eurostat’s intermediate solution of SDMX-RI getGenericData getCompactData getCrossSectionalData queryStructure Web Service Provider Web Service (1)Structure Retriever (1)Structure Retriever SR API DR API (3)Data Retriever (streaming) (3)Data Retriever (streaming) (6)Common SDMX API (5)SDMX Model/IO (revised) <implements> PC-Axis Mapping Store (7)SDMX API Implementation Dissemination DB Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 25
“Final” solution: What changes (1) All modules will be modified to use the SDMX Common API (6) The SDMX Model/IO (5) will no longer be used For Java the MT API implementation (7) will be used For .NET the API implementation (7) will be developed SDMX-RI Components APIs will be changed Due to SDMX Common API SDMX 2.1 messages and new query features will be supported Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 26
“Final” solution: What changes (2) Web Service will be extended to support SDMX 2.1 standardized SOAP and RESTful APIs New Web Service endpoints will be added above the Controller They will co-exist with SDMX 2.0 endpoint Will support SDMX 2.1 error handling Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 27
WS extension to SDMX 2.1 new interfaces SOAP Request 2.0 SOAP Request 2.1 REST Request 2.1 NSI_Service_2.0 NSI_Service_2.1 NsiRestService Web Service Provider Controller (1)Structure Retriever (6)Common SDMX API (3)Data Retriever (streaming) (7)SDMX 2.0 Implementation (7)SDMX API Implementation (7)SDMX 2.1 Implementation Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 28
Web Service request sequence :ServiceImpl GetCompact Controller() :Controller WS Client HandleRequest ( request, OutputStream) DataQueryParseManager() :DataQueryParseManager buildDataQuery(request) DataQueryBean DataRetriever() :DataRetriever getDsdForDataQuery(DataQueryBean) DataStructureBean CompactWriter :CompactData WriterEngine CompactWriter(OutputStream, DataStructureBean) RetrieveData(DataQueryBean, CompactWriter) Write* {Groups/Series/Obs} Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 29
“Final” solution: Benefits Usage of Common SDMX API Interchangeable implementation Foster component reusability Support of data streaming Support of SDMX 2.1 New query capabilities New message formats Support of RESTful API Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 30
Impact for SDMX-RI installations (1) Organisations with Mapping Store in production, will have to: Install new Mapping Assistant Upgrade Mapping Store automatically within MA Organisations that have a Web Service installation in place, will have to: Install the new Web Service package Existing clients of Web Service will not be affected The SDMX 2.0 SOAP interface will remain Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 31
Impact for SRI installations (2) Organisations using the SDMX-RI components APIs Migration will be required Migration guidelines will be provided Organisations that have already done modifications to the source code Will have to make the changes again using the “Final” solution Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 32
Common SDMX API integration Eurostat Unit B3 Section: Standardisation and advanced IT for statistics
SDMX-RI Components affected Data Retriever (DR) Structure Retriever (SR) Web Service (WS) Web Client (WebC) Mapping Assistant (MA) Test Client (TC) Test Auth Config (TAC) DR In Java, this activity has already been performed in the context of the PoC. The source code of the PoC will be used for the migration of the DR to the common API. This makes the task easier than the similar one for .NET technology. In .NET the current SDMX Model/IO the migration should be done from scratch. SR NSI WS The SDMX Model/IO will be removed and the corresponding Common API classes will be used. Moreover, it will be needed the integration of the new APIs of the DR and SR. NSI Client For both Java and .NET, the NSI Client needs to be ported to the Common API replacing the usage of the current SDMX Model/IO. Mapping Assistant The changes required for integrating common API are related with the MA importing mechanism. The current model and reading will be removed and the common API will be used in the classes that deal with the insert and update of the SDMX artefacts in the Mapping Store. Test Client In Java, N/A as there is no Java implementation of the Test Client In .NET, the Test Client needs to be ported to the Common API replacing the usage of the current SDMX Model/IO. Test Auth Config In Java, N/A as there is no Java implementation of the Test Auth Config In .NET, TAC needs to be ported to the Common API replacing the usage of the current SDMX Model/IO. Also, the TAC will have to be modified to use a SdmxBeanRetrievalManager to MSDB implementation instead of StructureAccess class (common code in .NET SR/DR). Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 34
Interactions amongst SDMX-RI Components Compile-time Structure Retriever Test Client Mapping Store Run-time Usage Web Service Common SDMX API Mapping Assistant Web Client Test Auth Config The solid (red) arrows indicate compile-time dependency, i.e. a what a component required in order to be capable of being complied and built as a SW. The dashed (green) arrows indicate a run-time dependency, i.e. what a component requires in order to execute and function as intended. For example, DR needs to know which Mapping Store to use. The long dashed (blue) arrows indicate a simple usage, i.e. what a component may use but is not essential for it to run. For example, the Test Client may run fine as a standalone SW tool, but need a WS in order to test. Similarly the Web Client runs as a Web Application but it will not show any content unless an SDMX Web Service is specified. Data Retriever Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 35
SDMX-RI Component Dependencies (1) The Common SDMX API is the starting point Java version is sufficient for integration .NET version development is ongoing Adoption of .NET 4.0 Framework For the Common SDMX API Before integrating to .NET components Integration of the Common SDMX API into DR, SR, WS and MA can be done in parallel MA activities require DR new API and Mapping Store updates Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 36
SDMX-RI Component Dependencies (2) SR activities require update of the Mapping Store Web Client activities require WS new SOAP/REST interfaces Web Client depends on WS, DR and SR in the context of custom requests Test Client activities require DR new API Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 37
Alignment to SDMX 2.1 Eurostat Unit B3 Section: Standardisation and advanced IT for statistics
SDMX-RI Components affected Data Retriever (DR) Structure Retriever (SR) Web Service (WS) Web Client (WebC) Mapping Assistant (MA) Test Client (TC) * Already integrated to the Common SDMX API Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 39
Data Retriever activities for SDMX 2.1 Query for a specific Dimension at observation level and flat format (new Dataset formats per SDMX 2.1) Request for Dataset with explicit Measures Request details of the Dataset to be returned (Full, DataOnly, SeriesKeyOnly, NoData) Specify in the Query the TimeFormat to be matched Specify in the Query the PrimaryMeasure value (new SDMX 2.1 operators) Number of first N / last N observations to be returned New time semantics in the data Query Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 40
Structure Retriever activities for SDMX 2.1 New resolve references features (All, Children, Parents, ParentsAndSiblings, Descendants, SpecificObjects) New detail levels for returned Artefacts (Full, Stub, CompleteStub, MatchedItems, CascadeMatchedItems) Support returnMatchedArtefact feature New detail levels for referenced Artefacts (Full, Stub, CompleteStub) Query Structures by new terms and contained Artefacts Query for Categorization Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 41
Web Service activities for SDMX 2.1 Support of SOAP interface according to SDMX 2.1 Web Service guidelines and standard WSDL Support of RESTful interface according to SDMX 2.1 Web Service guidelines and standard WADL Error handling according to SDMX 2.1 Web Service guidelines Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 42
Web Client activities for SDMX 2.1 Support of SOAP interface according to SDMX 2.1 Web Service guidelines communication with WS Support of RESTful interface according to SDMX 2.1 Web Service guidelines communication with WS Support of new custom messages in SDMX 2.1 (if possible) or by other means Generate SDMX-ML 2.1 Data and Structure Query messages Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 43
Mapping Assistant activities for SDMX 2.1 Change the Mapping Store to support additional information for SDMX 2.1 Modify the import mechanism for loading SDMX 2.1 artefacts to the revised Mapping Store Change the GUI to present the additional information Support SDMX 2.1 time semantics “Support SDMX 2.1 time semantics” refers to the support of the time query semantics in the data query, i.e. a time period in the clause is translated to a concrete datetime and returns all the periods that fall completely within the datetime range. This will affect mostly the DR (query generation for time). However it will affect also MA in the time transcoding screen, since it will be needed to transcode all the period time (that a dataset may have). Currently you can transcode only one frequency. Moreover, the time expression field will need to change also. Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 44
Test Client activities for SDMX 2.1 Support output SDMX-ML 2.1 data formats Enhance QueryBuilder to support new SDMX 2.1 Data Query options Generate SDMX 2.1 SOAP and REST Data Queries Implement REST WS test screen “Support SDMX 2.1 time semantics” refers to the support of the time query semantics in the data query, i.e. a time period in the clause is translated to a concrete datetime and returns all the periods that fall completely within the datetime range. This will affect mostly the DR (query generation for time). However it will affect also MA in the time transcoding screen, since it will be needed to transcode all the period time (that a dataset may have). Currently you can transcode only one frequency. Moreover, the time expression field will need to change also. Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 45
Known defects & Enhancements Eurostat Unit B3 Section: Standardisation and advanced IT for statistics
Mapping Assistant/Mapping Store No dependency (March 2013 release) Improve the Query Editor in terms of graphical flexibility Store TextFormat information Handling of non-existing local codes in transcoding Enhance the way Mapping Assistant handles the capability to attach the same Dataflow under more than one Categories Implement change and delete functionality for an added column and an added constraint Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 47
Mapping Assistant/Mapping Store Dependency on Common SDMX API & SDMX 2.1 Export-Import of information stored within Mapping Store Support Annotations in Mapping Assistant/Mapping Store Improve information displayed for artefacts Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 48
Web Service No dependency Export-Import of information stored within Mapping Store Support Annotations in Mapping Assistant/Mapping Store Dependency on Common SDMX API & SDMX 2.1 Add a configurable limit on the number of cells (observations) served Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 49
Web Client No dependency Enhance the behavior of the tree displayed on the screen to expand when clicking on a category label Add/verify support for ESTAT’s list of recommended browsers Support ASP.Net 4.0 Dependency on Common SDMX API & SDMX 2.1 Import and SDMX-ML Query Perform stress tests Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 50
Test Client No dependency Implement a mechanism for using encrypted user id and password Dependency on Common SDMX API & SDMX 2.1 Implement support for internationalization Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 51
Tentative Release Calendar Eurostat Unit B3 Section: Standardisation and advanced IT for statistics
Milestones Mapping Assistant & Mapping Store Updated with some enhancements: February/March 2013 Integrated & aligned with SDMX 2.1: June 2013 Java SDMX-RI Web Service (incl. all components*) Ready: early June 2013 Due to Mapping Assistant dependency: end of June 2013 .NET SDMX-RI Web Service (incl. all components*): June 2013 * Included components: Java: WS, DR, SR, Common SDMX API/Impl., WebC .NET: WS, DR, SR, Common SDMX API/Impl., WebC, TC, TAC Eurostat Unit B3 – Section: Standardisation and advanced IT for statistics 53
SRI Development Timeline Java SDMX-RI integrated Java SDMX-RI aligned Java SDMX-RI packaged Feb 2013 Mar 2013 April 2013 May 2013 June 2013 July 2013 Oct 2013 .NET SDMX-RI packaged .NET/Java SDMX-RI enhancements .NET SDMX-RI integrated .NET SDMX-RI Aligned MA/MS enhancements MA/MS integrated MA/MS aligned 54