SDMX Reference Infrastructure Development strategy – work in progress

Slides:



Advertisements
Similar presentations
1 SDMX Reference Infrastructure (SDMX-RI) Work in progress, status and plans Bengt-Åke Lindblad, Adam Wroński Eurostat Eurostat Unit B3 – IT and standards.
Advertisements

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.
1 SDMX Reference Infrastructure SDMX Global Conference 2 – 4 May 2011, Washington DC Adam Wroński, Marco Pellegrino, Bengt-Åke Lindblad, Nadezhda Vlahova.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Francesco Rizzo (ISTAT - Italy) SDMX ISTAT FRAMEWORK GENEVE May 2007 OECD SDMX Expert Group.
1 Meeting on the Management of Statistical Information Systems (MSIS 2010) SDMX architecture for data sharing and interoperability Francesco Rizzo, ISTAT,
Eurostat B.3 Alignment to SDMX 2.1 SDMX RI User Group Luxembourg, September 2013.
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
SDMX IT Tools SDMX Reference Infrastructure
SDMX IT Tools SDMX use in practice in NA
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois LEBLANC Christian SEBASTIAN SDMX IT Tools SDMX.
7b. SDMX practical use case: Census Hub
Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois LEBLANC Christian SEBASTIAN SDMX IT Tools Common.
Eurostat 6. SDMX: A non-technical overview of the SDMX architecture and IT tools 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services”
Software Reuse. Objectives l To explain the benefits of software reuse and some reuse problems l To discuss several different ways to implement software.
Databases (CS507) CHAPTER 2.
B.6 Roadmap 2013 – 2014 SDMX RI User Group Luxembourg, September 2013.
Results from Essnet for SDMX WP7 PC-Axis SDMX Integration
Distribution and components
The evolution of the SDMX infrastructure and services
Exchanging Reference Metadata using SDMX
SDMX Information Model
SDMX Reference Infrastructure
Data, Databases, and DBMSs
SDMX Reference Infrastructure
Support to National Helpdesks
Census Hub in practice Working Group "European Statistical Data Support" Luxembourg, 29 April 2015.
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
ESSnet on SDMX phase II Laura Vignola
11. The future of SDMX Introducing the SDMX Roadmap 2020
SDMX Reference Infrastructure Introduction
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
Adult Education Survey
Census Hub: Progress report
2. An overview of SDMX (What is SDMX? Part I)
ESSnet on SDMX phase II June 7-8, Luxembourg.
Eurostat – Units E2, B5 Cristina BLANARU
2. An overview of SDMX (What is SDMX? Part I)
SDMX Tools Architecture
SDMX Reference Infrastructure
SDMX Reference Infrastructure
Workshop on ESA 2010 transmission programme – What and how?
An Introduction to Software Architecture
Data Transmission Tools & Services EDAMIS, SDMX, Validation
LOD reference architecture
SDMX in the S-DWH Layered Architecture
SDMX Tools Overview and architecture
Statistical Information Technology
SDMX IT Tools SDMX Converter
Metadata The metadata contains
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
X-DIS project: final report
SDMX : General introduction H. Linden, Eurostat, Unit B5
A review of the 2011 census round in the EU, including the successful implementation of a detailed European legal base First meeting of the Technical Coordination.
Configuration management
SDMX IT Tools SDMX use in practice in NA
Collecting methodological information on regional statistics
Eurostat Unit B3 – IT and standards for data and metadata exchange
2nd SISAI meeting Luxembourg, June 2012
Eurostat Unit B3 – IT and standards for data and metadata exchange
Eurostat Unit B3 – IT and standards for data and metadata exchange
Eurostat Unit B3 – IT and standards for data and metadata exchange
Marco Pellegrino, Bengt-Åke Lindblad
Standardizing and industrializing a business process – the dissemination use case Alessio Cardacino - ESTP Course “Information standards.
Interoperability of metadata systems: Follow-up actions
SDMX IT Tools SDMX Registry
SDMX IT building blocks
Presentation transcript:

SDMX Reference Infrastructure Development strategy – work in progress Eurostat Unit B3 – IT and standards for data and metadata exchange

Presentation overview What is the SDMX Reference Infrastructure Status report Development roadmap 2012 Development strategy - work in progress Eurostat Unit B3 – IT and standards for data and metadata exchange 2

What is the SDMX Reference Infrastructure Set of pick-and-choose building blocks allowing a statistical office to expose data to the external world based on access rights Developed in both Java and .NET with well defined API Provides data and structural metadata based on mappings to an organizations data warehouse Conform to SDMX Web Service guidelines Open Source package under the EUPL licence Support Census Hub and similar Eurostat projects The SDMX-RI (Reference Infrastructure) is a set of pick-and-choose building blocks allowing a statistical office to expose data to the external world based on access rights. Since it uses SDMX standards, incl. standard for Web services it de facto creates a universal framework for modern data provision - single exit point – an interlocutor to Eurostat Single Entry Point. SDMX-RI composed of reusable building blocks is designed to provide data and structural metadata based on mappings to each organization's dissemination data warehouse. An organization can decide to use the entire service infrastructure, extend it by adding new modules, modify some modules, or integrate some modules within its existing dissemination environment. To cover a wider spectrum of environments, SDMX-RI was developed in both Java and .NET. In this way, the organisations' dissemination environments of the ESS can exchange data using common standard, increasing the interoperability and visibility of the disseminated data and reducing the burden of distributing data. Allows exposure of data from existing databases using SDMX standards + enable production of SDMX data from existing reference/dissemination databases 3 3 Eurostat Unit B3 – IT and standards for data and metadata exchange 3

SDMX-RI Overview Data providing organisation SDMX-RI – User Interfaces Data collecting organisation Web Client Mapping Assistant InternalLink Web Service SDMX-RI – “Under the hood” Non-SDMX local data The figure shows the structure of SDMX-RI and its interaction with provider and collector. The area on the right (data collector, e.g. Eurostat) contains the software that is used to "pull" data from a data provider (e.g. an NSI) represented by the area on the left more specifically from the provider's dissemination environment. The central part is the SDMX-RI that exposes providers' data through SDMX compatible Web Services. The Mapping Assistant maps data in the particular dissemination environment of the provider to the SDMX data description allowing translation of generic SDMX requests into extractions from particular environment. SDMX-formatted data Eurostat Unit B3 – IT and standards for data and metadata exchange 4 4 4

Status report as of June 2012 (1) Countries that installed the SDMX-RI are: Austria, Belgium, Bulgaria, Cyprus, Finland, France*, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Mexico, Netherlands, Norway, Poland, Portugal, Slovenia, Spain and Sweden. * Selected building blocks Expressing interest or working on it: Denmark, Estonia, Greece, Iceland, Liechtenstein, Slovakia, Switzerland, Romania, Russia and United Kingdom Eurostat Unit B3 – IT and standards for data and metadata exchange

Development Roadmap 2012 (1) Analyse how "components” of other SDMX Frameworks can be integrated. Provide a common API based on the SDMX-IM. Support SDMX 2.1 Technical Specification. Testing and improvement of performance to meet service level objectives and limit expensive design rework. Documentation is essential to support the use of SDMX-RI. It is proposed to assess the current situation of the documentation and suggest possible improvements and enhancements. The rising number of organisations involved calls for features to be implemented or re-factored in a way which satisfies user requirements and the business vision for SDMX that envisages a "data sharing" model: Analyse how the "Manager" and "Client/Loader" components of the ISTAT SDMX Framework used for EGR (Euro-Groups Register) can be integrated. 2) Providing a common API based on the SDMX-IM Harmonisation of MT model with ESTAT’s MT model implementation supports v2.1 SR and DR components already migrated to the common API as a Proof of Concept 3) Why to migrate to SDMX 2.1? Issues/limitations of SDMX v2.0 Not explicit specifications for the Web Service API Missing attribute for referencing Concepts in DSD Limitations in query of data and structures Non time series support with Cross Sectional Benefits of SDMX v2.1 Explicit WSDL for SOAP services. New RESTful API New query capabilities Time semantics in query explicitly defined Artefacts definition complete Unique dataset model for time series and non-time series 3) Providing a common API based on the SDMX-IM In order SRI to be v2.1 compliant is more than integrating the common model New features of 2.1 (e.g. dimension at observation) New Interfaces (v2.1 WSDL, RESTful API) Making SDMX-RI compliant with a RESTful API Migrate to SDMX 2.1. Study on how to further generalise the current implementation of the SDMX-RI so that it could be used for national purposes, such as a single dissemination exit point, and in other domains with the added advantage of using SDMX standards recognized at international level. Testing and improvement of performance to meet service level objectives and limit expensive design rework. Eurostat Unit B3 – IT and standards for data and metadata exchange

Development Roadmap 2012 (2) Security guidelines for SDMX-RI modules and Mapping Assistant: Authentication (the need to confirm the identity of the person using the service); Authorization (the need to determine the access rights of the person using the service); Confidentiality (the need to prevent disclosure to unauthorized parties); Integrity (the need to prevent illegal modifications of data). A strong need for security guidelines for SDMX-RI modules and Mapping Assistant was faced. Important security-related questions (like authentication and authorization) have been left out in SDMX-RI. Several members of the SDMX-RI community have expressed the wish to provide recommendations on how to best handle security issues affecting SDMX-RI modules and the Mapping Assistant: Authentication (the need to confirm the identity of the person using the service) Authorization (the need to determine the access rights of the person using the service). Confidentiality (the need to prevent disclosure to unauthorized parties) . Integrity (the need to prevent illegal modifications of data). Eurostat Unit B3 – IT and standards for data and metadata exchange

Development Roadmap 2012 (3) To improve the quality and communication with organisations/Member States and create SDMX-RI User Group: Identification of those who are interested to participate actively in the User Group taking account of recent IT developments and experience in the different SDMX projects; Eurostat will provide countries with the technical specifications and guidelines needed to participate in the SDMX-RI User Group; Eurostat will organise technical assistance: missions and meetings. Eurostat Unit B3 – IT and standards for data and metadata exchange

Development strategy – work in progress Current solution Overview, issues Intermediate solution Overview, rational, changes, benefits, impact to users Final solution Activities to final solution Eurostat Unit B3 – IT and standards for data and metadata exchange 9

Current Implementation Schematic overview Components view Issues Eurostat Unit B3 – IT and standards for data and metadata exchange 10

Schematic overview 11 NSI Web Service Is the Web Service that offers an interface according to the SDMX guidelines for the Web Services for querying for data and structural metadata. Mapping Assistant, Test client and TestAuthConfig are desktop applications that are used off-line to configure various aspects of the SRI NSI Web Service. Mapping Assistant Is the application that is used to make the mappings between the SDMX Structures and the local storage (DDB and PC Axis files). Test-Client Is used from the administrator or developer in the organisation. This targets only the power users in the organisation and is mostly for testing the infrastructure. TestAuthConfig Is used from the administrator to provide access rights (authentication/authorisation) to the users of the Web Service. This information is stored in the Auth/Auth database. Dissemination database This is the final storage data warehouse maintained by each NSI. It stores data that can be published to potential Data Consumers. PC-Axis files This is the PC-Axis dissemination environment file format (aka px-files). A custom driver has been implemented for loading the data from px-files into a temporary DB so as to be queried by the Data Retriever. Mapping Store This artefact is responsible for keeping the mappings between the SDMX structural metadata and the native format (a file or a DB schema). The mappings are created and edited off-line by the Mapping Assistant. In other words, the Mapping Store is responsible for creating the mappings between an SDMX Data Structure Definition (DSD) and a DB schema (dissemination database) or a set of dissemination data files (PC-Axis files). It maps the DB schema from the database to the SDMX DSD. NSI Web Client Is a Web application that offers to the users a web user interface in order to formulate queries to the Web Service in graphical way and get the data in SDMX-ML. Moreover it offers exporting the data in other formats (e.g. CVS, Gesmes/TS, Excel). Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX RI Development Roadmap 2012, unit B-3 11 11

Components view NSI Web Service Provider Web Service getGenericData getCompactData getCrossSectionalData queryStructure NSI Web Service Provider Web Service (1)Structure Retriever (2)Query Parser (3)Data Retriever (4)SDMX Data Generator Mapping Store This artefact is responsible for keeping the mappings between the SDMX structural metadata and the native format (a file or a DB schema). The mappings are created and edited off-line by the Mapping Assistant. In other words, the Mapping Store is responsible for creating the mappings between an SDMX Data Structure Definition (DSD) and a DB schema (dissemination database) or a set of dissemination data files (PC-Axis files). It maps the DB schema from the database to the SDMX DSD. Dissemination database This is the final storage data warehouse maintained by each NSI. It stores data that can be published to potential Data Consumers. PC-Axis files This is the PC-Axis dissemination environment file format (aka px-files). A custom driver has been implemented for loading the data from px-files into a temporary DB so as to be queried by the Data Retriever. Web Service Provider This module is responsible for exposing the data using a Web Service interface that provides SDMX-ML messages. It follows the SDMX v2.0 WS Guidelines. SDMX Query Parser This module is responsible for getting the request from the “Web Service Provider” and populating the internal data model (i.e. sdmx data model) with the query received in the request. Data Retriever This module is responsible for querying the dissemination database, getting the respective recordset and populating the sdmx data model with the data retrieved, which is then returned. SDMX-ML Data Generator This module is responsible for generating an SDMX-ML Dataset message upon receiving the DSD and the SDMX model objects containing the data retrieved. SDMX Model/IO This library contains objects for storing data and metadata based on the SDMX information model. Also, it provides methods for reading and writing from/to SDMX-ML messages. It is already used in several Eurostat components. (5)SDMX Model/IO PC-Axis Mapping Store Dissemination DB Eurostat Unit B3 – IT and standards for data and metadata exchange

Issues in the current implementation Memory issues with large messages Keeps all data in memory before sending response Performance Significant decrease of performance for larger datasets and concurrent requests Doesn’t support SDMX 2.1 Current SDMX model design shortcomings Tight to SDMX Schema v2.0 Doesn’t provide interfaces Only serves as SDMX information placeholder; no utility methods Current SDMX model design shortcomings: Tight to SDMX Schema v2.0 The current SDMX model has been designed using the SDMX Schema v2.0 and the SDMX Information Model. More specifically, from SDMX IM, it uses only the inheritance of the artefacts (e.g. the abstract classes Identifiable, Maintainable, ItemScheme etc), since in v2.0 XSDs the class hierarchy was not present . The rest of the classes were based on the XSDs and therefore their design is close to the SDMX v2.0 XSDs. For this reason, it is difficult to make them SDMX version independent in the future. It does not provide programmatic interfaces The SDMX model contains only a set of specific classes (aka beans) for keeping the information. In order to be integrated in various applications, those classes have to be explicitly used, because there are no interfaces for them. Having interfaces for all the SDMX model would will allow client programs to base their logic on the interface and not to the implementation of the interface. This would offer the possibility to switch between several implementations of the SDMX model without affecting the client program logic. For instance, a different CodelistBean implementation could be provided for large Codelists so as to store them in a file in order to minimize memory requirements. Useful only as SDMX information placeholder; no utility methods In most of the cases the SDMX classes serve as a mere placeholder of the information that they represent. This means that they only provide the getters and setters for the information. The model does not provide utility methods that could be handy to a client class when it is needed to extract more refined information out of the bean. Through utility methods, the code on the client classes will be less and clearer. For instance, for the DSD, a utility method would be to get all the artefact references used in the DSD i.e. with one call to get the “id,agency,version” of all the ConceptSchemes and Codelists used within the DSD. If a utility method is not available, the developer should browse all the components of the DSD, using the appropriate getters and gathering referenced artefacts information. Common SDMX API get the references of the KeyFamily bean: Set<CrossReferenceBean> crossReferences = keyFamilyBean.getCrossReferences(); In the current model, none of the above examples is possible. Only the user instantiates the bean and uses the getters and setters to browse through the components and other information. There are interfaces to plug-in implementations, but there are no domain objects and no high level utility method. Current Estat’s SDMX Model // for getting the references, the following Lists should be iterated and find from each component the referenced ConceptScheme and Codelist. Also it should be checked not getting one Codelist twice. List dimensions = keyFamilyBean.getDimensions(); PrimaryMeasureBean primaryMeasure = keyFamilyBean.getPrimaryMeasure(); List attributes = keyFamilyBean.getAttributes(); Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX RI Development Roadmap 2012, unit B-3 13 13

Intermediate solution Why an intermediate solution Components view What changed Benefits Impact Eurostat Unit B3 – IT and standards for data and metadata exchange 14

Why an intermediate solution Performance problems needed to be solved High memory consumption and response times constrained organizations from putting it in production High memory consumption could lead to Out Of Memory errors Impossible to serve large data requests. Eurostat Unit B3 – IT and standards for data and metadata exchange 15

Components view NSI Web Service Provider Web Service getGenericData getCompactData getCrossSectionalData queryStructure NSI Web Service Provider Web Service (3)Data Retriever (streaming) (1)Structure Retriever Web Service Provider This module is responsible for exposing the data using a Web Service interface that provides SDMX-ML messages. It follows the SDMX v2.0 WS Guidelines. Data Retriever This module is responsible for querying the dissemination database and getting the respective “record set”, which is then streamed to the caller. The DR is provided with the query to process and a Streaming Writer that depends on the type of the message (i.e. Generic, Compact, XS) Compared with the previous version, the DR API was changed due to streaming. Initially, the Dataset object from the SDMX model was used to return all the data of the request. Next, the object was passed to the DG to be written into an SDMX-ML file. The intermediate solution changed this approach. DR does not return something. It takes the query and a streaming writer object and uses them to stream the data to the destination and format specified by the caller. SDMX Model/IO This library contains objects for storing data and metadata based on the SDMX information model. Also, it provides methods for reading and writing from/to SDMX-ML messages. Compared with the previous version, the intermediate solution contains streaming writers for the SDMX-ML Datasets (Generic, Compact, XS) that are used in the WS and DR for streaming data to the client. The SDMX Model/IO is already used in several Eurostat components. In the context of the SRI Web Service, the following features are used: reading the SDMX-ML Data Query reading and writing the SDMX-ML RegistryInterface messages (only for the query of structures) streaming writers of the SDMX-ML Datasets. SDMX Query Parser This module was removed from the SRI architecture because in the intermediate solution its functionality is included in the SDMX Model/IO. The class and API of the Query Parser did not change. Data Generator This module was removed from the SRI architecture because in the intermediate solution it is obsolete. Now the writing of the SDMX-ML is carried out by the Data Streaming Writers that reside in the SDMX Model/IO. (5)SDMX Model/IO (revised) PC-Axis Mapping Store Dissemination DB Eurostat Unit B3 – IT and standards for data and metadata exchange

Streaming of data in the service use of JAX-WS in Java What changed? Streaming of data in the service use of JAX-WS in Java SDMX IO(5) revised with Streaming Writers DR API(3) changes due to streaming QP(2) and DG(4) functionality is now included in the SDMX IO(5) library Additional technical information on the changes: Before streaming, the Service stored the SOAP response payload in DOM elements. The java implementation was based on Axis1 before moving to JAX-WS API. The SDMX IO library was revised to include the streaming writers for each SDMX-ML dataset message supported (Generic, Compact, XS). The difference is that now the data retrieved from the DDB is not stored at all in SDMX model objects. As the information is read from the DDB record by record, the appropriate writer call is used to write series and observations. Below are the calls for writing a series with two obs: writer.writeHeader(header); writer.startSeries(); writer.writeSeriesKeyValue("ADJUSTMENT", "W"); writer.writeSeriesKeyValue("FREQ", "Q"); writer.writeSeriesKeyValue("REF_AREA", "IT"); writer.writeSeriesKeyValue("STS_ACTIVITY", "NS0020"); writer.writeSeriesKeyValue("STS_BASE_YEAR", "2000"); writer.writeSeriesKeyValue("STS_INDICATOR", "PROD"); writer.writeSeriesKeyValue("STS_INSTITUTION", "1"); writer.writeAttributeValue("AVAILABILITY", "B"); writer.writeAttributeValue("COLLECTION", "A"); writer.writeAttributeValue("TIME_FORMAT", "P1M"); writer.writeObservation("2005-01", "1.51"); writer.writeAttributeValue("OBS_CONF", "F"); writer.writeAttributeValue("OBS_STATUS", "A"); writer.writeObservation("2006-03", "1.52"); writer.close(); The DR API was changed due to streaming. Prior to the intermediate solution, the Dataset object from the SDMX model was used to return all the data of the request. Next, the object was passed to the DG to be written into an SDMX-ML file. The intermediate solution changed this approach. DR does not return something. It takes the query and a streaming writer object and uses them to stream the data to the destination and format specified by the caller. QP functionality always existed in the SDMX IO thus it was a rather logical separation of modules. Now, the same functionality exists only in the SDMX IO. Moreover the DG is obsolete since it has been replaced by the streaming writers in the SDMX IO. Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX RI Development Roadmap 2012, unit B-3 17 17

Benefits Better performance Improvement up to 75% Solution to OutOfMemory problems for large datasets No memory constrains Data are streamed to the client The 74,77% improvement was observed for the .NET platform for a query returning 200k observations (size of message ~22MB) with 5 concurrent users making the same request. The before and after response times are 36804 ms and 9285 ms. For the Java platform, a 85.44% improvement was observed for a query returning 200k observations (size of message ~22MB) with 5 concurrent users making the same request. The before and after response times are 288849 ms and 42051 ms. Eurostat Unit B3 – IT and standards for data and metadata exchange 18

Impact Organizations that have installed current solution Only re-install the Web Service Existing users of Web Service will not be affected Same v2.0 interface will remain Users using the components API API has changed due to streaming support Migration will required Users already done modifications to the source code Will have to make the changes again if they want to use the new version. Eurostat Unit B3 – IT and standards for data and metadata exchange 19

Final solution Why a Final solution Components view What will be changed Benefits Impact Activities required Eurostat Unit B3 – IT and standards for data and metadata exchange 20

Why a final solution? Use of common SDMX API Foster of inter-organisation component reusability Support of SDMX 2.1 New messages (data representation, queries) New Interfaces (SOAP/REST) The common SDMX API is intended to be used in a wider scope i.e. in all organizations that use SDMX and not only in the scope of ESTAT’s modules. Benefits of a common SDMX API: It allows interchangeable API implementations It ensures reusability of common building blocks like the reading and writing of SDMX-ML messages New SDMX-ML building blocks will accept beans from the new API that can be automatically integrated to other systems. Independency of SDMX version. This has been designed solely in the Information model and not in the schemas. Moreover, for the same reason it will be easier to move to a new version if there will be one in the future. When a new message is provided in the future, it will be supported without any impact to the user programs because they will depend only on the API. A new message will only imply making a new implementation of the Reader and Writer interface. In ensures clearer code in the client programs. It will be easier to be used by the developer because it hides the complexity of the SDMX messages. Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX RI Development Roadmap 2012, unit B-3 21 21

Components view NSI Web Service Provider Web Service getGenericData getCompactData getCrossSectionalData queryStructure NSI Web Service Provider Web Service (3)Data Retriever (Streaming) (1)Structure Retriever Mapping Store This artefact is responsible for keeping the mappings between the SDMX structural metadata and the native format (a file or a DB schema). The mappings are created and edited off-line by the Mapping Assistant. In other words, the Mapping Store is responsible for creating the mappings between an SDMX Data Structure Definition (DSD) and a DB schema (dissemination database) or a set of dissemination data files (PC-Axis files). It maps the DB schema from the database to the SDMX DSD. Dissemination database This is the final storage data warehouse maintained by each NSI. It stores data that can be published to potential Data Consumers. PC-Axis files This is the PC-Axis dissemination environment file format (aka px-files). A custom driver has been implemented for loading the data from px-files into a temporary DB so as to be queried by the Data Retriever. Web Service Provider This module is responsible for exposing the data using a Web Service interface that provides SDMX-ML messages. It follows the SDMX v2.0 WS Guidelines. Data Retriever This module is responsible for querying the dissemination database and getting the respective “record set”, which is then streamed to the caller. The DR is provided with the query to process and a Streaming Writer that depends on the type of the message (i.e. Generic, Compact, XS) Compared with the intermediate solution, now DR uses the common SDMX API instead of the Eurostat’s SDMX Model/IO. The features and the design are similar. However, the DR API changes due to the usage of the Common SDMX API. Common SDMX API This is an API that provides interfaces of objects for storing data and metadata based on the SDMX information model. Also, it provides interfaces of methods for reading and writing from/to SDMX-ML messages. It is the common SDMX API that is intended to be used in a inter-organisation scope in order to foster reusability of components. In the context of the SRI Web Service, the following features are used: reading the SDMX-ML Data Query reading and writing the SDMX-ML RegistryInterface messages (only for the query of structures) streaming writers of the SDMX-ML Datasets. It has replaced the Eurostat’s Model/IO from the Intermediate solution. It provides interfaces for the beans, thus the logic and the interfaces of a module is based on the interface and not on the implementation. The bean implementations can be plugged in to a module (e.g. using Spring), therefore implementations of the common API can be interchangeable. Moreover, it provides the possibility of domain beans - beans that contain information which can only be read, not modified (i.e. only getter methods). In this way, the domain beans can be passed to other modules and the external modules will not be able to change their state. The beans that allow changing their state are called mutable i.e. they provide setters in their interface. Below are examples of domain and mutable beans for the KeyFamily (DSD). Domain (immutable) beans: interface com.metadatatechnology.sdmx.api.model.beans.datastructure.KeyFamilyBean implementing class com.metadatatechnology.sdmx.sdmxbeans.model.beans.datastructure.KeyFamilyBeanImpl Mutable beans: interface com.metadatatechnology.sdmx.api.model.mutable.datastructure.KeyFamilyMutableBean implementing class com.metadatatechnology.sdmx.sdmxbeans.model.mutable.datastructure.KeyFamilyMutableBeanImpl Getting a domain model from a mutable to be given to another module. keyFamilyBean = kfMutableBean.getImmutableInstance(); The common API offers utility methods (as already indicated in the previous slide) that are not offered in the current model. E.g. Getting the references of the KeyFamily bean: Set<CrossReferenceBean> crossReferences = keyFamilyBean.getCrossReferences(); In the current model, none of the above examples is possible. Only the user instantiates the bean and uses getters and setters to browse through the components and other information. There are interfaces to plug-in implementations but there are no domain objects and no high level utility methods. KeyFamilyBean keyFamilyBean = new KeyFamilyBean(); // for getting the references the following Lists should be iterated and find from each component the referenced ConceptScheme and Codelist. Also it should be checked not getting one Codelist twice. List dimensions = keyFamilyBean.getDimensions(); PrimaryMeasureBean primaryMeasure = keyFamilyBean.getPrimaryMeasure(); List attributes = keyFamilyBean.getAttributes(); SDMX API implementation The common SDMX API (6) provides the interfaces of the functionality to be used by other modules but not the actual functionality. Therefore an implementation of the SDMX API is needed - i.e. “SDMX API implementation” (7) in the diagram. The SRI will be use the MT implementation of the common API, developed in Java. An implementation for the .NET technology is pending. The SDMX API implementation is used to construct the actual objects that follow the API interfaces. In a more advanced way, the implementation classes can be injected to the modules code by using “inversion of control”, like in the Spring framework, making it easier to change the implementations. The term interfaces used above refers to the programming language concept of interfaces. The interfaces are only empty method signatures that constitute the specifications of the classes and methods. In other the words it is the contract between the modules in order to integrate to each other. The term functionality used above refers to the functionality offered by the common SDMX API i.e. Beans that keep the SDMX information and ensure reading/writing of SDMX-ML messages. The term implementation used above refers to the classes that implement the interfaces of the SDMX Common API. That is the classes that provide the actual functionality specified by the interfaces. In a more simplistic way we could say that it is the module that fills in with code the empty methods of the interfaces. The term modules used above refers to the SRI modules that use the common API i.e. SR, WS, DR. Also, all possible modules in other applications from Eurostat or other organizations that will be able to take advantage of the common API. One of the advantages of using interfaces is the fact that different implementations can be used. For instance, if ESTAT makes its own implementation in the future, it can take the place of the MT implementation. The reasons for doing that may be technical, e.g. implement without using Spring due to constraints in the organization's environment. Also a reason could be possibly to improve performance. However, it might not be necessary to replace all the MT implementation. ESTAT could extend it by adding alternative implementation of specific parts. For instance, provide additional implementation of the CodelistBean that could handle huge codelists without storing them into memory. Also, provide additional implementation of the reader and writer interfaces to support of a new format e.g. FLR, Google DSPL etc. (6)Common SDMX API (7)SDMX API Implementation PC-Axis Mapping Store Dissemination DB Eurostat Unit B3 – IT and standards for data and metadata exchange

What will be changed? All modules migrated to the common SDMX API(6) Replaces SDMX Model/IO(5) MT implementation(7) for SDMX API. SRI Components APIs will be changed Due to common SDMX API Support of SDMX 2.1 messages and new query features Extend to support v2.1 standardised SOAP and RESTful APIs Add new Web Service end point above Controller Co-existence with v2.0 service Support of v2.1 error codes Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX RI Development Roadmap 2012, unit B-3 23 23

WS extension to SDMX v2.1 new interfaces SOAP Request v2.0 SOAP Request v2.1 REST Request v2.1 NSI_Service_2.0 NSI_Service_2.1 NsiRestService Web Service Provider Web Service Provider This module is responsible for exposing the data using a Web Service interface that provides SDMX-ML messages. It offers 3 Web Service interfaces: SOAP SDMX v2.0, SOAP v2.1, REST v2.1 NSI_Service_2.0 It is a module of the Web Service Provider component. It implements the Web Service SOAP interface according to the SDMX v2.0 Web Service guidelines. It is responsible for serving such requests, that are passed to the Controller. NSI_Service_2.1 It is a module of the Web Service Provider component. It implements the Web Service SOAP interface according to the SDMX v2.1 Web Service guidelines (SDMX v2.1 provides a standardised WSDL). It is responsible for serving such requests, that are passed to the Controller. NsiRestService It is a module of the Web Service Provider component. It implements the Web Service Restful API according to the SDMX v2.1 Web Service guidelines. It is responsible for serving such requests, that are passed to the Controller. Controller It is a module of the Web Service Provider component that has all the logic of the Web Service provider. It coordinates the calls to the rest of the modules (SR, DR, common SDMX API reader/writers) in order to carry out the request so as its result is streamed back to the interface that was called i.e. v2.0, v2.1 and Rest services. Data Retriever This module is responsible for querying the dissemination database and getting the respective recordset, which is then streamed to the caller. The DR is provided with the query to process and a Streaming Writer that depends on the type of the message (i.e. Generic, Compact, XS). Common SDMX API This is an API that provides interfaces of objects for storing data and metadata based on the SDMX information model. Also, it provides interfaces of methods for reading and writing from/to SDMX-ML messages. It is the common SDMX API that is intended to be used in a inter-organisation scope in order to foster reusability of components. The SRI Web Service uses from this API the reading of SDMX-ML Data Query, reading writing of the SDMX-ML RegistryInterface for structure query request/responses and finally the streaming writers of SDMX-ML Datasets. SDMX API implementation This is the implementation of the Common API used in the context of the SRI. The modules are dependent on the API that provides the interfaces - however they should use an implementation of the API that provides the actual functionality. The MT implementation for Java will be used in the SRI. An implementation for the .NET is pending yet. Controller (1)Structure Retriever (3)Data Retriever (streaming) (6) Common SDMX API (7)SDMX API Implementation Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX RI Development Roadmap 2012, unit B-3 24 24

Benefits Use of common SDMX API Interchangeable implementation Foster component reusability Support of data streaming Support of SDMX 2.1 New query capabilities New message formats Support of RESTful API Eurostat Unit B3 – IT and standards for data and metadata exchange 25

Impact MA user will be able to reuse their Mapping Stores Mapping Store upgrade will be supported Only will be required to re-install MA Organisations that have installed current solution Only to re-install the Web Service Existing users of Web Service will not be affected Same v2.0 interface will remain Users using the SRI components API API has changed due to streaming support Migration will be required Migration guidelines will be provided Users already done modifications to the source code Will have to make the changes again if they want to use the new version. The Mapping Assistant and the Mapping Store will be affected in the final solution since they are going to support SDMX v2.1. SDMX v2.1 includes some more relations in the information model of the data structures and these will require changes in the current Mapping Store. More specifically: Adding the id for each component Changing the representation of the Measure dimension (in SDMX v2.1 a Concept Scheme is used instead of a Codelist) Including new way of relating attributes to series/groups and group of dimensions. The MA GUI will be affected so as to present the additional information of the SDMX v2.1 structures. Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX RI Development Roadmap 2012, unit B-3 26 26

Activities Migration is dependent on the SDMX common API Java: on going (API to be released as OSS in June 2012) .NET: not started yet (API should be ported in C# and then implemented) Two steps for SRI migration “Intermediate release” will be migrated to use the Common SDMX API. Support of existing functionality for v2.0 Add additional v2.1 functionality incrementally WS new interfaces (SOAP and REST) Query Time semantics Dimension at observation Support in MA 2.1 artefacts Eurostat Unit B3 – IT and standards for data and metadata exchange 27

Contact Bengt-Åke Lindblad Bengt-Ake.Lindblad@ec.europa.eu 28 Eurostat Unit B3 – IT and standards for data and metadata exchange 28 28 28