Presentation is loading. Please wait.

Presentation is loading. Please wait.

X-DIS project: final report

Similar presentations


Presentation on theme: "X-DIS project: final report"— Presentation transcript:

1 X-DIS project: final report
Item 5.2 of the Agenda X-DIS project: final report Marco Pellegrino Unit B5, Statistical Information Technologies 21-22 October 2009

2 X-DIS project XDIS = XML for Data Interoperability in Statistics
« Project of common interest » in the framework of IDABC Born in 2005 (1st Global Implementation Plan: ) : new production phase Last projects running until end-2010 Time for a pre-final assessment 21-22 October 2009 IT Directors Group

3 Project report: achievements (1)
Work area A.1: SDMX OSS applications supporting SDMX standards (Registry, Data Structure Wizard, Converter, …) SDMX Implementation strategy (registry operational, development of Data Structure Definitions and technical artefacts, capacity-building, training) Re-usable components for implementing SDMX in member States ( ) 21-22 October 2009 IT Directors Group 2

4 Project report: achievements (2)
Work area A.2: SODI SDMX Open Data Interchange Generic applications, data sharing model IT infrastructure working at Eurostat, supporting SDMX-based data sharing (re-usable for different domains, e.g. for the Census hub) New SODI with a different focus (pull mode, SDMX-ML, data and metadata, new list of domains) 21-22 October 2009 IT Directors Group 2

5 Project report: achievements (3)
Work area A.3: Sectoral networks XBRL pilot project task-force completed: tested feasibility of exploiting XBRL reporting for statistics; potential to reduce response burden. Open-source tools on OSOR. Review of other sectoral XML standards: other than XBRL, no candidates for further work with real potential to lower response burden. MEETS programme in 2010. 21-22 October 2009 IT Directors Group 2

6 Project report: achievements (4)
B.1: Visualisation “Business Cycle Clock” for PEEIs (OSS from CIRCA) plus tools for tables and graphics. B.2: SDMX web services for the download of Eurostat’s data; toolkit for reference metadata; study on e-services. B.3: Large Datasets: Analysis, proof-of-concept tool for the dissemination of large datasets; Census Hub pilot to be integrated within the dissemination environment. C.1 OSS: “OSS and Statistics” on CIRCA, then OSOR; Guidelines on OSS. 21-22 October 2009 IT Directors Group 2

7 Plans 2009–2010 SDMX: concentrating on the implementation of SDMX to support member States and on the improvement of the registry-based SDMX architecture SODI: improvement of tools Large Datasets: Census Hub, Euro Groups Register (largely re-using X-DIS tools and SODI infrastructure) Sectoral standards: MEETS programme eServices to retrieve SDMX data from Eurostat’s web site OSS: Further developments taking place on OSOR Among the plans for , I would like to stress one point: yesterday, several countries highlighted the need of using common principles and standards (SDMX has been mentioned by several countries) to design shared tools and services. This has been the core agenda for X-DIS during all these years. SDMX standards and IT architecture, together with statistical guidelines, are being used for a series of implementations involving ESS countries: European Census Hub (later), SODI, Euro-Groups Register (EGR). These experiences have highlighted some important points: Building the SDMX architecture, from a data producer point of view, requires the analysis of several factors, and the development of complex software modules. The exchange of know-how and software between NSIs – encouraged by Eurostat (see Census Hub) – has allowed in some cases a much quicker development of the IT infrastructure. National data are stored in Member States' repositories, and described differently from how they could be in the SDMX DSD defined at international level. The start-up phase is crucial, because the expert knowledge of SDMX standards, XML and related technologies (e.g. Web Services) is not easily available. 21-22 October 2009 IT Directors Group 2

8 SDMX benefits: the NSI perspective
SDMX can reduce reporting burden to national, European and international institutions The use of SDMX can improve harmonisation, standardisation and integration processes inside a NSI International “community” to share experiences and software. Open Source culture Eurostat, upon request, provides technical advice to NSIs interested in starting SDMX projects (missions, training) Eurostat designed a reference architecture for NSIs and is developing building blocks through its implementations SDMX can reduce the reporting burden, but standards are not enough without tools, technical advice, exchange of experiences and software, and clear architectures. A new action on “SDMX implementation and support in MS”, which is on-going, includes the analysis of some national architectures, with a particular attention to solutions which are already shared in the statistical community (e.g. PC-AXIS) and the inventory of existing software developed at national level, together with proposals on how to integrate them in the SDMX reference architecture and with the components already developed by Member States which participate in SDMX projects. This action was presented in a Technical workshop held in Madrid (22-23/9) with a good success: documentation available from CIRCA. This project is also going to provide useful material for the SDMX ESSnet. Within this framework, the specifications for the so-called “SDMX reference architecture”, together with the definition of single components and the interfaces between building blocks, are being made available on CIRCA. As you know, there are different possible architectures for data exchange... 21-22 October 2009 IT Directors Group 8

9 This is a “simplified” UML picture of the SDMX Reference architecture for MSs. It represents the synthesis of several experiences worldwide and can be considered not a strict specification but rather a guide or “best practice”. The objective is to provide a description of a generalized architecture to be used partially or as a whole by MSs interested in starting SDMX projects Dissemination database: This is the final storage data warehouse being kept from each NSI for data that can be published to potential Data Consumers Mapping assistant: module responsible for creating the mappings between an SDMX Data Structure Definition (DSD) and a DB schema (dissemination database) or a set of dissemination data files. It maps the DB schema from the database to the SDMX DSD (“SDMX Structure File” artefact) Mapping Store: module responsible for keeping the mappings between the SDMX and the native format (a file or a DB schema) SDMX Structure File: This artefact is the SDMX-ML DSD required by the “Mapping Assistant” module in order to map its component (i.e. Dimensions, Attributes, Measures) to the dissemination db columns and tables Data Retriever: This module is responsible for querying the dissemination database and getting the respective recordset Data Loader: module responsible for loading new data from the NSI’s production environment/database to the dissemination environment/database and updating the module “RSS Generator” SDMX-ML Data Generator: module responsible, upon receiving the recordset and the respective mappings from the “Data Retriever”, for generating an SDMX-ML Dataset message Web Service Provider: module responsible for exposing the Dataset using a Web Service interface that provides SDMX-ML messages RSS Generator: module responsible for generating a feed entry on the event of new data arriving from the “Data Loader”. SDMX Query Parser: module responsible for getting the request from the “Web Service Provider” and populate the internal data model, i.e. sdmx data model Eurostat Unit B5 – Statistical Information Technologies STNE 24th Meeting – June 2009 9 9

10 Data Repository (Warehousing) Architecture
NSI Eurostat Pull Requestor eDAMIS Data Input SDMX Registry Intermediate storage Verification / Conversion To SDMX Received data in SDMX-ML Loader register Warehouse Eurobase query Dissemination XSL for P U L S H In SDMX, we can have an architecture based on data warehousing, for which we can distinguish a « push » or a « pull » mode. In the push mode, the data provider takes action to send the data to the organisation collecting the data. This can take place using different means, such as or file transfer. These are the traditional modes of data collection, carried out by international organisations for many years. Once the file is received, an application based in the recipient systems processes it and uploads data in a data base. The chain in the receiving organisation can be fully automatic, ensuring the best quality of data exchange. In the pull approach, the data provider makes the data available for the users: for download in a SDMX-conformant file; as a result from a query to a web service linked to a database on the provider's side. More precisely, the organisation that consumes statistical data can for example subscribe to a RSS flow and receive in real time the last links to available data. Another scenario could be the consumer organisation system sends a SMDX-ML query file to the data provider's web service and get the requested data file. Finally, the data provider also has the possibility to deliver statistical data files in a shared place accessible by authorized organisation for download. Note that in both cases, the data are made available to any organisation requiring them, in formats which ensure that data are consistently described by appropriate metadata, whose meaning is common to all parties in the exchange. The Single Entry Point allows both push and pull methods: eDAMIS is now able to recognise and deliver SDMX-ML files. An SDMX-ML module for representing validation rules was developed and it is used by the eDAMIS validation engine. The pull approach concerns the following steps: Step 1: when new data are available, the NSI should: Create an SDMX-ML file containing the new data, or Do nothing if the NSI WS builds SDMX-ML messages upon request Step 2: the NSI should add a new feed entry, including an SDMX-ML Query message describing the new Dataset, to the NSI feed. The Pull Requestor reads the new feed entry and: Retrieves the SDMX-ML file from the specified URL, if it resides in a URL, or Uses the Query Message included in the feed to query the NSI WS, if the data are prepared by the WS The Pull Requestor forwards the SDMX-ML dataset to the rest of the modules within Eurostat production environment 21-22 October 2009 IT Directors Group 10

11 SDMX Architecture (Hub mode)
SDMX also supports the “Data Hub” concept/architecture, where users obtain data from a central hub which itself automatically assembles the required dataset by querying other data sources. Data providers can notify the hub of new sets of data and corresponding structural metadata (measures, dimension, code lists, etc.) and make data available directly from their systems through queries. Data users can browse the hub to define a dataset of interest via the above structural metadata and retrieve the desired dataset. From the data management point of view, the hub is also based on a specific datasets, but these – contrary to the database-driven architecture – are not kept locally at the central hub system. The agreed hypercubes are fetched directly from the data producer databases when a user requests them. SDMX formats and architecture are used In the Data sharing context, both architectures reduce the burden of transferring data to multiple counterparties, if a group of partners agree on providing access to their data according to standard processes, formats and technologies. The following process operates: A user identifies a dataset through the web interface of the central hub using the structural metadata, and requests it; The central hub translates the user request in one or more queries and sends them to the related data providers’ systems; Data providers’ systems process the query and send the result to the central hub in standard format; The central hub puts together all the results originated by all interested data providers’ systems and presents them in a human readable format.

12 Issues Costs and benefits of the implementation
Comparison of achievements with the expected results Sustainability for Eurostat and member States Good progress reached in creating and fine-tuning SDMX standards, tools and reference architecture Emphasis on the integration of information systems: SDMX at the core of the harmonisation of the statistical business process. 21-22 October 2009 IT Directors Group

13 For more information: Bengt-Åke.Lindblad Francesco.Rizzo
Krassimir.Ivanov @ec.europa.eu Marco.Pellegrino Michel.Henrard Unit B5, Section « Standardisation and advanced IT for statistics » 21-22 October 2009 IT Directors Group


Download ppt "X-DIS project: final report"

Similar presentations


Ads by Google