Sharing data validation activities in the ESS. Item 4.1 of the agenda Sharing data validation activities in the ESS. Recent developments and plans for 2012 Georges Pongas – B3 19-20 October 2010 IT Directors’ Group Meeting
Background ITDG 2010: Sharing of Validation tools Eurostat presented a document on sharing selected validation tools at ESS level Sharing of Validation tools Is a crucial aspect of future infrastructure of the ESS reinforcing common standards in terms of rules and common data and metadata use. In can result in considerable economies both for Member States and Eurostat. Sharing of software services or software and (meta)data distribution can be envisaged. Challenges: IT standards, interdependence of actors, operational support 18-19 October 2011 IT Directors’ Group Meeting
Data validation in the ESS Data validation takes place: In Member States – before transmission In Eurostat – before further dissemination and processing Several steps in validation: Format validation Codes validation Data validation 1st level: basic checks – existence of mandatory fields, range checks, consistency of info inside file 2nd and 3rd level: consistency with historical data / data from other sources (other countries, other statistics) 4rd level: expert validation / in-depth analysis 18-19 October 2011 IT Directors’ Group Meeting
Data validation tools developed by Eurostat eVE = eDamis Validation Engine Allows for a final check before transmitting data to Eurostat Covers format, codes and basic checks For files in SDMX-ML format; linked to DSD EBB = Editing Building Block Allows importing of external reference files and databases Can be configured to use many files simultaneously For files with an agreed format applied by all data senders (csv, flr,sdmx-ml, sdmx-edi, databases) Not only validation but also correction and computation WEDES -Web application for the validation of Energy statistics applicable by Member States and Eurostat (an extension of EBB). 18-19 October 2011 IT Directors’ Group Meeting
EBB = Editing Building Block Main Functionalities: Acceptance of various file formats and number of variables (limited by the DBMS column number capacity) Validation programs are parametric Not only validation but also variable creation Possibility to manipulate incoming datasets Information is persistent (data+metadata) and reusable 18-19 October 2011 IT Directors’ Group Meeting
Functionality in detail File management: Fixed length records Variable length records (delimited) Sdmx-ML Gesmes files Scripting and web services Web version and stand alone version 18-19 October 2011 IT Directors’ Group Meeting
Validation Rules, Computations Rules are logical expressions followed by: The rule name The rule severity The rule warning message A possible modification or creation of data depending on the rule result. Rules can be horizontal or vertical (inter record) Special computations (outliers) Output statistics (summary) and details for errors (what error, where in the dataset). 18-19 October 2011 IT Directors’ Group Meeting
Dataset operations Copy file, select part of file Split file Aggregate Rename Merge Append Reorder lines or columns 18-19 October 2011 IT Directors’ Group Meeting
EBB New Functionalities (2012) Secure use using the same solution as the Commission email (ECAS, SMS) Running EBB in unattended mode linked to eDAMIS. Integration of EBB with eVE and SDMX Registry 18-19 October 2011 IT Directors’ Group Meeting
Currently applied in the domains Foreign Trade (Estat, Enhanced application) Esspross (Estat) CVTS (MS) ITS, FDI (MS) SBS (In preparation) From inside SAS (Estat) 18-19 October 2011 IT Directors’ Group Meeting
Proposed Organisation The successful deployment of EBB needs the existence of a knowledge and support network at local level. This network exists; It is called Local Coordinators and is the contact point for transmissions, and support for eDAMIS, eVE, Statel and SDMX. It is proposed that this network includes also EBB in its support mandate. Eurostat will create the conditions for this inclusion (courses, material,..) 18-19 October 2011 IT Directors’ Group Meeting