Data validation rules Item 3b Eurostat Task Force on Annual Financial Accounts Frankfurt, 4 March 2016
Data transmission/validation procedure SDMX template or DB extraction XML files eDamis notification production database (FINA2010) structural and data check reports contact countries validation dissemination
Transmission errors Errors leading to file rejection (no data check report produced) Quality issues with the data (presented in the data check reports
Coding: embargo dates WRONG CORRECT A country may forget to delete or update an old embargo date Use of embargo dates with flags other than "N" (e.g. "F" or "C") dates are taken into account only with flag "N" WRONG CORRECT
Coding: flags (1) WRONG CORRECT "NaN" values with OBS_STATUS "A" or "E" – wrong! "NaN" can only be used with flags such as "L", "M", "J" etc. Numerical values (including "0") with flags "L", "M" or "J" – wrong! Numerical values can only be used with flags "A", "E", "P", etc. WRONG CORRECT
Coding: flags (2) WRONG Value "0" used as OBS_STATUS and CONF_STATUS "J" or "M” used as a CONF_STATUS "v" used as OBS_STATUS Value "L" used instead of "NaN" Value with two dots inside "Nan" values instead of NaN (it is case-sensitive). WRONG
Coding: flags (3) Flag "L": data should exist but are not collected (not available) Flag "M": data do not exist by definition; not applicable Zero value is not a substitute for either "L" or "M" WRONG
Coding errors: reference area Use of "LU" as Reference Area This happens because "LU" is used in the sample questionnaires and countries forget to change the field WRONG (unless you are Luxembourg…)
Coding errors: sector S2 Balancing items (B9f, BF90) are to be transmitted with opposite sign: S1 S2 SECTOR SUB-SECTOR Total economy SECTOR ► S1 OBS_STATUS OBS_CONF CPAREA ► W0 B9F 457.00 SECTOR SUB-SECTOR Rest of the World - total (4) SECTOR ► S1 OBS_STATUS OBS_CONF CPAREA ► W1 B9F -457.00 CORRECT
Other coding errors Wrongly coded series probably linked to old SDMX template Dummy figures of SDMX template left in the XML file Some numbers transmitted with space(s) inside, probably due to conversion issues. Obligatory parameters missing from the XML file, such as UNIT_MULT, TABLE_ID, Valuation, Maturity etc.
Validation checks See FINA2010 document Structural report: checks the validity of the contents of the transmission (correct coding, series transmitted) Data check report: checks the internal coherence of the data (mainly arithmetic checks) (Revision report: checks size of revisions, comparing new transmission with existing validated data)
Eurostat task force on data validation TF Validation met in March and October 2015. It includes NSIs and NCBs, ECB and OECD. Objectives To review validation checks performed in Eurostat upon reception of ESA 2010 TP tables To clarify methodological or practical aspects underlying issues common to ESA data transmissions To propose validation rules for an internal or external pre-validation tool To investigate possibilities of a common process for the collection and dissemination of metadata.
TF Validation: current work draft a validation handbook introduction of automatic validation reports Set-up in 2016 of a structural validation service: allows checking a transmitted file for SDMX compliance. Invalid files will be rejected and a report sent to the sender by eDamis. Note: pre-validation service: part of 3-pillar strategy for quality assurance of ESA 2010 datasets, along with the quality reports and compliance monitoring.
ESA 2010 quality assessment framework Regulation (EU) No 549/2013, articles 4 & 12: MS shall provide to Eurostat a report on the quality of transmitted data [Art. 4(2)] Eurostat shall assess quality of MS' data transmissions [Art. 4(4)] Provision of a report on application of the regulation to Council and European Parliament by 1 July 2018 and every five years thereafter [Art. 12]