Presentation is loading. Please wait.

Presentation is loading. Please wait.

Practical use case of SDMX (1): Short-term Statistics (STS)

Similar presentations


Presentation on theme: "Practical use case of SDMX (1): Short-term Statistics (STS)"— Presentation transcript:

1 Practical use case of SDMX (1): Short-term Statistics (STS)
Cristina Blanaru Eurostat Unit B5: “Data and metadata services and standards” SDMX Basics Course – 1-2 March 2017

2 Why SDMX? Eurostat's tools are based on SDMX in data exchange (EDAMIS) and validation services (EDIT) – Genedi is not supported anymore International organisations (ILO, IMF, OECD etc.) and the ECB use SDMX-ML New SDMX Validation Services  will be available in the near future: New SDMX compliant validation services  will be available in the near future: Structured Validation service:  which validates format and codes according to a valid and active DSD for the concerned data flow. Status: in test. Content Validation service: allows to validate the content of an incoming file that is compliant with the SDMX information model. Status: in development. STRUCTURED VALIDATION SERVICE (STRUVAL) CONTENT VALIDATION SERVICE (CONVAL)

3 SDMX PROCESS PHASES - STS

4 Phase 1: Preparation The legal basis for the STS indicators is the Council Regulation No 1165/98 of 19 May concerning short- term statistics and subsequent amending regulations The parts involved: ESTAT, NSIs and ECB in the future; Data providers: NSIs; The data transmission – monthly or quarterly; The format of the file: xml, send via eDAMIS; The use of the FAME (internal database) like production system;

5 The datasets name in EDAMIS
For production files:STS… (e.g. STSIND_PROD_M) For test files: VSTS…. (e.g. VSTSIND_PROD_M)

6 Phase 2: Compliance 2015: Settle on the matrix and the associated code lists; Re-use of existing code lists and cross-domain code list: CL_FREQ, CL_OBS_STATUS, CL_UNIT_MULT, CL_DECIMALS; Create specific code lists: CL_STS_INDICATOR, CL_ACTIVITY_STS;

7 Explanation of STS Matrix
Overview Sheetsummarises all concepts and code lists used

8 Explanation of STS Matrix - Matrix Sheet (1/3)

9 Explanation of STS Matrix - Matrix Sheet (2/3)
Shows the relationship between the dataset(s) and the concepts Each concept has a hyperlink pointing to the corresponding code list sheet The cells link a dataset (row) to a concept (column)

10 Explanation of STS Matrix - Matrix Sheet (3/3)
The cells contain: A # sign if the code list is fully used in the dataset; A % sign if the code list is partiallly used in the dataset (blank) … Concept not used in the dataset (code) … fixed single code from code list

11 Explanation of STS Matrix Code list Sheets
Showing the contents of each of the code lists used: CL_FREQ sheet CL_STS_INDICATOR sheet CL_ACTIVITY_STS sheet CL_OBS_STATUS sheet………

12 Phase 2: Compliance Prepare/configure the process for IS4STAT/STRUVAL
drawn up the guidelines Review the guidelines Online survey of the Member States and the ECB, to take stake holders' views into account Data Structure Definition was prepared, starting from GESMES/TS "Key Family" Develop the DSD and the constraints. Prepare/configure the process for IS4STAT/STRUVAL Provide the matrix and the DSD to the IT production team (for their preparatory work)

13 Phase 3: Implementation
Finalize the guidelines Prepare test files and test the DSD The DSD and the dataflows processing constraints are uploaded into the Euro Registry - Send the SDMX package to the countries for testing

14 DSD: Code lists and constraints
Using existing lists from Global Registry and Euro SDMX Registry, except for Indicator and Activity Values are restricted according EDAMIS flow ("data flow" or "dataset", e.g. STSIND_PROD_M) Details in: The DSD in Euro SDMX Registry "Human readable" SDMX-STS_DSD-Matrix (MS Excel format)

15 Phase 3: Implementation
Transmitting data using SDMX Transmission files must be structured according to the DSD The transmission format is SDMX-ML (COMPACT) Eurostat makes a variety of tools available to Member States to implement SDMX for data transmission

16 Phase 3: Implementation
What is the current DSD for STS? ESTAT+STSALL+2.1 Where to find the STS DSD and the constraints? Euro SDMX Registry

17 Click

18 Click

19 Detailed view Quick view Download

20 Click

21

22 Click

23 Click

24 Click

25

26 How to create SDMX-ML files?
Get the DSD from the SDMX Registry DDB If you have a DDB Data stored as files SDMX-RI SDMX Converter One goal – different possibilities Expose the data to be pulled (WS, HUB) Push via eDAMIS

27 SDMX-RI components Mapping Assistant
Graphical tool to create the mapping between the DSD and the dissemination database. Test Client Used to test your dataflow locally NSI Web Service Allows you to share your dataflow NSI Client Web interface to interact with the Web Service

28 SDMX Converter The SDMX converter is a Java application that converts files between all the following formats* In order to do so, it needs at least the data file and the data structure definition file (DSD). *Limitations may apply. Please, check the User Manual. The SDMX Converter has been developed in the context of the SODI project. It is a Java tool that allows converting from/to various dataset types based on SDMX-ML DSDs. The allowed types between which the Converter is capable of converting are: SDMX-ML formats (in order to go from/to Cross-Sectional type the DSD should contain Cross-Sectional information and TimeDimension) GESMES/TS (aka SDMX-EDI) (not capable of converting from/to SDMX-ML Cross-Sectional format) GESMES/2.1 and GESMES/DSIS CSV, FLR, EXCEL, DSPL (supports mapping mechanism and parametric delimiter for CSV. Converting to CSV/FLR from other formats may result in loss of attributes attached at a higher level than observation.)

29 Interfaces (modes of use)
Locally installed User interface Command line Web service Web service API

30 Mandatory for any conversion
Input file Output file (complete path) Format for input and output files Specify DSD 4.1. DSD file 4.2. Reference to a DSD file in the Registry 4.3 Reference to a Dataflow file in the Registry

31 GUI (Converter 4.5.0) 2.a Select the DSD in the local drive
1. Selection of the input/output files and their format 2.b Identify a DSD to download from the SDMX Registry (configuration required) 2. If the local DSD includes multiple versions, we can specify the one derired 2. If the local DSD includes multiple versions, we can specify the one desired 2.c Identify a dataflow linked to the DSD to download from the SDMX Registry (configuration required) Excel parameter file 3. SDMX header (.prop file) Only for flat and excel files There are some mandatory fields, e.g. the DSD is mandatory, therefore, either loading from a file or from the Registry, it must be present. A message will appear if there are mandatory fields missing. 5. CSV parameters 6. SDMX (output) validation 4. Mapping and Transcoding XML parameters for SDMX output formats

32 GUI Information marked * is mandatory

33 Where to find the SDMX Converter
You can download the latest version of the SDMX converter on CIRCABC 85f2-4f6f30b4d8eb Available packages SDMX Converter Documentation SDMX Converter Platform Independent SDMX Converter Web Service SDMX Converter installer for Windows 32-bit

34 Validation of the SDMX-ML files
From CNA to Eurostat EDAMIS flow and file format STRUVAL Feedback in EDAMIS File loading What's new with SDMX-ML?

35 Validation of the SDMX-ML files

36 EDAMIS flow and file format

37 STRUVAL The structure of SDMX-ML files (file extension .xml) is validated before files are forwarded to the STS team: against the data structure definition (DSD): ESTAT+STSALL+2.1 against the dataflow specific constraints, e.g. CR_STSIND_TURN_M (see "Cube Regions" in the Euros SDMX Registry for acceptable values) All other files, including GESMES/TS (.ges) and NOTES (.docx), by-pass STRUVAL

38 Feedback in EDAMIS (1/2) A validation report is made available over the EDAMIS feedback mechanism to the sender of the SDMX-ML file.

39 Feedback in EDAMIS (2/2)

40 Validation of the SDMX-ML files Conclusion
The STS SDMX-ML files are automatically validated – with feedback over EDAMIS – and pass the same manual validations as the GESMES/TS files DSD and constraints SDMX-ML is richer than GESMES/TS – not all new fields are though checked A solution for recording embargo date and time implemented

41 Recent issues in SDMX-ML test files
How to get it correct? File type EDAMIS flow Data Structure Definition Header Time format Unit multiplier Decimals Absolute values Activity

42 File type o The file name is derived from the EDAMIS flow, but the extension is taken from the original file o SDMX-ML is a schema of the eXtended Mark-up Language (XML) o Consequently, the file type is XML and the file extension should be .xml The .sdmx-files are not automatically validated. o The file extension of the GESMES/TS files should be .ges o Not recommended: .txt, .gms

43 EDAMIS flow STS… VSTS… (e.g. "STSIND_PROD_M") (e.g. "VSTSIND_PROD_M")
o For production files: "free" For test files data are released – data EDAMIS records are not used under embargo not o EDAMIS records are used for assessing the compliance of the reporting countries o The "test" flag in the The "test" flag in the header should be ticked header should not be ticked

44 Data Structure Definition (1/3)
o Only use ESTAT+STSALL+2.1 o In the SDMX Converter, validation of SDMX-ML files works only for "COMPACT_SDMX" Concept Scheme. It is an SDMX 2.0 format The converter writes the version of the DSD and Concept Scheme in the pre-header of the file: <CompactData xmlns=" xmlns:sts="urn:sdmx:org.sdmx.infomodel.keyfamily.KeyFamily=ESTAT:STSALL:2.1:compact" xmlns:xsi=" xsi:schemaLocation=" SDMXMessage.xsd urn:sdmx:org.sdmx.infomodel.keyfamily.KeyFamily=ESTAT:STSALL:2.1:compact ESTAT_STSALL_Compact.xsd">

45 Data Structure Definition (2/3)
In ESTAT+STSALL+2.1 "ABS0" [a-b-s-"zero"] (and "0000") are both accepted for absolute values Concept names different from v.2.0 DSD: - "SEASONAL_ADJUST" (not "ADJUSTMENT") - "BASE_PER" (not "BASE_YEAR") - "UNIT_MEASURE" (not "UNIT")

46 Header o Header is needed in the SDMX Converter o Mandatory (
Header o Header is needed in the SDMX Converter o Mandatory (*) fields need to be filled in o Do not tick "Test" for production files o If you fill in other fields, be consistent, for example, with the data flow

47 Time format o In SDMX-ML, only 2 infra-annual formats are accepted for
the reference period (no GESMES/TS codes): "P1M", monthly data, yyyy-"M"mm - For example "2017-M01" for January 2017 - Currently also accepted without "M", " " "P3M", quarterly data, yyyy-"Q"q - For example "2017-Q1" for the 1st quarter 2017 o Embargo time Luxembourgish time No time zone ("Z" or "+01:00")

48 Unit multiplier o It is the power of 10 o For indices, (UNIT_MEASURE="IX") only "0" is accepted as UNIT_MULT e.g. 105*10 105*1=105 o For absolute values, the closest UNIT_MULT in the code list should be used("3" means 1000x; "6" means x)

49 Decimals point ("period" or "dot") is valid –
o In SDMX-ML files, only decimal point ("period" or "dot") is valid – "decimal comma" is rejected in STRUVAL o Number of decimals is ignored in the incoming data after STRUVAL You can use any value in the code list (0-7) There is no check between the number of decimal effectively sent and the value of "DECIMALS" 1 decimal is enough for STS

50 Activity With SDMX-ML, the coding of 27 STS activity (or
product) aggregates changes A transcoding file can be used in the SDMX Converter

51 Phase 4: Production SDMX compliant data exchange used in production;
Send the package to all countries;

52 Conclusions Current STS data transmissions
Formats: GESMES/TS and SDMX-ML Environments: test and production, files sent via EDAMIS GESMES/TS SDMX-ML o Also called SDMX-EDI o Can be used from o Used since the January 2017 beginning of the STS o Based on XML Regulation o Conversion from csv o Based on UN EDIFACT with SDMX Converter o Conversion from csv or created directly with Genedi with SDMX Reference Infrastructure tools

53 Support from Eurostat and NSIs
For STS requests contact: For SDMX Support contact:


Download ppt "Practical use case of SDMX (1): Short-term Statistics (STS)"

Similar presentations


Ads by Google