Download presentation
Presentation is loading. Please wait.
Published byRisto Heikkilä Modified over 6 years ago
1
5. Detail: Main SDMX objects for. metadata exchange. (What is SDMX
5. Detail: Main SDMX objects for metadata exchange (What is SDMX? Part I) Bogdan ZDRENTU Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, March 2016
2
Types of metadata Structural metadata
acting as identifiers and descriptors of the data, such as: dimensions of statistical cubes variables titles of tables Nomenclatures (code lists) always be associated with the data to allow their identification, retrieval and browsing.
3
Example for structural metadata
4
Types of metadata Reference metadata
acting only as descriptors of the data, they don’t help to actually identify the data. They can be of different kinds: conceptual metadata methodological metadata quality metadata (process and output) can be exchanged independently from the data they are related to, but are however often linked to them. The part concerning Metadata assisting interpretation of statistical information is strongly content-oriented. The requirements for metadata depend very much on the subject area and target user groups. A driving force for the minimum metadata set formulated in this group is that as much information as necessary should be made available to ensure a correct interpretation and to avoid misuse. As a result, the list of recommended metadata in this case is quite long covering both the metadata required for the correct interpretation of statistics and the metadata for better assessing the quality and comparability of statistics.
5
Example for reference metadata
6
ESS standardisation of reference metadata based on SDMX
The SDMX Content Oriented Guidelines (2009), i.e. the Cross- Domain Concepts and the MCV; - The SDMX Glossary (2016) The SDMX Technical Standards: The information model for creation of the Metadata Structure Definitions (MSDs); The SDMX-ML for documenting the XML format; The Euro SDMX Registry for storing the MSDs etc. The SDMX Glossary - It is recommended as a single entry point to a common SDMX terminology to be 7 used in order to facilitate communication and understanding of the standard. A Metadata Structure Definition describes how metadata sets, containing reference metadata are organised.
7
Standardisation of structural metadata
Code lists describe dimensions in data tables, giving a meaning to the data. Code lists are based on: official statistical classifications such as NACE, NUTS, ISCO… the SDMX Content Oriented Guidelines Domain specific codifications A standard code list is a code list already harmonised Standard code lists should be used all along the statistical business process: data design, collection, aggregation, dissemination, archiving…
8
Example of a harmonised code list (NACE Rev. 1.1)
Old version (before harmonisation) New version (after harmonisation) Domains Old codes Old label_en New codes New label_en hrst, htec MA_TOTAL Manufacturing sector D Manufacturing fats MAN Manufacturing industries theme3 RD Manufacturing industry theme4 B0200 theme8 SE0_4 theme9 TOT_MANUF ds, hrst, htec MA_LOW_TEC Low technology manufacturing sector D_LTC Low-technology manufacturing fats / inn LOT Low Technology (incl. following NACE codes: 15-22; 36, 37) inn I_LOW_TEC Low tech industries: NACE Rev.1 codes 15 to 22, 36 and 37 SE_TOTAL Services: NACE Rev. 1.1 sections G to Q = 50 to 99 G-Q Services SER Services sector
9
Impact on the statistical business processes
Better comparability: same codes for the same concepts Increase efficiency: less transcoding; less code lists; clean lists Improve accuracy: facilitate data management and exchange and reduce the number of errors Re-usability and integration of the data: data warehouse are only possible if codes corresponding to the same concept are the same SDMX implementation: it is essential for the implementation of a SDMX data/metadata exchange process. The ESS standard code lists will also be made available in the Euro SDMX Registry (currently RAMON).
10
RAMON
11
Standard Code Lists in RAMON
12
The ESS Reference Metadata Standards
ESMS Euro SDMX Metadata Structure ESQRS ESS Standard for Quality Reports Structure EPMS Eurostat Process Metadata Structure The Statistical Data and Metadata eXchange (SDMX) compliant reporting template for reference metadata, i.e. the Euro SDMX Metadata Structure (ESMS) was introduced at Eurostat in 2008. It contains the description and representation of statistical metadata concepts to be used for documenting statistical data and for providing summary information useful for assessing data quality, documenting methodologies and the production process in general.
13
The Euro SDMX Metadata Structure (ESMS)
It uses 21 high-level concepts, with a limited breakdown of sub-items, strictly derived from the list of cross domain concepts in the SDMX Content Oriented Guidelines. The ESMS is aligned with the European Statistics Code of Practice and its coverage of institutional environment, statistical processes and statistical outputs. The data quality management aspects are consistent with the ESS quality criteria as they are defined in the new Regulation on European Statistics that entered into force on 1 April 2009. The main quality criteria (relevance, accuracy, timeliness and punctuality, accessibility and clarity, comparability and coherence) including cost and burden are covered by the ESMS metadata concepts The structure and content of the criteria and the sub-criteria are according to the new ESS guidelines for quality reports. In order to explain the content of the metadata concepts on quality, short "Descriptions" and "ESS Guidelines" on what is to be reported by the Eurostat domain manager in a first step and by countries in a second step, have been included in ESMS. The main reference for further information on what to report for each aspect of quality is further explained in the ESS Handbook for Quality Reports [Eurostat, 2009b and the forthcoming revision in 2014]. The information to be entered is normally free text and it is recommended to provide summarised information on the main strengths and weaknesses of the statistical outputs and to provide links for more detailed information about the data (comprehensive quality reports, methodological reports, compliance monitoring reports). In general, quantitative information on quality – through Quality and performance indicators - should only be provided for a few key statistics (EU-aggregates and Member States totals). However, if considered important, quantitative information should also be provided indicating the quality at a lower breakdown.
14
The ESS Standard for Quality Reports Structure (ESQRS)
15
The Euro Process Metadata Structure (EPMS)
Concept name 1 Contact 1.1 Contact organisation 1.2 Contact organisation unit 1.3 Contact name 1.4 Contact person function 1.5 Contact mail address 1.6 Contact address 1.7 Contact phone number 1.8 Contact fax number 2 Summary process description 3 Workflow 4 Statistical processing 4.1 Data collection 4.2 Source data 4.2.1 Source data - integration 4.2.2 Source data - coding 4.3 Data validation 4.3.1 Data validation in Member States 4.3.2 Validation rules agreed with Member States 4.3.3 Data validation - detection (Eurostat) 4.3.4 Data validation - correction (Eurostat) Concept name 4.4 Data compilation 4.4.1 Data compilation - variables 4.4.2 Data compilation - weights 4.4.3 Data compilation - aggregates 4.4.4 Data compilation - finalisation 4.4.5 Data compilation - draftoutput 4.5 Data validation - final 4.5.1 Data validation final - output 4.5.2 Data validation final - explanation 5 Confidentiality 5.1 Confidentiality - data treatment 6 Release policy 6.1 User access 7 Dissemination format 7.1 Publications 7.2 On-line database 7.3 Micro-data access 7.4 Other 8 IT applications 8.1 IT applications for data reception/collection 8.2 IT applications for data processing 8.3 IT applications for data validation 8.4 IT applications for data confidentiality 8.5 IT applications for metadata 8.6 Other IT applications
16
Dissemination of reference metadata
17
Dissemination of national reference metadata
18
The ESS Metadata Handler
The business process Common user Interface Output produced for the Eurostat Web Other output for Eurostat or external users ESS-MH IT application RAMON CODED ESS – Metadata Handler Euro SDMX Registry Input from national metadata Metadata from the Eurostat Domain manager Eurostat as main administrator
19
What is the ESS Metadata Handler?
The ESS Metadata Handler is a web based application for reference metadata production, exchange and dissemination in the ESS; It implements the ESS metadata standards (ESMS, ESQRS and EPMS, etc.) It replaces EMIS (used in Eurostat) and NRME (for countries); It contains many improvements based on users' feedback (in terms of business process and functionalities); It is in production since 31 January 2014.
20
National Statistical Institute EUROSTAT
EDAMIS National Statistical Institute EUROSTAT ESS Metadata Handler (ESS MH) ESS MH Database Eurostat Website PRODUCTION TREATMENT & ANALYSIS DISSEMINATION National Metadata File
21
The business process for using the ESS MH for national metadata
Mapping of the existing national reference metadata files to the ESMS and/or ESQRS formats; Conversion of existing national reference metadata files into standard structure; Insertion of these files into the ESS MH application; The NSI’s are asked to complete, enhance their converted files, directly in the ESS MH; The responsible Domain Managers in Eurostat are asked to validate these ESMS / ESQRS files; The national metadata are finally disseminated on Eurostat Web site (if decided so). Time line is approximately 6 months.
22
ESS standards for metadata - Implementation
Waste (end of life vehicles, packaging, electronic waste) WINE FARM STRUCTURE MIP STATISTICS HICP/ Compliance monitoring EHIS (Education, health and social protection) R&D (CIS 2012) Annual crops PRAG ESAW AES (Education, Science and Culture) LCI (Labour Cost Index) INFOSOC (Information Society) BUSINESS REGISTER HICP LFS-Q, LFS-A EU-SILC FATS STS (Short Term Statistics) WASTE AEI (Pesticides) EDUCAT JVC (Job Vacancy Stats) PRODCOM EXTERNAL TRADE (3rd countries) COSAEA URBANREG R&D TOURISM PERMANENT CROPS CENSUS HOUSING PRICES HPS Cross-domain concepts in the SDMX framework describe metadata concepts relevant to several statistical domains. SDMX recommends use of the concepts outlined below whenever feasible to promote re-usability and exchange of statistical information and their related metadata between national and international organizations. Whenever used, these concepts should conform to the specified names, roles, and representations defined in the SDMX Content-Oriented Guidelines. CDC promote: reusability exchange of data and related metadata Dynamic asset: initial growth, then relative stabilisation expected continuos maintenance 22
23
Practical Example reusability exchange of data and related metadata
Cross-domain concepts in the SDMX framework describe metadata concepts relevant to several statistical domains. SDMX recommends use of the concepts outlined below whenever feasible to promote re-usability and exchange of statistical information and their related metadata between national and international organizations. Whenever used, these concepts should conform to the specified names, roles, and representations defined in the SDMX Content-Oriented Guidelines. CDC promote: reusability exchange of data and related metadata Dynamic asset: initial growth, then relative stabilisation expected continuous maintenance 23
24
Example for reference metadata
That source contains metadata about the Tourism datasets When exchanging statistical data, other useful information are also exchanged: reference metadata. In the SDMX model, objects can have explanatory texts explaining the main features of data. This is typically described as “reference” metadata. Reference metadata can be stored and exchanged without being embedded in the data message. In other words, those metadata are normally linked to the object by a simple “reference” to the object. Another important point is that very often these metadata are associated not with specific observations or series of data, but with entire collections of data or even with the institutions providing the data. 24
25
Let's select a specific topic from which we'll create a metadata report
25
26
Content of the selected topic
27
Content of the selected topic
27
28
Defining a Metadata Structure Definition
The Tasks Analysis of the entire set of metadata in order to identify and document the “Concepts” for which metadata are to be reported or disseminated. Determine the structure of the “Metadata Report” in terms of the concepts used, the hierarchy of the concepts when used in the report, and their “representation” (e.g. is a code list used, is the format free text?). Specify the “object type” to which the metadata are to be attached, and how this object type is identified: knowledge of the SDMX Information model is useful here (as the metadata can only be attached to object types that can be identified in terms of the object types that exist in the information model).
29
Metadata Report Structure – Content Metadata
BASIC_METH_ISSUES POS_ACC_TOUR The following attributes can be derived from these examples. STAT_UNIT
30
SCOPE_OBS TOUR_ACC_ESTAB NACE_55_1
The following attributes can be derived from these examples. TOUR_ACC_ESTAB NACE_55_1 30
31
Metadata Report Structure – Concept Scheme
The following concepts are derived from this example: CONTACT CONTACT_ORG CONTACT_ORG_UNIT The following attributes can be derived from these examples. CONTACT_MAIL_ADDRESS BASIC_METH_ISSUES POS_ACC_TOUR STAT_UNIT SCOPE_OBS TOUR_ACC_ESTAB NACE_55_1
32
Metadata Report Structure – Bringing it Together
Report Structure - Contact Report CONTACT CONTACT_ORG ESTAT_MSD CONTACT_ORG_UNIT CONTACT_MAIL_ADDRESS For the Contact report the following schematic shows the definition. ESTAT_METADATA_CS CATEGORY_REPORT
33
Metadata Report Structure – Bringing it Together
Report Structure - Quality Report BASIC_METH_ISSUES POS_ACC_TOUR STAT_UNIT ESTAT_MSD SCOPE_OBS TOUR_ACC_ESTAB NACE_55_1 CATEGORY_REPORT ESTAT_METADATA_CS
34
Metadata Set: Structure
References to : a Metadata Structure Definition (MSD) a Report Structure a Target Identifier Defines: The actual values of the target objects Comprises: The Reported Attributes and their corresponding Values These Attributes may be: coded text date/time number etc. Full Target Identifier: DATA_PROVIDER; TIME_PERIOD; DATAFLOW
35
Metadata Set – General Schematic
CATEGORY_REPORT CONTACT BASIC_METH_ISSUES CONTACT_ORG POS_ACC_TOUR Unit G3 Short-term statistics; tourism CONTACT_ORG_UNIT Eurostat, Statistical Office of the European Communities CONTACT_MAIL_ADDRESS
36
Metadata Set – Metadata file
The SDMX-ML for this, reported in the generic format for a Metadata Set is displayed here.
37
Metadata Set – ESMS example
Metadata reports are structured using a set of main Concepts. In this example, taken from the Eurostat standard metadata format, these concepts and sub-concepts represent the metadata elements for which a text needs to be reported. In the example, “Contact organisation” has the value “ Statistical Office of the European Communities (Eurostat)” 37 37
38
Metadata Set – ESMS Example
In our example, the concept used in the metadata report can be easily retrieved into the metadataset: Attribute “Contact” with its three sub-concepts: Organisation, organisation unit and address Attribute “Metadata update” with its three sub-concepts: last certified, last posted, and last updated. 38 38
39
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.