Download presentation
Presentation is loading. Please wait.
1
2. An overview of SDMX (What is SDMX? Part I)
Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, 1-2 March 2017
2
A typical production chain
Data collection is little different from other goods
3
What are the key features?
PRODUCER MANUFACTURER CONTRACT SPECIFICATIONS Type of fruits (oranges) Dimensions of the box Number of fruits per box
4
What are the key features?
PRODUCER MANUFACTURER All the details of the contract are stored in the company offices to be checked by both parties SPECIFICATIONS COMPANY OFFICE CONTRACT
5
DATA STRUCTURE DEFINITION
In SDMX … DATA PRODUCER DATA CONSUMER PROVISION AGREEMENT SDMX REGISTRY DATAFLOW DATA STRUCTURE DEFINITION
6
What is SDMX in more technical terms?
7
What is SDMX? A model to describe statistical data and metadata
A standard for automated communication from machine to machine A technology supporting standardised IT tools statisticians agree to use common descriptions and guidelines driven by these common descriptors for all to reuse developed as wide-ranging open source software
8
Presentation of SDMX The SDMX Information model: What is the information model underlying the data and metadata exchange between the partners? Content-oriented guidelines: How to increase the interoperability and statistical harmonisation? IT Architecture for Data Exchange: How to exchange the data?
9
The information model
10
The Information Model: … is a representation of concepts, relationships, constraints, rules and operations. … is a formal way to: - express and design information needs - communicate with IT people - give specifications to reporting agents - document the system - drive the software How many of you made kit airplane / car models? Well the model was the example to follow…there was a representation of the airplane showing where the wheels were glued, the windows put in and the wings attached. The Information Model is similar…it is a schema of SDMX objects and their relationships. It is an abstract, formal representation. It specifies relations between kinds of things as well as individual things.
11
What things does SDMX need to model?
Statistical data Through descriptor concepts. These concepts can be further classified into dimensions, attributes and measures. Metadata Structural metadata (such as concept names etc.) Reference (or explanatory) metadata Data exchange processes Needs to model three things. Reference metadata, which is generally in a textual format, describes the contents and quality of the data from a semantic point of view. They include explanatory texts on the context of the statistical data, methodologies for data collection and data aggregation as well as quality and dissemination characteristics. How to take data and metadata under a source schema and transform it into data and metadata structured under a target schema.
12
Modelling statistical data
13
Modelling structural metadata
Data Structure Definition (DSD) Identification of dimensions, attributes and measures Use of common code lists Integration into concept schemes
14
Modelling reference metadata
Quality descriptions Process descriptions Methodological descriptions Administrative descriptions So much descriptive information. It needs to be expressed in a common, standard way.
15
The standard way is the Metadata Structure Definition (MSD)
A Metadata Structure Definition describes how metadata sets, containing reference metadata are organised. In particular, it defines: which metadata are being exchanged; how these concepts relate to each other; how they are represented (free text or coded values); with which object types (agencies, data flows, data providers, subsets of data flows, or others) they are associated.
16
Modelling reference metadata in SDMX
17
THE CONTENT-ORIENTED GUIDELINES
18
Content-oriented guidelines
The content-oriented guidelines are a set of recommendations within the scope of the SDMX standard in order to produce maximum interoperability. The SDMX standards: - provide essential support to statisticians; - maximise the amount of information through to users; - allow an automation of the process; - allow web-service queries.
19
There are three main areas in the content-oriented guidelines:
Statistical subject-matter domains. Cross-domain concepts (and code lists). A Metadata Common Vocabulary.
20
Statistical subject-matter domains
Statistical subject matter domains is a high level classification of statistical areas. They refer to statistical activities that have common characteristics with respect to variables, concepts and methodologies for data collection. Examples: price statistics, national accounts, environment statistics or education statistics. It is intended to cover the universe of official statistics.
21
Cross-domain concepts
They are a list of statistical concepts, related to statistical processes and data quality. The list is based on the concepts used by the contributing international organisations. The concepts can be used at the data side as well as at the metadata side.
22
Example of cross-domain concept
23
A cross-domain concept may have a code list as presentation.
This means that the concept might take a limited set of possible values enumerated in its corresponding code list. The code lists associated with cross-domain concepts are called cross-domain code lists.
25
Metadata Common Vocabulary
The Metadata Common Vocabulary (MCV) is a vocabulary that recommends a common terminology to be used in order to facilitate communication and understanding The MCV is closely linked to the cross-domain concepts as it also contains all these concepts, stating their definitions and context descriptions.
26
Example of Metadata Common Vocabulary
27
IT Architecture for data exchange
28
Standard formats for the exchange of data and metadata.
SDMX-ML Architectures for data exchange: Push Pull Data-hub SDMX Tools
29
Producer can push them to the manufacturer…
1 GET SPECIFICATIONS 2 3 PUSH PREPARE
30
Push mode
31
Manufacturer can go and collect the oranges…
3 2 SEND NOTIFICATION GOODS ARE READY 1 PULL 4 PREPARE
32
Pull mode
33
In some cases, final client can get the oranges directly from the producer ..
2 REQUEST 1 REQUEST 3 PREPARE 4 SEND
34
Data Hub
35
SDMX tools Eurostat tools at our SDMX Info Space SDMX Data Structure Wizard (used to create, edit and test SDMX artefacts). SDMX Converter (converts data files between SDMX formats and other file formats). ESS Metadata Handler SDMX Reference Infrastructure (SDMX-RI) (set of tools that allows to connect your IT systems to the SDMX world). SDMX Mapping Assistant (mapping and transcoding of the contents of an existing database to SDMX data structures).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.