Presentation is loading. Please wait.

Presentation is loading. Please wait.

2. An overview of SDMX (What is SDMX? Part I)

Similar presentations


Presentation on theme: "2. An overview of SDMX (What is SDMX? Part I)"— Presentation transcript:

1 2. An overview of SDMX (What is SDMX? Part I)
Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, March 2016

2 What is SDMX? A model to describe statistical data and metadata
A standard for automated communication from machine to machine A technology supporting standardised IT tools for which statisticians agree to use common descriptions and guidelines driven by these common descriptors for all to reuse developed as wide-ranging open source software

3 Presentation of SDMX The SDMX Information model: What is the information model underlying the data and metadata exchange between the partners? Content-oriented guidelines: How to increase the interoperability and statistical harmonisation? IT Architecture for Data Exchange: How to exchange the data?

4 THE INFORMATION MODEL

5 The Information Model: … is a representation of concepts, relationships, constraints, rules and operations. … is a formal way to: - express and design information needs - communicate with IT people - give specifications to reporting agents - document the system - drive the software (

6 What things does SDMX need to model?
Statistical data Through descriptor concepts. These concepts can be further classified into dimensions, attributes and measures. Metadata Structural metadata (such as concept names etc.) Reference (or explanatory) metadata Data exchange processes

7 Modelling statistical data in SDMX

8

9 Key SDMX object concerning data:
Data Structure Definition (DSD) Identification of dimensions, attributes and measures Use of common codelists Integration into concept schemes

10 DSD example

11 Structural Metadata Data Dimensions Data Structure Definition (DSD)
(ex: country, variable/topic, year) Data Structure Definition (DSD) Dataset Structure Code lists Structural Metadata Attributes (ex: unit of measure) Identify/Describe Metadata about an individual value, a time series or a group of time series Data

12 Modelling reference metadata in SDMX
Quality descriptions Process descriptions Methodological descriptions Administrative descriptions So much descriptive information. It needs to be expressed in a common, standard way.

13 The standard way is the Metadata Structure Definition (MSD)
A Metadata Structure Definition describes how metadata sets, containing reference metadata are organised. In particular, it defines: which metadata are being exchanged; how these concepts relate to each other; how they are represented (free text or coded values); with which object types (agencies, data flows, data providers, subsets of data flows, or others) they are associated.

14 Modelling reference metadata in SDMX

15

16 THE CONTENT-ORIENTED GUIDELINES

17 Content-oriented guidelines
The content-oriented guidelines are a set of recommendations within the scope of the SDMX standard in order to produce maximum interoperability. The SDMX standards: - provide essential support to statisticians; - maximise the amount of information through to users; - allow an automation of the process; - allow web-service queries.

18 There are three main areas in the content-oriented guidelines:
Statistical subject-matter domains. Cross-domain concepts (and code lists). A Metadata Common Vocabulary.

19 Statistical subject-matter domains
Statistical subject matter domains is a high level classification of statistical areas. They refer to statistical activities that have common characteristics with respect to variables, concepts and methodologies for data collection. Examples: price statistics, national accounts, environment statistics or education statistics. It is intended to cover the universe of official statistics.

20 Functions of the classification of statistical domains.
A standard against which domain lists of national and international organisations can be mapped to facilitate the exchange of data and metadata. Provides an identifier for registering and searching statistical data on SDMX registries. Navigation aide for the identification and organisation of corresponding domain groups. When artefacts are stored, there has to be a clear way to retrieve them. The classification acts as a kind of catalogue index. These groups comprise organizations, working parties, expert groups, task forces, inter-secretariat working groups, UN city groups, etc, that are responsible for the development of statistical guidelines and recommendations and identification of best practice for statistics falling within the scope of a particular statistical domain. Working with the UNECE framework should facilitate identifying current or potential participants in various subject-matter domain groups. In particular, one of the objectives of the UNECE framework is the promotion of close co-ordination of statistical activities among international organizations active in the UNECE region. It achieves this close coordination by providing an extensive list of the domain groups,

21

22 Cross-domain concepts
They are a list of statistical concepts, related to statistical processes and data quality. The list is based on the concepts used by the contributing international organisations. The concepts can be used at the data side as well as at the metadata side.

23

24 Examples of cross-domain concept

25 Examples of cross-domain concept

26 A cross-domain concept may have a code list as presentation.
This means that the concept might take a limited set of possible values enumerated in its corresponding code list. The code lists associated with cross-domain concepts are called cross-domain code lists.

27 Code lists have a general description, a list of codes, their description and annotations that provide additional information on the codes. Examples of cross-domain concepts and code list: FREQ and its associated code list CL_FREQ. SEX and its associated code list CL_SEX.

28

29

30 Metadata Common Vocabulary
The Metadata Common Vocabulary (MCV) is a vocabulary that recommends a common terminology to be used in order to facilitate communication and understanding The MCV is closely linked to the cross-domain concepts as it also contains all these concepts, stating their definitions and context descriptions.

31 Metadata common vocabulary
The MCV covers a selected range of metadata concepts: General metadata concepts. Metadata terms decribing statistical methodologies and data quality. Terms referring specifically to data and metadata exchange.

32 Examples of Metadata Common Vocabulary

33 Examples of Metadata Common Vocabulary

34 IT Architecture for data exchange

35 Standard formats for the exchange of data and metadata.
SDMX-EDI SDMX-ML Architectures for data exchange: Push Pull Data-hub SDMX Tools

36 Push mode

37 Pull mode

38 Data Hub

39 SDMX tools Eurostat tools at our SDMX Info Space SDMX Registry (a central repository for storing and sharing SDMX artefacts). SDMX Data Structure Wizard (used to create, edit and test SDMX artefacts). SDMX Converter (converts data files between SDMX formats and other file formats). SDMX Reference Infrastructure (SDMX-RI) (set of tools that allows to connect your IT systems to the SDMX world). SDMX Mapping Assistant (mapping and transcoding of the contents of an existing database to SDMX data structures).

40 SDMX tools Other tools available in the community (www.sdmxtools.org)

41 Questions?


Download ppt "2. An overview of SDMX (What is SDMX? Part I)"

Similar presentations


Ads by Google