Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX for IT Experts Mapping Assistant Jorge Nunes Raynald Palmieri 18-20 February 2014 Eurostat Unit B3 – IT and standards for data and metadata exchange
Table of Contents Mapping Assistant Introduction Mapping Store DB Importing SDMX structure files Configuring mappings Dissemination Databases Databases in practice. Storage Schemes: A, B, C, D Hands-on use of Mapping Assistant 2
Dissemination Databases What does a DDB contain? Data Observations Attribute values Dimension values Structural Metadata Codelists Dataflows Reference metadata Creation metadata Processing metadata
Dissemination Databases Extracting data in SDMX Metadata Headers DSD Components SSTSRTD_PROD_M Local Tables Codelists SSTSRTD_IND_M Annotations Local Columns DataFlows SSTSRTD_TOVT_Q SQL SDMX Infrastructure SDMX Queries SDMX Output 4
Databases in practice Relational DDB issues Structural Many inherently different storage schemes Content Transcoding Date and time formats (text, datetime) Fragmented information in multiple columns Other data in the same tables not related to requested transmission.
Databases in practice Requirements Advanced Mapping Capable of handling many relational DDB storage schemes Data retrieval Using the performed mappings SDMX-ML generation
Storage Scheme A (simple) DSD Id: CENSUS_HUB Version: 1.1 Agency: Eurostat Dimensions: AGE CAS GEO SEX OBS_VALUE AGE CAS GEO SEX OBS_VALUE 001 GR 100 002 200 TOT 300 IT 150 250 400 7
Storage Scheme B DSD Id: CENSUS_HUB Version: 1.1 Agency: Eurostat Dimensions: AGE CAS GEO SEX OBS_VALUE AGE CAS GEO 001 002 TOT GR 100 200 300 IT 150 250 400 8
Storage Scheme C “Master” Table “Slave” Tables FR_006_TABLE DSD Id: CENSUS_HUB Version: 1.1 Agency: Eurostat Dimensions: AGE CAS GEO SEX OBS_VALUE “Master” Table “Slave” Tables AGE SEX OBS_VALUE … CAS GEO TABLE 001 IT IT_001_TABLE GR GR_001_TABLE … AGE SEX OBS_VALUE … AGE SEX OBS_VALUE 001 100 002 200 TOT 300 AGE SEX OBS_VALUE 001 150 002 250 TOT 400 FR_006_TABLE GB_002_TABLE IT_001_TABLE GR_001_TABLE 9
Storage Scheme D “Secondary” Table “Primary” Table 10 DSD Id: CENSUS_HUB Version: 1.1 Agency: Eurostat Dimensions: AGE CAS GEO SEX OBS_VALUE “Secondary” Table “Primary” Table AGE CAS GEO Id 001 GR 00001 IT 00002 … Id SEX OBS_VALUE 00001 001 100 002 200 TOT 300 00002 150 250 400 … 10
Storage Scheme UNION Data for Italy Data for Spain CAS GEO SEX OBS_VALUE 001 GR 100 002 200 TOT 300 IT 150 250 400 Data for Spain DSD Id: CENSUS_HUB Version: 1.1 Agency: Eurostat Dimensions: AGE CAS GEO SEX OBS_VALUE AGE CAS GEO SEX OBS_VALUE 001 GR 100 002 200 TOT 300 IT 150 250 400 Area dimension is missing in both tables 11
Combination of Storages Schemes “Slave” Tables “Primary” Table “Secondary”, “Master” Table SEX OBS_VALUE … AGE Id 001 00001 002 00002 … Id CAS GEO TABLE 00001 001 GR GR_001_001_TABLE 00002 IT IT_001_001_TABLE … SEX OBS_VALUE … SEX OBS_VALUE 001 100 002 200 TOT 300 SEX OBS_VALUE 001 150 002 250 TOT 400 FR_001_003_TABLE GB_002_001_TABLE IT_001_001_TABLE GR_001_001_TABLE 12
Problems Infinite storage schemes (chaotic) – custom queries Transcoding issues Date-time issues Level of attributes – replication required Performance issues External programs: Limited permissions on DDBs
Mapping Assistant application Facilitate mapping between the structural metadata of the DSD and the one of the Dissemination Database. MSDB is used by others tools. NSI Web Service Test Client i.e. Data Retriever API The Mapping Assistant is meant to facilitate the mapping between the structural metadata provided by the DSD and the respective ones that reside in the Dissemination database of the NSI operational environment. The Mapping Store DB created and managed by the Mapping Assistant can then be used by other tools, such as the Data Retriever.
SDMX-RI Components MSDB Store DDB Firewall Firewall Firewall DB Server DB Server MSDB Store Workstation App Server Mapping Assistant NSI-WS External Client Test Client NSI-Client DB Server DDB Internal Network Secure DMZ DMZ Internet
Mapping Store DB Used only by the SDMX-RI software. Contains: Accessed and managed only by SDMX-RI software. Contains: SDMX artifacts. Data flow mappings, transcodings.
MSDB initial setup Creation of Mapping Store DB objects. Tables, sequences, stored-procedures. Performed by the Mapping Assistant. Performed only once.
MSDB Initial Setup Connection Parameters One database has to be selected to store the mapping store. The first thing we have to do is to create the data base connexion to this database.
MSDB Initial Setup Success
MSDB Initial Setup Tables Created
Import SDMX structure files SDMX artifacts are imported from SDMX-ML files. Category schemes and categories. Code lists. DSDs. Data flows. Information on SDMX artifacts is stored in the MSDB.
Import SDMX Structure Files
After Importing SDMX Structure Files
Configuring mappings Create DDB connection. Create data set. One DDB connection can be used for multiple data sets and mappings. Create data set. A data set is similar to a “view” on a DDB table or tables. Create mapping set. Maps the fields of a data set to the components of a DSD. Bridge between the relational DB world and the DSDs of the SDMX world.
DDB Connections
Creating a Data Set (1)
Creating a Data Set (2)
Creating a Data Set (3) Preview Query
Creating a Mapping Set (1)
Creating a Mapping Set (2)
Creating a Mapping Set (3) Component with Fixed Value
Creating a Mapping Set (4) Check Mapped Data
Creating a Mapping Set (6) Check Mapped Data again
09 - Mapping Assistant