SDMX DATA STRUCTURE DEFINITION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA
Session Objectives At the end of this morning you will: Know the SDMX model of a data structure definition Understand the techniques to identify the structure of data Identify the concepts in a simple data set Be able to develop a simple data structure definition
Session Objectives At the end of this session you will: Know the SDMX model of a data structure definition Understand the techniques to identify the structure of data Identify the concepts in a simple data set Be able to develop a simple data structure definition
Data Set
Extract from a spreadsheet
What’s stopping us processing this data Outside of a spreadsheet processor? Not easy to process text comparison language What is the text e.g. where is the date, country, unit of measure?
Data Set
Web site What is on here that is not on the spreadsheet?
What’s stopping us processing this data 0utside of a spreadsheet processor? Not easy to process text comparison language What is the text e.g. where is the date, country, unit of measure? Have we lost any information? Metadata Hierarchy
What are we missing? The structure of the data What is this? Key
Data Structure Key – what is it and does it mean? These are values for what part of the structure?
The Key of the Data Dimensions Identify some dimensions Country Frequency Adjustment + others
Dimensions NAC Data Structure Definition Key Dimensions BOP Data Structure Definition Key Dimensions Country Frequency Adjustment + others Country Frequency Adjustment + others what’s wrong here?
The Key of the Data Dimensions Is “Country” “Frequency”, “Adjustment” relevant to other structures? How do we enable this? Are we missing something? We therefore need Concepts that are independent of use in data structures (and metadata structures)
Concepts ESA Data Structure Definition Key Dimensions BOP Data Structure Definition Key Dimensions Country Frequency Adjustment + others Concept uses
The Key of the Data Dimensions Is “Country” “Frequency”, “Adjustment” relevant to other structures? We therefore need Concepts that are independent of use in data structures (and metadata structures) What else does a Dimension need?
The Key of the Data Dimensions Is “Country” “Frequency”, “Adjustment” relevant to other structures? We therefore need Concepts that are independent of use in data structures (and metadata structures) What else does a Dimension need? Specification of valid content Code Lists or non-coded format (e.g. integer)
Data Set Structure: Concepts and Code Lists Code Lists GDP Indicator B1QG00 Gross domestic product at market prices F33200 Long-term securities other than shares TOTEMP Total employment COUNTR Y Adjustment N Neither seasonally nor working day adjusted S Seasonally adjusted, not working day adjusted T Trend CONCEPTS Country GDP Indicator Adjustment Concepts I6 EU 17 BE Belgium DE Germany
Representation has code list Code List concepts that identify the observation Data Structure Definition Key Dimensions has format takes semantic from Representation Coded Non- coded Concept
What else is required to define a Data Structure?
What else is required to define a Data Structure Additional Metadata
Attributes has code list Code List Attributes concepts that add metadata has format concepts that identify the observation Data Structure Definition Key Dimensions Concept takes semantic from has format takes semantic from Representation Coded Non- coded Attribute Relationship
Anything Else? observations
has code list Code List Attributes concepts that add metadata has format concepts that identify the observation Data Structure Definition Key Dimensions Concept Measure(s) takes semantic from has format takes semantic from has format concepts that are observed phenomenon Representation Coded Non- coded Attribute Relationship Measures
What do we need in order to be able to process this in a computer system?
Data Set Structure Computers need to know the structure of data in terms of: Dimensionality Additional metadata (Attributes) Measures (Observation) Concepts Valid content Code Lists Non coded format (integer, date, text)
Concepts play roles in a Data Structure Comprises –Concepts that identify the observation value –Concepts that add additional metadata about the observation value (as a value or the context of the value) –Concept that is the observation value –Any of these may be coded text date/time number etc. Dimension s Attributes Measure Representation
ESA.Q.BE.Y.0000.B1QG TTTT.L.N.P. 2000Q4 = 1.0 Data Makes Sense 1.0
Data Makes Sense – what are we missing?
Attributes Attribute Relationship
Q. What is required to do this? A. Referencing Mechanism
Attribute Relationship Q. What is required to do this? A. Referencing Mechanism
Attribute Relationship ESA.Q.BE.Y.0000.B1QG TTTT.L.N.P.2000Q4 = 1.0 Do we have a referencing mechanism? Q. What is the referenced “object” A. A specific Dimension Value
Attribute Relationship ESA.Q.I6.Y.0000.P TTTT.L.N.P.2000Q1 = 0.7
has code list Code List Attributes concepts that add metadata has format concepts that identify a partial key concepts that identify the observation Data Structure Definition Key Group Key Dimensions Concept Measure(s) takes semantic from has format takes semantic from has format concepts that are observed phenomenon Representation Coded Non- coded Attribute Relationship Group Key Dimension(s) Data Set Observation
Where Are We? specification of cube sub-set in terms of sub set of valid content valid content in terms of structure (dimensions, attributes, measures) data discovery data providers Dataflow Data Structure Definition
Where Are We Data Structure Definition Code List Concept Dimension Attribute Measure references
Design a DSD: What do we need to do first? Identify the Concepts
Questions?