Mogens Grosen Nielsen Statistics Denmark (mgn@dst.dk) Implementation of GSBPM, DDI and SDMX reference metadata at Statistics Denmark UNECE workshop on International Cooperation for Standard-based Systems Geneva 5-7 May 2015 Mogens Grosen Nielsen Statistics Denmark (mgn@dst.dk)
History, strategy and principles on quality and metadata Agenda History, strategy and principles on quality and metadata Definitions, models and solution on quality reporting Towards more integrated metadata Information models Lessons learned and opportunities
Short history on metadata initiatives Before 2010: Separated and partially standardised metadata systems January 2010: Taskforce: integration variables, classifications, statistical concepts and quality October 2011: Test DDI (focus on integration) January 2012: EU grant. SDMX quality concepts, DDI, and GSBPM. Colectica as tool January 2015: Quality declarations for all statistics (300+) in Colectica February 2015: Strategy on quality and metadata approved
Strategy: Vision Metadata about content and quality must guide internal and external users in their knowledge processes give precise information about the products Internal efficiency gains International standards GSBPM, SDMX, GSIM, DDI and others
Strategy: Principles Business Process Management (End-to-End Processes) Stepwise implementation Code of Practice and Quality Assurance Framework Principles on metadata Metadata must fulfill user needs Metadata and metadata flow integrated into GSBPM As much reuse as possible Active use of metadata in IT-systems (incl. metadata driven production)
Metadata definition #1. How to communicate this term to statisticians? Use the SDMX definition as the short and easy-to-understand definition Reference metadata: Conceptual metadata Methodological and processing metadata Quality metadata Structural metadata: Metadata act as identifiers and descriptors of the data
Metadata definition #2. How to communicate this term to statisticians? Use Generic Statistical Information Model (GSIM) to tell the full story
Klassifikationsdatabase ”Classical” metadata using Data Documenation initiative (DDI), SDMX and Colectica – ”The Diamond” StatBank Methods/ ”Survey” Methods papers Quality declaration Concept Variable/dataset Concepts database Hvad betyder Variable database Classifications Implemented in 2012-2015 Klassifikationsdatabase Class database
SDMX Standard for Quality Single Integrated Metadata Structure (SIMS) Content (population, concepts, reference time etc) Statistical processing (sources and methods) Information on 5 quality dimensions Relevance Accuracy and reliability Timeliness and punctuality Comparability Accessibility and Clarity
SIMS and reporting formats Euro-SDMX Metadata Structure (ESMS) and ESS Standard for Quality Reports Structure (ESQRS)
GSBPM and work processes with focus on quality declarations Needs Prepare user needs etc. Analyse : Fill in accuracy etc.
The solution Enter Quality Information Publish at Dst.dk Existing metadata METADATA IN COLECTICA Publish at the Intranet Quality eports to Eurostat 12
300 surveys implemented January 10 2015 Enter Quality Information Publish at Dst.dk Existing metadata METADATA IN COLECTICA Publish at the Intranet Quality reports to Eurostat 13
Towards more integration of metadata Business perspective: Business Process Management (BPM) and metadata-driven approach GSIM compliant model in DDI and Colectica (concepts, variables, classification etc) Harmonisation of statistical concepts Integration of metadata in publishing systems Metadata management
Business Process Management General Environment: Political/legal context, Technology/standards Ressources:staff IT-systems etc ? Respondents/ registers etc Users User needs /orders
Management-, core- and support-processes General Environment context: Political/legal, Technology/standards Ressources:staff IT-systems etc Management processes Respondents/ registers etc Users Support processes: Quality, metadata, methods & IT User needs /orders
Business processes and metadata driven approach
Business processes and metadata driven approach Reference metadata recorded: needs, purpose etc.
Business processes and metadata driven approach Structural metadata: DDI on questionnaire, variables and cubes etc Reference metadata: concepts, population etc
Business processes and metadata driven approach Metadata used to create survey system, databases and output systems etc
Business processes and metadata driven approach Auto-generated survey system used
Business processes and metadata driven approach Reference metadata on quality etc
Business processes and metadata driven approach Structural and reference metadata used for dissemination (e.g. quality reporting)j Autogenerated code used in dissemination systems
Business processes and metadata driven approach All metadata used for evaluation
Models: conceptual, logical and physical What we are doing at Stat DK Model / level What are we doing at Stat DK Conceptual Selection of variable, concept etc from GSIM (the concept corner) Logical GSIM compliant DDI model (3.2) Physical GSIM compliant DDI model implemented in Colectica
Lessons learned Need for Improving content - declarations more uniform and compliant with common guidelines Using the same quality-concepts for many purposes: report to Eurostat, publish at national web-site and to other international organisations. Need for more analysis and improved dissemination at www.dst.dk Improved focus on change management and communication Clear organisational roles and a dedicated cross-cutting group to introduce and implement standards
Opportunities for cooperation Work on models – from abstract to concrete level (BPM, GSIM, GSBPM, DDI and SDMX) GAMSO and CSPA needs more attention Sharing of solutions for input, processing and output systems – e.g. Colectica add-ins How to handle metadata management integrated into GSBPM Common metadata
Thanks for your attention Remember: DDI conference in Copenhagen 2-3 December 2015
The End!