The Generic Statistical Information Model (GSIM) and the Sistema Unitario dei Metadati (SUM): state of application of the standard Cecilia Casagrande –

Slides:



Advertisements
Similar presentations
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
Advertisements

Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Overview of key concepts and features
1 Business Exchange Structures Concepts.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
Background Data validation, a critical issue for the E.S.S.
WP.5 - DDI-SDMX Integration
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
NSI 1 Collect Process AnalyseDisseminate Survey A Survey B Historically statistical organisations have produced specialised business processes and IT.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.
CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.
GSIM implementation in the Istat Metadata System: focus on structural metadata and on the joint use of GSIM and SDMX Mauro Scanu
Statistics Portugal/ Metadata Unit Monica Isfan « Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata.
BAIGORRI Antonio – Eurostat, Unit B1: Quality; Classifications Q2010 EUROPEAN CONFERENCE ON QUALITY IN STATISTICS Terminology relating to the Implementation.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
CountrySTAT Regional Basic Administrator Training for ECO Member States Friday, October 23, 2015 EVENT Foundations of CountrySTAT E-learning.
Towards a more efficient system of administrative data management and quality evaluation to support statistics production in Istat Grazia Di Bella, Simone.
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
Supporting Researchers and Institutions in Exploiting Administrative Databases for Statistical Purposes: Istat’s Strategy G. D’Angiolini, P. De Salvo,
Statistical Metadata Strategy and GSIM Implementation in Canada Statistics Canada.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
SDMX IT Tools Introduction
Generic Statistical Information Model (GSIM) Jenny Linnerud
Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) APRIL 2006Mar Blanco Frías STATISTICAL METADATA MODEL DEVELOPED IN SPAIN:CURRENT.
7b. SDMX practical use case: Census Hub
A strategy on structural metadata management based on SDMX and the GSIM models Stefania Bergamasco, Alessio Cardacino, Francesco Rizzo, Mauro Scanu, Laura.
Copyright © 2007, Oracle. All rights reserved. Managing Items and Item Catalogs.
National Bureau of Statistics of the Republic of Moldova 1 High Level Seminar for Eastern Europe, Caucasus and Central Asia Countries (EECCA) on 'Quality.
METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.
The Role of service Granularity in Successful CSPA Realization Zvone Klun, Tomaž Špeh Geneve, 22 June 2016.
Supporting the use of administrative data in official statistics.
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
Topic 2 (ii) Metadata concepts, standards, models and registries
Towards connecting geospatial information and statistical standards in statistical production: two cases from Statistics Finland Workshop on Integrating.
Contents Introducing the GSBPM Links to other standards
THE BNSI EXPERIENCE IN METADATA COLLECTION AND ORGANIZATION
CV-1: Vision The overall vision for transformational endeavors, which provides a strategic context for the capabilities described and a high-level scope.
GSIM Implementation at Statistics Finland Session 1: ModernStats World - Where to begin with standards based modernisation? UNECE ModernStats World Workshop.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
WORKSHOP GROUP ON QUALITY IN STATISTICS
A generic production environment - use of GSIM in Statistics Sweden
Census Hub in practice Working Group "European Statistical Data Support" Luxembourg, 29 April 2015.
Generic Statistical Business Process Model (GSBPM)
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
Working on coherence and consistency of an output database
A handbook on validation methodology Marco Di Zio Istat
Towards common metadata using GSIM and DDI 3
SDMX Information Model: An Introduction
The problem we are trying to solve
SDMX in the S-DWH Layered Architecture
GSBPM and Data Life Cycle
Chapter 2 Database Environment Pearson Education © 2014.
A review of the 2011 census round in the EU, including the successful implementation of a detailed European legal base First meeting of the Technical Coordination.
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
Metadata used throughout statistics production
Statistical process as a structured chain of successive actions and intermediate products, supported by the coherent use of metadata  Focused on energy.
Parallel Session: BR maintenance Quality in maintenance of a BR:
Presentation to SISAI Luxembourg, 12 June 2012
Generic Statistical Information Model (GSIM)
Joint UNECE/Eurostat/OECD
Introduction to reference metadata and quality reporting
The Role of Metadata in Census Data Dissemination
Hands-on GSIM Mauro Scanu ISTAT
SDMX training Francesco Rizzo June 2018
GSIM overview Mauro Scanu ISTAT
Presentation transcript:

The Generic Statistical Information Model (GSIM) and the Sistema Unitario dei Metadati (SUM): state of application of the standard Cecilia Casagrande – casagran@istat.it 21 settembre 2016

The Istat metadata system SUM SUMMARY Generic Statistical Information Model (GSIM) and Istat metadata system (SUM) The Istat metadata system SUM -- Population, variable and other concepts -- Classification -- Data content -- Data structures Functionalities, developments and benefits Search functionalities Implementation of GSIM Benefits and more

GSIM and Istat metadata system (SUM) GSIM describes the “interfaces” in terms of subprocesses input and output, leaving to GSBPM the description of the subprocesses In order to trace the data production process, Istat metadata system considered GSIM as a primary source of information. Our current objective is to develop the part of Istat metadata system (SUM) related to data (structural metadata). Hence, attention has been given mostly to a portion of GSIM, the part related to the “structure” and “concepts” 4.Istat metadata system SUM is coherent with GSIM for definitions and nomenclatures

The Istat metadata system SUM The Istat metadata system (SUM) is a centralized system which contains structural metadata. SUM aims to describe data and data connections able to trace the data production process from data collection (raw data) up to data dissemination.

The system should aim at easing: The Istat metadata system SUM The system should aim at easing: Harmonization: our Institute should speak using the same terms according to a common language structure, independently on the theme Reuse: new processes should draw terms among those already existing Search functionalities: in order to foster harmonization and reuse, there should be adequate search functionalities Traceability: every step of the data production process is operationally defined so to allow reconstruction of the same data from another party (transparency)

The Istat metadata system SUM

The Istat metadata system SUM: population, variable and other concepts Hierarchical relationship

The Istat metadata system SUM: population, variable and other concepts Identification variable Statistical variable Indicator Operative concept Time concept Frequency concept Other concept

The Istat metadata system SUM: classification

The Istat metadata system SUM: classification New component in the SUM’s classification system: the simultaneous levels

Data content The Istat metadata system SUM: data content In SUM, we introduced the “data content” concept. It is defined according to the GSIM (specification) lines 47-50: Each data is a result of a Process step through the application of a Process method on the necessary Inputs. Hence it is modelled by specifying: Example: Monthly average household expenditures -Statistical Program and Statistical Program Cycle -Process Step (phase) -Process Method -Inputs Labour force survey + Time dimension and Freq dimension Dissemination Average Population (Households) + Numeric variable (monthly expenditures)

The Istat metadata system SUM: data content

Statistical program and Statistical program cycle The Istat metadata system SUM: data content Input Process Method Statistical program and Statistical program cycle Process Step

The Istat metadata system SUM: data content Data Content feeds the GSIM concept Measure in a Data Structure. SUM maintains a special code list “Data Content” where each item contains all the previous details. In this way a user can find the meaning of the data in a hypercube in a unique place. Furthermore a data producer has a form to complete for describing any new data content (data content structure). Any data producer should describe the “Data Content” according to the same “structure”. If the “Data Content” is not well defined, the meaning of data is not easy to understand and massive use of mappers should be foreseen for data and metadata exchange.

The Istat metadata system SUM: data structures Dimensional Data Structures (macro) Unit Data Structure (micro) GSIM states that “A Data Structure describes the structure of a Data Set by means of Data Structure Components (Identifier Components, Measure Components and Attribute Components).”

Dimensional Data Structures (macro) Functionalities, developments and benefits: data structures Dimensional Data Structures (macro) Metadata that identify a data set Statistical program Reference time Phase Data content Categorical variables Time dimensions Other dimensions Attributes (unit measure, unit multiplier, obs. status,..) Metadata that specify the meaning of each datum Data input Method of transformation Metadata on the relationship between datasets

Unit Data Structure (micro) Functionalities, developments and benefits: data structures Unit Data Structure (micro) Statistical program Reference time Phase Reference population Metadata that identify a data set Unit identifier Numerical/quantitative variables Categorical variables Textual variables Macrodata requested at the microdata level Attributes (maximum number of answers, rules, structural zeros,…) Paradata Metadata that specify the meaning of each datum Relationship between the investigated populations Data input Method of transformation Metadata on the relationship between datasets

Search functionalities Functionalities, developments and benefits: search functionalities Search functionalities Dimensional Data Structures (macro)

Unit Data Structure (micro) Functionalities, developments and benefits: search functionalities Unit Data Structure (micro)

Comparisons and implementation of Gsim Functionalities, developments and benefits: implementation of GSIM Comparisons and implementation of Gsim GSIM concept SUM presence Population Yes Unit type Not yet Conceptual variable Represented variable Instance variable No Classification Yes (slight different definition) Code list Category set Data set Yes (fixing roles for dimensions) Furthermore: Data content (for macro data structures)

Output = transformation (input) Functionalities, developments and benefits: benefits Benefits and more The use of GSIM concepts helps in harmonizing the description of a dataset between each phases. Among the concepts already available in GSIM, an additional concept (the “Data Content”) could be useful in order to feed in a standard and complete way a Measure of a Data Structure (of macrodata). This is what we have done in Istat. The corporate DWH (I.Stat) has more than 3300 “data contents”. In SUM it is possible to search data through different facets: Statistical program Reference population of the data Numerical variables used for the production of a data content Categories of a categorical variable used in data structures Output = transformation (input)

Thanks for your attention