Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.

Slides:

Advertisements

Similar presentations

Status on the Mapping of Metadata Standards

Advertisements

SDMX training session on basic principles, data structure definitions and data file implementation 29 November

DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

SDMX and DDI: How Do They Fit Together in Practical Terms? Arofan Gregory The Open Data Foundation European DDI User’s Group 2011 Gothenburg, Sweden.

Codebook Centric to Life-Cycle Centric In the beginning….

United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE

Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.

Background Data validation, a critical issue for the E.S.S.

ESCWA SDMX Workshop Session: Role in the Statistical Lifecycle and Relationship with DDI (Data Documentation Initiative)

WP.5 - DDI-SDMX Integration

WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.

NSI 1 Collect Process AnalyseDisseminate Survey A Survey B Historically statistical organisations have produced specialised business processes and IT.

Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.

SDMX and DDI Working Together Technical Workshop 5-7 June 2013

3 rd Annual European DDI Users Group Meeting, 5-6 December 2011 The Ongoing Work for a Technical Vocabulary of DDI and SDMX Terms Marco Pellegrino Eurostat.

SDMX AND DATA DISSEMINATION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.

4 April 2007METIS Work Session1 Metadata Standards and Their Support of Data Management Needs Daniel W. Gillman Bureau of Labor Statistics Paul Johanis.

METADATA HARMONISATION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.

CHRIS NELSON METADATA TECHNOLOGY WORK SESSION ON STATISTICAL METADATA GENEVA 6-8 MAY 2013 Designing a Metadata Repository Metadata Technology Ltd.

CountryData Technologies for Data Exchange SDMX Information Model: An Introduction.

SDMX Standards Relationships to ISO/IEC 11179/CMR Arofan Gregory Chris Nelson Joint UNECE/Eurostat/OECD workshop on statistical metadata (METIS): Geneva.

Technical Overview of SDMX and DDI : Describing Microdata Arofan Gregory Metadata Technology.

SDMX and DDI working together Technical workshop, Luxembourg, June 2013 Use cases for DDI and SDMX.

DDI-RDF Leveraging the DDI Model for the Linked Data Web.

Metadata Architecture at StatCan MSIS 2008 Luxembourg, April 7-9, 2008 Karen Doherty Director General Informatics Branch Statistics Canada.

SDMX Web Services the JSON version Sami Airo & Gerard Salou.

Francesco Rizzo (ISTAT - Italy) SDMX ISTAT FRAMEWORK GENEVE May 2007 OECD SDMX Expert Group.

Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.

SDMX DATA STRUCTURE DEFINITION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.

Session: General Statistical Business Process Model (GSBPM)

Slide 1 Eurostat Unit B3 – Statistical Information Technologies CoRD Meeting – 4 June 2007 Agenda Item 8 Preliminary ideas for a 2011 census hub Giuseppe.

Pilot Census in Poland Some Quality Aspects Geneva, 7-9 July 2010 Janusz Dygaszewicz Central Statistical Office POLAND.

Basics David Barraclough OECD SDMX Coordinator

Eurostat 1 7a. Practical use case 1: Pesticides Use Project Blanaru Cristina Eurostat Unit B5: “Central data and metadata services” SDMX Basics course,

Model and Representations

Survey Data Management and the Combined Use of DDI and SDMX Arofan Gregory Chris Nelson Metadata Technology Eurostat, June

Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,

Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.

SDMX IT Tools Introduction

2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.

Eurostat Standardisation within the ESS: SDMX present and future Luxembourg, October 2015 Marco Pellegrino Eurostat, Statistical Office of the European.

7b. SDMX practical use case: Census Hub

Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange SDMX IT Tools SDMX Converter Jean-Francois LEBLANC Christian.

Agency of statistics of the Republic of Kazakhstan Astana, 2014 Prospects for the SDMX standard implementation in the Agency of statistics of the Republic.

Eurostat November 2015 Eurostat Unit B3 – IT and standards for data and metadata exchange Jean-Francois LEBLANC Christian SEBASTIAN SDMX IT Tools SDMX.

Eurostat May 2016 Eurostat, Unit B3 – IT solutions for statistical production Test Client Jean-Francois LEBLANC Christian SEBASTIAN.

SDMX Basics course, March 2016 Eurostat SDMX Basics course, March Introducing the Roadmap Marco Pellegrino Eurostat Unit B5: “Data and.

ΕΚΤ Access to Knowledge ΕΚΤ Access to Knowledge R&D Statistics Information System: An Interoperability Tail between CERIF and SDMX Dimitris Karaiskos Dimitrios.

>> Metadata What is it, and what could it be? EU Twinning Project Activity E.2 26 May 2013.

DDI and GSIM – Impacts, Context, and Future Possibilities

4. SDMX: Main objects for data exchange

SDMX Information Model

Using SDMX structures to facilitate data reporting

SDMX Reference Infrastructure Introduction

ESCWA SDMX Workshop Session: Constraints.

2. An overview of SDMX (What is SDMX? Part I)

SDMX Information Model: An Introduction

SDMX in the S-DWH Layered Architecture

ESS VIP ICT Project Task Force Meeting 5-6 March 2013.

SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION

Eurostat Unit B3 – IT and standards for data and metadata exchange

DDI and GSIM – Impacts, Context, and Future Possibilities

7. Introduction to the main SDMX objects for metadata exchange

Standardizing and industrializing a business process – the dissemination use case Alessio Cardacino - ESTP Course “Information standards.

SDMX IT Tools SDMX Registry

Integrated Statistical Production System WITH GSBPM

SDMX training Francesco Rizzo June 2018

GSIM overview Mauro Scanu ISTAT

Presentation transcript:

Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics

PROCESS SCENARIO

Eurostat Web Site The two output tables are the focus of the Processes

GSBPM Stages

Process Scenario Survey/Register Raw Data Set Anonymization, cleaning, recoding, etc. Micro-Data Set/ Public Use Files Tabulation, processing, case selection, etc. Aggregation,harmonization Aggregation,harmonization Aggregate Data Set (Lower level) Aggregate Data Set (Higher Level) DDI SDMX Indicators Structure Described by DDI NCube and SDMX DSD

PROCESS STAGES

Stage 1: Input Data Received Survey and Unit Record Conceptual Model Survey targetted at specific population Comprises questions Question may be linked to Variable Variable has conceptual meaning (Concept) Valid responses are Categories Survey output is Unit Record Data Set

Stage 2: Data Processing and Cleaning Editing Process Can be a variety of functions Validation Outlier Trimming Recoding Edit for Non Response Comprises Description of the Process Program Code used

Stage 3: Data Derivation Survey and Unit Record Conceptual Model New Variables created From existing variables or Create from Concepts Maybe new Classifications (codes, categories) Need description and the program code that derives the new Variables

Stage 4: Tabulation Dimensional Structure Maps to DDI NCube and SDMX DSD DDI NCube describes the structure and Provenance to the Variables etc. SDMX Data is published as SDMX Data Set DSD describes the Dissemination Structure DSD can also describe NCube Structure Structure Map can describe mapping between the two Applications can link back from SDMX structures to DDI structures SDMX data can link back to Variables, data collection etc. SDMX

Stage 5: Dissemination (SDMX) Data Set References a Dataflow, DSD, or Provision Agreement This identifies the Structure (DSD) Provision Agreement also identifies the Data Provider Category Scheme supports “drill down” data discovery Constraint contains actual keys and Dimensions values present in the data source Application now has all of the metadata required to query for and process (e.g. visualise) the data

DDI PROCESSING AND STRUCTURES

Describing Unit-Record Data Sets in DDI [DEMO]

Describing Processes in DDI In our example we have several types of processing: – Recoding – Validation and editing – Derivation of new variables In DDI, these are described as Processing Events”

Describing Processes in DDI (Continued) The Collection Event element is part of the “Data Collection” module, but is also used for describing processing later in the data lifecycle A Processing Event can be: – Control operation – Cleaning operation – Weighting – Coding

Describing Processes in DDI (Continued) These elements allow for a description of the event and a link to or the direct expression of the processing “code” (SAS, SPSS, Java, etc.) used to perform the process The Coding element is divided into: – General Instruction – a generic process description – Derivation Instruction – for deriving new variables – These link to the variables used in the process

Tabulation in DDI DDI describes dimensionalized data sets as “Ncubes” This is very similar to an SDMX DSD except: – The values are addressed using references to variables in a unit-record data set – Calculations of measures can be described in detail (dependent and independent variables, computation, etc.) This means that the actual process of tabulation can be described

DDI NCUBE MAP TO SDMX DSD

DDI DDI DDI NCube Model

SDMX SDMX DSD Model

DDI NCube to SDMX DSD Model Map

DDI Representation to SDMX Representation Model Map

Note that the column names are not used (these are just for viewing). These are mapped to the Variable Id in NCube and the Component (Dimension, Data Attribute, Primary Measure) Id in SDMX DDI Data (CSV Describable by DDI NCube Format)

DDI NCube Data Set Model Fundamentally, the Physical Location describes the CSV format. The CSV file can either be converted to SDMX_ML using data readers and data writers or loaded directly into a database using an appropriate data reader. In both cases the map of the Dimension and Attribute Ids to the CSV columns and Id of the Dataflow will need to be passed to the Data Reader so that it can verify the data content with the relevant DSD.

Data Writers and Readers

SDMX STRUCTURES AND DATA DISCOVERY AND VISUALISATION

SDMX Structural Metadata DSD LFS_STRUCTURE1 Dataflow EMPLY_SEX_OCC_EDUC Dataflow EMPLY_SEX_AGE_NATION Constraint Constraint EMPLY_SEX_OCC_EDUC Provision Agr ES_EMPLY_SEX_AGE_NATION Provision Agr ES_EMPLY_SEX_OCC_EDUC Data Provider ESTAT Category Scheme ESTAT_TOPICS Category LABOR Category POPULATION Category NAC Category Categorisation LAB_SEX_OCCC

Data Discovery Registry Structures Data Discovery GUI

User Data Selection User Selection Generated SDMX REST Query

Pivot Table Built from Query Result