Documentation of statistics Metadata

Slides:



Advertisements
Similar presentations
1 Meeting of the OECD Short-term Economic Statistics Expert Group June 2002 FUTURE OF SHORT-TERM ECONOMIC STATISTICS DISSEMINATED BY THE OECD.
Advertisements

The Statistical Metadata System: its role in a statistical organization Jana Meliskova Joint UNECE / Eurostat / OECD Work Session on Statistical Metadata.
Communication and dissemination of indicators Soong Sup Lee, World Bank.
Summary of workshop Workshop on Writing Metadata for Development Indicators Lusaka, Zambia 30 July – 1 August 2012.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
REFERENCE METADATA FOR DATA TEMPLATE Ales Capek EUROSTAT.
Terminology and Standards Dan Gillman US Bureau of Labor Statistics.
Overview of quality work in Statistics Denmark Kirsten Wismer.
StatLine 4 metadata implementation Edwin de Jonge Statistics Netherlands.
GSIM implementation in the Istat Metadata System: focus on structural metadata and on the joint use of GSIM and SDMX Mauro Scanu
United Nations Economic Commission for Europe Statistical Division Part B of CMF: Metadata, Standards Concepts and Models Jana Meliskova UNECE Work Session.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
« 8-11 July 2008 « Metadata Life Cycle « STATISTICS PORTUGAL.
CountrySTAT Regional Basic Administrator Training for ECO Member States Friday, October 23, 2015 EVENT Foundations of CountrySTAT E-learning.
Statistics New Zealand’s End-to-End Metadata Life-Cycle ”Creating a New Business Model for a National Statistical Office if the 21 st Century” Gary Dunnet.
Implementation Experiences METIS – April 2006 Russell Penlington & Lars Thygesen - OECD v 1.0.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
General Recommendations on STS Carsten Boldsen Hansen Economic Statistics Section, UNECE UNECE Workshop on Short-Term Statistics (STS) and Seasonal Adjustment.
1 Enhancing data quality by using harmonised structural metadata within the European Statistical System A. Götzfried Head of Unit B6 Eurostat.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
STATISTICAL METADATA ON THE INTERNET REVISITED Hans Viggo Sæbø, Statistics Norway
Processes and Policy for Revising Monthly Production Statistics (GDP) at Statistics Canada Prepared by: Michel Girard & Erika Young Presented by: Michel.
From Data Access to Data Integration IAOS, Shanghai October 2008 Annegrete Wulff, Statistics Denmark
National Bureau of Statistics of the Republic of Moldova 1 High Level Seminar for Eastern Europe, Caucasus and Central Asia Countries (EECCA) on 'Quality.
Role of Metadata in dissemination of census data Regional Seminar on dissemination and spatial analysis of census data, Nairobi, September, 2010.
1 Transnational Partner Search Toolkit Transnationality Contact Points Meeting 30 September Warsaw.
Quality declarations Study visit from Ukraine 19. March 2015
Metadata models to support the statistical cycle: IMDB
Development of Strategies for Census Data Dissemination
Prepared by: Galya STATEVA, Chief expert
Artur Andrysiak Economic Statistics Section, UNECE
The Future Dissemination of OECD Statistics – Some Comments
The SEEA indicator initiatives A preliminary note
Group Discussions - Summary
"Development of Strategies for Census Data Dissemination".
Regional Workshop on Short-term Economic Indicators and Service Statistics September 2017 Chiba, Japan Alick Nyasulu SIAP.
Data dissemination practices
Application of Dublin Core and XML/RDF standards in the KIKERES
Guidelines on the use of SBR for business demography and entrepreneurship statistics Tammy Hoogsteen (Statistics Canada) and Norbert Rainer (co-chair.
Goals and objectives of Work package 2 of the ESSnet on Consistency of concepts and applied methods of business and trade-related statistics Norbert Rainer,
Rolling Review of Education Statistics
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
WORKSHOP ON THE DATA COLLECTION OF OCCUPATIONAL DATA Luxembourg, 28 November 2008 Occupation as a core variable in social surveys Sylvain Jouhette
2. An overview of SDMX (What is SDMX? Part I)
Working on coherence and consistency of an output database
6.1 Quality improvement Regional Course on
Role of Metadata in Census Data Dissemination
Passenger Mobility Statistics 10 April 2014
ESTP Course Balance of Payments – Introductory course Paris, May 2014 Quality issues.
Sub-Regional Workshop on International Merchandise Trade Statistics Compilation and Export and Import Unit Value Indices 21 – 25 November Guam.
Item 3.2 ESS guidelines on temporal disaggregation by Dario Buono (Eurostat) WG Methodology 5 April 2017.
ESS VIP ICT Project Task Force Meeting 5-6 March 2013.
Energy Statistics Compilers Manual
A review of the 2011 census round in the EU, including the successful implementation of a detailed European legal base First meeting of the Technical Coordination.
Passenger Mobility Statistics 21 May 2015
Urban Statistics – Methodological work
Statistical Information Framework at CSO - A Beginning
Quality Reporting in CBS
Part B of CMF: Metadata, Standards Concepts and Models Jana Meliskova
The role of metadata in census data dissemination
Metadata on quality of statistical information
Annegrete Wulff Statistics Denmark
Dissemination Databases
Work Session on Statistical Metadata (Geneva, Switzerland May 2013)
Joint UNECE/Eurostat/OECD
Petr Elias Czech Statistical Office
ECONOMIC CLASSIFICATIONS Advanced course Day 1 – third afternoon session Tools for assisting the use of classifications Zsófia Ercsey - KSH – Hungary.
Introduction to reference metadata and quality reporting
The Role of Metadata in Census Data Dissemination
7. Introduction to the main SDMX objects for metadata exchange
Presentation transcript:

Documentation of statistics Metadata

We and our users get lost without metadata Why metadata? I work in dissemination – Metadata sounds boring and / or as a job for librarians: Necessary to explain the origin and meaning of data Supports “findability” Navigation Search engines We and our users get lost without metadata

Metadata is everywhere in the Generic Statistical Business Process Model Source:UNECE Secretariat - April 2009

A never ending demand Annual Danish user surveys since 2001 Every year users have placed more / better documentation as their number 1 priority A number of improvements No effect what so ever Documentation is mainly about metadata

Why we can’t live without metadata

May (different )ways of looking at metadata Let’s focus on those relevant to dissemination Purpose: Descriptive – to explain meaning of data ‘Findability’ – Navigation & search engines Data related Variable related Publication related

Data related / reference metadata Description of source of data Methodology used to produce data Status of data (provisional / revised / etc.) Implemented as: Quality declarations Footnotes attached to cells / tables Based on:Edwin de Jonge (CBS)

Quality declarations – reference metadata Administrative info Contents Time Accuracy Comparability Accessibility www.dst.dk/declarations

Quality declarations – Reference metadata Source: http://epp.eurostat.ec.europa.eu/portal/page/portal/population/data/main_tables

Footnote attached to table

Variable related metadata Name and description of variable Aggregation method used Unit (1,000, euro, kg, etc.) Name and description of classification Name and description of classification items (categories) Variable related metadata is partly descriptive but names are also important for ‘findability’ Based on:Edwin de Jonge (CBS)

©Statistics Denmark©Statistics Denmark OECD example i ©Statistics Denmark©Statistics Denmark

©Statistics Denmark©Statistics Denmark Eurostat Example ©Statistics Denmark©Statistics Denmark

Metadata is readily available and useable in context of client's information need What is a projection? What is the difference between immigranta and descendants? Which countries are Western? ©Statistics Denmark

Presenting metadata –selective needs Ancestry click!

Metadata on variabel - civilstatus

Publication metadata Metadata related to publishing Release calendars Also for search engines Dublin Core Standard for document metadata on the Internet Hidden metadata information supporting search engines

Publication metadata –release calendars

Publication metadata –release calendars Contact information Links to metadata Other publications

Publication metadata –release calendars

Publication metadata Many publication metadata are Dublin Core (dc) related- and supports search engines: Title (dc) Spatial (dc) Author (dc) Temporal (dc) – reporting period Created (dc) Subject (dc) Modified (dc) Frequency Source (dc) Laguage Description (dc) Subject Area Summary (dc) Statistical theme Published (dc)

Dublin core supporting search

Terminology / linguistics Coherence Metadata challenges Terminology / linguistics Coherence Output databases / changes over time Audience / Target groups

Terminology –What are our users talking about? Statistical terms: CPI Employed Salary Income Household Family Layman terms: Inflation Working Income/Salary Family

Dissemination metadata issues - Linguistics ‘Findability’: Users uses synonyms /hyponym to find data and finds nothing Synonym: Job <> occupation, business vs enterprises Hyper/hyponym: vehicles <> car <> SUV Musical instrument" is a hypernym of "guitar" because musical instruments include guitars

Metadata should ensure coherence in contents same definitions, aggregations and classifications must be used across all subject areas and media should build on international recognized nomenclatures data sources must be technically coordinated =>Statisticians <> Dissemination

Inconsistent tables - Motorbike owner Car owner 18 – 25 A 26 – 45 B > 46 C 18 – 29 D 30 – 41 E 41+ F When creating tables / compiling statistics detailed attention should be given to the harmonization of variable values even across different subject areas. Otherwise you will end up being inconsistent both across time and across subject areas. In the example above we have to different variables Car owner and motorbike own distributed by age. Even if the data is coming from different surveys and that it is there for not possible to make cross tabulations of Motorbike and Car owner it is still much better dissemination to usage the same age groupings across all tables compiled by your organization. This is of cause in real life a nearly impossible task. 1/17/2019

Consistent structural metadata in the Danish model Centralized variables values unit ”by”/”and” time template for quality declaration Decentralized contents footnote contact person quality declaration Decentralized metadata is highly standardized through templates, guides and editorial overview

Time dependency – output databases Definitions of variables may change All most all cubes have a time dimension If a measure changes A new measure is added If a dimension changes New categories are added Change in dimension depends on selection in time dimension -> Many empty cells – (region)

Metadata play a role when the users Metadata …. for what? Metadata play a role when the users browse search select comprehend compare

Metadata – for whom? Staff Users, internal/external database administrators statisticians developers managers Users, internal/external news media international organisations researchers occasional users

Documentation – metadata principles* ensure customers are identified for all metadata processes make metadata 'active' to greatest extent possible - also to Google (*)single authoritative source - 'registration authority‘ reuse metadata metadata is readily available and useable in context of client's information need (*) www.statistics.gov.uk/events/q2006/downloads/W02_Penlington.pdf

And now back to work …. card sorting ©