Modernization of Statistical data processes

Slides:



Advertisements
Similar presentations
ESRC UK Longitudinal Studies Centre A Framework for Quality Profiles Nick Buck and Peter Lynn Institute for Social and Economic Research University of.
Advertisements

Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
United Nations Oslo City Group on Energy Statistics 8 th Oslo Group Meeting, Baku, Azerbaijan September 2013 ESCM Chapter 8: Data Quality and Metadata.
United Nations Economic Commission for Europe Statistical Division Exploring the relationship between DDI, SDMX and the Generic Statistical Business Process.
The use and convergence of quality assurance frameworks for international and supranational organisations compiling statistics The European Conference.
Metadata: Integral Part of Statistics Canada Quality Framework International Conference on Agriculture Statistics October 22-24, 2007 Marcelle Dion Director.
WP.5 - DDI-SDMX Integration
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
NSI 1 Collect Process AnalyseDisseminate Survey A Survey B Historically statistical organisations have produced specialised business processes and IT.
European Conference on Quality in Official Statistics (Q2010) 4-6 May 2010, Helsinki, Finland Brancato G., Carbini R., Murgia M., Simeoni G. Istat, Italian.
Generic Statistical Information Model (GSIM) Thérèse Lalor and Steven Vale United Nations Economic Commission for Europe (UNECE)
1 Quality Assurance In moving information from statistical programs into the hands of users we have to guard against the introduction of error. Quality.
SDMX and DDI Working Together Technical Workshop 5-7 June 2013
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
African Centre for Statistics United Nations Economic Commission for Africa Session 9: Dissemination of Environment Statistics Workshop on Environment.
1 Improving Data Quality. COURSE DESCRIPTION Introduction to Data Quality- Course Outline.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
The future of Statistical Production CSPA. 50 task team members 7 task teams CSPA 2015 project.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
SDMX IT Tools Introduction
Metadata Working Group Jean HELLER EUROSTAT Directorate A: Statistical Information System Unit A-3: Reference data bases.
Modernization of official statistics Eric Hermouet Statistics Division, ESCAP
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
Aim: “to support the enhancement and implementation of the standards needed for the modernisation of statistical production and services”
Eurostat 1.SDMX: Background and purpose 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
1 Enhancing data quality by using harmonised structural metadata within the European Statistical System A. Götzfried Head of Unit B6 Eurostat.
1 Chapter VII Metadata and Data quality International Recommendations for Water Statistics (IRWS) – Chapter VII Metadata and Data quality Expert Group.
Copyright 2010, The World Bank Group. All Rights Reserved. Principles, criteria and methods Part 1 Quality management Produced in Collaboration between.
United Nations Economic Commission for Europe Statistical Division GSBPM and Other Standards Steven Vale UNECE
Relationship between Short-term Economic Statistics Expert Group Meeting on Short-Term Statistics February 2016 Amman, Jordan.
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
United Nations Economic Commission for Europe Statistical Division CSPA: The Future of Statistical Production Steven Vale UNECE
Implementation of Quality indicators for administrative data
Towards more flexibility in responding to users’ needs
Contents Introducing the GSBPM Links to other standards
Country use cases: Cambodia, and Tunisia
Introducing Statistical Standards -GAMSO
Interoperable data formats: SDMX
4.1. Data Quality 1.
Documentation of statistics
MSDs and combined metadata reporting
Measuring Data Quality and Compilation of Metadata
GSBPM, GSIM, and CSPA.
Quality Assurance in Population and Housing Censuses
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
2. An overview of SDMX (What is SDMX? Part I)
2. An overview of SDMX (What is SDMX? Part I)
The Generic Statistical Information Model
Statistical Information Technology
Module P4 Identify Data Products and Views So Their Requirements and Attributes Can Be Controlled Learning Objectives: Understand the value of data. Understand.
Sub-Regional Workshop on International Merchandise Trade Statistics Compilation and Export and Import Unit Value Indices 21 – 25 November Guam.
The Generic Statistical Business Process Model
CSPA: The Future of Statistical Production
Introducing the GSBPM Steven Vale UNECE
Contents Introducing the GSBPM Links to other standards
Presentation to SISAI Luxembourg, 12 June 2012
Assessment of quality of standards
The role of metadata in census data dissemination
Metadata on quality of statistical information
1. SDMX: Background and purpose
Generic Statistical Information Model (GSIM)
The future of Statistical Production
Guy Van Gyes CAWIE-meeting 23-24/01/2012
OBSERVER DATA MANAGEMENT PRINCIPLES AND BEST PRACTICE (Agenda Item 4)
Petr Elias Czech Statistical Office
Introduction to reference metadata and quality reporting
The Role of Metadata in Census Data Dissemination
ESS conceptual standards for quality reporting
Presentation transcript:

Modernization of Statistical data processes African Centre for Statistics, Economic Commission for Africa

Outline Background What is modernization Quality assurance Metadata Technology and Innovation Standards

Background Increased demands to produce statistics in quantity and quality New development agendas Data revolution New data sources for statistics Use of new technology Advancement in technology Emphasis on data analytics Often with fewer resources. At the same time there is a strong pressure to increase understanding among stakeholders of how we deliver our output in an efficient and structured manner.

Modernization “…the process of adapting something to modern needs or habits.” Transforming the way we manage statistical processes Reengineering the processes Increased flexibility to adapt “Doing more with less” Common generic processes – Common tools – Common methodologies •Recognizing all statistics are produced in a similar way

Quality assurance Data quality refers to the condition of a set of values of qualitative or quantitative variables. There are many definitions of data quality but data is generally considered high quality if it is "fit for its intended uses in operations, decision making and planning". Wikipedia

Quality assurance Data quality dimension: Relevance: degree to which statistics meet users’ needs Timeliness: time elapsed between release of data and reference period Accessibility: ease with which statistical data can be obtained by users Interpretability: the ease with which the user may understand and properly use and analyse the data or information Accuracy: distance between the estimated value and the true value Coherence: logically connected and complete Punctuality: distance between actual and planned release dates Accuracy of data or statistical information is the degree to which those data correctly estimate or describe the quantities or characteristics that the statistical activity was designed to measure. Accuracy has many attributes, and in practical terms there is no single aggregate or overall measure of it. Of necessity these attributes are typically measured or described in terms of the error, or the potential significance of error, 4 In addition, several international quality conferences have been held over the last couple of years. See e.g. http://www.nso.go.kr/eindex.html, www.statcan.ca/english/conferences/symposium2001, www.q2001.scb.se. 5 For a more detailed comparison, a paper by Alberto Signora and Michael Colledge of the OECD is worth reading (see References). Parts of this paper draw on the Signora/Colledge paper. 6 Statistics Canada (1998) Quality Guidelines, third edition, October 1998, available at www.statcan.ca. 3 introduced through individual major sources of error - e.g., coverage, sampling, non response, response, processing and dissemination. Timeliness of information reflects the length of time between its availability and the event or phenomenon it describes, but considered in the context of the time period that permits the information to be of value and still acted upon. It is typically involved in a trade-off with accuracy. Accessibility reflects the availability of information from the holdings of the agency, also taking into account the suitability of the form in which the information is available, the media of dissemination, the availability of meta-data, and whether the user has reasonable opportunity to know it is available and how to access it. The affordability of that information to users in relation to its value to them is also an aspect of this characteristic. Interpretability of data and information reflects the ease with which the user may understand and properly use and analyse the data or information. The adequacy of the definitions of concepts, target populations, variables and terminology underlying the data and information on any limitations of the data largely determines their degree of interpretability. Coherence of data and information reflects the degree to which the data and information from a single statistical program, and data brought together across data sets or statistical programs, are logically connected and complete. Fully coherent data are logically consistent - internally, over time, and across products and programs. Where applicable, the concepts and target populations used or presented are logically distinguishable from similar, but not identical, concepts and target populations of other statistical programs, or from commonly used notions or terminology.

Quality assurance At the agency level At the programme design stage At the Implementation stage At the post-collection evaluation stage

Metadata Metadata is “data that defines and describes other data.” OECD Metadata is "data [information] that provides information about other data". Wikipedia

Metadata Structural metadata needed to identify, use, and process statistical data Reference metadata describe the content and the quality of the statistical data Source: “Management of Statistical Metadata at the OECD”, OECD, 2006 Accordingly, structural metadata will have to be present together with the statistical data, otherwise it becomes impossible to identify, retrieve and navigate the data. Structural metadata should preferably include all of the following: • Variable name(s) and acronym(s), which should be unique (e.g. Financial Intermediation Services Indirectly Measured, FISIM). It is an advantage if these names and acronyms correspond as far as possible to entries in the OECD Glossary of Statistical Terms; terms from the Glossary will be clickable from MetaStore/OECD.Stat. • Discovery metadata, allowing users to search for statistics corresponding to their needs. Such metadata must be easily searchable and are typically at a high conceptual level, allowing users unfamiliar with OECD data structures and terminology to learn if the Organisation holds some statistics that might suit their needs (e.g. users searching for some statistics related to “inflation” should be given some indication on where to go for a closer look); • Technical metadata, making it possible to retrieve the data, once users have found out that they exist. An example are the coordinates (combinations of dimension members of the dimensions in the data cube), as kept in MetaStore. On the other hand, reference metadata describe the content and the quality of the statistical data. Reference metadata are the focal point of the common metadata items outlined below in this paper3 . Preferably, reference metadata should include all of the following: • Conceptual metadata, describing the concepts used and their practical implementation, allowing users to understand what the statistics are measuring and, thus, their fitness for use; • Methodological metadata, describing methods used for the generation of the data (e.g. sampling, collection methods, editing processes, transformations); • Quality metadata, describing the different quality dimensions of the resulting statistics (e.g. timeliness, accuracy); These types of metadata are included in the list of common metadata items provided in Annex 1 below.

Standards and architecture GSBPM GSIM CSPA SDMX DDI To promote harmonization of the business and information systems architectures; To support collaboration for the development of statistical software The Generic Statistical Business Process Model (GSBPM) is a means to describe statistics production in a general and process-oriented way. GSIM is a reference framework of internationally agreed. definitions, attributes and relationships that describe the pieces of information that are used in the production of official statistics (information objects). The Common Statistical Production Architecture (CSPA) is a reference architecture for the official statistics industry, providing the blueprint for designing and building statistical services in a way that facilitates sharing and easy integration into statistical production processes within or between statistical organisations

Business architecture Generic Statistical Business Process Model (GSBPM) To define and describe statistical processes in a coherent way To standardize process terminology To compare and benchmark processes within and between organisations To identify synergies between processes To inform decisions on systems architectures and organisation of resources

Business architecture

Information architecture Generic Statistical Information Model (GSIM) describes the information objects and flows within the statistical business process GSIM is not a software tool: It is a new way of thinking! Common Statistical Production Architecture: An statistical industry architecture will make it easier for each organization to standardize and combine the components of statistical production, regardless of where the statistical services are built

Application architecture Standard Data and Metadata eXchange (SDMX) Common data transmission format for statistical data and metadata Data Documentation Initiatives (DDI) Standard dedicated to microdata documentation; enables documentation of complex microdata files

Technology and innovation The Internet Web services/API Web 2.0 Open data/standard/sources New data sources Data analytics