UNECE Data Integration Project

Slides:



Advertisements
Similar presentations
Connecting people, society and the economy to a location UNSC Learning Centre 25 February 2013 Peter Harper Deputy Australian Statistician Australian Bureau.
Advertisements

GSBPM and GSIM as the basis for the Common Statistical Production Architecture Steven Vale UNECE
Products and Sources: Issues and Challenges Steven Vale on behalf of the Modernisation Committee on Products and Sources.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Institutional arrangements and legal framework for energy statistics United Nations Statistics Division International Workshop on Energy Statistics
United Nations Statistics Division
Summary of workshop Workshop on Writing Metadata for Development Indicators Lusaka, Zambia 30 July – 1 August 2012.
NIST Special Publication Revision 1
Initial thoughts on a Global Strategy for the Implementation of the SEEA Central Framework Ivo Havinga United Nations Statistics Division.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
Sponsorship on Standardisation Main results Barteld Braaksma, Cecilia Colasanti, Piero Demetrio Falorsi, Wim Kloek, Miguel Angel Martínez Vidal, Jean-Marc.
Deliverable 2.6: Selective Editing Hannah Finselbach 1 and Orietta Luzi 2 1 ONS, UK 2 ISTAT, Italy.
BMH CLINICAL GUIDELINES IN EUROPE. OUTLINE Background to the project Objectives The AGREE Instrument: validation process and results Outcomes.
1 Sponsorship on Standardisation Gosse van der Veen (Statistics Netherlands) Daniel Defays (Eurostat)
United Nations Economic Commission for Europe Statistical Division High-Level Group Achievements and Plans Steven Vale UNECE
Policies and Procedures for Civil Society Participation in GEF Programme and Projects presented by GEF NGO Network ECW.
Eurostat/UNSD Conference on International Outreach and Coordination in National Accounts for Sustainable Development and Growth 6-8 May, Luxembourg These.
Regional Seminar on Promotion and Utilization of Census Results and on the Revision on the United Nations Principles and Recommendations for Population.
Michelle Simard, Thérèse Lalor Statistics Canada CSPA Project Manager UNECE Work Session on Statistical Data Confidentiality Helsinki, October 2015 Confidentialized.
Modernization Committee on Products and Sources: Report th High -Level Group Workshop on the modernization of Production and Services, Den Haag.
Mid-Decade Assessment of the United Nations 2010 World Population and Housing Census Program Arona L. Pistiner Office of the Associate Director for 2020.
United Nations Economic Commission for Europe Statistical Division International Collaboration to Modernise Official Statistics Steven Vale UNECE
Aim: “to support the enhancement and implementation of the standards needed for the modernisation of statistical production and services”
Generic Statistical Information Model (GSIM) Jenny Linnerud
UNSD Recent international developments in Energy Statistics.
Modernization Committee on Products and Sources: Proposal for HLG project on Data Integration 5 th High -Level Group Workshop on the modernization of Production.
Advancing statistics for development Marko Javorsek ESCAP Statistics Division Modernization Working Group on Production, Methods, and Standards (MWG) First.
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
United Nations Economic Commission for Europe Statistical Division GSBPM in Documentation, Metadata and Quality Management Steven Vale UNECE
ROMA 23 GIUGNO 2016 MODERNISATION LAB - FOCUSSING ON MODERNISATION STRATEGIES IN EUROPE: SOME NSIS’ EXPERIENCES Insert the presentation title Modernisation.
United Nations Economic Commission for Europe Statistical Division CSPA: The Future of Statistical Production Steven Vale UNECE
Session topic (i) – Editing Administrative and Census data Discussants Orietta Luzi and Heather Wagstaff UNECE Worksession on Statistical Data Editing.
Assessments ASSESSMENTS. Assessments The Rationale and Purpose for Assessments.
Workshop on Implementing Standards for Statistical Modernisation 2016 Geneva, September 2016 Complementing the GSBPM with Quality Indicators for.
Project Management PTM721S
United Nations Statistics Division
Short Training Course on Agricultural Cost of Production Statistics
Data Integration in Official Statistics 2017 Project Proposal
UNECE Data Integration Project
Monitoring and Evaluation Systems for NARS Organisations in Papua New Guinea Day 3. Session 9. Periodic data collection methods.
Herman Smith United Nations Statistics Division
Session 2: Institutional arrangements for energy statistics
Modernization Committee on Products and Sources
United Nations Development Account 10th Tranche Statistics and Data
ESS Vision 2020 Implementation
Rolling Review of Education Statistics
22 February, ITDG/DIME Item 2 – Progress and deployment
Scanning the environment: The global perspective on the integration of non-traditional data sources, administrative data and geospatial information Sub-regional.
Sub-regional workshop on integration of administrative data, big data
6.1 Quality improvement Regional Course on
Guidelines on Integrated Economic Statistics
Exchange and Sharing of Economic Data
Blue Sky Thinking Network – Looking Back & Ahead
WP7 – COMBINING BIG DATA - STATISTICAL DOMAINS
United Nations Statistics Division
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
United Nations Statistics Division
CSPA: The Future of Statistical Production
ESS.VIP ADMIN EssNet on Quality in Multi-source Statistics, progress report 19TH WORKING GROUP ON QUALITY IN STATISTICS, 6 December 2016 Fabrice Gras,
Marleen De Smedt Geoffrey Thomas Cynthia Tavares
ESS.VIP Validation Item 5.1
MAPPING AFRICA FOR AFRICA INITIATIVE
United Nations Statistics Division
Assessment of quality of standards
GSBPM AND ISO AS QUALITY MANAGEMENT SYSTEM TOOLS: AZERBAIJAN EXPERIENCE Yusif Yusifov, Deputy Chairman of the State Statistical Committee of the Republic.
United Nations Statistics Division
United Nations Statistics Division
Data Architecture project
CSPA Common Statistical Production Architecture Motivations: definition and benefit of CSPA and service oriented architectures Carlo Vaccari Istat
Presentation transcript:

UNECE Data Integration Project Data Integration in Official Statistics – Collaborating for Success 2016 Results UNECE Data Integration Project

Data Integration Project Proposed by the Modernisation Committee on Products and Sources Main objectives: gain experience by collaborating on joint practical activities (experiments) translate experience into general recommendations provide initial guidance for a quality framework 12 countries involved so far: Australia, Brazil, Canada, Colombia, Hungary, Italy, Mexico, New Zealand, Netherlands, Poland, Serbia, Slovenia

Data Integration Project Motivation: Integrating different types of data can Provide more timely & more detailed statistics Provide new official statistics Meet new and unmet data needs Lower response burden Overcome effects of reducing response rates Address quality and bias issues in surveys “We must move from a paradigm of producing the best estimates possible from a survey to that of producing the best possible estimates to meet user needs from multiple data sources” (Connie Citro)

Many challenges: Partnerships with data providers and others Moving from research to production Set up and ongoing costs Speed of adoption Governance Public perceptions and communication Duplication of efforts Different forms of integration Producing stable output with unstable inputs New skills, methods, IT approaches Different Concepts Minimum and ideal metadata required Quality measurement and management

Project approach Template for proposing experiments WORK PACKAGES Template for proposing experiments 11 Experiments proposed Group reviewed and expressed interest in experiments Both single and multi-country experiments Face to face sprint in August 2016, hosted by Hungary Synthesize learnings for Work Package A

Face to Face Sprint Workshop Progressed understanding of individual work packages and experiments Developed structure and initial content of a guide to data integration (Work Package A – Synthesize lessons learnt from new working methods) Discussed and developed proposals for further work Used wiki and webex to: Allow virtual participants Automatically document results

Structure of a practical guide

Structure of a practical guide Opportunities Challenges Risk mitigation Standard Processes Recommended methods ICT considerations Quality Standards Related work in other projects/organisations Skills Resources Partnerships Governance Promotion and advocacy Recommendations Expected Outputs 2016 and 2017.

Work Packages – Structure Citro, Constance F. (2014), From multiple modes for surveys to multiple data sources for estimates. Survey Methodology, December 2014 137 Vol. 40, No. 2, pp. 137-161. Statistics Canada

WP0 Data Sets for Common Approaches Participants Frances Krsinich Statistics New Zealand (Coordinator) Gergely Bagó HCSO Hungary Zoltán Csányi HCSO Hungary Tiziana Tuoto ISTAT Italy Jon Wylie Statistics Canada

WP0 Data sets for common approaches Activities Identify and obtain data sets that can be used for testing in a collaborative environment Prepare and make data sets available, in the Sandbox environment, if possible Experiments Online and scanner data for integrated price measurement Results to date Confirmed with providers that they would provide experimental data Started familiarization with Sandbox Provided initial guidance for approaches

WP1 Integrating survey and administrative sources Participants Kaja Malesic  SURS Slovenia(Coordinator) Zvone Klun SURS Slovenia Ton de Waal CBS Netherland Zoltán Vereczkei HCSO Hungary Gergely Bagó  HCSO Hungary Zoltán Csányi  HCSO Hungary Julieth Solano DANE Colombia Miodrag Cerovina DANE Colombia Sinisa Cimbaljevic DANE Colombia Mara Brigitte Bravo DANE Colombia Melinda Tokai SORS Serbia Katarina Marjanovic SORS Serbia

WP1 Integrating survey and administrative sources Activities Literature review on data integration methods Design expected outputs of the integration Design and test of transformation of the administrative dataset into the statistical dataset Design and test of the integration of administrative data and survey data. Evaluation of statistical outputs obtained from the integration of administrative datasets and survey datasets Producing recommendations (initial guideline)

WP1 Integrating survey and administrative sources Experiments Integrating potential information sources for the statistical data production on job vacancies Linking the Statistical Register of Employment and the Labour Force Survey System of Consultation & Geographic Location of Schools (also WP3)

WP1 Integrating survey and administrative sources Results to date Three countries identified benefit of working together on common approaches for JVS Tentative list of the steps of the involvement of secondary sources has been prepared Test dataset has been accessed /Hungary/ Evaluation is under process

WP2 New data sources (eg big data) and traditional sources Participants Tiziana Tuoto   ISTAT Italy (Coordinator) Zoltán Vereczkei HCSO Hungary Gergely Bagó HCSO Hungary Julieth Solano  DANE Colombia Tijana Comic SORS Serbia Anapapa Mulitalo Statistics New Zealand Frances Krsinich Statistics New Zealand Ricardo Lujan INEGI Mexico Juan Munoz-Lopez INEGI Mexico Jon Wylie Statistics Canada

WP2 New data sources (eg big data) and traditional sources Activities Design and test integration between Internet‐scraped data and traditional statistical datasets. Include entity extraction and recognition and object matching Explore new techniques - record linkage -> object matching (i.e. where an object can have a looser structure than a record) Experiments Integrating potential information sources for the statistical data production on job vacancies Linking the Statistical Employment Register and abour Force Survey A System of Consultation and Geographic Location of Schools

WP2 New data sources (eg big data) and traditional sources Results to date Dependent on sourcing the appropriate data (in progress) Haven’t used the big data lab this year but participants are keen to use it in the next 12 months Created an initial list of the issues for the guide Initial consideration of methodological changes required (eg if item attributes are missing from new sources) (Statistics Canada Paper) Recognised changes required anyway because of changes in shopping habits (eg towards online sales)

WP3 Integrating Geospatial and Statistical Information Activities Review other efforts (like Eurostat’s GEOSTAT,GISCO,UN’s GGIM) An inventory of sources and projects Determine what valuable information could be produced by integrating statistical and geographical information Research methodologies to produce new information products Conduct some experiments and pilots to assess the value and adjust scope Not expected to cover all the activities in just one year.

WP3 Integrating Geospatial and Statistical Information Participants Monika Sekular CSO Poland (Coordinator) Julieth Solano DANE Colombia Angel Alberto Arellano Rincon DANE Colombia Sandra Patricia Rincón Méndez DANE Colombia Mara Brigitte Bravo DANE Colombia Ricardo Lujan INEGI Mexico Juan Munoz-Lopez INEGI Mexico

WP3 Integrating Geospatial and Statistical Information Experiments Conduct work in purpose to inventory the level of integration spatial objects used in statistics and geodesy - The 10 Level Model for harmonization of statistical and geodesy reference framework Analysis over a scheme of integration of geospatial and statistical information Aim: “To establish one model or schema that allows countries to compare, analyze and monitor socio-economic phenomena, not only at national level but global level. The project proposes the study of the existing models so that it can be determined which one is of better qualities to be applied globally, or which schemes or mixtures of them could be more adequate.”

WP3 Integrating Geospatial and Statistical Information

WP3 The proposal submitted by Colombia fits within the Polish proposal, because ‘The 10 level model’ is referenced to the three lower components of the Statistical Spatial Framework (SSF):

WP3 Integrating Geospatial and Statistical Information Results to date The assumption of the project notified by Poland was to carry out work aimed at making the synchronization of the cadastral and statistical division, which in consequence will unify the two systems. This work was concentrated on examining whether a unified model is universal and possible to acceptable by other countries and to what extent it is possible to make it spread within creation the spatial at the national level. A coherent model would increase the consistency of statistical data based on the integration of geospatial information.

WP3 Integrating Geospatial and Statistical Information Results to date This work was preceded by collecting the necessary information about the status of digitization through a questionnaire similar to which was carried out in Europe in Geostat 2 project. In the first step work is consist on answering to the questions in the survey by the project’s participants. These countries will be the coordinators to carry out the same survey among the countries of South America and possibly North and Central.

WP4 Micro – Macro integration Activities Identify potential sources and statistics that can benefit from micro‐macro integration Road test existing guidelines in particular national settings and review, refine and further develop them Experiments Nil in 2016

WP5 Validating Official Statistics Participants Felipa Zabala Statistics New Zealand (Coordinator) Zvone Klun  SURS Slovenia Kaja Malesic  SURS Slovenia

WP5 Validating official statistics Activities Identify issues related to systematically using data from other sources in the validation of official statistics Recommend potential approaches and modelling techniques Experiments A comparative analysis of income data from New Zealand Income Survey with administrative data Linking the Statistical Register of Employment and the Labour Force Survey

WP5 Validating official statistics Results to date Use of the Total Survey Error framework as implemented in Statistics NZ’s quality assessment framework: based on Li-Chun Zhang’s two- phase life-cycle model for integrated statistical microdata Phase 1 error framework Phase 2 error framework

WP5 Validating official statistics Results to date Identified some different applications and methods for validating official statistics. Documented issues with use of administrative data to validate official statistics and lessons learnt from experiments and other validation projects to provide initial guidelines in the use of administrative data to validate official statistics recommend approaches and modelling techniques to resolve issues Some examples of issues Need for a weight to adjust for missed links Better method to estimate linkage errors Methods to adjust for linkage errors

WPA Synthesize lessons learned from new working methods Activities Synthesize in a clear and standard structure (guideline) Seek input from others with experience Results to Date Group agreed on structure for a Guide to Data Integration for Official Statistics Initial content provided

WPA Synthesize lessons learned from new working methods Produce an online, adaptive, practical guide Give a starting point Simple structure if possible Cover all relevant issues, risks etc Promote the modernstats standards Promote the international work on integrating geospatial and statistical information Reuse not redevelop! Case studies

Structure of Guide

Future High Minimum Definition of Success Common approaches for particular areas of statistics Sources secured and risks managed New sources easily incorporated into our statistics Guidance easy and effective to use Future One or more concrete statistics produced Methods developed Modernstats standards used Guidance and quality frameworks created High Countries share their experience Collaborate on approaches in WP1-5 Minimum

Concluding Messages Please provide input for the guide! Enormous range of potential data sources Need new methods and approaches to quality Many countries are working on integrating multiple data sources Working together can speed up modernisation Opportunity to leverage the research and approaches in all of our countries May be additional types of data integration that should be considered Do you: face similar issues? work on similar experiments? Have knowledge to share? Please provide input for the guide!