Saskia Ossen, and Piet Daas Introduction in the Source and Metadata hyperdimension.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

1 Adding a statistics package Module 2 Session 7.
Telephone Skills.
Teacher Evaluation Model
Key Account Manager: Jan Vanstraelen Key Insight Analysis Mystery Customer Research Program Januari| 2013.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied, duplicated, or posted to a publicly accessible website, in whole or in part.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Writing for Publication
Lesson 2: Project: Evaluation, Monitoring, Auditing Macerata, 22 nd October Alessandro Valenza, Director, t33 srl.
Data Sharing and Linking Service Overview presentation May 2013.
C.H. Montin, Ankara 12 May Ankara, 12 May 2014 Further techniques to improve the design of legislation Charles-Henri Montin, Smart Regulation Consultant.
3/5/2009Computer systems1 Analyzing System Using Data Dictionaries Computer System: 1. Data Dictionary 2. Data Dictionary Categories 3. Creating Data Dictionary.
Counting the Dutch, The Future of the Virtual Census in the Netherlands Presentation at the seminar Counting the 7 Billion 24 February 2012 * Geert Bruinooge.
Determination of Administrative Data Quality : Recent results and new developments Piet J.H. Daas, Saskia J.L. Ossen, and Martijn Tennekes Statistics Netherlands.
The Social Statistics Database: Invaluable source of micro-data for socio-economic statistics Johan van Rooijen.
Virtual Workbenches Richard Anthony Dept. Computer Science University of Greenwich Distributed Systems Operating Systems Networking.
Selecting Your Evaluation Tools Chapter Five. Introduction  Collecting information  Program considerations  Feasibility  Acceptability  Credibility.
PPA 502 – Program Evaluation Lecture 5b – Collecting Data from Agency Records.
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
1 Agenda Views Pages Web Parts Navigation Office Wrap-Up.
SPECA Regional Workshop on Disability Statistics: Dec 13-15, 2006 Purposes of Disability Statistics Jennifer Madans and Barbara Altman National Center.
The Canadian Census of Population: a Review in Preparation for 2016 UNECE Group of Experts on Population and Housing Censuses May 23, 2012.
Saskia Ossen, and Piet Daas Introduction in the Data hyperdimension.
Project Analysis Course ( ) Week 2 Activities.
UNIVIRTUAL FOR INSTRUCTIONAL DESIGN Versione 00 del 29/07/2009.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
European Conference on Quality in Official Statistics (Q2010) 4-6 May 2010, Helsinki, Finland Brancato G., Carbini R., Murgia M., Simeoni G. Istat, Italian.
Applying Principles of Universal Design to Assessment Item Modification Peter A. Beddow III Vanderbilt University Nashville, TN June 2008.
Collection of Paradata in a CAPI System with Wireless Telecommunications Vesa Kuusela and Kai Vikki Social Survey Unit Statistics Finland.
Title of Articulate Module (must match what’s on the VITALS calendar) Johnny Hippocrates, MD Assistant Professor of Western Medicine
Lesson 2: Project: Evaluation, Monitoring, Auditing Macerata, 22 nd October Alessandro Valenza, Director, t33 srl.
Feasibility Study.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
OECD/INFE Tools for evaluating financial education programmes Adele Atkinson, PhD Policy Analyst OECD With the support of the Russian/World Bank/OECD Trust.
Quality issues on the way from survey to administrative data: the case of SBS statistics of microenterprises in Slovakia Andrej Vallo, Andrea Bielakova.
Quality framework for the evaluation of administrative data (to be used for statistics) Piet J.H. Daas, Judit Arends-Tóth, Barry Schouten and Léander Kuivenhoven.
The use of client files of energy companies Hans Pouwelse Statistics Netherlands October 24, 2012 Statistical method for monitoring energy.
Statistics Portugal « (Quality Rome, 10 July 2008) « Simplified Business Information: « Improving quality by using administrative data in Portugal.
SUB-MODULE 5. MEANS OF VERIFICATION RESULTS BASED LOGICAL FRAMEWORK TRAINING Quality Assurance and Results Department (ORQR.2)
The Dutch Virtual Census based on registers and already existing surveys Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics.
Author(s) David A. Wallace and Margaret Hedstrom, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative.
1 Presentation Template: Instructor Comments u The following template presents a guideline for preparing a Six Sigma presentation. An effective presentation.
Doc.: IEEE /0094r0 Submission November 2009 Steve Shellhammer, QualcommSlide 1 Comments on PAR Notice: This document has been prepared.
Moodle (Course Management Systems). Surveys and Choices.
1 Patron Data Management and Library Systems: A Vendor Perspective ALA Conference Summer, 2004.
The Role of Information Systems Integration on User Information Satisfaction with E- Government Initiatives Ulingeta O.L. Mbamba (PhD),UDBS Gerald Magova,
Regional Seminar on Promotion and Utilization of Census Results and on the Revision on the United Nations Principles and Recommendations for Population.
Statistics Netherlands’ modernization programme: the use of administrative data, lessons learned and the way ahead. Geert Bruinooge Assistant Director.
UsersTraining StatisticsCommunication Tests Knowledge Board Welcome to the Knowledge Board interactive guide! We encourage you to start with a click on.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
Chapter 12 Auditing Projects.
Towards a Process Oriented View on Statistical Data Quality Michaela Denk, Wilfried Grossmann.
The population in a statistical study is the entire group of individuals about which we want information The population is the group we want to study.
Q2010 Special session 34 Data quality and inference under register information Discussion by Carl-Erik Särndal.
EVALUATION RESEARCH To know if Social programs, training programs, medical treatments, or other interventions work, we have to evaluate the outcomes systematically.
Are the Standard Documentations really Quality Reports? European Conference on Quality in Official Statistics Helsinki, 3-6 May 2010 © STATISTIK AUSTRIA.
Survey Training Pack Session 3 – Questionnaire Design.
Active Learning in the Third Year Statistical Physics Module at the University of the Witwatersrand Jonathan Keartland School of Physics, University of.
Copyright 2010, The World Bank Group. All Rights Reserved. Core and Supplementary Agricultural Topics Section A 1.
Using administrative data for official statistics Samson NOUGBODOHOUE Statistician Statistics Division African Union Commission ASSD 2-4 November 2016.
Looking for statistical twins
Proper registration: Credit for your students and $ for the college
Development of Strategies for Census Data Dissemination
Group Discussions - Summary
"Development of Strategies for Census Data Dissemination".
Proper registration: Credit for your students and $ for the college
6.1 Quality improvement Regional Course on
Quality of administrative data
Quality Reporting in CBS
Presentation transcript:

Saskia Ossen, and Piet Daas Introduction in the Source and Metadata hyperdimension

Content of this module -Introduction in Source and Metadata hyperdimension -Introduction of quality checklist Theory and practical examples Group exercise in which groups apply the checklist to an “imaginary” source 2

Quality checklist 3 The quality checklist: -Can be used to evaluate the Source and Metadata hyperdimensions -Contains 34 indicators 51 questions (measurement methods) -Takes around 2 hours per data source -Findings are expressed at the dimensional level 5 for Source, 4 for Metadata -Can be found at: NL/menu/methoden/onderzoek- methoden/discussionpapers/archief/2009/ x10- pub.htmhttp:// NL/menu/methoden/onderzoek- methoden/discussionpapers/archief/2009/ x10- pub.htm

Source hyperdimension 4 SOURCE SOURCE: - Focus on data source as a whole - Contact information related - Delivery related aspects - and more

Evaluation of Source hyperdimension 5 -Here the data source is viewed upon as a file delivered by the data source holder to the NSI -Dimensions (5): Supplier Contact information, purpose of use Relevance NSI use, need, effect on response burden Privacy and security Legal base, confidentiality, security Delivery Costs, arrangements, format, selection Procedures Collection, changes, feedback, fall-back scenario

+, good; o, reasonable; -, poor; ?, unclear IPA: Insurance Policy records Administration; SFR: Student Finance Register; CWI: register of Centre for Work and Income; ERR: Exam Results Register; NCP: National Car Pass register; PR: Persons Register; VAT: Value Added Tax data; ICP: Intra-Community Product transactions (EU-countries); NHR: New Housing register; Practical example, Source hyperdimension 6

‐ CWI scores ‘poor’ in Delivery Result of delivery issues (delays) These need to be solved (and have been solved) ‐ VAT scores low in Procedures Back-up scenario related, what to do when no or only part of the data is being delivered? First research efforts purely focused on direct use, currently the back-up options are studied ‐ Other data sources Quite OK (there are always some things that can be improved) Practical example, Source hyperdimension 7

8 8 METADATA: Focuses on the (availability of the) information required to understand and use the data in the data source Metadata hyperdimension

-Focuses on the conceptual metadata quality components of the data source -Dimensions (4): Clarity Of units, variables, time definitions and changes Comparability Of units, variables, and time with those of NSI Unique keys Presence, similarity to NSI, and alternatives Data treatment Familiarity with checks and modifications (by data source holder) Evaluation of Metadata hyperdimension 9

Must have a specific use in mind! +, good; o, reasonable; -, poor; ?, unclear Practical example, Metadata hyperdimension IPA: Insurance Policy records Administration; SFR: Student Finance Register; CWI: register of Centre for Work and Income; ERR: Exam Results Register; NCP: National Car Pass register; PR: Persons Register; VAT: Value Added Tax data; ICP: Intra-Community Product transactions (EU-countries); NHR: New Housing register; 10

‐ CWI scores ‘poor’ in Clarity Definitions used by data source holder are difficult to understand ‐ CWI scores ‘poor’ in Comparability Because of definitions that are incomparable (and inconvertible) to the once used by Statistics Netherlands ‐ Other data sources ? for Data treatment indicates that processing by data source keeper needs more attention! (has improved) Others are quite OK (there are always some things that can be improved) Practical example, Metadata hyperdimension 11

Conclusions about the checklist – Checklist as a tool: ‐ Good way to assist the user, quite fast Quality information on a basic but essential (meta-)level Prevents users from missing important quality components ‐ Independent of the actual delivery of the data! Nice feature, adds flexibility A way to pre-evaluate data sources 12

General remarks / tips – Use checklist to identify problems in Source and Metadata hyperdimension Do not immediately dive into the data! – Problems in negative scoring dimensions need to be solved before studying the quality of the data – Other less problematic issues can be solved later (at less hectic times) – Considering the limited time needed to determine Source and Metadata it is recommended to always start with these Repeat when needed 13

Questions? Any questions or comments? 14

Introduction in exercise – Let’s try to interpret the findings of a Dutch ‘checklist’ 15

Introduction in exercise – Participants will be split into groups and each group is provided with: The Source and Metadata evaluation results for an administrative data source An intended use – Each group will be asked to discuss: ‐ whether the source could be used for the purpose intended If yes, why is everything OK? If not, what is the problem(s) that prevents its use and how can it be solved? 16