A handbook on validation methodology Marco Di Zio Istat

Slides:



Advertisements
Similar presentations
Research Methodology For reader assistance, have an introductory paragraph in which attention is given to the organization of the section in relation to.
Advertisements

Quality Guidelines for statistical processes using administrative data European Conference on Quality in Official Statistics Q2014 Giovanna Brancato, Francesco.
Four Dark Corners of Requirements Engineering
1 Business Exchange Structures Concepts.
Short Course on Introduction to Meteorological Instrumentation and Observations Techniques QA and QC Procedures Short Course on Introduction to Meteorological.
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
Background Data validation, a critical issue for the E.S.S.
What is Research ? Research Methodology CHP400:
ESSnet on microdata linking and data warehousing in statistical production: Metadata Quality in the Statistical Data Warehouse.
An Axiomatic Basis for Computer Programming Robert Stewart.
Evaluating the Quality of Editing and Imputation: the Simulation Approach M. Di Zio, U. Guarnera, O. Luzi, A. Manzari ISTAT – Italian Statistical Institute.
1 Results-based Monitoring, Training Workshop, Windhoek, Results-based Monitoring Purpose and tasks Steps 1 to 5 of establishing a RbM.
Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.
Generic Statistical Data Editing Models (GSDEMs) Workshop on the Modernisation of Official Statistics The Hague, 24 November 2015.
Standardisation in the European Statistical System inventory of normative documents and the standard-setting process – results of the ESSnet on Standardisation.
Chapter Two Copyright © 2006 McGraw-Hill/Irwin The Marketing Research Process.
Toward a New ATM Software Safety Assessment Methodology dott. Francesca Matarese.
Statistical Modernisation Community Padraig Dalton 8 March
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
-Systematic research begins with a problem -A gap in knowledge that one wishes to describe or explain -A felt difficulty that one wishes to solve -Often.
Theme (iv): Standards and international collaboration
Introduction to Marketing Research
Make or Buy transport.
Significance of Findings and Discussion
UNECE-CES Work session on Statistical Data Editing
Generic Statistical Data Editing Models (GSDEMs)
Software Configuration Management (SCM)
Towards connecting geospatial information and statistical standards in statistical production: two cases from Statistics Finland Workshop on Integrating.
Purpose of Research Research may be broadly classified into two areas; basic and applied research. The primary purpose of basic research (as opposed to.
Contents Introducing the GSBPM Links to other standards
An R package for selective editing based on a latent class model
The Generic Statistical Information Model (GSIM) and the Sistema Unitario dei Metadati (SUM): state of application of the standard Cecilia Casagrande –
Validation in the ESS CoE Data Warehousing 23./
WORKSHOP GROUP ON QUALITY IN STATISTICS
ESS Validation State of Play and next steps
Goal, Question, and Metrics
© 2012 The McGraw-Hill Companies, Inc.
My Performance Journey
ESS Vision 2020 Validation: Implementation of deliverables
Survey phases, survey errors and quality control system
Measuring Data Quality and Compilation of Metadata
Working Group on Population and Housing Censuses
METHOD VALIDATION: AN ESSENTIAL COMPONENT OF THE MEASUREMENT PROCESS
Survey phases, survey errors and quality control system
Reviewing your final digital product
GSIM The Generic Statistical Information Model
ESS Vision 2020: ESS.VIP Validation
Topic Principles and Theories in Curriculum Development
Project Management Process Groups
The Generic Statistical Information Model
DMAIC Roadmap DMAIC methodology is central to Six Sigma process improvement projects. Each phase provides a problem solving process where-by specific tools.
WisPQC Data Collection & Reports Webinar for NAS/NOWS Initiative
Definition of Project and Project Cycle
Business and IT Architecture for ESS validation
Module P4 Identify Data Products and Views So Their Requirements and Attributes Can Be Controlled Learning Objectives: Understand the value of data. Understand.
3rd WGM Meeting 3 May 2018 Item 2.3 Possible standards for ESS Validation.
Preliminaries Training Course «Statistical Matching» Rome, 6-8 November 2013 Mauro Scanu Dept. Integration, Quality, Research and Production Networks.
ESS Validation Project State of Play and next steps
GSBPM and Data Life Cycle
Applying the ESS EARF in a VIP project: The ESS.VIP Validation example
Ass. Prof. Dr. Mogeeb Mosleh
Meeting of the Working Group on Rail Transport Statistics
Assessment of quality of standards
Time Scheduling and Project management
Generic Statistical Information Model (GSIM)
A handbook on validation methodology. Metrics.
Introduction to reference metadata and quality reporting
CSPA Templates for sharing services
CSPA Templates for sharing services
Hands-on GSIM Mauro Scanu ISTAT
Presentation transcript:

A handbook on validation methodology Marco Di Zio Istat Workshop ValiDat Foundation – Wiesbaden, 10-11 November 2015

Underlying idea of the HB Why a handbook on methodology for data validation? Standardization of language, of elements, provide common measures for evaluation… establish a common reference framework and develop metrics for evaluating DV The HB is composed of two main parts: A generic framework for data validation Discuss metrics to evaluate a validation procedure (tuning, evaluating the procedure..) ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Generic framework for data validation The objective of this first section is to clarify What Why How and … ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Generic framework for data validation Clearly establish the relation with other phases of the statistical production process and internationals standards as GSBPM GSDEMs GSIM Describe the data validation life cycle – useful for managing the data validation process ValiDat foundation workshop - Wiesbaden 10-11 November 2015

What is data validation… Definition Data Validation is an activity verifying whether or not a combination of values is a member of a set of acceptable combinations. not far from the Unece definition: An activity aimed at verifying whether the value of a data item comes from the given (finite or infinite) set of acceptable values but essentially different… ValiDat foundation workshop - Wiesbaden 10-11 November 2015

What… It is a decisional procedure ending with an acceptance or refusal of data as acceptable. The decisional procedure is generally based on rules expressing the acceptable combinations of values. ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Why do we perform data validation… The purpose of data validation is to ensure a certain level of quality of the final data but quality has several aspects. We clarified which aspects are related to DV Essentially the ones related the ‘structure of the data’, that are accuracy, comparability, coherence. But others are connected, e.g., timelines can be seen as a constraining factor ValiDat foundation workshop - Wiesbaden 10-11 November 2015

How to perform DV… Two main elements Validation levels to what extent a data set has been validated Validation rules Rules are applied to data, a failure of the rule implies that the corresponding validation level is not attained by the data at hand (decisional process: accept/not accept) ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Validation levels They are related to the perspective of the ‘validator’ … In the HB: Business perspective Starting form the elements characterising usually the DV process (increasing information) A formal approach Looking a the elements characterizing a point in a statistical setting ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Validation levels: business perspective ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Validation levels: formal approach metadata aspects that are necessary to identify a data point, The universe U from which a statistical object originates. (e.g., household, company,) The time t of selecting an element u from the current population p(t) The selected element u. This determines the value of variables X over time that may be observed. The variable selected for measurement. ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Data validation - GSDEMs Generic Statistical Data Editing Models statistical data editing composed of three different function types: Review, Selection and Amendment The review functions are defined as: Functions that examine the data to identify potential problems. This may be by evaluating formally specified quality measures or edit rules or by assessing the plausibility of the data in a less formal sense, for instance by using graphical displays ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Data validation - GSDEMs Among the GSDEMs different function categories there is ‘Review of data validity’ that is Functions that check the validity of data values against a specified range or a set of values and also the validity of specified combinations of values. Each check leads to a binary value (TRUE, FALSE) ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Data Validation - GSBPM ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Data validation life cycle ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Second part of the document: Metrics Evaluating validation procedure …next presentation… ValiDat foundation workshop - Wiesbaden 10-11 November 2015

Thanks for your attention ValiDat foundation workshop - Wiesbaden 10-11 November 2015