Tool for Assessing Impact of Changing Editing Rules On Cost & Quality Alaa Al-Hamad, Begoña Martín, Gary Brown Processing, Editing & Imputation Branch.

Slides:



Advertisements
Similar presentations
StEPS at EIAWhere We Are Now Paula Weir and Sue Harris Energy Information Administration, U.S. Department of Energy ICES3 Topic Contributed Session: Generalized.
Advertisements

- ONS Classification Coding Tools Project Occupation Classification Workshop RSS, London, 21 June 2004 Nigel Swier.
Goods for Processing / Toll Processing … a pragmatic approach What is toll processing? Why is toll processing used? What is the problem? How has ONS dealt.
Migration of a large survey onto a micro-economic platform Val Cox April 2014.
Module B-4: Processing ICT survey data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-4 Processing ICT Survey data.
Editing and Imputing VAT Data for the Purpose of Producing Mixed- Source Turnover Estimates Hannah Finselbach and Daniel Lewis Office for National Statistics,
TRIM Workshop Arco van Strien Wildlife statistics Statistics Netherlands (CBS)
Deliverable 2.8: Outliers Gary Brown Office for National Statistics UK.
NCSR “DEMOKRITOS” Institute of Nuclear Technology and Radiation Protection NATIONAL TECHNICAL UNIVERSITY OF ATHENS School of Chemical Engineering Fuzzy.
Examining the use of administrative data for annual business statistics Joanna Woods, Ria Sanderson, Tracy Jones, Daniel Lewis.
1 Editing Administrative Data and Combined Data Sources Introduction.
Simple Neural Nets For Pattern Classification
Decision Tree Rong Jin. Determine Milage Per Gallon.
Compendium of Indicators for Monitoring and Evaluating National Tuberculosis Programs Using the Compendium to Plan for Monitoring and Evaluation of NTPs.
Björn Landfeldt School of Information Technologies Investigating a theoretical model Bjorn Landfeldt University of Sydney.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing 2.
A Statistical Approach to Method Validation and Out of Specification Data.
1 Methods for detecting errors in VAT Turnover data Phil Lewis Processing, Editing and Imputation branch Business Statistics Methods-Survey Methodology.
Dr K N Prasad MD., DNB Community Medicine
Maintenance of Selective Editing in ONS Business Surveys Daniel Lewis.
Validation and Verification
Test coverage Tor Stålhane. What is test coverage Let c denote the unit type that is considered – e.g. requirements or statements. We then have C c =
Arun Srivastava. Types of Non-sampling Errors Specification errors, Coverage errors, Measurement or response errors, Non-response errors and Processing.
Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census.
17.5 Rule Learning Given the importance of rule-based systems and the human effort that is required to elicit good rules from experts, it is natural to.
Work Package 5: Integrating data from different sources in the production of business statistics Daniel Lewis Office for National Statistics (UK)
Intro Informatica Productivity Pack Save Time and Money while Increasing the Quality of Your PowerCenter Deployment Louis Hausle.
Measuring the quality of regional estimates from the ABS Jennie Davies and Daniel Ayoubkhani.
The Edit Anders Norberg, Statistics Sweden (SCB) Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Improving the Design of UK Business Surveys Gareth James Methodology Directorate UK Office for National Statistics.
19 th Bled eConference, 06 June Hannes Selhofer European Commission An initiative of the Hannes Selhofer empirica GmbH 19 th Bled eConference –
A generic tool to assess impact of changing edit rules in a business survey – SNOWDON-X Pedro Luis do Nascimento Silva Robert Bucknall Ping Zong Alaa Al-Hamad.
On visible choice set and scope sensitivity: - Dealing with the impact of study design on the scope sensitivity Improving the Practice of Benefit Transfer:
A Strategy for Prioritising Non-response Follow-up to Reduce Costs Without Reducing Output Quality Gareth James Methodology Directorate UK Office for National.
The application of selective editing to the ONS Monthly Business Survey Emma Hooper Office for National Statistics
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
Systems Analysis and Design 8 th Edition Chapter 2 Analyzing the Business Case.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
Evaluating generalised calibration / Fay-Herriot model in CAPEX Tracy Jones, Angharad Walters, Ria Sanderson and Salah Merad (Office for National Statistics)
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
CBS-SSB STATISTICS NETHERLANDS – STATISTICS NORWAY Work Session on Statistical Data Editing Oslo, Norway, September 2012 Jeroen Pannekoek and Li-Chun.
Handbook on Precision Requirements and Variance Estimation for ESS Household Surveys Denisa Florescu, Eurostat European Conference on Quality in Official.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
ASys: Demo Problem to solve ASys (Assessment system) Analisys of the properties Architecture of the system DFD to mark one exam Testing using test cases.
Pheno_opt_rice Brief introduction by Pepijn van Oort.
Topic (i): Selective editing / macro editing Discussants Orietta Luzi - Italian National Statistical Institute Rudi Seljak - Statistical Office of Slovenia.
Preparing for A Strategy for Change Based on Previous Experiences Steve Vale Office for National Statistics, UK.
Census Processing Baku Training Module.  Discuss:  Processing Strategies  Processing operations  Quality Assurance for processing  Technology Issues.
Centro de Electrónica Industrial (CEI) | Universidad Politécnica de Madrid | | In magnetic components for power electronics applications.
The 2011 Census: Estimating the Population Alexa Courtney.
Fast SLAM Simultaneous Localization And Mapping using Particle Filter A geometric approach (as opposed to discretization approach)‏ Subhrajit Bhattacharya.
An Overview of Editing and Imputation Methods for the next Italian Censuses Gianpiero Bianchi, Antonia Manzari, Alessandra Reale UNECE-Eurostat Meeting.
Backcasting the future Gary Brown Office for National Statistics, UK.
The Practice of Statistics Third Edition Chapter 11: Testing a Claim Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
Using Simulation to evaluate Rasch Models John Little CEM, Durham University
FDI - Imputation. Overview Introduction Overview of Imputation Methods Overview of Outliering methods Overview of Estimation methods Aggregation Disclosure.
Linear Models Tony Dodd. 21 January 2008Mathematics for Data Modelling: Linear Models Overview Linear models. Parameter estimation. Linear in the parameters.
7. Modular and structured design
Improvements in editing methods and processes for use of Value Added Tax data in UK National Accounts Martina Portanti and Robert Breton Office for National.
Survey phases, survey errors and quality control system
Sample surveys versus business register evaluations:
Improving the efficiency of editing in ONS business surveys
Survey phases, survey errors and quality control system
Towards a Fully Adjusted Census Database for the 2011 Census
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Jeroen Pannekoek, Sander Scholtus and Mark van der Loo
IMPRESS Guidance and Policy Summary Water Directors Copenhagen, 21-22nd November 2002 Working Group leaders: Volker Mohaupt, Umwelt Bundes Amt Isobel.
A handbook on validation methodology. Metrics.
Presentation transcript:

Tool for Assessing Impact of Changing Editing Rules On Cost & Quality Alaa Al-Hamad, Begoña Martín, Gary Brown Processing, Editing & Imputation Branch Business Surveys

1. Overview Data Editing in the ONS Error Detection Rules Problems Surveys Managers Dilemma Proposed Tool Tool illustration & output Conclusion and Further Work

A costly component of the data cleaning process, in the ONS, is data editing Data Editing is defined as An activity aimed at detecting and correcting errors in data – ONS Glossary In practice this involves: the detection of error suspect data (using Editing Rules) Ex. Fail if A + B ‘>‘ (estimated parameter) Verification/correction of error suspect data from source 2. Editing in the ONS

If rule parameters are too conservative increased response burden (unnecessary recontacts) reduced data quality (over-validation errors and biases) costly in terms of staff & resources If rule parameters are too liberal Allows uncorrected errors through reduced data quality costly in terms of reputation less costly in terms of staff & resources 3. Detection Rules Problems

When managers are asked to achieve savings ‘Savings vs Quality Impact’ An easy way to make quick savings is to loosen the rules parameters so that less data will be edited The challenge is: Where to stop. What impact will such action have on the estimates? Remember Quality loss is not defined solely by number of error failure but also by the size of the error 4. Surveys Managers Dilemma

5. Proposed Tool Ideally what is required is a dynamic routine for editing rules parameters that is applicable to all business surveys and: offers a choice of different quality measurement criteria considers all editing rules simultaneously outputs proposed changes to parameters outputs savings and quality loss per changed rule and in total A dynamic routine has not yet been developed so we have pursued a pragmatic solution with the same criteria

6. Suitable Measurements A Measure of Savings: Savings = Number of records no longer require editing A measure of impact: Exact impact on final estimates is difficult to calculate time consuming costly Instead, use relative change = where X = a response before and after parameter change. w = a calibration weight.

7. Routine illustration Existing Rules Fail Pass No error Error B*

7. Routine illustration Existing Rules Loosen Rules Fail Pass No error Error B* Pass A Fail Pass Fail Pass B Savings # (A + B) Errors missed # (B)

8. Example of Rules Changes Rule 1 Rule 2 Rule 3 Alter Gate 1 Alter Gate 2 Alter Gate 3

9. Routine Results RulesRoutine Output Gate1Gate2Gate3Savings Errors Missed Relative Change (%)

10. Conclusions Often changes to validation rules to achieve saving are made in isolation and without consideration of the impact of these changes on the quality of the survey output In this work we are offering a simple but effective decision support tool –to quantify savings & loss in quality resulting from changing editing rules –help managers identify the editing rules that have the most impact on quality - Identify the parameters that minimise quality loss given set savings, and vice versa

11. Further Work Other elements of further work Make the routine more dynamic Enhancing the impact measure Investigating varying the parameters by domains (eg Standard Industrial Classification (SIC), employment sizeband) Apply the routine to other surveys

Over to you! 12. Questions