Improving data quality in business surveys for National Statistics

Slides:



Advertisements
Similar presentations
1 1 Web and Paper Questionnaires seen from the Business Respondent’s Perspective Gustav Haraldsen, Statistics Norway Jacqui Jones, UK Office of National.
Advertisements

United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Current and Future Applications of the Generic Statistical Business Process Model at Statistics Canada Laurie Reedman and Claude Julien May 5, 2010.
United Nations Economic Commission for Europe Statistical Division Mapping Data Production Processes to the GSBPM Steven Vale UNECE
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.
Public Libraries Survey Data File Overview. What We’ll Talk About PLS: Public Libraries Survey State level data Public library data (Administrative Entities)
Public Libraries Survey Data File Overview. 2 What We’ll Talk About PLS: Public Library Survey State level data Public library data (Administrative Entities)
4-6 September 2013, Vilnius Quality in Statistics: Administrative Data and Official Statistics USING ADMINISTRATIVE DATA SOURCES IN OFFICIAL.
Quality declarations Study visit from Ukraine 19. March 2015
Division of HIV/AIDS Managing Questionnaire Development for a National HIV Surveillance Survey, Medical Monitoring Project Jennifer L Fagan, Health Scientist/Interview.
EASYOFFICE A COMPLETE TAXATION SOFTWARE
T AGEISS Accounting Manager P. O. Box 6126 Longmont, CO Phone: Fax:
Register SPID - Retailer
Online Gate Signs Order Portal
CCCApply CCCApply BOG Fee Waiver
Theme (v): Managing change
DATA INPUT AND OUTPUT.
Standardized and modernized data editing in Statistics Denmark
Harry Timmermans Eindhoven University of technology
Questionnaire Design in Statistics Denmark
COMMUNITY ACCOUNTABILITY PLANNING SUBMISSIONS (CAPS) & MULTI-SECTOR SERVICE ACCOUNTABILITY AGREEMENTS (MSAA) CAPS And Schedule Refresh.
Anna Długosz Central Statistical Office of Poland
PIC + TransNet.
Posting to the Ledger 7.1.
PE Determiner Portal Registration and Log on Workshop
CEDARS Statewide Webinar – High Poverty/LAP/National Board Bonus
DepEd e-FORMS Automated Form Templates in Excel for Elementary and High School Alfonso C. Corpuz, Physics Teacher September 10, :00 pm.
Intelligent Validation in Online Questionnaires
Key points.
Mental Health Data Alliance, LLC (MHData) June 7th , 2018
XIS XML Input System Statistics Denmark 11 Maj 2004.
Implementation of a more efficient way of collecting data SBS: use of administrative data Statistics Belgium June 2009.
2018 NM Community Survey Data Entry Training
Quality Aspects and Approaches in Business Statistics
An Active Collection using Intermediate Estimates to Manage Follow-Up of Non-Response and Measurement Errors Jeannine Claveau, Serge Godbout and Claude.
Generic Statistical Business Process Model (GSBPM)
ESTP COURSE ON PRODCOM STATISTICS
PE Determiner Portal Registration and Log on Workshop
Warm up – Unit 4 Test – Financial Analysis
Business Register Quality Improvement
Data exchange between ENP-South countries and Eurostat
Electronic Data Collection at Statistics Canada
Development of production routines for Crime & Criminal justice statistics Arsela Sturc SOGETI.
A new fantastic source for updating the Statistical Business Register
Richard Heuberger, Nadja Lamei Statistics Austria
Evaluation: Pathways and Guidance Projects and Events
Web Usage in a Business Panel Survey
The Computer-Assisted Personal
LAMAS Working Group June 2017
Unemployment Insurance Agency Michigan Web Account Manager
Implementation of a more efficient way of collecting data SBS: electronic data collection Statistics Belgium.
„Elektra” HCSO electronic survey and its background
Internet Rechartering
Write Job Applications
Education and Training Statistics Working Group – 2-3 June 2016
Ethiopia Visa Guidance
Mapping Data Production Processes to the GSBPM
Metadata used throughout statistics production
Parallel Session: BR maintenance Quality in maintenance of a BR:
Changes in the Canadian Census of Population Program
How to Tie Payroll Reports and Journals Together for Audit
Burden reduction in Prodcom
Hanna Gembarzewska, Monika Grabani
Indicator 3.05 Interpret marketing information to test hypotheses and/or to resolve issues.
Business architecture
Quick statistics - how to deal with quality?
Quality Indicators.
Deciding the mixed-mode design WP1
PRODCOM Working Group JMO M November 2012
Presentation transcript:

Effect of cross validation in online questionnaires - on subsequent data editing Improving data quality in business surveys for National Statistics Hanne-Pernille Stax & Peter Tibert Stoltze, Statistics Denmark

Outline Development of online questionnaires for business surveys at Statistics Denmark: 2008-2014 Online validation – why and how? Case 1: Transportation of goods by lorry Case 2: Vacant positions Conclusion and perspectives

Online questionnaires for business surveys in Statistics Denmark 110 business surveys per year (yearly, quarterly, monthly) 450.000 forms submitted per year 85+ % digital submission Implementation of online validation 2008 > Wave 1: Digital copy of paper questionnaire Wave 2: Digital questionnaires with internal edit checks Wave 3: Digital questionnaires with cross validation: Online comparison btw. keyed data and pre-known data about individual unit

GSBPM

Online edit checks – why? Conventional process: Q1 R1 Q2 R2 Submit Edit Re-contact Integrated edit checks: Q1 R1 Check, Edit/Confirm Q2 R2 Check, Edit/Confirm Submit Valid data Instant feed back - if data violates edit rules R can review, edit, confirm or explain - before submission Reduce risk of error and subsequent error-upon-error Reduce need for data editing and re-contact Improve data quality Reduce respondent burden More effective process

Online edit checks – how? Simple edit checks on single values Missing *, type (number), scope (0-100), pattern... Complex edit checks Auto calculation, routing, cross validation Hard stops or responsive assisting guidance Form & level NOT guided by documented effect, but: Technological capability Respondent expectation Methodological presumptions 11/22/2018

Case 1: Transportation of goods by lorry Data: Report all trips for specific truck in specific week: Length of each trip + goods type and weight For control (post collection editing) Km driven in total Start and end point of each trip - area code 11/22/2018

Case 1: Issues (Goods by Lorry) Data quality is poor: Trips are not linked > Empty trips are missing Reported length of trips is unreliable Sum of trips ≈ 2 x km driven in total Trip 1: From Copenhagen To Odense Trip 2: From Hamburg To Copenhagen

Case 1: Online validation (Goods by Lorry) Responsive soft assisting functionality Facilitate internal cross comparison Auto-transfer of values (km in total) Auto-sum of trips (running tally) Auto link of trips 11/22/2018

Case 1: Control question (Goods by Lorry) In total how many kilometers in week? Calculated from km counter values – at start Total 11/22/2018

Case 1: Cross validation (Goods by Lorry) Sum of trips Transfer & display Total km - for reference Colour format if Sum exceeds Total. Sum Total 11/22/2018

Case 1: Auto fill (Goods by Lorry) Auto-link of trips: Auto-transfer of end place of preceeding trip to starting place of following trip. 11/22/2018

Case 1: Effect (Goods by Lorry) Trips are linked Empty trips included Low span btw. Sum of trips and Total km No series break in Total km pr week. Re-design . 11/22/2018

Case 2: Vacant positions Data: Report for specific unit at specific date Number of vacant positions at unit Number of employees at unit Issues: Edit check AFTER data collection indicate that report is frequently NOT for selected work unit, but for - larger - legal unit. 11/22/2018

Case 2: Cross validation (Vacant positions) Known number of employees for unit is prefilled to questionnaire for each unit: Source 1: Reported in survey 1 year back Source 2: Business register OBS: The two values may differ > NOT displayed (hidden prefill) Warning is shown if entered value differs too much from both prefilled values (> double / + 50) OBS: Wide margin copies form post collection editing

Case 2: Cross validation (Vacant positions) Number of employees at work unit (control variable) Number of vacant positions at work unit (core variable) Hidden unit prefill: Number of employees - from previous survey - from business register Warning if entered value differs too much from unit prefill values (wrong unit??) Number of employees at work unit “xyz” seems high. Please correct or explain and confirm.

Case 2: Effect (Vacant positions) Number of errors generally decrease over time. No grand effect of cross validation. Too coarse?

Case 2: Effect (Vacant positions) Some errors are less frequent in data from web questionnaires than in data entered via telephone

Challenges and perspectives Form and level of online validation is largely guided by technical capability & methodological presumption. Respondents expect online edit checks Need to balance interruption and assistance NSI Statisticians think data editing AFTER data collection (GSBPM) Need to rethink process and generate qualified input - early Optimized edit checks require follow up analysis Need to document errors and effect on data quality

Thank you Hanne-Pernille Stax, hps@dst Thank you Hanne-Pernille Stax, hps@dst.dk Peter Tibert Stoltze, psl@dst.dk