Work Package 5: Integrating data from different sources in the production of business statistics Daniel Lewis Office for National Statistics (UK)

Slides:



Advertisements
Similar presentations
Why collect the same data twice?: Exploring the possibilities of using VAT data in the estimation of the Danish IIP Søren Kristensen Statistics Denmark.
Advertisements

Evaluating the Effects of Business Register Updates on Monthly Survey Estimates Daniel Lewis.
Paul Smith Office for National Statistics
Goods for Processing / Toll Processing … a pragmatic approach What is toll processing? Why is toll processing used? What is the problem? How has ONS dealt.
Using Business Taxation Data as Auxiliary Variables and as Substitution Variables in the Australian Bureau of Statistics Frank Yu, Robert Clark and Gabriele.
Editing and Imputing VAT Data for the Purpose of Producing Mixed- Source Turnover Estimates Hannah Finselbach and Daniel Lewis Office for National Statistics,
Deliverable 2.8: Outliers Gary Brown Office for National Statistics UK.
Eurostat Micro data linking project in European business statistics European Commission – Eurostat Directorate G: Global business statistics.
March 2013 ESSnet DWH - Workshop IV DATA LINKING ASPECTS OF COMBINING DATA INCLUDING OPTIONS FOR VARIOUS HIERARCHIES (S-DWH CONTEXT)
Results and next steps from the ESSnet Admin Data Alison Pritchard Business Outputs & Developments, Office for National Statistics, UK 4 December 2012.
Examining the use of administrative data for annual business statistics Joanna Woods, Ria Sanderson, Tracy Jones, Daniel Lewis.
1 Editing Administrative Data and Combined Data Sources Introduction.
1 Methods for detecting errors in VAT Turnover data Phil Lewis Processing, Editing and Imputation branch Business Statistics Methods-Survey Methodology.
Maintenance of Selective Editing in ONS Business Surveys Daniel Lewis.
Quality assuring the UK business register Andrew Allen.
Trade and business statistics: use of administrative data Lunch Seminar Enrico Giovannini Italian National Statistical Institute (ISTAT) New York, February,
Use of administrative data in statistics - challenges and opportunities ICES III End Panel Discussion Montreal, June 2007 Heli Jeskanen-Sundström Statistics.
Federal Department of Home Affairs FDHA Federal Statistical Office FSO Item 3: Methods to Improve Estimates of Migration Flows Joint UNECE/Eurostat Work.
Eurostat Statistical Data Editing and Imputation.
Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on.
Measuring the quality of regional estimates from the ABS Jennie Davies and Daniel Ayoubkhani.
MICS4 Survey Design Workshop Multiple Indicator Cluster Surveys Survey Design Workshop Data Analysis and Reporting.
CZECH STATISTICAL OFFICE Na padesátém 81, CZ Praha 10, Czech Republic The use of administrative data sources (experience and challenges)
Rudi Seljak, Metka Zaletel Statistical Office of the Republic of Slovenia TAX DATA AS A MEANS FOR THE ESSENTIAL REDUCTION OF THE SHORT-TERM SURVEYS RESPONSE.
12th Meeting of the Group of Experts on Business Registers
Integrating administrative and survey data in the new Italian system for SBS: quality issues O. Luzi, F. Oropallo, A. Puggioni, M. Di Zio, R. Sanzo Nurnberg,
Improvements in stratification in the UK's Office for National Statistics Pete Brodie, Martina Portanti & Emily Carless UK Office for National Statistics.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Coverage assessment and adjustment methodology Owen Abbott Methodology Directorate, ONS.
Pushing forward with ASPIRE A System for Product Improvement, Review and Evaluation Heather Bergdahl, Paul Biemer, Dennis Trewin Q2014.
2011 CENSUS Coverage Assessment – What’s new? OWEN ABBOTT.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
Quality issues on the way from survey to administrative data: the case of SBS statistics of microenterprises in Slovakia Andrej Vallo, Andrea Bielakova.
2011 Census 2007 Census Test – emerging findings Garnett Compton, ONS Updated 4 September 2007 BSPS – 12 September 2007.
IMPUTING MISSING ADMINISTRATIVE DATA FOR SHORT-TERM ENTERPRISE STATISTICS Pieter Vlag – Statistics Netherlands Joint work with DESTATIS, Statistics Estonia,
Deliverable 2.6: Selective Editing Hannah Finselbach 1 and Orietta Luzi 2 1 ONS, UK 2 ISTAT, Italy.
A Strategy for Prioritising Non-response Follow-up to Reduce Costs Without Reducing Output Quality Gareth James Methodology Directorate UK Office for National.
The application of selective editing to the ONS Monthly Business Survey Emma Hooper Office for National Statistics
Topic (ii): New and Emerging Methods Maria Garcia (USA) Jeroen Pannekoek (Netherlands) UNECE Work Session on Statistical Data Editing Paris, France,
Topic (vi): New and Emerging Methods Topic organizer: Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Oslo, Norway, September 2012.
Cristina Casciano, Viviana De Giorgi, Filippo Oropallo Istat Division for Structural Business Statistics, Agriculture, Foreign Trade and Consumer Prices.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
ESSnet AdminData Methods of estimation for business statistics variables that cannot be obtained from administrative data sources (WP3) Duncan Elliott.
Evaluating generalised calibration / Fay-Herriot model in CAPEX Tracy Jones, Angharad Walters, Ria Sanderson and Salah Merad (Office for National Statistics)
Statistik.atSeite 1 Norbert Rainer Quality Reporting and Quality Indicators for Statistical Business Registers European Conference on Quality in Official.
New sources – administrative registers Genovefa RUŽIĆ.
ESSnet on the use of administrative and accounts data in business statistics Development of Quality Indicators (WP6) John-Mark Frost (ONS, UK), Humberto.
Process Quality in ONS Rachel Skentelbery, Rachael Viles & Sarah Green
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
Topic (i): Selective editing / macro editing Discussants Orietta Luzi - Italian National Statistical Institute Rudi Seljak - Statistical Office of Slovenia.
United Nations Workshop on Evaluation and Analysis of Census Data, 1-12 December 2014, Nay Pyi Taw, Myanmar DATA VALIDATION-I Evaluation of editing and.
ESS-net DWH ESSnet on microdata linking and data warehousing in statistical production Harry Goossens – Statistics Netherlands Head Data Service Centre.
QUALITY ASSESSMENT OF THE REGISTER-BASED SLOVENIAN CENSUS 2011 Rudi Seljak, Apolonija Flander Oblak Statistical Office of the Republic of Slovenia.
The 2011 Census: Estimating the Population Alexa Courtney.
Towards the 2011 UK Census Editing Strategy Heather Wagstaff and Steven Rogers Methodology Directorate Office for National Statistics, U.K.
Evaluating the benefits of using VAT data to improve the efficiency of editing in a multivariate annual business survey Daniel Lewis.
Census quality evaluation: Considerations from an international perspective Bernard Baffour and Paolo Valente UNECE Statistical Division Joint UNECE/Eurostat.
UN ECE Seminar on New Frontiers for Statistical Data Collection 31 Oct – 2 Nov 2012 Beyond 2011 The future of population statistics Andy Teague, Office.
First meeting of the Technical Cooperation Group for the Population and Housing Censuses in South East Europe Vienna, March 2010 POST-ENUMERATION.
Synthetic Approaches to Data Linkage Mark Elliot, University of Manchester Jerry Reiter Duke University Cathie Marsh Centre.
CASE STUDY Growing capacity through insight. Assessing current employees to understand their abilities. The business Challenge The client needed to select.
Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate.
Methods for Data-Integration
Implementation of Quality indicators for administrative data
Theme (v): Managing change
Guidelines on the use of estimation methods for the integration of administrative sources WG Methodology 2018/05/03.
Statistical Office of the Republic of Slovenia
WP7 – COMBINING BIG DATA - STATISTICAL DOMAINS
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
Task Force on Small and Medium Sized Enterprise Data (SMED)
Presentation transcript:

Work Package 5: Integrating data from different sources in the production of business statistics Daniel Lewis Office for National Statistics (UK)

Overview Issues with integrating admin and survey data Plans for Work Package 5 Work stream B Improving predicted values

Issues with integrating admin data Admin data may have different unique identifiers to survey data One-to-one and many admin units to one survey unit matches are fine Many-to-many and one admin unit to many survey units matches are more difficult Matching can result in duplicate & missing units Common variables from admin and survey sources often have different definitions Admin data are often of different periodicity to survey data (dealt with in WP4)

Plans for WP5 Split into two work streams to focus on specific topics relating to integrating data from multiple sources: Work stream A – methods for editing integrated data to ensure they are error-free and consistent Work stream B – combining admin data with survey data to improve editing and imputation

Work stream B (1/3) Collaboration between UK, Belgium and Italy Improving editing and imputation by using admin data integrated with survey data Initially concentrating on Structural Business Statistics Hope to extend to Short Term Statistics following progress by WP4 on periodicity issues

Work stream B (2/3) For some countries, admin data offers possibility of imputing rather than re-weighting to account for non-response In other cases, admin data can improve the accuracy of predicted values used in both editing and imputation Begin by analysing integrated admin and survey data available in UK, Belgium and Italy Test whether admin data offers benefits over other available predictors

Work stream B (3/3) Ultimately aiming for ESS wide recommendations Research availability of admin data and editing and imputation methods used in other ESS countries for SBS surveys Use information to undertake further analysis which will be applicable to other European countries

Use of predicted values Predicted values are often used when editing and imputing survey data: Traditional edit rules often compare survey responses with predicted values – large deviations are deemed suspicious Selective editing relies on having predicted values for each business in order to estimate the importance of potential errors on survey estimates Item non-response can be dealt with by modelling the relationship between survey and other (related) data

Improving predicted values More accurate predictors can improve the ability of editing methods to identify erroneous responses Can improve quality of data Or keep quality the same whilst reducing survey costs and burden on businesses Also possible to improve outputs by better imputation Using predictors directly as imputations or modelling with survey data

Evaluating predicted values (1/2) Estimated error for each predictor: Estimate savings to editing by comparing edit failures using edit rules with current and new predictors

Evaluating predicted values (2/2) Use imputation study to estimate imputation bias for methods based on each predictor Check distribution of imputed data sets

UK example of improved predicted values Used modelled and unmodelled VAT turnover and expenditure to predict 5 key SBS variables (Turnover, Purchases, Employment Costs, Net Capital Expenditure, Gross Value Added) Compared with existing predictors (previous values where available, register values) Restricted study to one-to-one matches between sources

UK example of improved predicted values Previous period values are generally best predictors, but only available for a third of the sample For Turnover, Purchases, Employment Costs and Gross Value Added, VAT predictors were better than register values, often significantly so Illustrates potential benefits of using admin data integrated with survey data Also highlights some of the problems that need to be addressed when integrating data from multiple sources

Summary WP5 focuses on integrating data from multiple sources in production of business statistics Two work streams looking at different aspects of this Editing integrated data Use of admin data to improve editing and imputation of survey data Previous research suggests that improvements to survey outputs and reduction in costs and burden are possible Will ultimately produce guidelines for ESS