Session topic (i) – Editing Administrative and Census data Discussants Orietta Luzi and Heather Wagstaff UNECE Worksession on Statistical Data Editing Ljubljana, 9-11 May 2001
Covered issues: methodological advances in two specific application contexts: 1)Editing admin/external data used in statistical production processes. This is the direction that many National Statistical Institutes (NSIs) are taking now in order to reduce response burden and to lessen costs. However, using admin data for statistical purposes implies a deep revision of the statistical production process. Statistical Agencies should be able to analyze the benefits and drawbacks of the data, and develop methodology to deal with the additional E&I problems Session topic (i) – Editing Administrative and Census data Introduction
Session topic (i) – Editing Administrative and Census data 2)Editing Census data: many NSIs have finalized their testing activities for the next Census round, and are currently implementing their E&I strategies for their economic and households censuses. The complexity of Census operations requires the development of complex integrated E&I systems, and the use of fast and efficient algorithms to solve E&I problems. A number of NSIs implement register-based census or incorporate administrative data, with remarkable costs and statistical burden reductions and significant effects on E&I strategies
Session topic (i) – Editing Administrative and Census data Editing Census data – Italy (G. Ruocco et al.) – Key invited paper – UK (H. Wagstaff) – Slovenia (R. Seljak et al.) – Register-based – Switzerland (D. Kilchmann) – Register-based Editing Administrative data – Business area – France (E. Gros) – Key invited paper – Canada (F. Brisebois et al.) – UK (D. Lewis) – Italy (O. Luzi et al.)
Session topic (i) – Editing Administrative and Census data Summary Incorporating admin data into the statistical production process raises significant challenges and opportunities for NSIs. This is especially true for E&I, where the primary objectives are to maintain the quality of the statistical outputs, reduce costs and minimize respondent burden Admin data can be used either in direct or indirect ways, i.e. to either directly replace (totally or partially) statistical survey data, or to indirectly improve the efficiency of survey data processing (including E&I) The benefits of directly using admin data are especially evident in register-based Censuses, where their massive use allows for remarkable costs and statistical burden reductions and increased timeliness w.r.t. “traditional” Censuses
Admin data are collected for purposes which are typically different from the statistical ones. In addition, reasons for errors/missing data are different in admin data w.r.t. statistical data (e.g. source under- coverage and incompleteness, consistency problems among statistical and admin definitions, etc.) As a consequence, in most cases editing activities performed by external Agencies are to be integrated with additional E&I processing at the Statistical Agency, based on statistical methods and theory, to fulfill statistical quality requirements and allow for final statistical uses The type and the amount of editing needed depend on the intended usage of the external data, on their quality, and on the type of editing activities which are performed by data owners
Points for discussion - Editing Administrative Data Accessibility and stability (over time and w.r.t. information contents) of admin sources are critical aspects to consider when evaluating the direct or indirect usability of external information Formal agreements and cooperation between the data owners and NSIs would benefit both parties in terms of data quality. One of the ways of obtaining and convincing data owners to supply the Statistical Agency with external data is by offering to carry out E&I on the data, thereby improving their quality –Countries’ experiences and adopted solutions (type of agreements, benefits and drawbacks deriving from the established agreements, etc.)
The type and the amount of editing needed at the Statistical Agency depend on the intended usage of the external data, on their quality, and on the type of editing activities which are performed by data owners –Is the documentation on data editing/validation activities performed by data providers usually available and accessible? –In the Countries’ experience, how the trade-off between benefits and additional workload deriving from the direct use of admin data for statistical purposes has been determined? –To which extent traditional E&I methods can be considered as appropriate for admin data? As an example, are selective editing/macroediting appropriate for large data bases? To which extent probabilistic corrections/imputations can be considered as appropriate in case of fiscal data and, more in general, data collected for non statistical purposes?
Assessing the statistical usability of admin data requires the development of specific quality criteria/ dimensions, possibly at International level. In the context of the European Statistical System the quality of admin data has been formally defined (*), and further developments are ongoing also through European projects activities –Do specific quality frameworks for admin data have been developed in Countries, or European/International quality frameworks have been referred to? Standard quality reports and documentation need to be established in case of statistical use of admin data (in conjunction with statistical data) in both Censuses and other statistical processes –To which extent the existing indicators and metadata for quality reporting are to be revised when using admin data in statistical processes (*) Eurostat (2003a), Definition of Quality in statistics. Eurostat Working Group on Assessment of Quality in Statistics, Luxembourg, 2-3 October. Eurostat (2003b), Item6 - Quality Assessment of Administrative Data for Statistical Purposes, Luxemburg.
Census complexity in terms of amount of collected information and structure of target population requires the development of complex integrated E&I systems, and the use of fast and efficient algorithms to solve E&I problems Generalized tools are usually integrated with specifically developed algorithms to optimize the treatment of specific variables, error types or sub-populations Thanks to the high amount of available resources, Censuses represent key occasions to proceed with the development of new methodologies and tools for data E&I, which could be subsequently adopted in other statistical surveys Points for discussion - editing Census Data
–Trade off between costs for developing specific algorithms for common problems and costs for adapting solutions/tools already developed in other Countries –Benefits from International cooperation: how to improve International exchange of tools and approaches, to which extent and how Country- specific solutions can be extended to other National contexts? –Other National surveys usually benefit from Censuses investments in methodological and technological innovation. Countries’ experiences
Web/electronic questionnaires represent a powerful solution for mitigating respondents errors, especially wrt some key variables within the E&I process Resources should be spent to lessen statistical burden on respondents resulting from the complexity and the amount of the required information, by efficient electronic questionnaire design and adequate on-line assistance to respondents –Due to the dissemination of web within target populations (especially for household Censuses) mixed-mode data collection strategies are to be planned. Experiences in terms of trade off between costs/benefits deriving from the adoption of this solution in Censuses –Experiences in terms of response rates, by type of population (households/Businesses)