Rudi Seljak, Aleš Krajnc

Slides:



Advertisements
Similar presentations
STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE REPUBLIC OF SLOVENIA Rudi Seljak Statistical Office of the Republic of Slovenia.
Advertisements

EPSON STAMPING ISO REV 1 2/10/2000.
Regional Workshop for African Countries on Compilation of Basic Economic Statistics Pretoria, July 2007 Administrative Data and their Use in Economic.
Modernisation of Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS Workshop on Modernisation of Statistical Production Geneva, 15–17.
Metadata driven application for aggregation and tabular protection Andreja Smukavec SURS.
ICVS IN SLOVENIA Tatjana Škrbec. Content of presentation  Short history  Crime victim survey 2001 within SORS  Methodology and content of questionnaire.
Rudi Seljak, Metka Zaletel Statistical Office of the Republic of Slovenia TAX DATA AS A MEANS FOR THE ESSENTIAL REDUCTION OF THE SHORT-TERM SURVEYS RESPONSE.
Quality Reporting at SORS – Experiences and Future Perspectives Rudi Seljak, Tina Ostrež Statistical Office of the Republic of Slovenia.
Dr. Mojca Noč Razinger SURS Data collection in the Statistical Office of the Republic of Slovenia (SURS)
USING THE METADATA IN STATISTICAL PROCESSING CYCLE – THE PRODUCTION TOOLS PERSPECTIVE Matjaž Jug, Pavle Kozjek, Tomaž Špeh Statistical Office of the Republic.
Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia.
National design, fieldwork and data harmonization for Labour Force Survey Irena Svetin Statistical Office of the Republic of Slovenia September 2014.
Toward Generic Systems Shifra Haar - Central Bureau of Statistics-Israel.
Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia.
New sources – administrative registers Genovefa RUŽIĆ.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
Outlining a Process Model for Editing With Quality Indicators Pauli Ollila (part 1) Outi Ahti-Miettinen (part 2) Statistics Finland.
The Application for Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS UNECE Statistical Data Confidentiality Work Session Helsinki,
The hidden side of successful story – implication of wide use of administrative data sources at national statistical institutes Metka Zaletel, Irena Križman.
RECENT DEVELOPMENT OF SORS METADATA REPOSITORIES FOR FASTER AND MORE TRANSPARENT PRODUCTION PROCESS Work Session on Statistical Metadata 9-11 February.
S T A T I S T I K A U S T R I A Quality Assessment of register-based Statistics A Quality Framework Manuela LENK Directorate.
QUALITY ASSESSMENT OF THE REGISTER-BASED SLOVENIAN CENSUS 2011 Rudi Seljak, Apolonija Flander Oblak Statistical Office of the Republic of Slovenia.
13-Jul-07 State of the art of the ISCO-08 implementation.
2nd Joint Workshop on Pesticide Indicators Pesticide Usage Survey on Wheat in Hungary Zsuzsanna Szabó Hungarian Central Statistical Office September.
ADaM or SDTM? A Comparison of Pooling Strategies for Integrated Analyses in the Age of CDISC Joerg Guettner, Lead Statistical Analyst, Bayer Pharma, Wuppertal,
The Role of service Granularity in Successful CSPA Realization Zvone Klun, Tomaž Špeh Geneve, 22 June 2016.
Session topic (i) – Editing Administrative and Census data Discussants Orietta Luzi and Heather Wagstaff UNECE Worksession on Statistical Data Editing.
4-6 September 2013, Vilnius Quality in Statistics: Administrative Data and Official Statistics USING ADMINISTRATIVE DATA SOURCES IN OFFICIAL.
Workshop on Statistical Data Collection 2017
Methods for Data-Integration
Modernisation Story of Statistics Slovenia
EU-SILC Survey Process in the Czech Republic presentation for EU-SILC Methodological Workshop November 7th Martina Mysíková, Martin Zelený Social.
Experience of Serbia in conducting pilot Time Use survey
UNECE Seminar on New Frontiers for Statistical Data Collection, Geneva
Current status of the planning of the AC
Anna Długosz Central Statistical Office of Poland
Omurbek Ibraev Project coordinator December 2014
The usage of web interviewing in Lithuanian Labour Force Survey
Operational Agility in the American Community Survey: The Promise of Administrative Records Victoria Velkoff and Jennifer Ortman American Community Survey.
Estimation methods for the integration of administrative sources
Dublin, april 2012 Role of Business Register in coordinated sampling
Survey phases, survey errors and quality control system
Survey phases, survey errors and quality control system
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
WORKSHOP ON THE DATA COLLECTION OF OCCUPATIONAL DATA Luxembourg, 28 November 2008 Occupation as a core variable in social surveys Sylvain Jouhette
Use of handheld electronic devices for data collection in GeoStat
Richard Heuberger, Nadja Lamei Statistics Austria
Coding occupations The new coding process Sue Westerman, Marc Houben.
Urve Kask Statistics Estonia
Metadata Framework as the basis for Metadata-driven Architecture
Validation process and the IT tools used at KAS
2011 Population and Housing Census of Turkey
Data validation in Statistical Office of the Republic of Serbia
Administrative Data and their Use in Economic Statistics
Albania 2021 Population and Housing Census - Plans
Passenger Mobility Statistics 2017
Qualtrics for data collection
Turkish Statistical Institute
The change of data sources in the Spanish SILC
Workshop on Pesticide Indicators
Reduction of administrative burden through official statistics
GSBPM AND ISO AS QUALITY MANAGEMENT SYSTEM TOOLS: AZERBAIJAN EXPERIENCE Yusif Yusifov, Deputy Chairman of the State Statistical Committee of the Republic.
Hanna Gembarzewska, Monika Grabani
Business architecture
Basic preconditions The next round of population and housing censuses is scheduled for the start of the new decade (2021), both in the EU and in the partner.
Technical Coordination Group, Zagreb, Croatia, 26 January 2018
Turkish Statistical Institute Demographic Statistics Department
Census 2021 in the Republic of Serbia
International Standards and Contemporary Technologies,
Presentation transcript:

EDITING OF MULTIPLE SOURCE DATA IN THE CASE OF SLOVENIAN AGRICULTURAL CENSUS 2010 Rudi Seljak, Aleš Krajnc Statistical Office of the Republic of Slovenia

Overview of the presentation General about the Agriculture Census (AC 2010) Database organization Statistical data processing Main problems and challenges Conclusions

General information about the AC 2010 Collection of exhaustive information on all the agricultural holdings (AH) which fulfill the certain criteria stated in the EU regulation. In accordance with the EU regulation it is conducted every 10 year. In 2010 conducted in most EU Member States (few in 2009). The aim of obligatory regulation is to get for the first time the comparable data on agricultural indicators based on the same methodology.

Slovenian AC 2010 Carried out by the Statistical Office of the Republic of Slovenia (SURS) in June-July 2010. Part of the data collected with the field survey (CAPI) and (large) part was obtained from different administrative sources. There were 94,686 AH visited in the field → 74,646 that satisfy the ECA criteria. The field work and data entry program was done by the outsourced company, but all the all the instructions and rules were provided by the SURS’s staff. About 600 interviewers finished the fieldwork in approx. 75 days

Micro-data Database Field data were separated into the different tables according to the sets of related questions. Each of the different administrative sources was put in the separate table. Each table was „accompanied“ with the statuses of variables. Status „flagged“ the collection mode and also each change in the process. Each table has one associated table where all the changed records are inserted. Views to different version of the data were created. All together 199 tables and views and all together 9,583 variables to be processed

Database – schematic presentation Tables Statuses Data TabX TabX_S TabX_edi TabX_S_edi Views View - All versions of the record View - Last version of the record View - All versions of the record View - Last version of the record

Statistical data processing Combination of general application and custom made computer programs used for data processing. Custom made programs: Insertion of the new units. Units that were according to the field data not AH, but admin data indicated the opposite Replacement of the whole set of data in the case where the field data were of bad quality Calculation of the derived variables General application: Logical controls Individual and systematic corrections Imputation

General application The metadata driven application for data editing which is used in several other surveys (also in population census) Due to the requirements of the AC 2010 data processing some additional functionalities were added: General metadata driven process for linkage of arbitrary number of tables General metadata driven process for the calculation of the “aggregated derived variables” data on the level of persons, which work at the AH are aggregated to the level of AH Several new imputation methods were added 8

General application – Pros and Cons Greater independency from IT persons IT (programming) work decreased significantly Traceability and repeatability is ensured The process documented through the metadata database Cons: More skilled subject personnel needed A lot of metadata produced → sometimes difficult to manage and control 9

AC 2010 – main challenges AC2010 already by its nature very demanding survey: Large number of units and variables Data from different level (AC holding + persons work at holding) Combination of different data sources makes the job even more complicated Creation of rules (process metadata) was spread among several subject, each of them covering one of the areas → overall coordination quite demanding task In the first phase a lot of errors in the syntax was produced 10

AC 2010 – main challenges cont’d Large number of variables required large number of process steps (e.g. 16 steps in the imputation part of the process) → sometimes difficult to follow the process and enable consistency in corrected data Integration of the data from two different sources was a special challenge: Priority setting in the case of the “overlapping” of the sources Large differences in data from different sources had to be resolved → very time consuming 11

Conclusions – points for discussion What is the influence of the outsourcing of the data collection to the quality of the incoming data? Importance of active cooperation of the SURS staff in testing of the questionnaire and training of interviewers Usage of combined data sources: Large advantage in decrease of the reporting burden Not large influence to costs reduction Increased workload at the data editing stage Usage of different sources increased the quality of final micro data Challenge to find the balance between these factors 12

Conclusions – points for discussion cont’d Complexity of data processing: Balance between the usage and (if needed) upgrade of general IT solutions and creation of custom made programs Micro-data provided to Eurostat and given on disposal to researchers Can we still afford selective data editing? 13

Thank you for your attention 14