Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February 2009 Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) Zoltán Csereháti HCSO Methodological Department
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Introduction 2. The IDPS (Creating Integrated Data Processing System) project 3. Documentation scheme for imputation 4. Training course on imputation 5. Future work: Handbook on imputation
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Introduction (1) The work of the HCSO Methodological department: Our scope of processing phases: Sampling Estimation Imputation Seasonal adjustment Data confidentiality (list gradually widening)
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Introduction (2) Tools offered: quality guidelines for the elements of value chain methodological documentation schemes good practices quality indicators methodological support training course materials quality assessing tools handbook for several phases
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February General issues related to non- response and imputation (1) Item / Unit non-response Non-response bias (Selecting larger samples is not a solution.) Alternatives: Reweighting Imputation
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February General issues related to non- response and imputation (2) What is special about imputation: There is a huge variety of imputation methods. Many of them are quite simple and easy to implement. Unlike other methodological areas imputation is a processing phase which is often conducted by subject matter statisticians without the supervision of methodologists. Supposedly many of these methods could be improved.
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February The "IDPS" (Creating Integrated Data Processing System) project Objectives (1): To develop user-friendly integrated data processing system based on standard logic covering the widest range of surveys. Accessible via a standard user interface and providing a clear and efficient tool for the statisticians. Include data quality requirements and data processing procedures documented in the meta-database To be integrated with other general purpose systems such as data entry, dissemination
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Objectives (2): To develop applications or frame systems allowing coordination and quality management in the control of processing Direct access to data for the purpose of verification and analysis To restructure the division of labour with the IT staff focusing on innovation, development and production quality data faster through direct data processing We anticipate having a (partially) working system by the end of 2010.
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February The organization of the IDPS project An IT company chosen by a public procurement procedure On behalf of the HCSO: IT Department Methodological Department Selected subject matter statisticians from all the relevant fields. Project leadership: Selected members of the HCSO IT Department IT company project leaders
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February IDPS (2) Benefits: Common, integrated platform for all the surveys Less redundancy More transparent system Processes documented in a standard way Better overview of the process plans System functionalities by the hand of the user Build new data process flows more easily
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February IDPS (3) Main steps already done: Documentation of the data process flow elements Designing a general scheme for a universal data processing flow Identifying process stages such as editing, imputation, outlier filtering, consistency checking, etc Identifying basic methods currently in use in the different stages. Identifying process steps from which the individual implementations of the methods are built from.
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February IDPS (4) Standard processes We do not want to settle strict methodological standards. The so-called standards of the IDPS system will be optimally designed software components for implementing different algorithms and procedures which are useful as building blocks to compile the IT version of different methods. How does an ideal standard process look like? Small and special enough to serve as a building block Flexible and general enough Having a number of parameters for fine tuning As a consequence: We will face difficult trade-off situations
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Documentation schemes Affected methodological areas: Sampling Imputation Estimation and standard error calculation Seasonal adjustment and confidentiality. Aims: to build a uniform structure for assessing to gain a better overview of the methods used by various surveys to improve process quality.
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February A documentation scheme for imputation General information treatment of item/unit non-response Imputation method applied Is there any guideline? Is the procedure documented? Place in the processing chain Software solution used Auxiliary data sources used Simple or composite method Indicate the applied method(s)
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Internal training course on imputation (1) Concept of imputation Why imputing at all? Drawbacks and benefits of different methods How to reduce non-response bias? Basic weighting techniques Benefits of complete datasets How to organize a method building process? Use of auxiliary information
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Internal training course on imputation (2) Editing and imputation Basic imputation methods / examples Documentation: flow charts, algorithmic descriptions Flagging the imputed values The place of imputation in the whole data processing flow Imputation and outlier-filtering How to plan and assess an imputation method? Simulation studies
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Internal training course on imputation (3) Teamwork session: Select a practical problem and try to solve it together in teams Share the experiences and ideas
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Conclusion, future work (1) Compiling a handbook on imputation (For internal use in the HCSO): Recommended methods with application areas Detailed guidelines: how to build an imputation method Highlighting current best practices Practical advices, focusing on issues related to Hungarian specialities Using the experiences of The work on the IDPS system The feedbacks from the training course The information collected by the documentation scheme
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February Conclusion, future work (2) International background material including: ONS paper: Report on the Task Force on Imputation Statistics Canada Quality Guidelines The results of the EUREDIT project EDIMBUS project Implementing to the special needs of the HCSO (In the area of seasonal adjustment a similar work has been already finished)
Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) NTTS 2009 seminar, Bruxelles February References The results of the EUREDIT project: The results of the EDIMBUS project: The ONS paper: Report on the Task Force on Imputation (June 1996) GSS Methodology Series Statistics Canada Quality Guidelines (Fourth Edition 2003) Quality Guidelines of the HCSO (Legal Act 2007) Hungarian Central Statistical Office: Strategy , pages Csereháti, Z. (2006) Multiple Donor Imputation Techniques, Paper for the European Conference on Quality in Survey Statistics, Cardiff, April 2006