Download presentation
Presentation is loading. Please wait.
Published byAlyson Stewart Modified over 8 years ago
1
Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning General rules and procedures Stephanie Stuck MEA Antwerp February 6 th /7 th 2008
2
2 General philosophy Respondents are experts of their own lives, in general we (still ) take their answers very seriously Only change data if you are sure it is wrong, if answers seem implausible but you are not sure what to do indicate this via flag variable
3
3 General rules Please use data files with sampid for data cleaning (don’t use data version with sampid2) Always write programs to correct data (STATA do or SPSS sps files) please never change data directly (e.g. no changes in editors)
4
4 Program files (do or sps) should always start with: Name of author & date of program Data version (date) and modules Short description of program Sequence of programs
5
5 in programs always Keep original variables (“varname_original”) STATA: generate dn003_original = dn003_ SPSS: compute dn003_original = dn003_ do not change variables called “varname_original” but change variables with “varname” STATA: replace dn003_ = 1919 if sampid == “1206211111100” & respid == 1 SPSS: if (sampid == “1206211111100” & respid == 1) dn003_ = 1919
6
6 Add flag variables to indicate changes (“varname_flag”) STATA generate dn003_flag = 0 replace dn003_flag = 1 if dn003_original ~= dn003 SPSS compute dn003_flag = 0 if (dn003_original ~= dn003) dn003_flag = 1 Please label flag variables “0” should always be used for “no changes/ok” Other values can be used as needed e.g.: “1: year of birth changed” “2: implausible” in programs always
7
7 Always Save corrected data files with new name save “filename_corrected_1”) save “filename_corrected_2”)
8
8 Country teams send program files to MEA MEA runs files and creates new data versions MEA uploads files to web site on new internal SHARE site New data versions will be named with numbers in the end: share_w2_`module’_1 Country teams download files and can go on checking and cleaning data General procedures
9
9 Please don’t take wave 1 information for granted, it can be wrong, too sometimes we will have to change wave 1 data, too CentERdata and MEA currently prepares a version of wave 1 data that includes Respid for all eligibles (right now respid is only included for respondents) Flags for changes during cleaning wave 1 data we will have another release of wave 1 data together with the public release of wave 2 Wave 1 data
10
10 What I learned You need more ‘step by step’ guidelines, clear instructions, Where to start – priority list What exactly to do – programs, examples When to do it – schedule
11
11 Very next steps Send the programs you have written to MEA Send drop offs and vignette forms to MEA (paper versions), also check them for country specific deviations Imputations group and MEA send around priority list and more instruction MEA and CentERdata prepare updated wave 1 and wave 2 files incl. sampid, respid for all eligibles & a new merging variable
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.