Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning General rules and procedures Stephanie Stuck MEA Antwerp.

Similar presentations


Presentation on theme: "Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning General rules and procedures Stephanie Stuck MEA Antwerp."— Presentation transcript:

1 Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning General rules and procedures Stephanie Stuck MEA Antwerp February 6 th /7 th 2008

2 2 General philosophy  Respondents are experts of their own lives, in general we (still ) take their answers very seriously  Only change data if you are sure it is wrong, if answers seem implausible but you are not sure what to do  indicate this via flag variable

3 3 General rules  Please use data files with sampid for data cleaning (don’t use data version with sampid2)  Always write programs to correct data (STATA do or SPSS sps files) please never change data directly (e.g. no changes in editors)

4 4 Program files (do or sps) should always start with:  Name of author & date of program  Data version (date) and modules  Short description of program  Sequence of programs

5 5 in programs always  Keep original variables (“varname_original”) STATA:  generate dn003_original = dn003_ SPSS:  compute dn003_original = dn003_  do not change variables called “varname_original”  but change variables with “varname” STATA:  replace dn003_ = 1919 if sampid == “1206211111100” & respid == 1 SPSS:  if (sampid == “1206211111100” & respid == 1) dn003_ = 1919

6 6  Add flag variables to indicate changes (“varname_flag”) STATA  generate dn003_flag = 0  replace dn003_flag = 1 if dn003_original ~= dn003 SPSS  compute dn003_flag = 0  if (dn003_original ~= dn003) dn003_flag = 1  Please label flag variables  “0” should always be used for “no changes/ok”  Other values can be used as needed e.g.: “1: year of birth changed” “2: implausible” in programs always

7 7 Always  Save corrected data files with new name  save “filename_corrected_1”)  save “filename_corrected_2”)

8 8  Country teams send program files to MEA  MEA runs files and creates new data versions  MEA uploads files to web site on new internal SHARE site  New data versions will be named with numbers in the end: share_w2_`module’_1  Country teams download files and can go on checking and cleaning data General procedures

9 9  Please don’t take wave 1 information for granted, it can be wrong, too  sometimes we will have to change wave 1 data, too  CentERdata and MEA currently prepares a version of wave 1 data that includes  Respid for all eligibles (right now respid is only included for respondents)  Flags for changes during cleaning wave 1 data  we will have another release of wave 1 data together with the public release of wave 2 Wave 1 data

10 10 What I learned  You need more ‘step by step’ guidelines, clear instructions,  Where to start – priority list  What exactly to do – programs, examples  When to do it – schedule

11 11 Very next steps  Send the programs you have written to MEA  Send drop offs and vignette forms to MEA (paper versions), also check them for country specific deviations  Imputations group and MEA send around priority list and more instruction  MEA and CentERdata prepare updated wave 1 and wave 2 files incl. sampid, respid for all eligibles & a new merging variable


Download ppt "Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning General rules and procedures Stephanie Stuck MEA Antwerp."

Similar presentations


Ads by Google