Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE data versions & IDs Stephanie Stuck MEA Antwerpen February 2008.

Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE data versions & IDs Stephanie Stuck MEA Antwerpen February 2008

2 Data versions and ID-variables Data cleaningPublications who gets the idinternal versionpublic version Household ID all householdssampidsampid2 (scrambled version of sampid) Person ids all household members (in CV), that means: non-eligible persons get a cvid, too, e.g. children, other people living in the household cvid should be used to merge modules within waves all eligible household members (in CV), that means: all household members that should be interviewed, e.g. respondents and partners even if partners areyounger than 50 years respid should be used to merge waves

3 Data versions and ID-variables Data cleaningPublications internal versionpublic version Household IDsampidsampid2 (scrambled version of sampid) Download site country specific versions Raw version (updates during fieldwork and ‘shortly’ after fieldwork) CentERdata site http://cdata8.uvt.nl/share/version2.7/ Corrected versions during cleaning process new internal SHARE site (not yet available) all countries public website www.share-project.org & internal website (data for working groups) Available forrespective country team, CentERdata, MEAworking groups, external users

4 sampid rules (old)  Digits 1-2: country code (e.g. 23 for Belgium French speaking)  Digits 3-5: wave indicator (042 for wave 1 and 062 for wave 2 main survey)  Digits 6-11: household ID  Digits 12-13: longitudinal household split indicator 00 by default, if respondent moves out based on respid, e.g. if ‘moving out respondent’ has respid 01 it is changed to 01 Examples 1104200010000: Austria, starting in wave 1 (longitudinal sample) 2306214010300: Belgium (French), starting in wave 2 (refresher)  One needs to combine sampid with the respondent ID (respid) to identify and merge cases on the respondent level  Merging problems esp. for split households / ‘moving’ respondents across waves

5 Therefore...  We will change the system and  have unique person ids, that can be used to merge modules and waves  person id will not change across waves, even if a household splits  have string country codes instead of numeric ones  We will divide sampid into different parts:  household id (fixed part and split indicator if needed)  new wave indictor variable ‘wi’ indicates when a household first entered the sample

6 New household identifier hhid1 (internal) & hhid (public)  Digits 1-2: country code in letters. e.g. AT for Austria, Bf for Belgium French speaking (internal)  Digits 3-8: fixed household ID This part will not change across waves if household splits off  Digit 9: one digit added to the fixed household id to identify whether it is an ‘additional’ household that resulted from a split,  A for all ‘original’ household (all in wave 1, refresher in wave 2)  B used only if a household has split. A is than still used for the ‘first’ part of the household and B for the ‘splitting part’ (the one that is interviewed second, normally the one that moved out)  C is used for very rare case of split off household when original household in wave 1 consisted of 3 eligible sisters for example and split in 3 parts. Examples for new household id AT100100A: Austria, ‘original’ household AT100100B: Austria, split off household Bf140103A: Belgium French speaking household (internal)

7 New person identifier: person1  Digits 1-2: country code (CC) in letters e.g. AT for Austria, Bf for Belgium French speaking  Digits 3-8: fixed household ID this part will not change across waves.  Digit 9-10: respondent id, e.g if respid is 1 it will be 01 Respondent identifier oldnew Sampid & respidperson1 11100100001AT10010001 11100100002AT10010002 23140103001Bf14010301

8 Old and new ids internal versionpublic version (scrambled) oldnewoldnew Household ID & wave indicator sampidhhid1 & wi sampid2hhid & wi Person idsampid & respid person1 sampid & respid personid

9 In addition:  A dataset will be generated that shows to which households a respondent belonged during her or his ‘SHARE history’, e.g.: person1hhid1_w1hhid1_w2hhid1_w3 AT10010001AT100100A AT10010002AT100100A AT100100B Bf14010301Bf140103A  A compatibility file will be made for internal use to merge the old sampid respid files with the new ids  We will have an additional person id (uuid) to insure uniqueness, but it will be used in the background only for technical reasons

10 Data cleaning  always use the unscrambled version that includes sampid for data cleaning  use sampid and respid to identify respondents  generate/compute sampid_original, respid_original and cvid_original before you change ids

Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE data versions & IDs Stephanie Stuck MEA Antwerpen February 2008.

Similar presentations

Presentation on theme: "Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE data versions & IDs Stephanie Stuck MEA Antwerpen February 2008."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE data versions & IDs Stephanie Stuck MEA Antwerpen February 2008.

Similar presentations

Presentation on theme: "Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE data versions & IDs Stephanie Stuck MEA Antwerpen February 2008."— Presentation transcript:

Similar presentations

About project

Feedback