Mannheim Research Institute for the Economics of Aging SHARE IDs Stephanie Stuck MEA Frankfurt December 6 th
2 Data versions and ID-variables internal versionpublic versionwho gets the id Household ID sampidsampid2 (scrambled version of sampid) all households Person ids cvid should be used to merge modules within waves all household members (in CV), that means: non-eligible persons get a cvid, too, e.g. children, other people living in the household respid should be used to merge waves all eligible household members (in CV), that means: all household members that should be interviewed, e.g. respondents and partners even if partners areyounger than 50 years Download site internal website version2.7/ country specific versions public website project.orgwww.share- project.org & internal website All Countries Available for respective country team, CentERdata, MEA working groups, external users
3 sampid rules (old) Digits 1-2: country code for example 11 for Austria, 23 for Belgium French speaking Digits 3-5: wave indicator indicates the wave in which the household participated for the first time (042 for wave 1 main survey and 062 for wave 2 main survey). Digits 6-11: household ID Digits 12-13: longitudinal household split indicator 00 by default, if household splits, respondent moves out of the household based on respid, e.g. if ‘moving out respondent’ has respid 01 it is changed to 01 Examples : Austria, starting in wave 1 (longitudinal sample) : Austria, starting in wave 1, split off household in wave : Belgium (French speaking), starting wave 2 (refresher) One needs to combine sampid with the respondent ID (respid) to identify and merge cases on the respondent level Merging problems esp. for split households / ‘moving’ respondents across waves
4 Therefore... We will change the system and have unique person ids, that can be used to merge modules and waves person id will not change across waves, even if a household splits have string country codes instead of numeric ones We will divide sampid into different parts: household id (fixed part and split indicator if needed) new wave indictor variable ‘wi’ indicates when a household first entered the sample
5 Old and new country codes (first two digits of household ids and person ids) Country (old) numeric code internal country code public country code Austria11AT Germany12DE Sweden13SE Netherlands14NL Spain15ES Italy16IT France17FR Denmark18DK Greece19GR
6 Old and new country codes (first two digits of household ids and person ids) Country(old) numeric code internal country code public country code Switzerland German20Cg CH Switzerland French21Cf Switzerland Italian22Ci Belgium French23Bf BE Belgium Flemish24Bn Israel Hebrew25Ih IL Israel Arabic26Ia Israel Russian27Ir Czechia28CZ Poland29PL Ireland30IE
7 New household identifier hhidcom (internal) & hhid (public) Digits 1-2: country code in letters. e.g. AT for Austria, Bf for Belgium French speaking (internal) Digits 3-8: fixed household ID This part will not change across waves if household splits off Digit 9: one digit added to the fixed household id to identify whether it is an ‘additional’ household that resulted from a split, A for all ‘original’ household (all in wave 1, refresher in wave 2) B used only if a household has split. A is than still used for the ‘first’ part of the household and B for the ‘splitting part’ (the one that is interviewed second, normally the one that moved out) C is used for very rare case of split off household when original household in wave 1 consisted of 3 eligible sisters for example and split in 3 parts. Examples for new household id AT100100A: Austria, ‘original’ household AT100100B: Austria, split off household Bf140103A: Belgium French speaking household (internal)
8 New person identifier: pidcom Digits 1-2: country code (CC) in letters e.g. AT for Austria, Bf for Belgium French speaking Digits 3-8: fixed household ID this part will not change across waves. Digit 9-10: respondent id, e.g if respid is 1 it will be 01 Respondent identifier oldnew Sampid & respidpidcom AT AT Bf
9 Old and new ids internal versionpublic version (scrambled) oldnewoldnew Household ID & wave indicator sampidhhidcom & wi sampid2hhid & wi Person idsampid & respid pidcom sampid & respid mergeId
10 In addition: A dataset will be generated that shows to which households a respondent belonged during her or his ‘SHARE history’, e.g.: pidcomhhidcom_w1hhidcom_w2hhidcom_w3 AT AT100100A AT AT100100A AT100100B Bf Bf140103A A compatibility file will be made for internal use to merge the old sampid respid files with the new ids We will have an additional person id (uuid) to insure uniqueness, but it will be used in the background only for technical reasons
11 Right now we still have to use the old system for data cleaning but we will have soon have the pidcom to merge across waves mergeid will already be included in release 0 as soon as the new system is available and checked we will inform you how to go on probably in the next SHARE data cleaning meeting (February 6, Antwerp)