Presentation is loading. Please wait.

Presentation is loading. Please wait.

Collecting the household data as a sub-sample. Rome May 2014 Jonas Kylov Gielfeldt.

Similar presentations


Presentation on theme: "Collecting the household data as a sub-sample. Rome May 2014 Jonas Kylov Gielfeldt."— Presentation transcript:

1 Collecting the household data as a sub-sample. Rome May 2014 Jonas Kylov Gielfeldt

2 The broader frame – why are we collecting household data? Are households the “natural” unit for collecting LFS variables? Not very often! (jobless households and…) Are the household unit better for data collection? Sometimes yes! With CAPI-mode household is very sensible, but not for CAWI and CATI- mode.

3 All NSI’s work in an environment were resources are sparse(r). Is it justifiable to use a lot of resources on collecting household data if a) there is no gain in terms of collection mode b) there is no strict substantial reason for collecting the variables on households instead of on individuals? The economy of it all… 3

4 The Danish case 4 In Denmark we collect the core-LFS through CATI- mode – this model is better suited for individuals as the unit. We are obliged to collect the household data, this is done in a combination of CAWI/CATI Since collecting on household does not fit our collection-mode and we do not see the substantial reason for collecting LFS-variables we use a sub-sample to minimize costs.

5 The core-LFS gross sample – 40.000 persons pr. quarter The number of respondents pr. quarter – 22.000 persons The gross sub-sample – 11.000 persons (not including Core-LFS respondents) The number of respondents – 6.000 persons The core-sample and the sub-sample 5

6 Why to use a sub-sample 6 If the NSI primarily uses CATI - collecting the whole household through this mode will increase costs significantly. Collecting household as the core-sample quadruples the costs! Otherwise diminish the sample size, risking increased bias/cluster effect (household members are often equal) Costs saved by sub-sampling 6000 respondents quadrupled Euro DKR (7,45) Number of respondents24.000 Current price in average for ca. 6000 respondents on HH29.600220.520 Price quadrupled118.400882.080 Difference (saved costs)88.800661.560

7 Different sample sizes – means different weighting models 7 The weighting model of the Core-LFS VariablesGroupings -age1111 grp Information is crossed-sex2 grp -region5 grp Information is crossed -age66 grp -education3 grp -socio-economic status8 grp -number of children in the household4 grp -citizenship4 grp -registered as unemployed12 grp -brutto income4 grp -moved2 grp

8 Quite a big non-response in the Danish LFS, but a lot of high quality registers. This is used as auxiliary information in a rather complex weighting model. The weighting model is optimized for the number of individuals in the population and especially wants to control bias on fx labour market status, education etc. On the core-LFS weighting model 8

9 The weighting model of the household-sample 9 VariablesGroupings -age3 grp Information is crossed-sex2 grp -family type6 grp -size of household4 grp A person from Household has moved 2 grp -Only danes in household or mixed household2 grp -average age of the household3 grp -brutto household income4 grp

10 On the household weighting model This weighting model is optimized for both the number of individuals in the population, but also the total number of households This means that new variables must be added as auxiliary information (family type, size of household etc.) At the same time – smaller smaple size limits the amount of auxiliary information

11 Differences in estimates – the example on education Education is added as auxiliary information in the core-LFS but not in the household This means differences in estimates Highest level of education completed (25-64 years) - %2011 Core-LFS2011 HH-LFS2012 Core-LFS2012 HH-LFS2013 Core-LFS2013 HH-LFS -At most lower secondary level23,119,422,118,321,719,1 -Upper secondary level43,241,643,14142,840,6 -Third level33,73934,840,635,440,2 2011 Core-LFS2011 HH-LFS2012 Core-LFS2012 HH-LFS2013 Core-LFS2013 HH-LFS - min. ISCED3c long / upper secondary level (20-24 years) - %70,074,972,074,971,876,1 -Early leavers from education and training (18- 24 years) - %9,77,99,18,08,16,9

12 The auxiliary information on education The difference between Core and household-LFS shows that the auxiliary information helps dealing with the overrepresentation of higher educated. But it is not possible to use this information in the household model, since it would make it too complex. The household model does not handle the bias at all

13 Conclusion There are significant economical gains from collecting the household part of the LFS as a sub-sample. But when constructing the weighting model sub-samples can limit the level of complexity.


Download ppt "Collecting the household data as a sub-sample. Rome May 2014 Jonas Kylov Gielfeldt."

Similar presentations


Ads by Google