Presentation is loading. Please wait.

Presentation is loading. Please wait.

Task Force on Victimization Eurostat, October 2011 Guillaume Osier

Similar presentations


Presentation on theme: "Task Force on Victimization Eurostat, October 2011 Guillaume Osier"— Presentation transcript:

1 Sampling one or all persons per household in the EU Safety Survey: pros and cons
Task Force on Victimization Eurostat, October 2011 Guillaume Osier Institut National de la Statistique et des Etudes Economiques (STATEC) Social Statistics Division

2 Aim of the presentation
Compare two possible approaches for sampling individuals per household Select one individual at random among all the household members aged 16 or more (like in the ICVS) Interview all the household members aged 16 or more (like in the US National Crime and Victimization Survey – NCVS) The choice has implications for data quality, mainly sampling variance, non-response rate and measurement errors, and for the overall cost of the survey.

3 Sampling variance (1/4) Four assumptions:
A simple random sample of m households from a population of M elements A simple random sample ofn individuals aged 16 or more selected from every household All the households in the population have the same numberN of individuals aged 16+ The total sample size n of individuals is fixed

4 Sampling variance (2/4) Parameter of interest: share of the individuals aged 16 or more who have been victim of a certain type of crime (victimization rate): * Yi = 1 if i has been victim of the crime, 0 otherwise * N = total number of individuals aged 16+ in the population Under the four previous assumptions, we obtain:

5 Sampling variance (3/4) n = total sample size of individuals (supposed to be fixed) = minimum SASU sample size in the Regulation  = intra-household correlation coefficient (« cluster effect »)

6 Victimization rate in household h
Sampling variance (4/4) Basically  measures the degree of homogeneity within household: the higher , the more homogenous households with regard to Y, that is, Yi = Yj for all individuals i and j in household h. When  is equal to 1, either all the members of a household have been victim of the crime or all have not

7 Main results (1/4) When  is high enough, sampling two or more individuals per household makes sampling accuracy worse: when the members of a household tend to be similar to each other (with regard to the variable of interest), we don’t gain any information by collecting redundant information over all the members of a household, while we lose accuracy by sampling fewer households (assuming the total sample size n of individuals is fixed)

8 Main results (2/4) When  = 1, we have:
In this case, if we interview more persons per households, then the level of accuracy will go down (“we divide the sample size byn”) For household-level crimes (household burglary, vehicle theft) we have by construction  = 1 For individual-level crimes (robbery, theft, violence…),  < 1 although it should remain high

9 Intra-household correlation coefficient (rho)
n=8000, P=10%, Nbar=5 Intra-household correlation coefficient (rho) 0.1 0.2 0.5 0.7 0.8 0.9 1.0 Relative margin of error (%) Number of Individuals interviewed per household (nbar) 1 6.6 2 6.1 7.7 8.4 8.7 9.0 9.3 3 5.7 9.9 10.4 10.9 11.4 4 5.2 9.6 11.1 11.9 12.5 13.1 5 4.6 12.3 13.9 14.7 Absolute margin of error (% points) individuals 0.6 1.1 1.2 1.3 1.4 1.5

10 n=7000, P=10%, Nbar=5 Intra-household correlation coefficient (rho)
0.1 0.2 0.5 0.7 0.8 0.9 1.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 7.0 2 6.6 8.2 9.0 9.3 9.6 9.9 3 6.1 10.5 11.1 11.7 12.2 4 5.6 10.2 11.9 12.7 13.4 14.1 5 5.0 13.1 14.9 15.7 Absolute margin of error (% points) 0.6 1.1 1.2 1.3 1.4 1.5 1.6

11 n=6000, P=10%, Nbar=5 Intra-household correlation coefficient (rho)
0.1 0.2 0.5 0.7 0.8 0.9 1.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 7.6 2 7.1 8.9 9.7 10.0 10.4 10.7 3 6.6 11.4 12.0 12.6 13.1 4 6.0 11.1 12.9 13.7 14.5 15.2 5 5.4 14.2 16.1 17.0 Absolute margin of error (% points) 1.1 1.2 1.3 0.6 1.4 1.5 1.6 1.7

12 n=5000, P=10%, Nbar=5 Intra-household correlation coefficient (rho)
0.1 0.2 0.5 0.7 0.8 0.9 1.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 8.3 2 7.8 9.8 10.6 11.0 11.4 11.8 3 7.2 12.5 13.1 13.8 14.4 4 6.6 12.1 14.1 15.0 15.8 16.6 5 5.9 15.6 17.6 18.6 Absolute margin of error (% points) 1.1 1.2 1.3 1.4 1.5 1.6 1.7 0.6 1.8 1.9

13 n=3000, P=10%, Nbar=5 Intra-household correlation coefficient (rho)
0.1 0.2 0.5 0.7 0.8 0.9 1.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 10.7 2 10.0 12.6 13.7 14.2 14.7 15.2 3 9.3 16.1 17.0 17.8 18.6 4 8.5 15.6 18.2 19.4 20.4 21.5 5 7.6 20.1 22.8 24.0 Absolute margin of error (% points) 1.1 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.3 2.4

14 Main results (3/4) The sampling variance is also influenced by the victimization rate P: With regard to the absolute margin of error (half-length of the confidence interval), the closer P to 0.5, the more important the increase or decrease in (absolute) margin of error withn (number of persons to be interviewed per household) With regard to the relative margin of error (=absolute margin / P), the closer P to 0, the more important the increase or decrease in (relative) margin of error withn

15 Victimization rate P (%)
n=8000, =0.6%, Nbar=5 Victimization rate P (%) 1.0 2.0 5.0 10.0 15.0 20.0 50.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 21.8 15.3 9.6 6.6 5.2 4.4 2.2 2 26.7 18.8 11.7 8.1 6.4 5.4 2.7 3 30.8 21.7 13.5 9.3 7.4 6.2 3.1 4 34.5 24.3 15.1 10.4 8.2 6.9 3.5 5 37.8 26.6 16.5 11.4 9.0 7.6 3.8 Absolute margin of error (% points) 0.2 0.3 0.5 0.7 0.8 0.9 1.1 0.4 0.6 1.3 1.2 1.5 1.4 1.7 1.9

16 n=7000, =0.6%, Nbar=5 Victimization rate P (%) 1.0 2.0 5.0 10.0 15.0
20.0 50.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 23.3 16.4 10.2 7.0 5.6 4.7 2.3 2 28.5 20.1 12.5 8.6 6.8 5.7 2.9 3 33.0 23.2 14.4 9.9 7.9 6.6 3.3 4 36.9 25.9 16.1 11.1 8.8 7.4 3.7 5 40.4 28.4 17.7 12.2 9.7 8.1 4.1 Absolute margin of error (% points) 0.2 0.3 0.5 0.7 0.8 0.9 1.2 0.4 0.6 1.1 1.4 1.3 1.7 1.5 1.9 1.6

17 n=6000, =0.6%, Nbar=5 Victimization rate P (%) 1.0 2.0 5.0 10.0 15.0
20.0 50.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 25.2 17.7 11.0 7.6 6.0 5.1 2.5 2 30.8 21.7 13.5 9.3 7.4 6.2 3.1 3 35.6 25.0 15.6 10.7 8.5 7.2 3.6 4 39.8 28.0 17.4 12.0 9.5 8.0 4.0 5 43.6 30.7 19.1 13.1 10.4 8.8 4.4 Absolute margin of error (% points) 0.3 0.4 0.6 0.8 0.9 1.3 0.7 1.1 1.2 1.5 0.5 1.4 1.8 1.6 2.2

18 n=5000, =0.6%, Nbar=5 Victimization rate P (%) 1.0 2.0 5.0 10.0 15.0
20.0 50.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 27.6 19.4 12.1 8.3 6.6 5.5 2.8 2 33.8 23.8 14.8 10.2 8.1 6.8 3.4 3 39.0 27.4 17.1 11.8 9.3 7.8 3.9 4 43.6 30.7 19.1 13.1 10.4 8.8 4.4 5 47.8 33.6 20.9 14.4 11.4 9.6 4.8 Absolute margin of error (% points) 0.3 0.4 0.6 0.8 1.1 1.4 0.5 0.7 1.2 1.7 0.9 1.6 1.3 1.8 2.2 1.9 2.4

19 n=3000, =0.6%, Nbar=5 Victimization rate P (%) 1.0 2.0 5.0 10.0 15.0
20.0 50.0 Relative margin of error (%) Number of individuals interviewed per household (nbar) 1 35.6 25.0 15.6 10.7 8.5 7.2 3.6 2 43.6 30.7 19.1 13.1 10.4 8.8 4.4 3 50.4 35.4 22.1 15.2 12.0 10.1 5.1 4 56.3 39.6 24.7 17.0 13.5 11.3 5.7 5 61.7 43.4 27.0 18.6 14.8 12.4 6.2 Absolute margin of error (% points) 0.4 0.5 0.8 1.1 1.3 1.4 1.8 0.6 1.6 2.2 0.7 1.5 2.5 1.2 1.7 2.3 2.8 0.9 1.9 3.1

20 Main results (4/4) The sampling variance is also influenced by the sample size of individuals n: the higher n, the more important the increase or decrease in the sampling variance Finally, it must be kept in mind that we assumed simple random sampling of households. If not, the design effect must be taken into account, and will make any change in accuracy stronger.

21 Cost of contacting a household
Other aspects (1/4) Although interviewing all household members is generally bad for sampling accuracy, this option offers advantages: Survey costs are reduced: in order to achieve a target sample size of individuals, surveying all the members aged 16+ in a household implies contacting much less households than if we interviewed one person per household. Cost of an interview Fixed cost Cost of contacting a household

22 Other aspects (2/4) If we interview one person per household:
If we interview all the members of a household:

23 Other aspects (3/4) The gain in cost should be particularly important with face-to-face surveys (c1 high), while it should be limited with telephone or web-based surveys (c1 low) Non-response should be reduced: household respondents may help interviewers by providing contact information for the other household members as well as times when they are likely to be available. Further, if their experience was positive, household respondents help to locate and motivate other household members to respond, a burden which would otherwise fall on interviewers.

24 Other aspects (4/4) Measurement errors might be reduced for certain questions: Having all the members of a household interviewed may also produce more accurate results, especially to household-level questions. On the other hand, data for multi-respondent households may be subject to certain biases for intra-household crimes such as domestic violence. More generally, for questions asking about personal attitudes, it is best not to have anybody else present, so that the respondent will feel free to give his or her true opinion. This measurement bias could be reduced if only one person per household was interviewed.

25 Tradeoff cost-accuracy (1/3)
We seek m (sample size of households) andn (sample size of individuals per household) which minimize the variance: Under the cost constraint: Fixed cost Cost of an interview (data collection, processing…) Cost of contacting a household

26 Tradeoff cost-accuracy (2/3)
We obtain the following: We should interview more individuals per household when: c1/c2 is high (it is costly to contact households)  is low (the households are heterogeneous)

27 Tradeoff cost-accuracy (3/3)

28 Conclusions 1. The option of interviewing all household members instead of one often leads to less accurate results in terms of sampling variance, mainly because the members of a household tend to be homogeneous with regard to victimization experiences. The loss of accuracy should be particularly important with household-level crimes like vehicle theft or home burglary. It should be important with person-level crimes (robbery, violence…) as well since the within-household correlation should remain high for these types of crimes.

29 2. From the cost perspective, interviewing all household members helps us to save money, particularly when the survey is carried out face-to-face. There are other advantages with regard to non-response rate and measurement errors; provided that one’s response is not influenced by the other members of the household (for instance, intra-household violence). The loss of sampling efficiency caused by interviewing more than one person per household should be balanced with all these aspects. 3. In order to reconcile statistical accuracy with survey cost, a tradeoff has to be made. However, the compromise solution in terms of both statistical accuracy and cost efficiency is variable-specific.

30 Given the impact on accuracy of interviewing more than one household member, the item dealing with minimum sample sizes should be clarified: - In order to attain the minimum sample sizes as set out in the Regulation, the sample sizes at country level need to be inflated in order to take into account unit non-response and design effects (stratification, clustering, unequal weighting…) - In particular, the impact on accuracy of interviewing more than one household member must be taken into account in the design effect  A clear definition of what design effect refers to and what are the components to be included (SASU manual?)

31 Questions to the TF * What about national surveys on victimization? Have you interviewed all household members or only one? What were the pros and cons? * What about intra-household correlation (“cluster effect”)? Have you already observed high values for the intra-household correlation coefficient with your data? * What about cost? Given the available budget, can you afford interviewing one member per household? If not, do you think that the loss of accuracy caused by interviewing more than one member per household is “acceptable” given the objectives of the survey?


Download ppt "Task Force on Victimization Eurostat, October 2011 Guillaume Osier"

Similar presentations


Ads by Google