Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.

Slides:



Advertisements
Similar presentations
Determining How to Select a Sample
Advertisements

Sampling Methods.
Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
QBM117 Business Statistics Statistical Inference Sampling 1.
Who and How And How to Mess It up
Sampling.
Fundamentals of Sampling Method
Chapter 12 Sample Surveys
A new sampling method: stratified sampling
Stratified Simple Random Sampling (Chapter 5, Textbook, Barnett, V
Stratified Sampling Lecturer: Chad Jensen. Sampling Methods SRS (simple random sample) SRS (simple random sample) Systematic Systematic Convenience Convenience.
Course Content Introduction to the Research Process
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. How to Get a Good Sample Chapter 4.
Formalizing the Concepts: STRATIFICATION. These objectives are often contradictory in practice Sampling weights need to be used to analyze the data Sampling.
BA 427 – Assurance and Attestation Services
Sampling Design.
Determining the Size of
Sampling Moazzam Ali.
Lecture 30 sampling and field work
Determining Sample Size
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Census Sampling Frames and Sampling Section A 1.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Near East Regional Workshop - Linking Population and Housing Censuses with Agricultural Censuses. Amman, Jordan, June 2012 Improving Efficiency.
Sampling: Theory and Methods
CHAPTER 12 – SAMPLING DESIGNS AND SAMPLING PROCEDURES Zikmund & Babin Essentials of Marketing Research – 5 th Edition © 2013 Cengage Learning. All Rights.
Estimation of Statistical Parameters
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
9 th Workshop on Labour Force Survey Methodology – Rome, May 2014 The Italian LFS sampling design: recent and future developments 9 th Workshop on.
Quality strategies in cross- national surveys The case of the European Social Survey Ineke Stoop.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
Countries of Europe France Spain Italy Germany Which country is this?
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
Population and Sampling
Population and sample. Population: are complete sets of people or objects or events that posses some common characteristic of interest to the researcher.
Lecture 9 Prof. Development and Research Lecturer: R. Milyankova
IPSG Expert Meeting – Customer Satisfaction Mapping 1st December 2005.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
1 Chapter Two: Sampling Methods §know the reasons of sampling §use the table of random numbers §perform Simple Random, Systematic, Stratified, Cluster,
UK Trade in Goods Statistics – A QIF project Rafael Mastrangelo (HMRC) Jonathan Digby-North (ONS)
International Crime Victim Survey International Crime Business Survey Anna Alvazzi del Frate UNODC/PARB/RAS.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling and Sampling Distributions.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Level and growth rate of per capita GDP Gross domestic product (GDP) is a measure of economic activity: value of all goods and services produced and sold.
Survey Methodology EPID 626 Sampling, Part II Manya Magnus, Ph.D. Fall 2001.
Bangor Transfer Abroad Programme Marketing Research SAMPLING (Zikmund, Chapter 12)
 When every unit of the population is examined. This is known as Census method.  On the other hand when a small group selected as representatives of.
STATISTICAL DATA GATHERING: Sampling a Population.
Sampling technique  It is a procedure where we select a group of subjects (a sample) for study from a larger group (a population)
Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.
USAGE OF DRUGS IN EUROPE LSD CANNABIS. ALL ADULTS (15-64) USAGE OF LSD IN EUROPE All adults (15-64) Usage of LSD in Europe datesample sizemalefemaletotal.
Northern Europe Label the following countries on the next page, using the color each countries is labeled in, then add capitals to each country using a.
Guillaume Osier Institut National de la Statistique et des Etudes Economiques (STATEC) Social Statistics Division Construction.
RESEARCH METHODS Lecture 28. TYPES OF PROBABILITY SAMPLING Requires more work than nonrandom sampling. Researcher must identify sampling elements. Necessary.
Population vs Sample Population = The full set of cases Sample = A portion of population The need to sample: More practical Budget constraint Time constraint.
Sampling.
Graduate School of Business Leadership
SAMPLING (Zikmund, Chapter 12.
Sampling Techniques & Samples Types
Welcome.
Selection of cities Anastasios Maroudas Eurogramme
EU: First- & Second-Generation Immigrants
SASU manual: sampling issues
Andreas Krüger, Eurostat - Unit C2 National Accounts - production
Task Force on Victimization Eurostat, October 2011 Guillaume Osier
Sampling issues related to the implementation of EDSIM/ESIHSI
Observed differences for net lending / net borrowing between annual non-financial and financial accounts (ESA tables 6 and 8 compared) Item 7 Eurostat.
European Union Membership
Task force on victimisation 4. Precision requirements
Prodcom Statistics in Focus
Presentation transcript:

Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, February 2010 Guillaume Osier Service Central de la Statistique et des Etudes Economiques (STATEC) Social Statistics Division

Outline I.Some theory 1. Definitions and concepts 2. How to over-sample? 3. Why over-sample? 4. Impact on national accuracy II.Over-sampling the capital cities in the EU-SASU 1. Is this proposal (statistically) relevant? 2. How to determine the over-sampling rates? 3. Impact on the national accuracy III.Specific issues in relation to over-sampling

Definitions and concepts (i) A sub-group (d) in the population is said to be over-sampled (or over-represented) when the proportion of units from the sub-group is, on average, higher in the sample than in the reference population: (ii) Conversely, a sub-group is said to be under-sampled (or under- represented) when the proportion of units from the sub-group is, on average, lower in the sample than in the reference population: (iii) When a sub-group is neither over-sampled nor under-sampled, it is said to be well-sampled (or well-represented) Proportion of units from (d) in the population Average proportion of units from (d) in the sample

How to over-sample? In order to get implemented, over-sampling requires the units in the sub-group to be identified in advance of sampling (issue with telephone surveys) Two main techniques to over-sample: Stratification using unequal sampling fractions in the strata More general « proportional-to-size » sampling (  ps, pps…) Over-sampling rate for (d): Expected sample size in (d) under no over-sampling (i.e. under Simple Random Sampling) Expected sample size in (d)

Why over-sample? 1/2 By selecting more people from certain groups than would typically be done if everyone in the sample had an equal chance of being selected, over-sampling leads to more accurate estimates for those groups. The technique has proven particularly suitable to: Small sub-populations; Sub-populations having severe non-response problems; Sub-populations with large internal variability on the key variables (e.g., household wealth)

Why over-sample? 2/2 More generally, one can resort to over-sampling whenever the sample size doesn’t allow us to reach specified precision targets over certain sub-populations. Besides, in cross-national surveys (like the EU-SASU), over-sampling is essential for precision and hypothesis testing in cross-country comparisons. The choice of the sub-groups to over-sample is policy- driven (political matter)

Impact on national accuracy 1/3 Optimal (Neyman) allocation: in order to maximize the precision of the national sample under stratified simple random sampling, the sample size in stratum h depends both on the stratum population N h and the standard deviation S h of the study variable Stratum 1 Size N 1 St. deviation S 1 Stratum 2 Size N 2 St. deviation S 2 Stratum H Size N H St. deviation S H … Total population aged 16+

Impact on national accuracy 2/3 According to the previous formula, a larger sample should be taken if: * the stratum is larger * the stratum is more variable internally These national considerations may conflict with more “local” considerations: as said, from a local point of view, over-sampling often focus on small sub-populations, while national considerations lead to taking larger samples from the largest strata. Nevertheless, the loss in national accuracy is often limited:

Impact on national accuracy 3/3 Thus, if g=20%, we have  /  (opt)  1.02, which makes an increase in accuracy (as measured by the standard error) of 2%. Similarly, if g=30%, we have  /  (opt)  1.04, which makes an increase of 4%. In this sense the optimum can be described as flat. As a result, the impact of over-sampling on national accuracy should be limited, provided the sample sizes are not “extremely” different from the optimal ones. The impact is all the more limited given that the national sample sizes are generally large (thousands of units). Besides, by using powerful auxiliary information at national level, one may hope to increase sample precision a posteriori.

Over-sampling the capital cities in the EU- SASU: is this proposal relevant? Capital city = most populated city of the country Always the same as the political capital (except for Switzerland) Is the proposal (statistically) relevant? Sample size of individuals over the capital cities: is it enough to draw reliable conclusions? Victimization rates in the capital cities: are they generally higher than those for the rest of the country? Higher non-response in the capital cities? (often correct)

Minimum sample sizes for the capital cities

Source: International Crime and Victimization Survey (ICVS), 2005 Victimization rates in capital cities  Victimization rates are higher in the capital cities than in the rest of the countries

How to determine the over-sampling rates? 1/4 Step 1: set up a precision target for every capital cities Step 2: determine the minimum sample size needed to achieve the level of precision specified at Step 1 Precision target (1): under simple random sampling, a relative margin of error of  % in each capital city for any victimization rate higher than P%

 = 10% How to determine the over-sampling rates? 2/4

P = 20% How to determine the over-sampling rates? 3/4

Precision target (2): under simple random sampling, an absolute margin of error of  % points in each capital city for any victimization rate higher than P% How to determine the over-sampling rates? 4/4

Consider the national victimization rate for the 10 main crimes as used in the International Crime and Victimization Survey (ICVS): Impact on the national accuracy 1/8 Victimization rate in the capital city Victimization rate in the rest of the country

Variance: Impact on the national accuracy 2/8 Relative margin of error: Absolute margin of error:

Case 1: fixed national sample size Impact on the national accuracy 3/8

Impact on the national accuracy 4/8 Table 3: Relative margin of error (%) for the national victimization rate – fixed sample size at national level (Case 1) Country Over-sampling No over-sampling P=0.1P=0.2P=0.3P=0.4P=0.5 France Germany Switzerland Italy Poland Netherlands Portugal Denmark Greece Spain Sweden Finland Norway Ireland Belgium United Kingdom Hungary Austria Estonia

Impact on the national accuracy 5/8 Table 4: Absolute margin of error (% points) for the national victimization rate – fixed sample size at national level (Case 1) Country Over-sampling No over-sampling P=0.1P=0.2P=0.3P=0.4P=0.5 France Germany Switzerland Italy Poland Netherlands Portugal Denmark Greece Spain Sweden Finland Norway Ireland Belgium United Kingdom Hungary Austria Estonia

Case 2: national sample size not fixed Impact on the national accuracy 6/8

Impact on the national accuracy 7/8 Table 5: Relative margin of error (%) for the national victimization rate – national sample size not fixed (Case 2) Country Over-sampling No over-sampling P=0.1P=0.2P=0.3P=0.4P=0.5 France Germany Switzerland Italy Poland Netherlands Portugal Denmark Greece Spain Sweden Finland Norway Ireland Belgium United Kingdom Hungary Austria Estonia

Impact on the national accuracy 8/8 Table 6: Absolute margin of error (% points) for the national victimization rate – national sample size not fixed (Case 2) Country Over-sampling No over-sampling P=0.1P=0.2P=0.3P=0.4P=0.5 France0.7 Germany0.7 Switzerland0.9 Italy0.7 Poland0.8 Netherlands Portugal Denmark Greece0.7 Spain0.6 Sweden0.8 Finland Norway Ireland1.0 Belgium0.8 United Kingdom0.8 Hungary0.6 Austria Estonia0.81.0

Specific issues The initial difficulty is in obtaining the sampling frame appropriate for the over-sampling the inhabitants of the capital cities. For the countries conducting a face-to-face survey, this should not be a serious issue. On the other hand, the countries which plan to conduct the survey by telephone might be unable to do so; unless specific phone numbers are allocated to the households in the capital city (e.g., when the first digits of a phone number represent the city code) Since individuals in capital cities are in general more difficult to contact, over-sampling them will necessitate more attempted contacts; which will likely imply higher costs and more time to reach the minimum sample size required for the survey. Finally, over-sampling might make the problem of anonymisation of the data more acute

Questions for the TF 1. Is over-sampling the habitants of the capital cities policy relevant? Which geographical areas might be over-sampled instead? NUTS2 or NUTS3 regions Groups of cities (like in Eurostat’s Urban Audit) Densely populated areas (based on degree or urbanization) City areas…. 2. What level of accuracy is needed for the capital cities/other geographical areas? 3. What about higher non-response? 4. What about telephone surveys?