Download presentation
Presentation is loading. Please wait.
Published byDwayne Wilkerson Modified over 8 years ago
1
Copyright © 2015 Inter-American Development Bank. This work is licensed under a Creative Commons IGO 3.0 Attribution-Non Commercial-No Derivatives (CC-IGO BY-NC-ND 3.0 IGO) license (http://creativecommons.org/licenses/by-nc-nd/3.0/igo/legalcode) and may be reproduced with attribution to the IDB and for any non-commercial purpose. No derivative work is allowed.http://creativecommons.org/licenses/by-nc-nd/3.0/igo/legalcode Any dispute related to the use of the works of the IDB that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. The use of the IDB’s name for any purpose other than for attribution, and the use of IDB’s logo shall be subject to a separate written license agreement between the IDB and the user and is not authorized as part of this CC-IGO license. Note that link provided above includes additional terms and conditions of the license. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the Inter-American Development Bank, its Board of Directors, or the countries they represent.
2
Sampling Topics Relevant to Impact Evaluations Juan Muñoz Sistemas Integrales
3
Household samples –Selection of primary sample units –Selection of households –Selection of individuals –Survey plan Facilities sampling –Selection of administrative areas –Selection of facilities –Selection of service providers –Selection of users (exit polls) –Selection of households (catchment areas/coverage) No response Estimating differences –Advantages and disadvantages of panel surveys –Introduction to the concept of power
4
Household Samples Selecting Primary Sample Units 4 Primary Sample Units (PSUs) are census area units generated by the latest census. PSUs usually have between 50 and 200 households. The sampling frame is a relatively small file. It is convenient to use Excel. PSUs for the sample are chosen with a probability proportional to size (Probability Proportional to Size, PPS). Chosen PSUs must be recognizable in the field. Need to collaborate with the national institute of statistics Presence of smaller PSUs might require some prior work We will see how to do this next class A data file is not enough. A map is also required.
5
Household Samples Selecting Households 5 The most appropriate sampling frame is a list of all the households. The list should include all the households in each of the chosen PSUs. This is a fieldwork operation that requires time and money. Time and money are: –Marginal in relation to the study’s budget and schedule, if they are planned properly. –Sufficiently big to be a serious problem if neglected. PSUs that are too big might need to be split up. Information to have in the list of households: –As a minimum, name of the head of household and address. –Additional information required by the impact evaluation (presence of children or pregnant women, for example). Selection of households based on the listing should not be done by the enumerators themselves. We’ll see how to do this next class Don’t collect more information than necessary Do not trust “random walks” The census is not a good alternative
6
Household Sample Selecting Individuals 6 Some impact evaluations require the selection of one individual out of many eligible individuals in each household. For example: –One of the children –One of the women of fertile age Enumerators carry out the selection, which should be –Random, but also –Monitorable Methods –Kish grids –Random tags –Birthday method We will look at an example next session This is why you may not use data or other seemingly random alternatives.
7
Household Samples Survey Plan 7 The duties of the person selecting the sample do not end with the selection of the PSUs. Sampling should also specify –Assignment of PSUs within field work teams –When each PSU will be visited If this is not explicitly specified, the firm performing the survey can recur to undesirable alternatives –The worst is “systematic sweeping” of the territory, which can confuse time and space in the analytical phase. The best is to plan the survey so that each region is observed throughout the entire data collection period. It is better to have a long data collection period with few enumerators than a short period with many enumerators. Also, many enumerators can be randomly assigned to the same region (interpenetrating samples)
8
Facility Samples 8 Many impact evaluation studies need to observe different types of elements –Large administrative areas (e.g., regions) –Intermediate administrative units (e.g., municipalities) –Facilities (schools, hospitals, clinics…) –Service providers (teachers, doctors, …) –Clients (students, patients, …) –Households Analysis must reflect hierarchical links How are the samples chosen?
9
Sampling Facilities 9 ?
10
Two possibilities 10 Down, up. (multistage sampling) –First, choose regions –Then, districts in the chosen regions –Then, hospitals in the chosen districts. Up, down. –First choose a facility sample –This implicitly defines a district sample –This, in turn, defines the region sample Disadvantages –High design effects –Poor understanding of higher levels –Vulnerable to deliberate selections Disadvantages –Slightly costlier.
11
In each facility To choose providers (teachers, doctors, etc.) 11 Samples have to be random (may be stratified) Enumerators have to be trusted with the selection… …but the selection must be replicable (by supervision) This can be done using –Kish grids –Random tags, or –The birthday method
12
In each health facility To choose patients (exit polls) 12 Selection must be random and trusted to enumerators. It cannot be replicable and the Hawthorne effect is inevitable, but the procedures should avoid the following biases –Selection bias, on behalf of the enumerator –Selection bias, on behalf of the patient –Related to the day of the week –Related to the time of day To do this properly you need two people: –An enumerator –A person in charge of managing patients We will see how to do this next session
13
In each facility To choose households in the coverage area 13 There is nothing better than the method we just saw: two- stage sampling, with PSUs in the first stage and households in the second However it is necessary to match with the territorial partition defined by the ministry with the census maps Using a list of households is still the best method for selecting households There are alternative methods. Despite their appealing names, they are not advisable: –Expert selection –“Random walks” –“Snowballing” –Etc.
14
No Response 14 Possible solutions… Replace those who do not respond with similar individuals Increase the sample size to compensate Use correction formulas Use imputation techniques (hot-deck, cold-deck, warm- deck, etc.) to simulate the responses of those who do not respond None of the above The main problem caused by no responses is not the reduction in the sample size, but the bias it induces. ✔ x x x x
15
15 The best way to approach no responses is to prevent them Lohr, Sharon L. Sampling: Design & Analysis (1999)
16
16 Total Nonresponse enumerator Type of Survey Respondents Training WorkloadMotivation Qualification Data Collection Method Demographic Socio-economic Biological samples Fatigue Motivation Proxy Availability Source: “Some factors affecting Non-Response.” by R. Platek. 1977. Survey Methodology. 3. 191-214
17
Panel surveys measure change better 17 Y 2010 Y 2012 20102012 It seems that Y 2010 > Y 2012 but… …both measurements suffer from sampling errors (e 2010 y e 2012 ) The error in the difference of Y 2012 - Y 2010 is… …√ (e² 2010 + e² 2012 ) if the samples are independent …only √(e² 2010 +e² 2012 –2ρ[Y 2010,Y 2012 ] e 2010 e 2012 ) if the sample is the same
18
Advantages & disadvantages of panel surveys 18 Analytical advantages –Measure change better –Better explain the reasons for the change –Correlate past and present behavior Analytical disadvantages –Aging: the sample becomes progressively less representative of the population Practical disadvantages –Attrition –Harder to manage –Vulnerable to manipulation –Must be designed prospectively Practical advantages –Can reduce the sample size without losing power –Do not require additional design efforts beyond the initial effort
19
Was there an impact? The notion of power 19 What happened in the world Accept H 0 if ŷ 2 – ŷ 1 ≤ D Reject H 0 if ŷ 2 – ŷ 1 > D OK Type I ErrorOK Type II Error P (Type I Error) = α P (correctly accepting H 0 ) = 1 - α P (correctly rejecting H 0 ) = 1 – β P (Type II Error) = β What we decide Null hypothesis H 0 There was no impact H 0 : y 2 – y 1 = 0 Alternative hypothesis H A There was an impact H A : y 2 – y 1 = Δ What we observe ŷ 1 ŷ 2 Effect size Significance level Power
20
20
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.