Jon Pedersen: Validation of CRVS through surveys Some considerations

Slides:



Advertisements
Similar presentations
ESTIMATION OF THE NET MIGRATION BY COMPARING TWO SUCCESSIVE CENSUSES Michel POULAIN GéDAP UCL Belgium.
Advertisements

Population Estimates 2012 Texas State Data Center Conference for Data Users May 22, 2012 Austin, TX.
Who and How And How to Mess It up
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
A new sampling method: stratified sampling
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Sub-regional Workshop on Census Data Evaluation, Phnom Penh, Cambodia, November 2011 Evaluation of Census Data using Consecutive Censuses United.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Bangkok,
SAMPLING TECHNIQUES. Definitions Statistical inference: is a conclusion concerning a population of observations (or units) made on the bases of the results.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
Sampling Sources: -EPIET Introductory course, Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole -IDEA Brigitte Helynck, Philippe Malfait,
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 5 Probability Distributions 5-1 Overview 5-2 Random Variables 5-3 Binomial Probability Distributions.
Institute of Professional Studies School of Research and Graduate Studies Selecting Samples and Negotiating Access Lecture Eight.
PHIA Surveys: Sample Designs and Estimation Procedures Graham Kalton Westat.
Population Projections
AC 1.2 present the survey methodology and sampling frame used
Market research THE TIMES 100.
Unconventional Approaches to Mortality Estimation
26. Classification Accuracy Assessment
Demographic Analysis Migration: Estimation Using Residual Methods -
Sampling Meaning, Types, Procedure
Chapter 14 Sampling PowerPoint presentation developed by:
Workshop on Demographic Analysis Fertility: Reverse Survival of Children & Mothers With Introduction to Own Children Methods.
Sampling: Design and Procedures
Sampling From Populations
Sampling.
Section 4.2 Random Sampling.
2a. WHO of RESEARCH Quantitative Research
Mortality: Introduction, Measurements
Methodologies & Procedures for Evaluation
Methodologies and Procedures for Evaluating Coverage and Content Error Pres. 6 United Nations Regional Workshop on the 2010 World Programme on Population.
RESEARCH METHODS Lecture 28
Chapter 11 Audit sampling
Sample &Sampling Design
Part III – Gathering Data
Determining How to Select a Sample
Sampling Methods and the Central Limit Theorem
Chapter 10 Samples.
Estimating mortality from defective data
Estimating Migration from Census data Issues for consideration
SAMPLING (Zikmund, Chapter 12.
SAMPLE DESIGN.
Sampling: Design and Procedures
Stratified Sampling STAT262.
Sampling Design.
SAMPLING.
Sampling: Design and Procedures
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Post Enumeration Surveys Pres. 2
Market Research Sampling Methods.
Warm Up Imagine you want to conduct a survey of the students at Leland High School to find the most beloved and despised math teacher on campus. Among.
Overview of Census Evaluation and Selected Methods Pres. 2
Overview of Census Evaluation and Selected Methods Pres. 2
Daniela Stan Raicu School of CTI, DePaul University
SAMPLING (Zikmund, Chapter 12).
Overview of Census Evaluation Methods
Overview of Census Evaluation and Selected Methods Pres. 2
BUSINESS MARKET RESEARCH
Keller: Stats for Mgmt & Econ, 7th Ed Data Collection and Sampling
Planning and Implementation of Post Enumeration Surveys Pres. 4
Random Variables Random variable a variable (typically represented by x) that takes a numerical value by chance. For each outcome of a procedure, x takes.
Methodologies and Procedures for Evaluating Coverage and Content Error Pres. 6 United Nations Regional Workshop on the 2010 World Programme on Population.
Quality assurance and assessment in the vital statistics system
MGS 3100 Business Analysis Regression Feb 18, 2016
Sadeq R Chowdhury JSM 2019, Denver
Presentation transcript:

Jon Pedersen: Validation of CRVS through surveys Some considerations What are the critical design and implementation issues to be considered for carrying out a validation study in Jordan – given that part of the refugee population lives outside of camps and that there are useful data from the 2015 population census? Can particular use be made of the 2015 Jordanian census and available UNHCR registration data? If so, what coverage limitations of the Jordanian Census and the UNHCR Registration data need to be considered? ◦ What are the critical design and implementation issues to be considered for a validation survey in Lebanon – given that the last population census was in 1932?

Main issues in a validation survey What should be estimated and with what estimators? How can the target population be reached?

Possible estimators A CRVS reports events, that are supposed to be totals derived from a finite population Vital events of type t (deaths, deaths at age x, births etc) observed in CVRS (vt,c) vs vital events in population (vt,p) Thus, we would like vt,p – vt,c = dt,c and dt,c should be 0 We would expect the difference to vary between t (i.e. births could be OK, but not early neonatal deaths). We are sensitive to underreporting to varying degree If survey for validation, then what we are doing is 𝑣 t,p – vt,c = d t,c 𝑣 t,p can be estimated in various ways Simply as an Horwitz-Tomphson estimator ( 𝑣 t,p= 𝑖=1 𝑛 𝑣 𝑖 𝑝 𝑖 ) As a capture – recapture type estimator (Petersen and so on) As an estimator based on some adaptive sampling scheme

Short aside on capture recapture Typical CRVS – survey table (Large) survey carried out after CRVS enumeration. The two are then matched. In demography constructed differently than the typical capture recapture, but actually the same corresponds to (i.e. Petersen estimator) Assumes a lot: The population is closed There are no immigration or emigration, no deaths or births All have the same chance of being observed in the first sample Marking individuals does not affect their chance of being reobserved Individuals are reliably identified as having been observed before or not But possible to relax assumptions with more complex estimators (rather dramatic effects) In CRVS Not in CRVS Total In survey N1 D N1+D Not in survey C N2 C+N2 N1+C D+N2 In CRVS Not in CRVS Total In survey R C Not in PES M See: Sometimes more complex modelling of each cell

Possible estimators: Derived Rate or ratios: Alternative to estimators of events Benefit: Intutively tells if the data makes sense Drawbacks Even if rates makes sense, there may be under-reporting of events May be difficult to interpret Survey derived rates are typically calculated differently than CRVS-derived ones (especially 1q0 and 5q0, Not a problem for births)

Possible estimators: secondary derived Calculation of diagnostic estimators from the rates (e.g. Proportion early neonatal of neonatal mortality) Benefit: May be quite revealing Drawbacks: Standards may be changing (and are only partly known) Variance

Different situations for validation surveys Far from complete CRVS Nearly complete Target population is «elusive» (H2R) Main challenge lies in surveying elusive population Both elusive and sample size challenge Target population is standard Degree of non-completeness relatively easy to estimate Determining completeness requires large samples

Coverage of CVRS Vital events of type t (births, deaths, deaths at age x)observed in CVRS (vt,c) vs vital events in population (vt,p) Thus, we would like vt,p – vt,c = dt,c and dt,c should be 0 We would expect the difference to vary between t. Alternatively we may express the differences as differences between rates or ratios, but: since CRVS rates differently constructed from survey based ones, difficult to do when CRVS is close to completeness (because difference in calculation method matters but simpler to focus on totals, i.e. the number of events themselves rather than estimators derived from them) Problem is estimating vital events in the population from a survey with Sufficent precision (sampling and measurement uncertainty) Lack of bias

Reaching the population: the examples of Jordan and Lebanon «Easy» part and difficult part: Camps, vs displaced outside camps 2015 census Surprising population size Good delineation of enumeration area cartography DoS traditionally not so good on actual listing within EAs. unclear why large number of migrants reported in census, as they are usually not covered well in surveys Traditional weak spot is work sites. For study of Iraqis in Jordan 2004 census was less informative in 2008 than envisaged, because of substantial movements of refugees Definition issue: who are refugees

Reaching the population: the examples of Jordan and Lebanon No census since 1932 CAS has prepared delination of EAs based on satelitte imagery, but getting old. Several polling firms have prepared their own (relatively) smalll area population estimates Overall proportion of migrants high (> 10%) Some geographic clustering of migrants Likely migration Some areas have security challenges Definition issue: who are refugees

Use of a census for sampling 101 a In principle, a census covers everyone within the borders of a country It defines small areas (containing typically 100 households or so) for the whole geographic extent It provides (recent) population figures for the small areas Therefore, nice to use for sampling because one can exploit the advantages of two stage cluster sampling with PPS in first stage and fixed sample take in second stage: 𝑝 ℎ,𝑐 = 𝑁 ℎ,𝑐 𝑚 ℎ 𝑁 ℎ , inclusion probability of cluster c within stratum h 𝑝 ℎ,𝑐,𝑓 = 𝑛 ℎ,𝑐 𝑁 ℎ,𝑐 , inclusion probability of household f in cluster c in stratum h 𝑝 ℎ,𝑐 = 𝑁 ℎ,𝑐 𝑚 ℎ 𝑁 ℎ 𝑛 ℎ,𝑐 𝑁 ℎ,𝑐 = 𝑚 ℎ 𝑛 ℎ,𝑐 𝑁 ℎ,𝑐 = 𝑛 ℎ 𝑁 𝑐 That is, equal probabilities within strata. Thus no variance contribution from inclusion probabilities, and sample size fixed by design But reality is not quite like this…..

Use of a census for sampling 101 b Reality intervenes because household numbers in sampling cluster are not the same as they were in the frame (census), thus 𝑝 ℎ,𝑐 = 𝑁 ℎ,𝑐 𝑚 ℎ 𝑁 ℎ , inclusion probability of cluster c within stratum h (same as before) 𝑝 ℎ,𝑐,𝑓 = 𝑛 ℎ,𝑐 𝑁 ℎ,𝑐 𝑙 , inclusion probability of household f in listed cluster c in stratum h 𝑝 ℎ,𝑐 = 𝑁 ℎ,𝑐 𝑚 ℎ 𝑁 ℎ 𝑛 ℎ,𝑐 𝑁 ℎ,𝑐 𝑙 = That is, unequal probabilities within strata. Thus variance contribution from inclusion probabilities (1+ 𝐶𝑉( 1 𝑝 ) 2 ), but sample size still fixed by design, and sample is still unbiased Note that accurate household numbers in frame are not necessary for an unbiased sample, but the less accurate, the more variance

Use of census for sampling 101 c Other aspects of reality are more important for lack of bias: If the delineation of enumeration areas exhausts all areas in the country, and there is a way to up-date the actual frame with enumeration areas that have become populated If there are procedures in place that ensures that everyone that actually resides in an enumeration area actually can be counted (i.e. how non-residental space is treated, informal housing etc)

Dealing with H2R («Elusivenes») Various methods Double sampling – screening Disproportinate allocation – not so good Adapative sampling Indirect sampling

Adaptive cluster sampling Used for populations that are rare, but clustered Some form of sample frame exists Procedure Select an ordinary cluster sample If a cluster contains more than z target respondents, choose all neighbours of that cluster. Continue selecting neighbours until cluster contains less than z target respondents For Easy /both procedure and estimation) Works well when clustered population assumption fulfilled Against Very rare populations -> no respondents Not so rare populations -> all clusters selected

Actually: Jordan and Lebanon are not so different Cartography Jordan has more accurate cartography, but CAS cartography in Lebanon not bad (but a bit old). Satellite images in Lebanon better for internal structure of EAs Population counts Question remains about Jordan’s, probably not very good for migrants Lebanon: well. Migration likely to have messed up counts for population of interest. (note that we loose much of the benefit of our 101 description if we are interested in a sub group)