Download presentation
Presentation is loading. Please wait.
Published byJustin James Modified over 5 years ago
1
Hybrid Estimates for Rare Populations: Probability Surveys Augmented with Targeted Nonprobability Samples Jill A. Dever, PhD 2019 Joint Statistical Meetings July 30 Denver, Colorado
2
Interaction of the 3 is especially problematic
Rare Populations Three important characteristics: Low prevalence Low likelihood of locating Low likelihood of participating Interaction of the 3 is especially problematic see, e.g., Tourangeau (2014) <5% of US adults are years of age No US registry
3
National Marijuana Beliefs and Behaviors Study (NMBS)
In 2016, RTI-funded national study of adults 18+ years Probability survey (n = 1,867) Address-based sample (ABS) with mail/web data collection Strata (4) = recreational, medical-liberal, medical-restrictive, other Recruitment via mail for single adult household respondent 14.5% weighted response rate (RR2) Nonprobability survey (n = 4,943) Adult participants from social media (e.g., Facebook) Recruitment via advertisement / referral for web interview
4
NMBS Rare Population (example)
Age (yrs.) ABS Social Media Total (Hybrid) 18 – 20 20 222 242 21 – 24 54 382 436 25 + 1,793 4,339 6,132 1,867 4,943 6,810
5
Hybrid Estimation
6
Two Theories for Estimation with Nonprobability Data
Superpopulation Approach Participants predict values for non-participants Pro : lower variance because models tailored to each y Con : models tailored to each y Quasi-randomization Approach Participants weighted to account for non-participants with pseudo-inclusion probabilities Pro/Con : one set of weights for all analyses
7
Pseudo-inclusion probabilities with a reference survey
Sample matching Assigned weight from reference survey match Definition of closeness required Inverse estimated propensity score Direct estimation Average propensity within subclass Calibration Alone In combination with other methods With or without quota sampling Commonality assumptions Inconsistent results for all methods
8
Evaluate Common Support Assumption
100% Valid Probability sample Nonprobability sample Partially Valid
9
Evaluate Common Support Assumption
No standard methodology: “Hope for the best” method Demographic comparisons Compare variables of interest MatchIt in R (Ho et al. 2007) Evaluate overlap of response propensities Model-based evaluations (e.g., imputation) Inconsistent results
10
NMBS Common Support Evaluation
Demographics: ABS = Male; BS degree +; private insurance SM = Hispanic; smoked/vaped in last 30 days Overlap in response propensities: Complete overlap overall ~93% overlap for U.S. adults, years
11
Hybrid Estimation 3 methods
12
Hybrid Weights – Three Methods
(1) Composite estimation “A” = probability survey “B” = nonprobability survey 𝑡 𝑦 =𝝀 𝑡 𝐴𝑦 +(1−𝝀) 𝑡 𝐵𝑦 𝝀<1 + uncommon support components Hybrid weight (λ) options: Relative sample size Unequal weighting effect (UWE) Mean square error
13
Hybrid Weights – Three Methods (continued)
(2) Robbins et al. (2018) Probability of being in combined sample A ∪ B Bayes method Adjust input weights by a function of the propensities Nonprobability input weight = simple random sample from the population
14
Hybrid Weights – Three Methods (continued)
(3) Kott (2019) WTADJX in SUDAAN® Selection model (simultaneously): Calibration to population controls Response propensity model Zero controls to force alignment of A and B estimates
15
NMBS Point Estimates for U.S. Adults, 18-20 Years
Methodology Ever Used Support Medical Use NSDUH ( , average)* 37.5 na ABS only (n = 20) 39.6 74.1 Difference from ABS: Social Media only (n = 222) -9.1 8.5 (1) Composite, UWE -4.9 6.6 (2) Robbins et al. (2018) -7.9 11.3 (3) Kott (2019) 17.9 3.5 * Glasheen et al. (2017)
16
NMBS Effective Sample Size for U.S. Adults, 18-20 Years
Methodology Sample size (n) Effective n Percent of n ABS only 20 12 59.1 Social Media only 222 85 38.3 (1) Composite, UWE 242 78 32.1 (2) Robbins et al. (2018) 88 36.4 (3) Kott (2019) 86 35.7 Introduction of the social media sample increased the sample size 12-fold. Effective n is increased 6.6-fold. “Generic” effective n = sample size / unequal weighting effect
17
Summary and Next Steps The story so far … Next steps
Nonprobability samples may provide underrepresented sample units Need to evaluate multiple methods for hybrid estimation Next steps Common support assumption Simulation study for further evaluation Revisit construction of propensity scores Superpopulation approach
18
Select References Dever, J. A. (2019), Discussion of ‘How Errors Cumulate: Two Examples’ by Roger Tourangeau. Journal of Survey Statistics and Methodology, in print. Dever, J. A. (2018). “Combining probability and nonprobability samples to form efficient hybrid estimates: an evaluation of the common support assumption.” Proceedings of the Federal Committee on Statistical Methodology Conference, Washington, DC. Glasheen, C., Forman-Hoffman, V.L., & Williams, J. (2017). Residential Mobility, Transience, Depression, and Marijuana Use Initiation Among Adolescents and Young Adults. Substance Abuse: Research and Treatment, 11. Ho D, Imai K, King G, Stuart E (2007). Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis, 15:199–236. Kott, P. S. (2019). A Partially Successful Attempt to Integrate a Web-Recruited Cohort into an Address-Based Sample. Survey Research Methods, 13: 95–101. Robbins, M.W., B. Ghosh-Dastidar, and R. Ramchand (2017). “Blending of Probability and Convenience Samples as Applied to a Survey of Military Caregivers.” Presented at the 2017 Joint Statistical Meetings in Baltimore, Maryland, /meetings/jsm/2017/onlineprogram/AbstractDetails.cfm?abstractid= Tourangeau, R. (2014). Defining hard-to-survey populations. Hard-to-Survey Populations, Chapter 1 (Tourangeau R., Edwards B., Johnson T.P., Wolter K.M., & Bates N. eds). United Kingdom: Cambridge University Press. Valliant, R., & Dever, J. A. (2018). Survey weights: A step-by-step guide to calculation. (First ed.) College Station, TX: Stata Press.
19
Hybrid Estimates for Rare Populations: Probability Surveys Augmented with Targeted Nonprobability Samples Joint Statistical Meetings, July 30, Denver, Colorado Jill A. Dever, PhD Senior Research Statistician RTI International Washington, DC
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.