NIMH Collaborative Psychiatric Epidemiology Surveys

Slides:



Advertisements
Similar presentations
Multistage Sampling.
Advertisements

Multiple Indicator Cluster Surveys Survey Design Workshop
An Assessment of the Impact of Two Distinct Survey Design Modifications on Health Insurance Coverage Estimates in a National Health Care Survey Steven.
9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
National Center for Health Statistics DCC CENTERS FOR DISEASE CONTROL AND PREVENTION Changes in Race Differentials: The Impact of the New OMB Standards.
Estimates and sampling errors for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Sampling.
1/26/00 Survey Methodology Sampling, Part 2 EPID 626 Lecture 3.
Sampling Strategy for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Sampling.
Geographic Oversampling for Race/Ethnicity Using Data from the 2010 Census Presented to WSS Sixia Chen December 3, 2014.
Complex Surveys Sunday, April 16, 2017.
MISUNDERSTOOD AND MISUSED
Analysis of Complex Survey Data Katherine M. Keyes
Who and How And How to Mess It up
2.2: Sampling methods (pp. 17 – 20) Probability sampling: methods that can specify the probability that a given sample will be selected. Randomization:
Why sample? Diversity in populations Practicality and cost.
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
A new sampling method: stratified sampling
PROBABILITY SAMPLING: CONCEPTS AND TERMINOLOGY
Understanding sample survey data
Sampling and Sampling Procedures.  In most epidemiologic studies, we deal with a sample of the population  The study population may be:  An entire.
Formalizing the Concepts: STRATIFICATION. These objectives are often contradictory in practice Sampling weights need to be used to analyze the data Sampling.
Joint Canada/U.S. Health Survey Catherine Simile, National Center for Health Statistics Patrice Mathieu, Statistics Canada Ed Rama, Statistics Canada NCHS.
How survey design affects analysis Susan Purdon Head of Survey Methods Unit National Centre for Social Research.
Sample Design.
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
Complexities of Complex Survey Design Analysis. Why worry about this? Many government studies use these designs – CDC National Health Interview Survey.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Near East Regional Workshop - Linking Population and Housing Censuses with Agricultural Censuses. Amman, Jordan, June 2012 Improving Efficiency.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.
Institute for Social Research University of Michigan
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
JENNIFER SAYLOR, PHD, RN, ANCS-BC UNIVERSITY OF DELAWARE SEPTEMBER 14, 2012 Essentials of Complex Data Analysis Utilizing National Survey.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Panel Study of Entrepreneurial Dynamics Richard Curtin University of Michigan.
Current Population Survey Sponsor: Bureau of Labor Statistics Collector: Census Bureau Purpose: Monthly Data for Analysis of Labor Market Conditions –CPS.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Bangkok,
Lohr 2.2 a) Unit 1 is included in samples 1 and 3.  1 is therefore 1/8 + 1/8 = 1/4 Unit 2 is included in samples 2 and 4.  2 is therefore 1/4 + 3/8 =
MEPS WORKSHOP Household Component Survey Estimation Issues Household Component Survey Estimation Issues Steve Machlin, Agency for Healthcare Research and.
Building Wave Response Rates in a Longitudinal Survey: Essential for Nonsampling Error Reduction or Last In - First Out? Steven B. Cohen Fred Rohde and.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
Introduction to Survey Sampling
SAMPLE DESIGN: HOW MANY WILL BE IN THE SAMPLE—SAMPLE SIZE ADJUSTMENTS?
Strengthening Partnerships: Shaping the Future Portland, OR June 6 th – 10 th, 2004 Quality of Race and Hispanic Origin Reporting on the US Death Certificate.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
CASE STUDY: NATIONAL SURVEY OF FAMILY GROWTH Karen E. Davis National Center for Health Statistics Coordinating Center for Health Information and Service.
1 of 22 INTRODUCTION TO SURVEY SAMPLING October 6, 2010 Linda Owens Survey Research Laboratory University of Illinois at Chicago
Sampling technique  It is a procedure where we select a group of subjects (a sample) for study from a larger group (a population)
Basic Sampling Concepts Used in NAEP Andrew Kolstad National Center for Education Statistics May 20, 2006 Basic Sampling Concepts Used in NAEP Andrew Kolstad,
Quality of Race and Hispanic Origin Reporting on Death Certificates in the US Elizabeth Arias, Ph.D. Mortality Statistics Branch Division of Vital Statistics.
NHANES Analytic Strategies Deanna Kruszon-Moran, MS Centers for Disease Control and Prevention National Center for Health Statistics.
Sample Design of the National Health Interview Survey (NHIS) Linda Tompkins Data Users Conference July 12, 2006 Centers for Disease Control and Prevention.
1. 2 DRAWING SIMPLE RANDOM SAMPLING 1.Use random # table 2.Assign each element a # 3.Use random # table to select elements in a sample.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Addis.
Sampling Design and Procedure
Weighting and imputation PHC 6716 July 13, 2011 Chris McCarty.
PHIA Surveys: Sample Designs and Estimation Procedures Graham Kalton Westat.
Complex Surveys
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Presentation transcript:

NIMH Collaborative Psychiatric Epidemiology Surveys CPES Sample Designs and Analysis Weights Steve Heeringa

Sample Design and Weighting Documentation www.icpsr.umich.edu/CPES Table of contents Sample design Weighting Detailed description of sample design and weighting procedures. Also, Heeringa et al. (2004). “Sample Designs and Sampling Methods for the Collaborative Psychiatric Epidemiology Studies (CPES).” International Journal of Methods in Psychiatric Research, 13(4), 221- 239. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES Estimation and Inference CPES is based on the integration of three nationally representative multi-stage area probability samples. Estimation will require weighted analysis to compensate for disproportionate sampling. Inference (design-based) will require the use of variance estimation procedures for complex sample designs. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Complex Sample Design Probability sample design: Each population element has a known, non-zero selection probability Properly weighted, sample estimates are unbiased or nearly unbiased for the corresponding population statistic. Variance of sample statistics can be estimated from the sample data (measurability) 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Complex Sample Design “Complex sample” : A probability sample developed using sampling procedures such as stratification, clustering and weighting designed to improve statistical efficiency, reduce costs or improve precision for subgroup analyses relative to simple random sampling (SRS) Unbiased estimates with measurable sampling error are still possible Independence of observations, equal probabilities of selection may no longer hold 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Where are Complex Sample Designs Used? Complex sample designs are the rule and not the exception in sample-based studies in the Social Sciences, Epidemiology, Public Health, Agriculture, Natural Resources and many other scientific fields. Multi-stage area probability sample designs for household surveys: NCS-R, NSAL, NLAAS, NHIS, NHANES, CPS, etc. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Multi-Stage Sample of U.S. Households 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Consortium in Psychiatric Epidemiology Studies (CPES) Sampled persons are interviewed about mental health and mental health-related matters. Separate estimates (oversampling) required for domains defined by: Age-sex groups, African Americans, Latinos, Asians and All Others. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES SAMPLE DESIGN (Primary stage) The PSUs are mostly single counties or metropolitan statistical areas (MSAs). Stratify PSUs by region and MSA status. Select PSUs by probability proportionate to estimated size (PPES) sampling (a few selected with certainty). 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES SAMPLE DESIGN (Second Stage) The SSUs are area segments. Area segments are Census blocks or combinations of Census blocks with a minimum number of households, e.g. 50 for NCS-R. Area segments are selected with PPES-typically 6-20 area segments per PSU 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES SAMPLE DESIGN (Third Stage) Before selection can occur, field staff list the housing units in the selected area segments. For each segment, a subsample of housing units is selected from the prepared list. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES SAMPLE DESIGN (Fourth Stage) Conduct a screening interview to classify persons by domain. Subsample persons to give required sampling rates by domain. Survey weights will reflect this subsampling step Conduct interviews with sample persons (SPs). 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Basic Weighting Approach Suppose sample element i was selected with probability πi. Then sample element i represents (1 / πi ) elements in the population. That is, count the element i in the analysis by giving it a weight of wi = (1 / πi ). For example, a sample element selected with probability 1/10 represents 10 elements in the population. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Weighting Weighting is used to compensate for Unequal probabilities of selection Nonresponse (typically, a unit that fails to respond) In poststratification to adjust weighted sample distributions for certain variables (e.g., age and sex) to make them conform to the known population distribution. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Overall Weight CPES weights incorporate simultaneously all three components, unequal probabilities of selection, nonresponse, and poststratification: 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Weighted Estimation In probability samples in which the sample elements have been selected with different probabilities and each case has been assigned a weight, Wi=1/ πi, the unbiased estimator of the population mean is: 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Weighted Estimation The simple unweighted estimator of the sample mean is a special case of the weighted estimator of the population mean with Wi=1 for all cases: 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Weighted Estimates of Population Statistics 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Weighted Estimation 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Weighted Estimation 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Weighted Estimation 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Effect of Weights on Variances of Estimates of Means LW = Relative increase in sampling variances of estimates due to weighting Generally, oversampling for subgroups: LW,SUB < LW 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Example of Weighting Loss in Disproportionate Sampling Stratum % Hispanic Population Oversampling Rate Weight % of Hispanic Sample 1 19.2% 1:1 4 7% 2 22.8% 2:1 3 24.1% 3:1 1.33 26% 33.9% 4:1 50% 100% 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Example of Weighting Loss in Disproportionate Sampling 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Scaling of Weights Analysis based on: will yield the same results (provided the software program treats the weights correctly). 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Normalizing Weights 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Example: Distribution of NCS Part 2 Weights (Normalized) Moments N 5877 Sum Weights Mean 1.00000007 Sum Observations 5877.00039 Std Deviation 1.16010775 Variance 1.34584999 Skewness 2.8311973 Kurtosis 7.9882 Uncorrected SS 13785.2153 Corrected SS 7908.21452 Coeff Variation 116.010767 Std Error Mean 0.01513284 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES Weighting Approach Separately, NCS-R, NSAL and NLAAS cases are assigned to 1 of 12 distinct race/ancestry groups Vietnamese, Filipino, Chinese, All other Asian, Cuban, Puerto Rican, Mexican, All other Hispanic, Afro-Caribbean, African-American, White, All Other Persons with multiple ancestry are assigned to the first applicable discrete category in this list. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES Weighting Approach Separately, NCS-R, NSAL and NLAAS cases are assigned to 1 of 11 distinct geographic domains defined based on the race/ancestry composition of the Census tract in which they resided at the time of interview. Hawaii is only represented in NLAAS 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES Weighting Approach The March 2002 Current Population Survey (CPS) was used to estimate the total size of the adult population in each race/ancestry stratum. Within each race/ancestry stratum the distribution of the population to the 11 geographic domains was estimated based on the CPES study which could provide the most precise estimates, e.g. NSAL for African Americans. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES Weighting Approach Separately, NCS-R, NSAL and NLAAS study-specific weights (see chart) were post-stratified to the 12 x 11 grid of population totals for race/ancestry by geographic domain. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES Weighting Approach CPES weight values for the pooled data were constructed by scaling each individual study weight by a factor equal to the proportion of the total sample in a grid cell that originated with the component study. The result is a CPES weight that sums to the CPS 2002 post-stratification controls in the 12 x 11 grid. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES Weighting Approach Long and Short form weight variables– NCS-R uses a two-phase sampling approach in which only a subsample of cases progress from Part 1 to Part 2. Therefore, there are 2 CPES weights: CPESWTLG – used when one or more variables of interest are measured only in NCS-R Part 2 CPESWTSH – used when all variables of interest are measured in NCS-R Part 1 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

CPES Weighting Approach Weights for pooled analysis of data from only two of the three CPES data sets: NCNLWTLG, NCNLWTSH: NLAAS and NCS-R NCNSWTLG, NCNSWTSH: NSAL and NCS-R NSNLWT : NLAAS and NSAL Derived as CPES weights but with final scaling step limited to proportion of sample in cell from the two studies. 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

NIMH Collaborative Psychiatric Epidemiology Surveys Table 1: Descriptive Statistics for CPES Study Weights Sample Weight Mean S.E. CV NCSR Short NCSRWTSH 1.00 0.0054 52.28 Long NCSRWTLG 0.0127 95.82 NLAAS --- NLAASWGT 6333.6 86.77 93.42 NSAL NSALWTPN 8022.2 165.84 161.22 CPES CPESWTSH 10468.2 75.25 101.69 CPESWTLG 12693.4 152.03 153.49 NCSR-NLAAS ncnlwtsh 15061.7 100.10 78.44 ncnlwtlg 20290.6 261.12 130.87 NCSR-NSAL ncnswtsh 13588.6 106.91 97.52 ncnswtlg 17731.9 248.46 152.04 NLAAS-NSAL nsnlwt 19553.1 608.91 322.59 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Un-weighted Prevalence Table 2: Un-weighted and Weighted Prevalence of Major Depressive Disorder Sample Sample size Un-weighted Prevalence % (s.e.) Weighted Prevalence White-Non-Hispanic NCSRa 6696 17.95(0.47) 17.75(0.55) CPESa 7567b 18.00(0.44) 17.92(0.54) Black 1176 11.99(0.95) 11.29(1.04) NSAL 3433 10.63(0.53) 10.31(0.58) 4609b 10.98(0.46) 10.39(0.47) Hispanic NLAAS 2554 15.66(0.72) 13.79(0.68) 3615 15.60(0.60) 13.71(0.63) a =short sample, b= excluding missing cases s.e.=standard error 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Un-weighted Prevalence Table 3: Un-weighted and Weighted Prevalence of Generalized Anxiety Disorder Sample Sample size Un-weighted Prevalence % (s.e.) Weighted Prevalence White-Non-Hispanic NCSRa 6696 6.45(0.30) 6.16(0.31) CPESa 7566b 6.25(0.28) 6.08(0.32) Black 1176 4.34(0.59) 4.14(0.74) NSAL 3430 3.53(0.31) 3.26(0.36) 4606b 3.73(0.28) 3.48(0.34) Hispanic NLAAS 2554 3.48(0.36) 2.82(0.40) 3615 3.76(0.32) 3.03(0.34) a =short sample, b= excluding missing cases s.e.=standard error 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys

Un-weighted Prevalence Table 4: Un-weighted and Weighted Prevalence of Post Traumatic Stress Disorder Sample Sample size Un-weighted Prevalence % (s.e.) Weighted Prevalence White-Non-Hispanic NCSRa 4180 10.26(0.47) 6.84(0.54) CPESb 4180b 6.92(0.53) Black 679 12.37(1.26) 7.4(0.86) NSAL 3419c 9.15(0.49) 9.10(0.54) CPESa 4098c 9.69(0.46) 8.76(0.49) Hispanic NLAAS 2554 5.29(0.44) 4.42(0.45) 3259c 6.17(0.42) 4.74(0.39) a =long sample, b= only assessed in NCS-R , c= excluding missing cases s.e.=standard error 4/16/2017 NIMH Collaborative Psychiatric Epidemiology Surveys