SUB GROUP BIAS IN PUBLIC HEALTH RESEARCH Applying survey over coverage methodology to health disparities research Naomi Zewde, MPH and Rhonda Belue, PhD
Presenter Disclosures The following personal financial relationships with commercial interests relevant to this presentation existed during the past 12 months: No relevant relationships to disclose
Increasing immigration challenges racial classification EG: Classifying Hispanic ethnicity when the individual is also either black or white ACA charged the DHHS with revising standards of race/ ethnicity data collection 1 “While data alone will not reduce disparities, it can be foundational in our efforts to understand the causes, design effect responses and evaluate our progress.”
Obesity Rates among Hispanic and Non- Hispanic White American Adults 2012
Sub group bias Documented health outcome variation by: 2-4 Socio-economic status Country of origin Immigrant status Combined effects African immigrants report higher health than US born whites, while US, West Indian and European born blacks do not
Obesity Rates among Hispanic and Non- Hispanic White American Adults 2012
Sub group inefficiency Large sample sizes enable precise estimates Racial groups are necessarily larger than sub-groups There are necessarily more Black Americans than there are middle-income or African Americans.
Precision - bias tradeoff Sub-group analysis is statistically inefficient if the original results are unbiased No practical difference in outcomes; or Sub-group is a majority, thus driving the results Our paper suggests a method of quantifying the tradeoff between precision and bias.
Objectives This paper draws a conceptual and methodological parallel between survey over coverage bias and sub- group bias in health disparities research to: 1. Demonstrate a method of quantifying sub-group bias 1. Demonstrate a method of identifying the relative statistical efficiency of using sub-group data
Over coverage bias: sampled persons are not part of the target population One to one correspondence F T. F T Over coverage F F T
Sub-group : over coverage parallel Sub group members serve as the target population Example: Puerto Rican Americans Non sub group members are overrepresented in the data Non- Puerto Rican Hispanic Americans
Sub-group : over coverage parallel Over coverage and sub-group bias each occur when unintended observations contribute to sample statistics Survey methodology identifies two drivers of over coverage bias 5 Difference in outcome between foreign and targeted units Proportion of foreign vs. targeted elements
Applied example Obesity prevalence among Hispanic Americans Obesity is a growing public health concern Risk factors are correlated with cultural variation Data source 2012 Medical Expenditure Panel Survey (MEPS) Identifies Hispanic ethnicity across six countries of origin
Mean bias 5 Full sample mean Number of foreign elements Full sample size Mean of foreign elements Mean of target population
Ethnic GroupSample Size Obesity RateObesity Ratio to N.H. W. Bias*Relative Bias Hispanic7, N.H. White11, Reported obesity Statistics from MEPS 2012 represent non-institutionalized American adults. * Demonstration of Bias from using obesity statistic calculated on Hispanic ethnicity (Szameitat and Schafer, 1963) Ethnic GroupSample Size Obesity RateObesity Ratio to N.H. W. Bias*Relative Bias Hispanic7, N.H. White11, Central/S. American 1, Dominican Reported obesity Statistics from MEPS 2012 represent non-institutionalized American adults. * Demonstration of Bias from using obesity statistic calculated on Hispanic ethnicity (Szameitat and Schafer, 1963) Ethnic GroupSample Size Obesity RateObesity Ratio to N.H. W. Bias*Relative Bias Hispanic7, N.H. White11, Central/S. American 1, Dominican Puerto Rican Other Hisp/ Latino Cuban Mexican 4, Reported obesity Statistics from MEPS 2012 represent non-institutionalized American adults. * Demonstration of Bias from using obesity statistic calculated on Hispanic ethnicity (Szameitat and Schafer, 1963) Mean bias in Hispanic ethnic sub-groups
Statistical Efficiency Efficiency can be measured by relative mean squared error Rewards sample size Penalizes unexplained variation and bias Relative efficiency of sub-group analysis is ambiguous apriori
Relative Efficiency Relative Mean Squared Error=
Relative efficiency of Hispanic ethnic sub- groups Ethnic GroupSample SizeRelative BiasMSERelative MSE* Hispanic 7, E-04-- Central/S. American 1, E Puerto Rican E Mexican 4, E Reported obesity statistics from MEPS 2012 represents non-institutionalized American adults. *Ratio of sub- group MSE to full sampling frame MSE (all Hispanic) Ethnic GroupSample SizeRelative BiasMSERelative MSE* Hispanic 7, E-04-- Central/S. American 1, E Puerto Rican E Mexican 4, E Dominican E Other Hisp/ Latino E Cuban E Reported obesity statistics from MEPS 2012 represents non-institutionalized American adults. *Ratio of sub- group MSE to full sampling frame MSE (all Hispanic)
Discussion Over coverage methodology provides a concrete tool to assess the tradeoff between precision and bias to present racial ethnic minority findings Mean bias has been demonstrated in survey over coverage methodology, future research is needed to identify bias in other statistics, including regression coefficients.
References 1. U.S. Department of Health and Human Services. (2011, October). Implementation guidance on data collection standards for race, ethnicity, sex, primary language and disability status. Retrieved from: 2. Read, J. G., Emerson, M. O., & Tarlov, A. (2005). Implications of black immigrant health for U.S. racial disparities in health. Journal of Immigrant Health, National Research Council. (2004). Eliminating health disparities: Measurement and data needs. Panel on DHHS Collection of Race and Ethnicity Data, Committee on National Statistics.Washington, DC: National Academies Press. 4. Liang, J., Van Tran, T., Krause, N., and Markides, K. S.Generational differences in the structure of the CES-D Scale in Mexican Americans.Journal of Gerontology: Social Sciences44(1989).5110– Szameitat, K., & Schaffer, K. A. (1963). Imperfect Frames in Statistics and the consequences for their use in sampling. Bulletin of the International Statistical Institute, 40, pp