Exploring Administrative Records Use for Race and Hispanic Origin Item Non-Response Sonya Rastogi, Leticia Fernandez, James Noon, Ellen Zapata and Renuka Bhaskar United States Census Bureau UNITED NATIONS - ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing - Paris, April 2014 Disclaimer: This paper is released to inform interested parties of research and to encourage discussion. The views expressed are those of the authors and not necessarily those of the U.S. Census Bureau.
Motivation Census Bureau data on race and Hispanic origin influences U.S. policy and research Voting Rights Act, Civil Rights Act, Fair Housing Act Congressional redistricting Facilitates research on a broad range of issues associated with race/ethnicity Item non-response could affect population estimates Although widely used to impute missing items, hot deck procedures may become less accurate with increasing diversity and changes in neighborhood patterns Can administrative records (AR) provide more accurate data compared to hot deck imputation? 2
Research Objectives Explore feasibility of using AR to minimize imputation of race and Hispanic origin Compare agreement between: AR race and Hispanic origin and hot deck imputations AR race and Hispanic origin and ‘as reported, no proxy’ (unedited) 2010 Census Examine the characteristics of individuals with imputed race or Hispanic origin in 2010 Census In general, as well as for those with a corresponding response in administrative records 3
Data and Methods Data from the 2010 Census and AR from various federal and commercial sources For AR, a Hispanic origin and a race variable were created by combining data from various sources Matching rates between hot deck imputations and AR, and between Census and AR Logistic regressions modeling the association between demographic, household and geographic characteristics on: (a) having race or Hispanic origin imputed by hot deck, and (b) having a response in AR for those who received a hot deck imputation 4
Findings: 1. Race Match between AR and 2010 Census Hot Deck Imputation 5 Source: 2010 Census and Administrative Records. Note: A dash "-" indicates cell is suppressed for disclosure avoidance purposes. * AIAN = American Indian or Alaska Native; NHPI = Native Hawaiian or Other Pacific Islander; SOR = Some Other Race.
Findings: 2. Hispanic Origin Match between AR and 2010 Census Hot Deck Imputation 6 Source: 2010 Census and Administrative Records.
Findings: 3. Race Match between AR and 2010 Census ‘As Reported, No Proxy’ 7 Source: 2010 Census and Administrative Records. * AIAN = American Indian or Alaska Native; NHPI = Native Hawaiian or Other Pacific Islander; SOR = Some Other Race.
Findings: 4. Hispanic Origin Match between AR and 2010 Census ‘As Reported, No Proxy’ 8 Source: 2010 Census and Administrative Records.
Findings: 5. Selected Factors Associated with Hot Deck Imputation and with Coverage in AR The findings from multinomial analyses show that the odds of race or Hispanic origin imputations, and the odds of being in AR for individuals in the imputation universe, vary by race/ethnicity, and other individual, household and regional characteristics. Some highlights are: Race/ethnicity: Hispanics of any race, non-Hispanic NHPIs and non-Hispanic Asians were more likely than non-Hispanic Whites to have race imputed. Hispanics were less likely to have a Hispanic origin response imputed than non-Hispanic Whites. However, all non-Hispanic non-White groups (Black, AIAN, Asian, NHPI, SOR and Two or More Races) were more likely to have Hispanic origin imputed compared to non-Hispanic Whites. Among those with imputed responses, non-Hispanic Blacks are most likely to be in AR. 9
(Continued) Selected Factors Associated with Hot Deck Imputation and with Coverage in AR Household Composition: Imputation of race and Hispanic origin are less likely for larger households and married couple households than for smaller households and those headed by single parents or that have other family compositions. Individuals in the imputation universe are more likely to be in AR if they live alone or in households headed by single mothers compared to other types of households. Home-ownership: Home-owners are less likely to have race or Hispanic origin imputed, and they are also more likely to be in AR than renters. 10
Conclusions AR records are a promising avenue to complement and improve the quality of imputations for missing race and Hispanic origin data Since AR do not cover all individuals, hot deck and other methods will remain important for addressing race and Hispanic origin missing data Understanding the characteristics of individuals with imputed race and Hispanic origin, and the variables associated with their inclusion in administrative records, may contribute to developing strategies to address non-response 11
Thank You! We welcome your questions and comments 12