Treasure Trove of Data: Conducting Research Using Federal Statistical Surveys.

Slides:



Advertisements
Similar presentations
Measures of Child Well-Being from a Decentralized Statistical System: A View From the U.S. National Center for Health Statistics Stephen J. Blumberg, Ph.D.
Advertisements

National Center for Health Statistics DCC CENTERS FOR DISEASE CONTROL AND PREVENTION Changes in Race Differentials: The Impact of the New OMB Standards.
Dissemination of U.S. Census Data and Results: The role of ICPSR First Conference of Al-Khawarezmi Committee on Statistics Doha, Qatar 6-8 December 2010.
What are Wage Records? Wage records are an administrative database used to calculate Unemployment Insurance benefits for employees who have been laid-off.
U.S. Vital Statistics Mortality Data: Past Uses and Future Directions Irma T. Elo Director, Population Studies Center Professor of Sociology University.
Medical Expenditure Panel Survey Karen Beauregard Steve Machlin Jeffrey Rhoades.
The Second Longitudinal Study of Aging Julie Dawson Weeks, Ph.D. LSOAs Project Director U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease.
Semi-Permeable Boundaries Among Institutions: Non-Public Data and the Census RDC at Berkeley IASSIST 2009 – Tampere, Finland Jon StilesMay 27, 2009.
The rich multivariate data of the National Center for Health Statistics’ National Health Interview Survey Jane F. Gentleman, Ph.D., Director Division of.
Non-Public Data in the California Census Research Data Center.
Labor Statistics in the United States Grace York March 2004.
© John M. Abowd 2005, all rights reserved Statistical Programs of the Federal Government John M. Abowd February 2005.
National Center for Health Statistics Research Data Center Peter S. Meyer Director, Research Data Center May 6, 2009.
BC Jung A Brief Introduction to Epidemiology - IV ( Overview of Vital Statistics & Demographic Methods) Betty C. Jung, RN, MPH, CHES.
Introducing HealthStats Eleanor Howell, MS Manager, Data Dissemination Unit State Center for Health Statistics February 2, 2012.
Aspects of the National Health Interview Survey (NHIS) Chris Moriarity National Conference on Health Statistics August 16, 2010
Population Health Surveys: Data to Improve Public Health in an Era of Health Care Reform E. Richard Brown, PhD Director, UCLA.
Saadia GreenbergElena Fazio Office of Performance and Evaluation Administration on Aging US Department.
Tabatha McNeill, MPH Public Health Analyst National Center for Health Statistics Research Data Center.
Liesl Eathington Iowa Community Indicators Program Iowa State University October 2014.
Statistical Abstract of the United States- Value of Data Ian O’Brien Branch Chief, Statistical Compendia Branch, U.S. Census Bureau.
The California Census Research Data Center Data Oct 22, 2012.
Using AHRQ Data at Census RDC’s Health Data Workshop May 6, 2009 Doris Lefkowitz.
1 Record Linkage for Epidemiologic Research: Accessing Linked data at the NCHS Research Data Center Christine S. Cox NCHS Data Users Conference July 12,
Accessing Aggregated Population Health Data from Select Tools of the NCHS A presentation at the Knowledge 4 Equity Conference James M. Craver November.
Integrated Health Care Survey Designs: Analytical Enhancements Achieved Through Linkage of Surveys and Administrative Data 2008 European Conference on.
Local Employment Dynamics (LED) & OnTheMap Nick Beleiciks Oregon Census State Data Center Meeting April 14, 2009.
2006 ICE meeting Using Linked Data to Examine Injury and Disability Beth Rasch and Chris Cox National Center for Health Statistics.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Introduction to the Public Use Microdata Sample (PUMS) File from the American Community Survey Updated February 2013.
Overview of National Center for Health Statistics (NCHS) Data Systems Mary Burgess NCHS Librarian
Medical Expenditure Panel Survey SURVEY OVERVIEW.
Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.
1 Sources of gender statistics Angela Me UNECE Statistics Division.
United Nations Economic Commission for Europe Statistical Division Sources of gender statistics Angela Me UNECE Statistics Division.
3/1/2007 Wayne Gray1 Using Census Business Data Wayne B. Gray March 2007.
A New Resource to Support Research, Policy, and Practice Lauren Harris-Kojetin, PhD Eunice Park-Lee, PhD Long-Term.
Robin A. Cohen, PhD National Center for Health Statistics National Conference on Health Statistics August 6, 2012 Analytic Uses of National Health Interview.
Mobility MATTERS! Connecting People to Life Who Rides the Bus? How Understanding Transit Demographic Can Improve Service May 7, 2015.
MCRDC Michigan Census Research Data Center The MCRDC is a joint project of the U.S. Bureau of the Census and the University of Michigan to enable qualified.
Using Census Data to Understand Things ​ OpenGovChicago March 26, 2014.
1 NCHS Record Linkage Activities Kimberly A. Lochner Christine S. Cox NCHS Data Users Conference July 11, 2006 U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES.
Measurement of Income in NCHS Surveys Diane M. Makuc NCHS Data Users Conference July 12, 2006 Centers for Disease Control and Prevention National Center.
2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008.
Health, United States: History, Uses, and Future Directions Health, US Over the Years: Diane Makuc Health, US in the 21 st Century: Amy Bernstein Media.
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
RESEARCH DATA CENTER Types of Data. Major NCHS Surveys and Data Systems National Health and Nutrition Examination Survey (NHANES) National Health Interview.
Modification to the NCHS Data Release Policy Modification to the NCHS Data Release Policy -A Response to the States- Centers for Disease Control and Prevention.
New Data in the Federal Statistical Research Data Centers Melissa Ruby Banzhaf, PhD Administrator, ARDC Center for Economic Studies U.S. Census Bureau.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Eve Powell-Griner National Center for Health Statistics Centers for Disease Control and Prevention National Center for Health Statistics Microdata Release.
Current Population Survey Joint BLS/Census Bureau Product Sampling design – About 60,000 occupied housing units monthly nationally – design National/Regional.
Record and Geographic Linkages to Inform Health Disparities Jennifer Parker and Lauren Rossen Office of Analysis and Epidemiology.
Accessing and Using NCHS Data on the Web Ann Aikin Centers for Disease Control and Prevention National Center for Health Statistics.
Trends in childhood asthma: NCHS data on prevalence, health care use and mortality Susan Lukacs, DO, MSPH Lara Akinbami, MD Infant, Child and Women’s Health.
A Proposal to Modify the NCHS Data Release Policy – A Response to the States A Proposal to Modify the NCHS Data Release Policy – A Response to the States.
INFO 7470/ECON 7400/ILRLE 7400 Understanding Social and Economic Data John M. Abowd and Lars Vilhuber January 21, 2013.
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
Overview of National Center for Health Statistics (NCHS) Data Systems Mary Burgess
The LEHD Program and Employment Dynamics Estimates Ronald Prevost Director, LEHD Program US Bureau of the Census
Introduction to NCHS Rob Weinzimer, Special Assistant for Outreach Centers for Disease Control and Prevention National Center for Health Statistics.
National Hospital Care Survey (NHCS). 2 Overview What is the National Center for Health Statistics (NCHS)? What is the National Hospital Care Survey (NHCS)?
Using Census Data at the Federal Statistical Research Data Centers Barbara A. Downs Director, FSRDC Center for Economic Studies U.S. Census Bureau.
Michigan Census Research Data Center
Data Available in the RDC
Data Available in the RDC
The Rocky Mountain Research Data Center
UT-Austin FSRDC Grand Opening December 13, 2017
The Rocky Mountain Research Data Center
Presentation transcript:

Treasure Trove of Data: Conducting Research Using Federal Statistical Surveys

So many unanswered research questions…

Census Publications 3

The World of Printed Reports: Statistical Abstract, 1902, 580 pages 4

5

Cost of Living Measurement 6

… but seriously folks… There is a Hierarchy of Federal Data Published aggregates – dating back over a Century but also (mostly available electronically)  Some predetermined geography and categories  Thinner the data “slice” the more confidentiality protection, i.e. the data’s not there anymore Public Use file  A sub-sample of the data, only feasible for large samples  …but also with confidentiality protection (see above) Synthetic Data (new approach) Restricted Use Micro Data  Proposals for research required  Special access arrangements, terms of use, etc.

Public use data

Census Research Data centers

Demographic Data  1970, 1980, 1990 and 2000 Decennial Long Form (back to 1940 soon)  American Community Survey (effectively replacing the long form)  March CPS Earnings Supplements  Survey of Income and Program Participation  American Housing Survey

Economic Data Sets Annual Survey of Manufactures Census of Construction Census of Finance and Insurance Census of Manufactures Census of Mining Census of Real Estate Census of Retail Census of Services Census of Transportation Census of Wholesale Characteristics of Business Owners Survey Commodity Flow Survey Auxiliary Establishment Survey Longitudinal Business Database Longitudinal Research Database Manufacturing Energy Consumption Survey Medical Expenditure Panel Survey, Insurance Component National Employer Survey Pollution Abatement Costs and Expenditures Quarterly Financial Reports Research and Development Survey Survey of Manufacturing Technology Worker Establishment Characteristics Database R&D and Innovation Survey

Read the Forms!

Linked Household / Business data Longitudinal Employer Household Dynamics (LEHD)  Links households to place of employment  Based on unemployment insurance administrative records  Covers most states  Quarterly starting in 1990  “Tracks” a person based on their place of employment  Establishment (i.e. the place of work) is exact for single plant companies  Establishment is assigned for all others (using geography and industry to improve matches)  Google “LEHD on the map”…

How to Apply Preliminary Proposal Must Meet Basic Requirements  Need for Non-Public data  Maintains Confidentiality  Feasibility  Describes Census Benefits (LEGAL REQUIREMENT)  Scientific Merit Work with Census Administrator to Craft Final Proposal

Restricted use Health data

Why is there health data at the Census RDCs? This data is collected by:  National Center for Health Statistics (NCHS)  Agency for Healthcare Research and Quality (AHRQ) Dual mission: to provide broad access to health data and statistics, while protecting the privacy of respondents Most Research uses the Public Use file NCHS and AHRQ RDCs created to provide access to restricted use files Now available at all Census RDCs

What type of data is it? NCHS Data National Health Status Surveys National Health and Nutrition Examination Survey (NHANES) I, II, and III National Health Interview Survey (NHIS) Longitudinal Study on Aging I and II (LSOA) National Survey of Family Growth National Survey of Children's Health National Survey of Early Childhood Health National Survey of Children with Special Health Care Needs National Asthma Survey National Health Care Surveys National Ambulatory Medical Care Survey National Hospital Ambulatory Medical Care Survey National Survey of Ambulatory Surgery National Hospital Discharge Survey oNational Nursing Home Survey (NNHS) oNational Home and Hospice Care Survey oNational Employer Health Insurance Survey oNational Health Provider Inventory oNational Immunization Survey Vital Statistics o Mortality and Multiple Mortality o Birth o Fetal Death o National Death Index o Marriage and Divorce Linked Data Sets oLinked mortality data: NHIS, NHANES LSOA II, NNHS oLinked Medicare Enrollment and Claims data: NHIS, NHANES, LSOA II oLinked Social Security Administration Data: NHIS, NHANES, LSOA II, NNHS oLinked EPA data

What is restricted in the public use files but available in the RDC? Every survey has at least some data that is restricted for confidentiality Data can be restricted in a number of ways:  Individual variables:  Removed  Top-coded, bottom-coded, coarsened or masked  Artificial information is substituted  Pieces of datasets are restricted  Whole datasets are unavailable (particularly linked files)

What’s restricted? Variables Examples of restricted variables: Geographic variables (state, county, or metropolitan area) Most dates (date of interview, date of death, date of birth) Income and employment data (industry codes) Specific diagnoses (ICD-9 codes are generally coarsened) Details about facilities (accreditation, payments, number of employees) Some information about children and adolescents, (e.g. height and weight, depression, behavior problems, and drug use) Some information about race, ethnicity, and country of origin Contextual data (nearest hospital, % of population with diploma) Sample design variables (necessary for estimating variances)

What’s restricted? Pieces of datasets Examples  Contextual data: data can be linked to information about area (e.g., number of hospitals, education in county, MEPS Area Resource File)  Medical Expenditure Panel Survey: Provider, Insurance, and Nursing Home Component  NHANES III: Youth Conduct Disorder Datasets, Los Angeles Demographic Dataset, Diagnostic Interview Schedule for Children  National Survey on Family Growth: self-report data and interviewer comments

What’s restricted? Datasets Linked data sets:  Mortality files linked to NHANES, NHIS, LSOA  EPA emissions data linked to NHDS, NHIS, NHANES  Social Security linked to NHANES, NHIS, LSOA  Medicare files linked to NHANES, NHIS, LSOA Other datasets unavailable:  National Employer Health Insurance Survey  National Death Index

How can I access it? Submit a proposal to NCHS or AHRQ NCHS/AHRQ evaluates for feasibility, availability of computing resources, and likelihood of disclosure of confidential info (NOT for scientific merit) If approved, researcher sends public use data and code NCHS/AHRQ staff merges public use data with restricted data to create a file for use by researcher Files are only created by NCHS/AHRQ staff

How can I access it? Proposal must include  Full research proposal  Explanation of why public-use files are insufficient  Data dictionary, which must identify files and years, target sample, and variables  Sample code, examples of desired output, and software requirements  Resumes of researchers, sources of funding, and proposed dates when analysis will take place

How can I access it? (Working through NCHS/AHRQ ) Working at NCHS or AHRQ RDCs (both in Hyattsville, MD)  RDC analyst prepares data prior to researcher’s arrival  Researchers cannot merge own data sets or work with more than one data set at time  All output and notes must be reviewed before removal; data files cannot be removed  Support is available from RDC staff Working with NCHS remotely  Researchers send code via and receive output back via  Only certain SAS/SUDAAN procedures permitted; no access to micro data Working with AHRQ remotely  AHRQ has no remote server  Possibility of writing task order for AHRQ