Download presentation
Presentation is loading. Please wait.
1
The Rocky Mountain Research Data Center
Advancing the Frontiers of Social Science: Opportunities and Challenges Jani Little, Executive Director Katie Genadek, Expected Administrator The Rocky Mountain Research Data Center Jani Little, Executive Director The Rocky Mountain Federal Statistical Research Data Center (RMRDC) Jani Little Executive Director
2
What is a Federal Statistical Research Data Center (FSRDC)?
--A secure computing lab where restricted data, collected by federal agencies, can be accessed FOR STATISTICAL PURPOSES ONLY --Made possible by a contractual agreement between a leading research institution and the U.S. Census Bureau --The Census Bureau’s Center for Economic Studies (CES) directs all FSRDCs and the FSRDC Program --FSRDCs are managed by an on-site Census employee—the administrator— who guides researchers on proposal development, enforces security guidelines, and serves as liaison with the research community.
3
University of Colorado
Katie Genadek, PhD RMRDC Administrator University of Colorado IBS Room 423
5
9 in 2009 and 24 in 2016
6
The RMRDC Consortium Partner Members: Supporting Members:
UC Colorado Springs Colorado State Government Colorado School of Mines National Center for Atmospheric Research National Renewable Energy Laboratory
7
Partner Consortium Members Faculty, Grad Students, and Affiliated Researchers:
Free access to RMRDC services and secure laboratory Researchers with continued use are expected to write grant proposals and include lab fees
8
Advantages to Researchers and Institutions:
--Greatly expands the policy and basic questions that can be addressed --Builds on past research findings with richer data --Improves competitive edge for grants and publications --Improves graduate education (big data/statistical techniques) and placement --Attracts and retains data-intensive faculty
9
Advantages Provided to Research:
--Microdata not available publicly firms and establishments individuals and households (especially longitudinal studies) children --Variables not available in public versions of data sets (e.g., low level geography) --Full population counts or larger samples (Decennial Census, ACS, CPS) --Full range of response items (e.g., industry codes, occupational codes, detailed race answers, income is not top-coded, etc.) --Ability to make linkages with external data (e.g., via geocodes, establishment ID, etc.) between multiple internal data sets via non-public link keys
10
FSRDCs Used to Address Many Research Topics
Business, Trade, Finance, and Management Crime and Crime Victimization Demography, Population Distributions and Trends, Migration, and Immigration Economics, Labor Markets, Entrepreneurship, Employment and Industry Education and Education Policy Hazard Mitigation, Environmental Impact Assessment, Pollution Abatement Health and Well-Being, Health Insurance, Health Policy Housing, Housing Markets, and Residential Patterns Poverty, Social Welfare Policy, and Social Mobility Transportation Analysis and Planning Urban and Regional Economics and Planning Energy Efficiency and Greenhouse Gas Emissions in Manufacturing
11
Requirements for Any FSRDC Project:
--Research projects must undergo a formal approval process with the agency that owns the data, e.g., Census, NCHS, AHRQ, BLS --Researchers must go through a background investigation that qualifies them for “Special Sworn Status (SSS)” which makes them an unpaid Census Bureau employee. --Results must be formally reviewed for disclosure violation before they leave the secure facility. Currently 260 active projects, 50% are Census
12
RMRDC: The Physical Facility
Projected Opening: May 2017 Location: IBS Building on CU Boulder Campus --10 thin client workstations to access FSRDC servers --Secure communications that tunnel over campus internet --Contains the Administrator’s office --Badge Reader at Entrance --24/7 Security System with camera --no electronic devices allowed --NOTHING leaves the secure lab without approval
13
FSRDC Server Software GeoDa Tomlab Knittro Madd QGIS StatTransfer Python - Anaconda Fortran Perl Tex/LaTex Gauss Stata Matlab & toolboxes PBS Pro Intel Composer XE NX Enterprise R SAS SAS (Dataflux) SUDAAN
14
Components of Proposals:
--Personnel and Time frame --Project Description (scientific merit, methods, feasibility, why requires restricted data) --Dataset(s), Variables, Geography --Results Expected and Disclosure Avoidance Strategies
15
Proposal Differences by Agency:
Census NCHS and AHRQ Time to Approval 3 months on average 1-3 months on average Benefit to Agency PPS Required Not Required Fee None $1200 min extract fee NCHS $300 AHRQ* Scope Broad (max of 30 pages) Precise
16
Major Partners in the FSRDC System
U.S. Census Bureau Economic Data Demographic Data Longitudinal Employer-Household Dynamics (LEHD) Data Bureau of Labor Statistics (BLS) National Center for Health Statistics (NCHS) Agency for Healthcare Research and Quality (AHRQ) Other Federal Partners
17
Economic data available in RDCs
Microdata not available elsewhere Detailed geographies and industries Data linked over time Employee and employer linked data Full business register for the US Can link own data to individual businesses
18
Examples of Economic Microdata
Data Set Frequency Unit of Enumeration Availability Standard Statistical Establishment List/Business Register (SSEL) Annually Establishment 1974–2014 Longitudinal Business Database (LBD) 1976–2014
19
Examples of Economic Microdata
Data Sets Frequency Unit of Enumeration Availability Census of Auxiliary Establishments (AUX) Every 5 Years Establishment 1977–2012 Census of Construction Industries (CCN) 1972–2012 Census of Finance, Insurance, and Real Estate (CFI) 1992–2012 Census of Manufactures (CMF) 1963, 1967–2012 Census of Mining (CMI) 1987–2012 Census of Retail Trade (CRT) Census of Services (CSR) Census of Transportation, Communications, and Utilities (CUT) Census of Wholesale Trade (CWH)
20
includes Health Care and Social Assistance Enterprises NAICS code 62
Census of Services-- includes Health Care and Social Assistance Enterprises NAICS code 62 2012 Number of Establishments in U.S.: 831,303 Receipts/Revenues ($1,000): 2,040,441,203 Summary table: ew.xhtml?src=bkmk North American Industry Classification System NAICS
21
Linked Employer Household Dynamics (LEHD)
LEHD data combine administrative data from states’ Unemployment Insurance systems with Census Bureau data. Workers: Employer history and quarterly wages, Individual characteristics (sex, age, race), Point in time residence and place of birth Employers: Industry, employment, total payroll, location Linkages between workers and employers Links to other Census data The key is that you can link the worker and employer information to other data sets – economic or demographic.
22
Census Data: Demographic data available in RDCs
More geographic detail—usually block group or tract Additional variables More observations Variables not censored (income) Additional detail within variables
23
Data Available Decennial Censuses
Yearly ACS (American Community Survey) Current Population Survey Supplements American Housing Survey Survey of Income and Program Participation National Crime Victimization Survey National Longitudinal Mortality Study National Longitudinal Surveys (NLS)
24
Yearly ACS (American Community Survey)
Decennial Censuses full count short form and 17% long form Long form: Household and individual level demographic, socio-economic, program participation, education, household characteristics, etc 2010 short form only Yearly ACS (American Community Survey) Annual full samples % of US population Replaced Long form from 2000 decennial + a few extra questions
25
Current Population Survey Supplements
ASEC (Annual Social and Economic Supplement) or March Fertility Supplement ( ), Food Security ( ), School enrollment ( ), Tobacco Use ( ), Unbanked ( ), Volunteer ( ), Voter Reg ( ) American Housing Survey Some years from ; ~50,000 households per year Core questions: Home condition, occupant characteristics, home improvements, housing costs, home values, characteristics of recent movers, etc Topical questions vary by year
26
Survey of Income and Program Participation
2-4 year household panels; interviews ~every 4 months; ; 14,000 to 52,000 households each wave Core: labor force, income dynamics, government transfers Topical modules vary National Crime Victimization Survey Yearly ; ~90,000 households Non-fatal and property crimes, reported and unreported; demographic information for respondent; demographic information of perpetrator
27
National Longitudinal Mortality Study
CPS-ASEC data linked to national death index CPS cohorts National Longitudinal Survey (NLS) Original cohorts (1966, 1968) Labor market, demographic, and other data collected over 35 years ~5,000 respondents per cohort
28
Health Restricted Data:
More geographic detail Additional variables Child data (under 18 years) Additional detail within variables
29
Restricted Health Data and Variables
Geographic Codes for all NCHS Surveys National Health and Nutrition Examination Survey (NHANES) National Health Care Surveys National Ambulatory Medical Care Survey (NAMCS) and National Hospital Ambulatory Medical Care Survey (NHAMCS) National Hospital Discharge Survey (NHDS) National Nursing Home Survey (NNHS) and National Nursing Assistant Survey (NNAS) National Home and Hospice Care Survey (NHHCS) and National Home Health Aide Survey (NHHAS) National Survey of Residential Care Facilities (NSRCF) National Study of Long-Term Care Providers (NSLTCP) National Hospital Care Survey (NHCS)
30
National Health Interview Survey (NHIS)
National Survey of Family Growth (NSFG) State and Local Area Integrated Telephone Survey (SLAITS) National Survey of Children's Health (NSCH) National Survey of Children with Special Health Care Needs (CSHCN)
31
NCHS Data Linkage Activities
Linked Mortality Data Products Linked Medicare Enrollment and Claims Files Data Linked Medicaid Enrollment and Claims Data Linked Social Security Benefit History Data National Vital Statistics System (NVSS) Data Release and Access Policy National Maternal and Infant Health Survey
32
Some major health data sources:
Survey data NHANES NHIS NSCH AHRQ Survey data MEPS-HC MEPS-IC Health Care Survey data NAMCSs NHDS Administrative data Vital Records Linked Data Mortality Data Products Medicare Enrollment and Claims Data Medicaid Enrollment and Claims Data Social Security Benefit History Data
33
National Health and Nutrition Examination Survey (NHANES)
Provides prevalence data on selected diseases and risk factors of U.S. Population Monitors trends in diseases, behaviors, and environmental exposures Identifies emerging public health concerns Provides national baseline information on health and nutrition
34
National Health and Nutrition Examination Survey (NHANES), 1999-2014
National probability sample, approx. 10,000 Data collection from Mobile unit Interview—acculturation, air quality, allergies, demographics, diet, cognitive functioning, physical activity, sleep disorder, smoking, social support, weight history, family background, food security, alcohol use, bowel health, overall health, depression screening, pesticide exposure, reproductive health, exposure to chemicals, drug use, sexual behavior, etc Physical exam — hearing, body measurements, balance, blood pressure, vision, heart, etc Lab testing —blood, urine, oral rinse, etc
35
National Health and Nutrition Examination Survey (NHANES) Restricted Data
Identifies geography below national level down to Census block Youth -- Alcohol and Drug Use, ADHD, STDs, Mental Health Disorders, Depression, Sexual Behavior Studies that compare ADHD kids to non-ADHD in diet, activity, family structure. Racial disparities in toxic heavy metals show non-Hispanic blacks significantly more likely across all age groups.
36
National Health Interview Survey, 1993-2015
Annual Sample that is Nationally and Regionally Representative Family, Household and Person Self-Report Data Extensive Health and Social Psychological Measures including Depression, anxiety Other Mental Health Conditions Other Emotional or Behavioral Problems
37
National Health Interview Survey, Restricted Data
Country of Birth and Related Immigration Variables (Person File) State and Year of Birth (Person File) Industry and Occupation Codes Detailed Race and Hispanic Origin (Person File) Exact Dates (e.g., date of birth in Person File) Low levels of geography from state down to tract Researchers at Columbia University: Neeraj Kaushal; Julia Wang Local and state public deportation enforcement data Linked with Restricted National Health Interview Survey (NHIS), includes detailed health behaviors, health and mental health outcomes for Mexican immigrants Includes state and local geographic identifiers Researchers at UC Boulder—Early Life Mortality—NDI linked with NHIS, ,464 records, 734 deaths. Early life deaths (ages 0-17) Exact dates (rather than quarters) Age, day, month of birth, interview, and death Detailed cause of death (beyond 10 categories)
38
Exposures to Fine Particulate Air Pollution and Respiratory Outcomes in Adults Using Two National Datasets: A Cross-sectional Study Researchers: Keeve Nachman and Jennifer Parker Datasets: NHIS, EPA Air Data System (External- Linked using geocode) --Evaluates the relationship between air pollution and asthma across race/ethnicity… --Revealed significant associations for non-Hispanic blacks but not for Hispanics and non-Hispanic whites
39
National Survey of Children’s Health
National telephone survey of households with at least 1 child, N= 91,642 Demographics, Health and Functioning, Home Environment, Early Childhood Care, Developmental Screening, Adolescent School, Exercise, Emotional Difficulties Family Functioning and Parental Health Neighborhood and Community All variables restricted County and zip code geography available Established differences between married and non-married and child health; Married; cohabiting step families; single parent, extended kin families Linked with public data on state-level family and welfare policies
40
Medical Expenditure Panel Survey--Insurance Component (AHRQ and Census)
, Public (govt) and private sector employers ~40,000 each year Asks about insurance plans offered Asks about contributions provided by employers and employees Can be linked with Census business data Used to document changes in employer-provided insurance before and after ACA
41
Medical Expenditure Panel Surveys—Household Component (AHRQ)
Annual sample of households from prior year NHIS 30,000 persons, 14,000 households Health services used, frequency, charges and source of payments Access to care and quality of care Panel design over 2 years Medical Provider Component supplements Household Component Detailed charge and payment data Hospitals, physicians, home health care providers, and pharmacies The Household Component provides data from individual households and their members, which is supplemented by data from their medical providers. A set of large-scale surveys of families and individuals, their medical providers (doctors, hospitals, pharmacies, etc.), and employers across the United States. MEPS collects data on the specific health services that Americans use, how frequently they use them, the cost of these services, and how they are paid for, as well as data on the cost, scope, and breadth of health insurance held by and available to U.S. workers.
42
National Ambulatory Medical Care Surveys
Sample of physicians, 1 week of visits, randomly sampled Patient demographics, symptoms, diagnoses and medications ordered, number of visits in past year Physician demographics, type and size of practice, specialty Zip code Misuse and Abuse of Prescription Opioids Evaluates effectiveness of State Prescription Drug Monitoring Programs (PDMP) in reducing physician prescribed opioids NAMCSs identifies patient visits for pain-related reasons resulting in opioid prescription Is trend reduced after adoption of PDMP?
43
Useful Websites Restricted NCHS Data
Restricted AHRQ Data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.