The Rocky Mountain Research Data Center

1 The Rocky Mountain Research Data Center
The Rocky Mountain Research Data Center Jani Little, Executive Director Katie Genadek, Administrator

2 What is a Federal Statistical Research Data Center (FSRDC)?
--A secure computing lab where restricted data, collected by federal agencies, can be accessed FOR STATISTICAL PURPOSES ONLY --Made possible by a contractual agreement between a leading research institution and the U.S. Census Bureau --The Census Bureau’s Center for Economic Studies (CES) directs all FSRDCs and the FSRDC Program --FSRDCs are managed by an on-site Census employee—the administrator— who guides researchers on proposal development, enforces security guidelines, and serves as liaison with the research community.


4 9 in 2009 and 24 in 2016

5 The RMRDC Consortium Partner Members: Supporting Members:
UC Colorado Springs Colorado State Government Colorado School of Mines National Center for Atmospheric Research National Renewable Energy Laboratory

6 Partner Consortium Members Faculty, Grad Students, and Affiliated Researchers:
Free access to RMRDC services and secure laboratory Researchers with continued use are expected to write grant proposals and include lab fees

7 UCCS Cost to become a Partner Member?
$15,000 annual contribution for 3 years ($15,000 lab fee for 1 external project per year)

8 Advantages to Researchers and Institutions:
--Greatly expands the policy and basic questions that can be addressed --Builds on past research findings with richer data --Improves competitive edge for grants and publications --Improves graduate education (big data/statistical techniques) and placement --Attracts and retains data-intensive faculty

9 Advantages Provided to Research:
--Microdata not available publicly firms and establishments individuals and households (especially longitudinal studies) --Variables not available in public versions of data sets (e.g., low level geography) --Full population counts or larger samples (Decennial Census, ACS, CPS) --Full range of response items (e.g., industry codes, occupational codes, detailed race answers, income is not top-coded, etc.) --Ability to make linkages with external data (e.g., via geocodes, establishment ID, etc.) between multiple internal data sets via non-public link keys

10 FSRDCs Used to Address Many Research Topics
Business, Trade, Finance, and Management Crime and Crime Victimization Demography, Population Distributions and Trends, Migration, and Immigration Economics, Labor Markets, Entrepreneurship, Employment and Industry Education and Education Policy Hazard Mitigation, Environmental Impact Assessment, Pollution Abatement Health and Well-Being, Health Insurance, Health Policy Housing, Housing Markets, and Residential Patterns Poverty, Social Welfare Policy, and Social Mobility Transportation Analysis and Planning Urban and Regional Economics and Planning Energy Efficiency and Greenhouse Gas Emissions in Manufacturing

11 RMRDC: The Physical Facility
Projected Opening: Early May, 2017 Location: IBS Building on CU Boulder Campus --10 thin client workstations to access FSRDC servers --Secure communications that tunnel over campus internet --Contains the Administrator’s office --Badge Reader at Entrance --24/7 Security System with camera --no electronic devices allowed --NOTHING leaves the secure lab without approval

12 Major Partners in the FSRDC System
U.S. Census Bureau Economic Data Demographic Data Longitudinal Employer-Household Dynamics (LEHD) Data Bureau of Labor Statistics (BLS) National Center for Health Statistics (NCHS) Agency for Healthcare Research and Quality (AHRQ) Other Federal Partners

13 Examples of Restricted Health Data
Geographic Codes for all NCHS Surveys National Health and Nutrition Examination Survey (NHANES) National Health Care Surveys National Ambulatory Medical Care Survey (NAMCS) and National Hospital Ambulatory Medical Care Survey (NHAMCS) National Hospital Discharge Survey (NHDS) National Nursing Home Survey (NNHS) and National Nursing Assistant Survey (NNAS) National Home and Hospice Care Survey (NHHCS) and National Home Health Aide Survey (NHHAS) National Survey of Residential Care Facilities (NSRCF) National Study of Long-Term Care Providers (NSLTCP) National Hospital Care Survey (NHCS)

14 National Health Interview Survey (NHIS)
National Survey of Family Growth (NSFG) State and Local Area Integrated Telephone Survey (SLAITS) National Survey of Children's Health (NSCH) National Survey of Children with Special Health Care Needs (CSHCN)

15 NCHS Data Linkage Activities
Linked Mortality Data Products Linked Medicare Enrollment and Claims Files Data Linked Medicaid Enrollment and Claims Data Linked Social Security Benefit History Data National Vital Statistics System (NVSS) Data Release and Access Policy National Maternal and Infant Health Survey

16 Finding Example Projects, online sources
• NCHS Publications - The NCHS RDC website provides a list of all publications that have come out of NCHS projects conducted in the RDC by dataset – • Abstracts of Approved NCHS Projects—These have been extracted from Annual Reports of CES and RDC activities – center/proposals Good search terms “pollution” “oil”

17 Requirements for Any FSRDC Project:
--Research projects must undergo a formal approval process with the agency that owns the data, e.g., Census, NCHS, BLS --Researchers must go through a background investigation that qualifies them for “Special Sworn Status (SSS)” which makes them an unpaid Census Bureau employee. --Results must be formally reviewed for disclosure violation before they leave the secure facility. Currently 260 active projects, 50% are Census

18 Components of Proposals:
--Personnel and Time frame --Project Description (scientific merit, methods, feasibility, why requires restricted data) --Dataset(s), Variables, Geography --Results Expected and Disclosure Avoidance Strategies

19 Accessing NCHS Restricted-Use Data
Submit proposal to NCHS for review Once approved, go through background check (SSS) Work with NCHS analyst to request data extraction $1200 minimum, $2,500 fee for more extensive extraction Pass on-line training to access data

20 FSRDC Server Software GeoDa Tomlab Knittro Madd QGIS StatTransfer Python - Anaconda Fortran Perl Tex/LaTex Gauss Stata Matlab & toolboxes PBS Pro Intel Composer XE NX Enterprise R  SAS SAS (Dataflux) SUDAAN

21 Comparing Diet Quality, Physical Activity, and Sedentary Behavior in Youth with and without ADHD
Researcher: Carol Curtin - University of Massachusetts Medical School Restricted National Health and Nutrition Examination Surveys (NHANES) Includes youth, ages 8-15, with and without ADHD Anthropometric, dietary and activity measures include detailed family structure categories ADHD and other behavioral health conditions assess using gold standard diagnostic

22 National Health and Nutrition Examination Survey (NHANES)
Provides prevalence data on selected diseases and risk factors of U.S. Population Monitors trends in diseases, behaviors, and environmental exposures Identifies emerging public health concerns Provides national baseline information on health and nutrition

23 National Health and Nutrition Examination Survey (NHANES), 1999-2014
Nationally probability sample, approx. 10,000 Data collection from Mobile unit Interview—acculturation, air quality, allergies, demographics, diet, cognitive functioning, physical activity, sleep disorder, smoking, social support, weight history, family background, food security, alcohol use, bowel health, overall health, depression screening, pesticide exposure, reproductive health, exposure to chemicals, drug use, sexual behavior, etc Physical exam — hearing, body measurements, balance, blood pressure, vision, heart, etc Lab testing —blood, urine, oral rinse, etc

24 National Health and Nutrition Examination Survey (NHANES) Restricted Data
Identifies geography below national level down to Census block Youth -- Alcohol and Drug Use, ADHD, STDs, Mental Health Disorders, Depression, Sexual Behavior

25 Complex Families, State Family Policies, and Child Health Disparities
Researchers: Justin Denney, Rachel Kimbro, Christine Percheski, Maria Perez-Patron Restricted National Survey of Children’s Health (NSCH), includes detailed family structure categories Detailed child health assessments Linked with public data of state-level family and welfare policy variables

26 National Survey of Children’s Health
National telephone survey of households with at least 1 child, N= 91,642 Demographics, Health and Functioning, Home Environment, Early Childhood Care, Developmental Screening, Adolescent School, Exercise, Emotional Difficulties Family Functioning and Parental Health Neighborhood and Community All variables restricted County and zip code geography available

27 Immigration Policy Enforcement and the Mental Health of Mexican Immigrant Families
Researchers at Columbia University: Neeraj Kaushal; Julia Wang Local and state public deportation enforcement data Linked with Restricted National Health Interview Survey (NHIS), includes detailed health behaviors, health and mental health outcomes for Mexican immigrants Includes state and local geographic identifiers

28 National Health Interview Survey, 1993-2015
Annual Sample that is Nationally and Regionally Representative Family, Household and Person Self-Report Data Extensive Health and Social Psychological Measures including Depress, anxiety Other Mental Conditions Other Emotional or Behavioral Problems

29 National Health Interview Survey, Restricted Data
Country of Birth and Related Immigration Variables (Person File) State and Year of Birth (Person File) Industry and Occupation Codes Detailed Race and Hispanic Origin (Person File) Exact Dates (e.g., date of birth in Person File) Low levels of geography from state down to tract

30 Early Life Mortality in the U. S. Elizabeth Lawrence, University of N
Early Life Mortality in the U.S. Elizabeth Lawrence, University of N. Carolina at Chapel Hill Rick Rogers, University of Colorado Boulder NHIS-LMFs 246,464 records, 734 deaths Restricted-use variables Early life deaths (ages 0-17) Exact dates (rather than quarters) Age, day, month of birth, interview, and death Detailed cause of death (beyond 10 categories) New mortality linked files, when available Geographic detail 50 states rather than 4 regions Age top coding usually at ages 85+ Quarter of birth, interview, and death may be too broad for younger ages. Some causes of death are perturbed New matches, available this fall, will first be accessible only in an RDC South generally experiences higher risk of death

31 University of Colorado
Katie Genadek, PhD RMRDC Administrator University of Colorado IBS Room 423

32 Census Data: Demographic data available in RDCs
More geographic detail Additional variables More observations Variables not censored (income) Additional detail within variables

33 Data Available Decennial Censuses
Yearly ACS (American Community Survey) Current Population Survey Supplements American Housing Survey Survey of Income and Program Participation National Crime Victimization Survey National Longitudinal Mortality Study National Longitudinal Surveys (NLS)

34 Linking to Demographic Data
Link data by geographic area to demographic and survey data Local data Survey data Proprietary data Linked data within the RDC

35 Detailed Geography Data Set Geography Decennial Census Block
American Community Survey (ACS) Survey of Income and Program Participation (SIPP) Tract Current Population Survey* (CPS) – ASEC Supplement & Food Supplement American Housing Survey (AHS) National Longitudinal Survey (NLS) – Young/Mature Women Lat/Lon Block Group National Longitudinal Survey (NLS) – Young/Old Men County National Longitudinal Mortality Study (NLMS)

36 More information a.html Public data and metadata

37 Person Identification Keys (PIKs)
PVS assigns 9 digit, unique identifiers called Protected Identification Keys (PIKs) via probabilistic matching techniques to surveys and decennial data PIKs are used to facilitate record linkage Once ‘PIKed,” data can be linked to any other data processed through PVS Some data is PIKed and linked already

38 Current RDC projects of interest – Census Data
Interracial couples and children’s race – how are they reported, how does that change over time The effects of 1960 individual housing conditions on urban renewal Linking 2010, 2000 full count census to asses race response changes and how is racial identity influenced by neighborhood characteristics Migration and duration of international student graduates (NSCG merged to census data)

39 Census Data: Economic data available in RDCs
Microdata that is not available elsewhere Detailed geographies and industries Data linked over time Employee and employer linked data Full business register for the US

40 Types of Economic data available in RDCs
Business Register Firm Surveys Establishment Surveys Economic Censuses Transaction or Trade data

41 Business Register Data
Data Set Compustat-SSEL Bridge (CSB) Form 5500 Bridge File Integrated Longitudinal Business Database (ILBD) Longitudinal Business Database (LBD) Ownership Change Database (OCD) Standard Statistical Establishment List / Business Register (SSEL)

42 Firm Surveys Data Set Annual Capital Expenditures Survey (ACES)
Annual Retail Trade Survey (ARTS) Business Expenditures Survey (BES) Business Research & Development and Innovation Survey (BRDIS) Enterprise Summary Report (ESR) Exporter Database (EDB) Quarterly Financial Report (QFR) Service Annual Survey (SAS) Survey of Business Owners (SBO) Survey of Industrial Research and Development (SIRD)

43 Establishment Surveys
Data Set Annual Survey of Manufacturers (ASM) Current Industrial Reports (CIR) Manufacturing Energy Consumption Survey (MECS) Medical Expenditure Panel Survey – Insurance Component (MEPS-IC) National Employer Survey (NES) Quarterly Survey of Plant Capacity Utilization (QPC) Survey of Manufacturing Technology (SMT) Survey of Plant Capacity Utilization (PCU) Survey of Pollution Abatement Costs and Expenditures (PACE)

44 Economic Censuses Census of Auxiliaries (AUX)
Data Set Census of Auxiliaries (AUX) Census of Construction Industries (CCN) Census of Finance, Insurance, Real Estate (CFI) Census of Manufacturers (CMF) Census of Mining (CMI) Census of Retail Trade (CRT) Census of Services (CSR) Census of Transportation, Communications, Utilities (CUT) Census of Wholesale Trade (CWH)

45 Transactions Data Data Set Commodity Flow Survey (CFS)
Foreign Trade Data - Export (EXP) Foreign Trade Data - Import (IMP) Longitudinal Foreign Trade Transactions Data (LFTTD)

46 Linked Employer Household Dynamics (LEHD)
LEHD data combine administrative data from states’ Unemployment Insurance systems with Census Bureau data. Workers: Employer history and quarterly wages, Individual characteristics (sex, age, race), Point in time residence and place of birth Employers: Industry, employment, total payroll, location Linkages between workers and employers Links to other Census data: Virtually any RDC data on businesses; SIPP; CPS March supplement; ACS The key is that you can link the worker and employer information to other data sets – economic or demographic.

47 Mining, Oil and Gas Data in the RDC
Census of Mineral Industries Every 5 years, 2012 Information: Forms: 21_form_default.html

48 Useful Websites Census Bureau Data: Center for Economic Studies
NCHS Research Data Center AHRQ

49 Examples of research using business data
Environmental regulation and productivity: evidence from oil refineries E Berman, LTM Bui - Review of Economics and Statistics, 2001 Industrial Investments in Energy Efficiency: A Good Idea? Mary Jialin Li – CES Working Paper 2017 State Taxation and the Reallocation of Business Activity: Evidence from Establishment-Level Data Xavier Giroud & Joshua Rauh – CES Working Paper 2017

50 Census (and other) Data Proposals

51 Process: Contact RDC Administrator with plan, get starter packet
Write first draft of proposal Work with Administrator on refining proposal description Work on PPS and Abstract Submit to administrator – submit to Census Once Approved - SSS

52 RDC Proposal Contents Personnel and Time Frame
Project Description (scientific merit, methods, feasibility, why requires restricted data) Dataset(s), Variables, Geography Results Expected and Disclosure Avoidance Strategies

53 Approval Process Step 1: Approval from RDC Administrator
Step 2: Census approval Step 3: Sponsoring or other agency approval Step 4: Background check and SSS

54 Background Check Off-line paperwork and documentation
On-line trainings and certifications Background check Submitted online and followed with interview Residential history Foreign travel Education and employment history References Fingerprinting

55 Special Sworn Status SSS is authorized by Title 13 U.S.C. 23 (c) "to assist the Bureau of the Census in performing the work authorized by this title.” The Census Bureau may provide SSS to an individual When an individual has expertise or specialized knowledge that can contribute to the accomplishment of Census Bureau projects or activities or engages in a joint project with the Census Bureau; When an individual is employed by an agency/organization performing a service for the Census Bureau under contract or providing information to the Census Bureau for statistical purposes; When Federal law requires an individual to audit, inspect, or investigate Census Bureau activities.

56 How and When Do I Get Started?
See materials at the RMRDC website, the CES website, NCHS website Contact the RMRDC Director and Administrator for: data availability project budget and timeline contact information For Census projects, the Administrator will give invaluable guidance on the proposal development process the benefits to Census (PPS) help navigate the project approval process

57 Contact Information: Katie Genadek: Jani Little:

