Presentation is loading. Please wait.

Presentation is loading. Please wait.

New Data in the Federal Statistical Research Data Centers Melissa Ruby Banzhaf, PhD Administrator, ARDC Center for Economic Studies U.S. Census Bureau.

Similar presentations


Presentation on theme: "New Data in the Federal Statistical Research Data Centers Melissa Ruby Banzhaf, PhD Administrator, ARDC Center for Economic Studies U.S. Census Bureau."— Presentation transcript:

1 New Data in the Federal Statistical Research Data Centers Melissa Ruby Banzhaf, PhD Administrator, ARDC Center for Economic Studies U.S. Census Bureau October 9, 2015

2 Overview  Background on Federal Statistical RDCs  Types of Data Available in the RDC (Emphasis on New Data)  How to Obtain Access to this New Data (and other data) in the RDCs

3 What are Federal Statistical Research Data Centers (RDCs)?  Secure computing labs where qualified researchers conduct approved statistical analysis on non-public data.  These data are collected by various government agencies (Census Bureau, NCHS, AHRQ, SSA, and more to come).  Established through an agreement between federal statistical agencies and a local research community.  Managed by the Census Bureau.

4 Federal Statistical Research Data Center Locations

5 The Atlanta Research Data Center  Located in the Federal Reserve Bank of Atlanta  corner of 10th & Peachtree  Consortium Members  Emory University  University of Georgia  Georgia State University  Clemson University  Federal Reserve Bank of Atlanta  University of Alabama at Birmingham  University of Tennessee – Knoxville  Florida State University  Georgia Institute of Technology

6 Types of Restricted Data Available  Economic Data  Microdata on firms and establishments  Business Register data  Demographic Data  Survey data on individuals and households  Administrative data on individuals  Linked survey and administrative datasets  Employer-Employee Jobs Data (LEHD)  Data on employees linked with data on employers  Health Data  National Center for Health Statistics  Agency for Healthcare Research & Quality

7 Advantages of Restricted Data  Vast number of business datasets that are not publicly available at the micro level  Census datasets can be linked together  Census datasets can be linked to external data  More detailed level of geographic identifiers  Very little top or bottom-coding

8 Economic Datasets Annual Survey of Manufactures Census of Construction Census of Finance and Insurance Census of Manufactures Census of Mining Census of Real Estate Census of Retail Census of Services Census of Transportation Census of Wholesale Survey of Business Owners Commodity Flow Survey Import and Export Transactions Annual Capital Expenditures Survey Business Register (SSEL) Longitudinal Business Database Manufacturing Energy Consumption Survey Medical Expenditure Panel Survey, Insurance Component National Employer Survey Pollution Abatement Costs and Expenditures Quarterly Financial Reports Research and Development Survey Survey of Manufacturing Technology Annual Retail/Wholesale Trade Surveys Kauffman Firm Survey

9 New Data – Management and Organizational Practices Survey  Supplement to the 2010 Annual Survey of Manufactures  Goal: Collect information on establishment’s use of structured management practices  36 questions:  16 Management (monitoring, targets, and incentives)  13 Organization (who makes decisions, data in decision-making)  7 background (number of managers/non-managers, union status)  Permits analysis of relationship between management practices and key economic outcomes (e.g., productivity)

10 Demographic Datasets - Survey  Decennial Surveys (1950-2010)  American Community Survey  Current Population Survey  Survey of Income and Program Participation  American Housing Survey  National Survey of College Graduates  National Crime Victimization Survey

11 New Data - Decennial  1950 – 1% PUMS sample  Geography: Census tract but lowest level is enumeration district (roughly 600 people)  1960 – 25% sample (densest ever)  Geography: Census tract and other sub-county geographies (Census place) but lowest level is enumeration district (roughly 600 people)  Harmonized coding across 1950 and 1960

12 New Data – Current Population Survey  CPS Basic Monthly Data (2000-2014)  CPS Food Security Supplement (2001-2012)  CPS Voting and Registration Supplement (2006, 2008, 2010, 2012)  CPS Fertility Supplement (1998, 2000, 2002, 2004, 2006, 2008, 2010, 2012)

13 New Data – Current Population Survey  Characteristics of Internal Files:  Geography: Census Tract  March CPS is only file that has PIKs  Has CPS identification key so may be able to link across CPS surveys.  Some limitations on types of analysis permitted by BLS.

14 New Data – National Crime Victimization Survey  National survey of households (2006-2012)  Collects information on frequency, characteristics, and consequences of criminal victimization (sexual assault, robbery, burglary, motor vehicle theft etc.)  New: Public Police Contact Survey (2011) – Collects information on perceptions of police behavior and response during encounters.

15 New Data – National Survey of College Graduates  Biennial survey collects information (such as occupation, work activities, salary, relationship between degree field and occupation) on college-educated individuals with particular emphasis on those in science and engineering fields.  2010 currently available  Geography at state level  Currently no PIKs

16 Demographic Datasets - Administrative  Census Numident File (SSA)  Housing Datasets (HUD):  Public and Indian Housing Information Center Dataset  Tenant Rental Assistance Certification Systems dataset  Computerized Homes Underwriting Management System

17 Demographic - Administrative Continued  Medicare/Medicaid Datasets (CMS):  Medicare Enrollment Database  Medicaid Statistical Information System

18 Administrative – Census Numident  Data derived from applications for Social Security Numbers  Contains data on:  Birthdate  Town or county of birth  Gender  Race  Citizenship  Date of death  PIKs

19 Administrative - Housing  Public and Indian Housing Information Dataset  Contains information on all members of HH with a participant in a covered program:  Housing Choice Voucher  Public Housing  Indian Housing  Includes age, race, sex, rent, household income, PIK  Geography: block level

20 Administrative - Housing  Tenant Rental Assistance Certification Systems (TRACS) dataset  Contains information on all members of HH with a participant in a covered program.  These programs provide rental assistance for participants living in privately-owned, subsidized housing.  Includes age, race, sex, rent, household income, PIK  Geography: block level

21 Administrative - Housing  Computerized Homes Underwriting Management System (CHUMS)  Contains records on approved mortgage applications insured by Federal Housing Administration (FHA)  Contains information on borrowers and co- borrowers including income, housing value, mortgage, demographic characteristics, PIKs  Geography: block level

22 Administrative - CMS  Medicare Enrollment Database (1999-2014)  Information on all Medicare beneficiaries  Limited to information on people not claims: eligibility dates and statuses, residence change dates, basic demographic information, PIKs  Geography: block level

23 Administrative - CMS  Medicaid Statistical Information System (2000- 2013)  Information on all Medicaid and CHIP enrollees in each month  Limited to information on people not claims: eligibility dates and statuses, basic demographic information, PIKs  Geography: zip code level

24 Demographic Datasets: Linked Survey-Administrative  Current Population Survey - SSA Earnings Files  Survey of Income and Program Participation – SSA Earnings Files  National Longitudinal Mortality Study

25 Linked: SSA Files with CPS and SIPP  CPS and SIPP Survey Data matched to SSA earnings files by PIK  SSA records include:  Detailed Earnings Record – earnings from FICA, non-FICA, and self-employment income (1978+) from Master File  Summary Earnings Record – all earnings for each year from 1951 to present  Master Beneficiary Record – contains information (entitlement and payment data) on Social Security Recipients (including Disability).  831 Disability File – determines medical eligibility for Disability Insurance, and SSI benefits.

26 Linked: National Longitudinal Mortality Study  Purpose of database: to study the effects of demographic and socio-economic characteristics on mortality  Survey data: March CPS, 1980 Decennial Census (sample)  Administrative data: Death Certificate information from National Death Index (through 2011)  Geography: county level

27 LEHD  “Tracks” a person based on their place of employment; essentially links employees with employers  Based on unemployment insurance administrative records  Available on a state-by-state basis  Quarterly data starting in 1990 – currently through 2011  Can link employer to employer data in other Census datasets  Can link employee to data on individuals in other Census datasets  New Variables: Firm age and size, Firm ID that matches Business Register

28 New Data – Innovation Measurement Initiative  Goal: Improve measurement of innovation resulting from research grants, a small but important sector of the economy.  How: Integrate university data on federally funded research grants with Census Bureau data on people and businesses.  Specifically link:  Employee, vendor, sub-award transactions to the Census Business Register and LEHD (employee-employer database).  Innovation outcomes: Job placements, start-up activity and business dynamics, vendor characteristics

29 New Data – Innovation Measurement Initiative  Partnership between Census and Institute on Research in Innovation and Science (IRIS) at the University of Michigan  Member institutions of IRIS provide data to Census and in turn receive:  Individual and collective reports  Underlying tables and graphics for institution’s use  Access to aggregate data for researchers  Input on new product design

30 New Data – IMI Opportunity  Census is asking for nominations of teams of 2-5 researchers (at least one member with SSS) to assist in enhancing and documenting data for the IMI project.  What is in it for you?  Opportunity to do research on new data.  $25K in funding support for 1 graduate student.  Initial deadline for nominations: October 16

31 Health Data in the ARDC  These data are collected by:  National Center for Health Statistics (NCHS)  Agency for Healthcare Research and Quality (AHRQ)

32 What types of NCHS data? National Health Status Surveys National Health and Nutrition Examination Survey (NHANES) I, II, and III National Health Interview Survey (NHIS) Longitudinal Study on Aging I and II (LSOA) National Survey of Family Growth National Survey of Children's Health National Survey of Early Childhood Health National Survey of Children with Special Health Care Needs National Asthma Survey National Health Care Surveys National Ambulatory Medical Care Survey National Hospital Ambulatory Medical Care Survey National Survey of Ambulatory Surgery National Hospital Discharge Survey National Nursing Home Survey (NNHS) National Home and Hospice Care Survey National Employer Health Insurance Survey National Health Provider Inventory National Immunization Survey Vital Statistics Mortality and Multiple Mortality Birth Fetal Death National Death Index Marriage and Divorce

33 What types of NCHS data? Linked Data Sets  Linked mortality data: NHIS, NHANES LSOA II, NNHS  Linked Medicare Enrollment and Claims data: NHIS, NHANES, LSOA II  Linked Social Security Administration Data: NHIS, NHANES, LSOA II, NNHS  Linked EPA data

34 What types of AHRQ Data?  Medical Expenditure Panel Survey (MEPS) files include:  Household Component  Provider Component  Insurance/Employer Component  Nursing Home Component (1996 only)  Area Resource File  Two-year two panel file  MEPS-NHIS linked data  Only Household Component and portions of Provider Component are publicly available

35 How to Access the RDC  Develop proposal  Different guidelines for Census data vs. NCHS/AHRQ guidelines  Submit proposal for agency review  Census (and agency sponsors)  NCHS/AHRQ  Obtain Special Sworn Status (SSS)  Pay one-time fee for NCHS/AHRQ data 35

36 Timeframe – “Patience is a Virtue”  Census Data  Plan on 6 to 9 months before working in lab  Census approval/ Other Agency Approval  NCHS/AHRQ Data  Timeframe dependent on agency approval process  Census approval NOT required  Special Sworn Status  3 to 4 months for your security clearance

37 Working in the ARDC lab  All analysis conducted in the ARDC lab  Data located on server in Maryland  Access data via thin client terminals  No internet access or personal computers allowed in lab  Statistical software available: SAS, Stata, R, Matlab, GIS, Sudaan, etc.  Agency reviews output before releasing  Penalty for disclosure is $250,000 and/or 5 yrs in prison (inadvertent or otherwise)

38 Upcoming RDC-Related Events  Cornell University Course – INFO 7470 – Understanding Social and Economic Data  Can be connected via distance learning (and get course credit)  Intended for Ph.D. students and faculty who use large-scale restricted-access data from government suppliers  Emphasis on data accessible through the RDC network  Interested? Contact us for more information.

39 Contact Information  People:  Melissa Ruby Banzhaf, ARDC Administrator melissa.r.banzhaf@census.govmelissa.r.banzhaf@census.gov, 404-498-7538  Julie L. Hotchkiss, ARDC Executive Director Julie.l.hotchkiss@atl.frb.orgJulie.l.hotchkiss@atl.frb.org, 404-498-8198  Resources:  ARDC website: atlantardc.orgatlantardc.org  Quarterly ARDC Newsletter (email us to get on list)


Download ppt "New Data in the Federal Statistical Research Data Centers Melissa Ruby Banzhaf, PhD Administrator, ARDC Center for Economic Studies U.S. Census Bureau."

Similar presentations


Ads by Google