Non-Public Data in the California Census Research Data Center
The Basics What is an RDC?What is an RDC? What data are available in the RDC?What data are available in the RDC? What kinds of research can be done with RDC resources?What kinds of research can be done with RDC resources? What is the process for getting access to RDC data?What is the process for getting access to RDC data?
CCRDC California Census Research Data Center Berkeley The CCRDC is a joint project of the U.S. Bureau of the Census and the University of California Berkeley (and UCLA) to enable qualified researchers with approved projects to access confidential, unpublished Census Bureau data CCRDC on the web:
Purpose of Census Research Data Centers Access to non-public use dataAccess to non-public use data Secure facilitySecure facility Presence of Census Bureau employeePresence of Census Bureau employee Benefits to Census BureauBenefits to Census Bureau –Necessary for access to Title 13 and Title 26 data –Not required for NCHS, AHRQ data if not linked to Title 13 data
Where are the RDCs? Washington, DC (1983)Washington, DC (1983) –Center for Economic Studies, U.S. Census Boston, Mass. (1994)Boston, Mass. (1994) UCLA and Berkeley (1999)/(Stanford 2010)UCLA and Berkeley (1999)/(Stanford 2010) Research Triangle, NC (Duke) (2000)Research Triangle, NC (Duke) (2000) Chicago, Illinois (2002)Chicago, Illinois (2002) Ann Arbor, Michigan (2002)Ann Arbor, Michigan (2002) Baruch (NYC, 2006) and Ithaca (Cornell, 2004)Baruch (NYC, 2006) and Ithaca (Cornell, 2004) Minnesota (2010)Minnesota (2010)
Why do we need RDCs? (Why is access to microdata restricted?) Perceptions of improper use could – Reduce response rates – Induce Congress to cut funding/programs Title 13 U.S.C protects confidentiality – Identifying microdata cannot be released – Only Census Employees/temporary staff can look at individually identifiable data – Access must provide legitimate benefits to Census Bureau programs
Why use CCRDC data? Not available elsewhereNot available elsewhere –Establishment level business data –Linked household-firm (LEHD) data More detail than anywhere elseMore detail than anywhere else –Detailed geo-spatial variables –Virtually no top or bottom coding –Possible to link to other non-Census data High Quality Sampling FramesHigh Quality Sampling Frames ExtensibilityExtensibility
Access and Disclosure Issues All researchers must be Census Bureau employees or have Special Sworn StatusAll researchers must be Census Bureau employees or have Special Sworn Status –Fingerprints, security forms, penalties Projects must showProjects must show –Benefits to Bureau –Scientific Merit –Feasibility –Need for non-Public use Data –Minimal Risk of Disclosure All output goes through disclosure review process (Interim and Final Outputs)All output goes through disclosure review process (Interim and Final Outputs) –Statistical output: Yes –Tabular Output: No
Key Demographic Surveys & Censuses Decennial Census of Population and Housing American Community Survey Current Population Survey Survey of Income and Program Participation American Housing Survey National Longitudinal Survey
Economic datasets Annual Capital Expenditures Survey (ACE) / Information and Communication Technology (ICT) Annual Survey of Manufacturers Assets and Expenditures Survey Auxiliary Establishment Business Expenditures Survey Census of Construction Industries Census of Finance, Insurance, and Real Estate Census of Manufacturers Census of Mining Census of Retail Trade Census of Services Census of Transportation, Communications, and Utilities Census of Wholesale Trade Commodity Flow Survey Compustat-SSEL Bridge Enterprise Summary Report
Economic datasets Exporter Database Foreign Trade Data - Export Foreign Trade Data - Import Large Company Survey Longitudinal Business Database Manufacturing Energy Consumption Survey Medical Expenditure Panel Survey - Insurance Component National Employer Survey Owner Change Database Quarterly Financial Report Standard Statistical Establishment List Survey of Industrial Research and Development Survey of Manufacturing Technology Survey of Plant Capacity Utilization Survey of Pollution Abatement Costs and Expenditures
Longitudinal Business Database Longitudinally linked Business CensusesLongitudinally linked Business Censuses –All non-farm establishments with paid employees in (almost) all industries 24 million unique establishments24 million unique establishments Excludes airlines, agriculture, RRExcludes airlines, agriculture, RR –Every five years from Manufacturing Census available from Manufacturing Census available from Annual Survey of Manufactures includes all large firmsAnnual Survey of Manufactures includes all large firms
Longitudinal Business Database LBD includesLBD includes –Payroll –Employment –Ownership –Detailed geographic information –Industry at 6-digit NAICS (more detail in some cases) –Other variables available (e.g. sales) but coverage varies across sectors
LBD draws on economic censuses Census of ManufacturesCensus of Manufactures Census of ServicesCensus of Services Census of MiningCensus of Mining Census of Retail TradeCensus of Retail Trade Census of Wholesale TradeCensus of Wholesale Trade Census of Transportation, Communications and UtilitiesCensus of Transportation, Communications and Utilities –All of these Censuses are available in full, and can be linked over time using the LBD
Employer-Employee Linked Datasets LEHD: Longitudinal Employer – Household DynamicsLEHD: Longitudinal Employer – Household Dynamics –4 million persons linked to 1 million establishments –Quarterly data on employment and wages from state unemployment insurance agencies Contains basic demographic data for all employeesContains basic demographic data for all employees Establishments linked to the LBDEstablishments linked to the LBD 49/50 states are currently participating49/50 states are currently participating
Other Firm-level Datasets Survey of Manufacturing TechnologySurvey of Manufacturing Technology Quarterly Financial ReportQuarterly Financial Report –US mining, manufacturing and transportation businesses Survey of Plant Capacity UtilizationSurvey of Plant Capacity Utilization Capital Expenditure SurveyCapital Expenditure Survey Compustat-LBD BridgeCompustat-LBD Bridge National Employer SurveyNational Employer Survey Survey of Pollution Abatement Costs and ExpendituresSurvey of Pollution Abatement Costs and Expenditures Manufacturing Energy Consumption SurveyManufacturing Energy Consumption Survey
National Center for Health Statistics We are now hosting research using confidential NCHS and AHRQ data in the CCRDCWe are now hosting research using confidential NCHS and AHRQ data in the CCRDC Rules for access and disclosure the same as those in their enclavesRules for access and disclosure the same as those in their enclaves – – – f –No requirement to demonstrate Census benefit. –Long list of datasets – including NHIS, NHANES, NSFG, LSOA…. –
New Data National Center for Health Statistics New Data National Center for Health Statistics National Health and Nutrition Examination Survey (NHANES) NHANES combines interviews and physical examinations to assess the health and nutritional status of adults and children in the United States.National Health and Nutrition Examination Survey (NHANES) NHANES combines interviews and physical examinations to assess the health and nutritional status of adults and children in the United States. National Health and Nutrition Examination Survey (NHANES)National Health and Nutrition Examination Survey (NHANES) National Health Care Surveys (NHCS) A family of provider-based surveys that provide reliable information about health care providers, services, and patients. NNational Health Care Surveys (NHCS) A family of provider-based surveys that provide reliable information about health care providers, services, and patients. NNational Health Care Surveys (NHCS)National Health Care Surveys (NHCS) National Health Interview Survey (NHIS) The NHIS collects data on a broad range of health topics through personal health interviews conducted in the home.National Health Interview Survey (NHIS) The NHIS collects data on a broad range of health topics through personal health interviews conducted in the home. National Health Interview Survey (NHIS)National Health Interview Survey (NHIS) National Vital Statistics System (NVSS) NVSS works with state vital registration systems to compile data on births, deaths, marriages, divorces, and fetal deaths.National Vital Statistics System (NVSS) NVSS works with state vital registration systems to compile data on births, deaths, marriages, divorces, and fetal deaths.National Vital Statistics System (NVSS)National Vital Statistics System (NVSS)
RDC Research Environment “Thin Client” computing.“Thin Client” computing. –Servers in Maryland, accessed via remote terminals –Standard statistical software (SAS, Stata, Guass, Matlab, etc.) –Standard Datasets kept on servers –Other software/data coordinated by Administrator/CES staff Secure EnvironmentSecure Environment –Restricted and monitored keycard access –No Visitors –No Laptops, internet –Printing limited, RDC Administrator Virtual RDC at Cornell (Synthetic Data, Zero Obs files)Virtual RDC at Cornell (Synthetic Data, Zero Obs files)
Contact Information RDC web site: web site: RDC phone: (510) RDC phone: (510) RDC administrator: Angela AndrusRDC administrator: Angela Andrus RDC executive director: Jon StilesRDC executive director: Jon Stiles CES: