Secure Data Laboratories: The U.S. Census Bureau Model

Slides:



Advertisements
Similar presentations
Microdata access in practice Felix Ritchie. Overview Concerns Conceptual and practical concerns International practice UK experience Key lessons.
Advertisements

Balancing Access and Confidentiality Jenny Telford Australian Bureau of Statistics September 2008.
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
National Science Foundation Division of Science Resources Statistics May The Confidential Information Protection and Statistical Efficiency Act.
IASSIST 2003 Changes in the Way Data Archives Process Data Data Processing at ICPSR Darrell Donakowski.
Open Meetings Law N.C.G.S. § through As a general principle, official meetings of public bodies must be open to the public. HOWEVER,
Semi-Permeable Boundaries Among Institutions: Non-Public Data and the Census RDC at Berkeley IASSIST 2009 – Tampere, Finland Jon StilesMay 27, 2009.
Building Historical Social Science Infrastructure: Data Integration Projects of the Minnesota Population Center Steven Ruggles Minnesota Population Center.
Proposed IPUMS-International Secure Data Enclave Patricia Kelly Hall
Archiving our Social Science Digital History ECURE 2005 March 1, 2005.
© John M. Abowd 2005, all rights reserved Introduction John M. Abowd January 2005.
Virtual Private Network
NORTHWEST CENSUS RESEARCH DATA CENTER (NWCRDC) Mark Ellis Director, Northwest Census Research Data Center (NWCRDC) Director, Center for Studies in Demography.
CSP Annual Security Training Miranda Gregory, CSP Analyst Carroll County Department of Citizen Services.
1 The SpaceWire Internet Tunnel and the Advantages It Provides For Spacecraft Integration Stuart Mills, Steve Parkes Space Technology Centre University.
HIPAA PRIVACY AND SECURITY AWARENESS.
1 General Awareness Training Security Awareness Module 1 Overview and Requirements.
Statistics Canada’s Real Time Remote Access Solution 2011 MSIS Meeting – Karen Doherty May 2011.
Confidentiality and Security Issues in ART & MTCT Clinical Monitoring Systems Meade Morgan and Xen Santas Informatics Team Surveillance and Infrastructure.
Copyright ©2011 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. Health Information Technology and Management Richard.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Health Datasets in Spatial Analyses: The General Overview Lukáš MAREK Department of Geoinformatics, Faculty.
Center for Economic Studies Research Data Centers Arnold P. Reznek Research Data Center Administrator Center for Economic Studies U.S. Census Bureau Room.
Innovations in Data Dissemination Thomas L. Mesenbourg, Jr. Acting Director U.S. Census Bureau United Nations Seminar on Innovations in Official Statistics.
MCRDC Michigan Census Research Data Center The MCRDC is a joint project of the U.S. Bureau of the Census and the University of Michigan to enable qualified.
Administrative procedures for microdata access at SURS October 2013.
2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008.
The experience of a National Statistical Institute after a law change: Estonia First Regional Workshop Microdata Access in European Countries ― Cooperation.
STANFORD UNIVERSITY INFORMATION TECHNOLOGY SERVICES 1 The Technical Services Stuff in IT Services A brief tour of the technical and service offering plethora.
Eve Powell-Griner National Center for Health Statistics Centers for Disease Control and Prevention National Center for Health Statistics Microdata Release.
Computer Security Sample security policy Dr Alexei Vernitski.
Using Census Data at the Federal Statistical Research Data Centers Barbara A. Downs Director, FSRDC Center for Economic Studies U.S. Census Bureau.
Expanding the Role of Synthetic Data at the U.S. Census Bureau 59 th ISI World Statistics Congress August 28 th, 2013 By Ron S. Jarmin U.S. Census Bureau.
Kara O’Bannon Spalding University September 2015 Training Consultant.
New Data Access Arrangements – The Experiences in Germany Stefan Bender (Deutsche Bundesbank) Claudia Oellers (German Data Forum) Cross National.
“Data from national surveys: access, analysis, and sharing”
Identity and Access Management
Michigan Census Research Data Center
Best Practices for Protecting Privacy in a Data Enclave
Development of UK Virtual Microdata Laboratory
Census developments in the Netherlands
Data Accessibility, Confidentiality and Copyright United Nations Statistics Division Demographic Statistics Section.
Creating Something from Nothing: Working with Synthetic Files
Privacy & Confidentiality
Providing Access to Your Data: Handling sensitive data
Release of Microdata John Cornish.
UK Data Service Secure Lab
Working with Sensitive or Confidential Data John Southall Bodleian Data Librarian Subject Consultant for Economics, Sociology, Social Policy and.
Research Opportunities at Federal Statistical Research Data Centers
Virtual Private Networks (VPN)
Connecting Researchers with Data: Discovery, Documentation, Access and Security Cornell Institute for Social and Economic Research (CISER); German Institute.
Need for VPN As a business grows, it might expand to multiple shops or offices across the country and around the world. the people working in those locations.
HIPAA PRIVACY AWARENESS, COMPLIANCE and ENFORCEMENT
Disability Services Agencies Briefing On HIPAA
Sabrina Iavarone Senior User Services Officer
Connecting Researchers with Data: Discovery, Documentation, Access and Security Cornell Institute for Social and Economic Research (CISER); German Institute.
County HIPAA Review All Rights Reserved 2002.
Working Group - Geographic Information Systems for statistics
The Beginnings of a European Remote Access Network
Protecting Confidential Data
On data accessibility and confidentiality……..
The Health Insurance Portability and Accountability Act
Government Data Practices & Open Meeting Law Overview
Lesson 1: Introduction to HIPAA
Item 2.2 of the Agenda Remote access to confidential data for researchers: possible actions under the 7th Framework Programme Pascal JACQUES Unit B 5 15.
Government Data Practices & Open Meeting Law Overview
Designing IIS Security (IIS – Internet Information Service)
The role of metadata in census data dissemination
Country Report of the Statistical Center of Iran for Workshop on Integrated Economic Statistics and Informal Sector for ECO Member Countries November.
The Role of Metadata in Census Data Dissemination
Presentation transcript:

Secure Data Laboratories: The U.S. Census Bureau Model Steven Ruggles University of Minnesota

Why are secure data laboratories needed? Greater geographic detail needed for multi-level modeling, spatial analysis, and studies of spatial segregation Very large samples (over 10% coverage) and complete-count microdata offer new research opportunities Adding geographic detail and raising sample sizes raises new confidentiality concerns

Existing Models: German Research Data Centres Statistics Canada Research Data Centers Census Bureau Research Data Centers Key limitation: each holds data for only one country, making comparative research impossible

Emerging standards: Data Sharing for Demographic Research Project, Inter-university Consortium for Political and Social Research Eurostat initiative: all statistical agencies are mandated to develop secure data laboratories

Census Bureau Research Data Centers U.S. Census Bureau made census microdata available to researchers in 1964 through the anonymized Public Use Samples It was impossible to anonymize the census of business Original RDC established in 1982 by the Census Bureau Center for Economic Studies to provide access to microdata on firms

The RDC Concept An office with multiple computers Staffed by a Census Bureau employee Computer driven remote data access Meets physical and computer security requirements for restricted access Researchers must undergo a background check and obtain Special Sworn Status to use restricted data Researchers are not permitted to remove anything from the RDC before it passes a disclosure avoidance review

Census RDC Remote Branches Boston (NBER) 1994 Carnegie-Mellon 1996-2004 UC Berkeley 1999 UCLA 1999 Research Triangle (Duke, North Carolina) 2000 Michigan 2002 Chicago 2002 New York Cornell 2004 New York Baruch 2006 Minnesota 2009

Census RDCs Coming soon: Minneapolis

Census Bureau and RDC partners: Establish physically secure offices and secure computer systems Choose projects that use the data appropriately, benefit Census Bureau programs, and present low disclosure risks; Impart to researchers at the RDC the Census Bureau “culture of confidentiality;” Establish policies and procedures that protect confidentiality in the RDC office; Release only research output that does not reveal confidential information.

Each RDC has a security plan. Locked office with badges, key cards, keypads, etc. Access limited to researchers with Special Sworn Status (SSS) carrying out active, approved projects at the RDC: Sign written active project agreements Obtain security clearance Sign Census Bureau’s standard sworn agreement to preserve the confidentiality of the data. Receive awareness training

Census employee (the RDC administrator) stationed at each RDC. Instills the Census Bureau's “culture of confidentiality” into the researchers trains the researchers regarding the security and confidentiality restrictions. Carries out disclosure analysis on any research output a researcher wishes to remove from the secure facilities

Thin client computing environment Data stored on secure Unix servers at Census Bureau headquarters (Bowie MD). No confidential data stored at the RDCs. RDCs connected to servers via dedicated T-1 lines. Researchers use X-terminals (“thin clients”- no local data storage) to access the data authorized for their projects. Researchers are accountable for their computer use, through the use of passwords and system logs.

The rules: May not upload or download anything to thin client servers (no physical way to do it) Have no access to any non-Census Bureau network (including the Internet) from within the RDC facility. May not bring laptop computers or other portable mass storage devices into the RDC facility.

Demographic and Health Data In the RDCs Historical focus on “economic” data Requests for “demographic” data Higher geographical resolution Denser samples and complete-count microdata Obtained permission to provide access to demographic data in RDCs in 1997 IPUMS is working with Census to reconstruct complete (100%) census microdata from 1960-2000+ for RDCs RDCs will soon include major collections of U.S. health data as well

The importance of high-density census microdata with fine geographic detail This is a completely new source with the potential to provide unprecedented insight into residential segregation and the influence of local conditions on behavior. Analysts of small areas have never had access to microdata, and have been forced to use crude aggregate tabulations that are often incompatible across time and across national boundaries. As a new kind of data, complete count microdata will stimulate entirely new methods of analysis.

Limitations of the Data Laboratory Model Access is highly restricted, cumbersome, and expensive The U.S. experience: just a dozen research projects using censuses in RDCs; number of projects using public-use census microdata over 10,000, most widely used data source in the social sciences Analysis across national boundaries is essential, and RDCs currently operated by the Census Bureau and the statistical agencies of Germany and Canada cannot meet this need The Data Sharing for Demographic Research (DSDR) program at the ICPSR has been charged with developing a set of standards for data enclaves

Conclusion Restricted data enclaves cannot replace public use data, since they prevent access for most researchers. This strategy, however, does provide the possibility for researchers with compelling needs to gain access to highly confidential data with virtually no risk of disclosure. To allow analyses that cross national boundaries, we must develop secure data laboratories that are not tied to specific national statistical agencies, but which allow access to data from many countries. Existing RDCs provide a valuable model