Presentation is loading. Please wait.

Presentation is loading. Please wait.

Secure Data Laboratories: The U.S. Census Bureau Model

Similar presentations


Presentation on theme: "Secure Data Laboratories: The U.S. Census Bureau Model"— Presentation transcript:

1 Secure Data Laboratories: The U.S. Census Bureau Model
Steven Ruggles University of Minnesota

2 Why are secure data laboratories needed?
Greater geographic detail needed for multi-level modeling, spatial analysis, and studies of spatial segregation Very large samples (over 10% coverage) and complete-count microdata offer new research opportunities Adding geographic detail and raising sample sizes raises new confidentiality concerns

3 Existing Models: German Research Data Centres
Statistics Canada Research Data Centers Census Bureau Research Data Centers Key limitation: each holds data for only one country, making comparative research impossible

4 Emerging standards: Data Sharing for Demographic Research Project, Inter-university Consortium for Political and Social Research Eurostat initiative: all statistical agencies are mandated to develop secure data laboratories

5 Census Bureau Research Data Centers
U.S. Census Bureau made census microdata available to researchers in 1964 through the anonymized Public Use Samples It was impossible to anonymize the census of business Original RDC established in 1982 by the Census Bureau Center for Economic Studies to provide access to microdata on firms

6 The RDC Concept An office with multiple computers
Staffed by a Census Bureau employee Computer driven remote data access Meets physical and computer security requirements for restricted access Researchers must undergo a background check and obtain Special Sworn Status to use restricted data Researchers are not permitted to remove anything from the RDC before it passes a disclosure avoidance review

7 Census RDC Remote Branches
Boston (NBER) 1994 Carnegie-Mellon UC Berkeley 1999 UCLA 1999 Research Triangle (Duke, North Carolina) 2000 Michigan 2002 Chicago 2002 New York Cornell 2004 New York Baruch 2006 Minnesota 2009

8 Census RDCs Coming soon: Minneapolis

9 Census Bureau and RDC partners:
Establish physically secure offices and secure computer systems Choose projects that use the data appropriately, benefit Census Bureau programs, and present low disclosure risks; Impart to researchers at the RDC the Census Bureau “culture of confidentiality;” Establish policies and procedures that protect confidentiality in the RDC office; Release only research output that does not reveal confidential information.

10 Each RDC has a security plan.
Locked office with badges, key cards, keypads, etc. Access limited to researchers with Special Sworn Status (SSS) carrying out active, approved projects at the RDC: Sign written active project agreements Obtain security clearance Sign Census Bureau’s standard sworn agreement to preserve the confidentiality of the data. Receive awareness training

11 Census employee (the RDC administrator) stationed at each RDC.
Instills the Census Bureau's “culture of confidentiality” into the researchers trains the researchers regarding the security and confidentiality restrictions. Carries out disclosure analysis on any research output a researcher wishes to remove from the secure facilities

12 Thin client computing environment
Data stored on secure Unix servers at Census Bureau headquarters (Bowie MD). No confidential data stored at the RDCs. RDCs connected to servers via dedicated T-1 lines. Researchers use X-terminals (“thin clients”- no local data storage) to access the data authorized for their projects. Researchers are accountable for their computer use, through the use of passwords and system logs.

13 The rules: May not upload or download anything to thin client servers (no physical way to do it) Have no access to any non-Census Bureau network (including the Internet) from within the RDC facility. May not bring laptop computers or other portable mass storage devices into the RDC facility.

14 Demographic and Health Data In the RDCs
Historical focus on “economic” data Requests for “demographic” data Higher geographical resolution Denser samples and complete-count microdata Obtained permission to provide access to demographic data in RDCs in 1997 IPUMS is working with Census to reconstruct complete (100%) census microdata from for RDCs RDCs will soon include major collections of U.S. health data as well

15 The importance of high-density census microdata with fine geographic detail
This is a completely new source with the potential to provide unprecedented insight into residential segregation and the influence of local conditions on behavior. Analysts of small areas have never had access to microdata, and have been forced to use crude aggregate tabulations that are often incompatible across time and across national boundaries. As a new kind of data, complete count microdata will stimulate entirely new methods of analysis.

16 Limitations of the Data Laboratory Model
Access is highly restricted, cumbersome, and expensive The U.S. experience: just a dozen research projects using censuses in RDCs; number of projects using public-use census microdata over 10,000, most widely used data source in the social sciences Analysis across national boundaries is essential, and RDCs currently operated by the Census Bureau and the statistical agencies of Germany and Canada cannot meet this need The Data Sharing for Demographic Research (DSDR) program at the ICPSR has been charged with developing a set of standards for data enclaves

17 Conclusion Restricted data enclaves cannot replace public use data, since they prevent access for most researchers. This strategy, however, does provide the possibility for researchers with compelling needs to gain access to highly confidential data with virtually no risk of disclosure. To allow analyses that cross national boundaries, we must develop secure data laboratories that are not tied to specific national statistical agencies, but which allow access to data from many countries. Existing RDCs provide a valuable model

18

19

20

21

22


Download ppt "Secure Data Laboratories: The U.S. Census Bureau Model"

Similar presentations


Ads by Google