Are Public Use (Micro) Data a Thing of the Past? John M. Abowd Cornell University US Census Bureau Prepared for IASSIST 2002.

Slides:



Advertisements
Similar presentations
The Statistics Act and Research Access to Data Paul J Jackson Legal Services ONS.
Advertisements

Local Employment Dynamics Data: Advanced Topics C2ER Training Workshop June 4, 2012 Stephen Tibbets Erika McEntarfer LEHD Program US Census Bureau.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Distributed Systems Architectures Slide 1 1 Chapter 9 Distributed Systems Architectures.
BEA Economic Areas Aligning Workforce & Economic Information Association of Public Data Users APDU 2008 Annual Meeting The Brookings Institution Washington,
11 ACS Public Use Microdata Samples of 2005 and 2006 – How to Use the Replicate Weights B. Dale Garrett and Michael Starsinic U.S. Census Bureau AAPOR.
© John M. Abowd 2007, all rights reserved Universes, Populations and Sampling Frames John M. Abowd February 2007.
Nordisk Statistikermøde i København august 2010 The archive statistical method years - A Summary by Svein Nordbotten 8/11/20101Svein.
St. Lucia Country Report By Edwin St Catherine Director, Central Statistical Office Presented to IPUMS Workshop August 24 th, 2007.
© John M. Abowd 2005, all rights reserved Analyzing Frames and Samples with Missing Data John M. Abowd March 2005.
The American Community Survey (ACS) Lisa Neidert NPC Workshop: Analyzing Poverty and Socioeconomic Trends Using the American Community Survey June 22 –
© John M. Abowd 2005, all rights reserved Recent Advances In Confidentiality Protection John M. Abowd April 2005.
© John M. Abowd 2005, all rights reserved Sampling Frame Maintenance John M. Abowd February 2005.
11 American Community Survey Summary Data Products.
MSIS 110: Introduction to Computers; Instructor: S. Mathiyalakan1 Organizing Data and Information Chapter 5.
John M. Abowd Cornell University IASSIST 2010 June 4, 2010.
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
U.S. Census Bureau 3 Censuses: – Decennial Census Age Race Sex Relationship to householder – Economic Census – Census of Governments.
Principles of Information Systems, Sixth Edition Organizing Data and Information Chapter 5.
“OnTheMap” The Census Bureau’s New Tool for Residence-Workplace Analysis Fredrik Andersson and Jeremy Wu May 7, 2007 Daytona Beach, FL.
United Nations Workshop on Revision 3 of Principles and recommendations for Population and Housing Censuses and Census Evaluation Amman, Jordan, 19 – 23.
Labor Market Information in the Americas: the United States Workshop On Labor Migration and Labor Market Information Systems Inter-American Network for.
1 What is a “Statistical Calculator”? Presented by Doug Hillmer Independent Consultant.
Local Employment Dynamics Jeff Matson CURA, University of Minnesota Oriane Casale Labor Market Information Office, MN Dept. of Employment and Economic.
Saadia GreenbergElena Fazio Office of Performance and Evaluation Administration on Aging US Department.
Plans For the First Release of American Community Survey 5-year Estimates Prepared for the Joint Meetings of the SDC and CIC Steering Committees February.
Local Employment Dynamics (LED) & OnTheMap Nick Beleiciks Oregon Census State Data Center Meeting April 14, 2009.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Introduction to the Public Use Microdata Sample (PUMS) File from the American Community Survey Updated February 2013.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
© 2012-Robert G Parker May 24, 2012 Page: 1 © 2012-Robert G Parker May 24, 2012 Page: 1 © 2012-Robert G Parker May 24, 2012 Page: 1 © 2012-Robert G Parker.
© John M. Abowd 2007, all rights reserved Analyzing Frames and Samples with Missing Data John M. Abowd March 2007.
virtual reality (VR) or virtual environment (VE), computer-generated environment with and within which people can interact. It is an artificial environment.
Random Group Variance Adjustments When Hot Deck Imputation Is Used to Compensate for Nonresponse Richard A. Moore Company Statistics Division US Census.
Innovations in Data Dissemination Thomas L. Mesenbourg, Jr. Acting Director U.S. Census Bureau United Nations Seminar on Innovations in Official Statistics.
The NEW American FactFinder Association of Public Data Users (APDU) 2010 Annual Conference American FactFinder Update & Demonstration September 21, 2010.
Principles of Information Systems, Sixth Edition Organizing Data and Information Chapter 5.
ILO experience with gathering and disseminating meta-data on household income and expenditure statistics Bob Pember, ILO Bureau of Statistics, Geneva.
GROUP 2 Practical C. Question 1 Cut off will depend on the country situation : 1 pig may be significant Frequency distribution – take the lower 10 – 20%
MCRDC Michigan Census Research Data Center The MCRDC is a joint project of the U.S. Bureau of the Census and the University of Michigan to enable qualified.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Conference on Data Quality for International Organisations, Newport, April Assessment of statistical data quality: The example of the Occupational.
1 Dissemination Michael J. Levin Harvard Center for Population and Development Studies
Company small business cloud solution Client UNIVERSITY OF BEDFORDSHIRE.
Economic Research and Policy Analysis Branch May 6, 2010 Access to Business Micro-Data to Support Economic Research and Policy Analysis: Where Do We Go.
Regional Seminar on Promotion and Utilization of Census Results and on the Revision on the United Nations Principles and Recommendations for Population.
A Complete Count: The Importance of Census Data for College and University Students.
Can We Trust Data Users to Consider Data Quality? Presented at the 2008 European Conference on Quality in Official Statistics.
© John M. Abowd 2005, all rights reserved Assessing Data Quality John M. Abowd April 2005.
David Price October 2011 Real Time Remote Access (RTRA) #10.
Principles of Information Systems, Sixth Edition Organizing Data and Information Chapter 5.
Micro data exchange in international trade, migration and banking statistics Jens.
Big Data and Macroeconomic Accounts Michael Davies, Division Head, Macroeconomic Statistics Division, Australian Bureau of Statistics September
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
INFO 7470/ECON 7400/ILRLE 7400 Understanding Social and Economic Data John M. Abowd and Lars Vilhuber January 21, 2013.
Principles of Information Systems, Sixth Edition Organizing Data and Information Chapter 5.
United Nations Workshop on Revision 3 of Principles and recommendations for Population and Housing Censuses and Census Evaluation Amman, Jordan, 19 – 23.
Systems Analysis and Design in a Changing World, 6th Edition 1 Chapter 6 - Essentials of Design an the Design Activities.
© John M. Abowd 2005, all rights reserved Using the Decennial Census of Population and Housing John M. Abowd February 2005.
INFO 7470 Statistical Tools: Edit and Imputation Examples of Multiple Imputation John M. Abowd and Lars Vilhuber April 18, 2016.
Using Census Data at the Federal Statistical Research Data Centers Barbara A. Downs Director, FSRDC Center for Economic Studies U.S. Census Bureau.
Expanding the Role of Synthetic Data at the U.S. Census Bureau 59 th ISI World Statistics Congress August 28 th, 2013 By Ron S. Jarmin U.S. Census Bureau.
Skolkovo PRESENTATION
Alternative Census Designs: An Overview of Issues
Chapter 17 Risks, Security and Disaster Recovery
Database Fundamentals(continuing)
How to Market Ancillary Products
2. Applying for Access (10 slides)
Task Force on Small and Medium Sized Enterprise Data (SMED)
Presentation transcript:

Are Public Use (Micro) Data a Thing of the Past? John M. Abowd Cornell University US Census Bureau Prepared for IASSIST 2002

Yes … If “public use” means distributed without any restrictions on the user If “micro data” means the actual responses of the sampled entities

Is this Heresy? No, the ability of any data provider to protect the confidentiality of the respondent’s identity and data has become increasingly more difficult at the same rate as computation and data access have become increasingly easier. This is just “Moore’s Law” as applied to the provision of data.

We Saw This Coming Data providers have almost never provided public use micro data for samples of businesses. The edits imposed on public use micro data from households have become increasingly severe. Data providers with legal alternatives to public use releases have increasingly opted for licensing, restricted access and other access protocols.

Can Scientific Inquiry Survive? Yes, provided the researchers and the archivists participate in the evolution of social data publication. A more shaded understanding of what constitutes “public use (micro) data” can protect both the confidentiality of the respondents’ information and the integrity of the research analysis.

Example: American FactFinder The public use product is an interface between the micro data (and the detailed summary data) and the user. The confidentiality protection is provided as a part of the interface. Advantage: the researcher can design the analysis (so, this is a public use micro data product) Disadvantage: only analyses that can be handled by the confidentiality protection system are allowed.

Example: New Census Employment Dynamics Estimates Estimates created by integrating data from employers and employees over time. Public use products based on a confidentiality protection systems that fuzzes all of the underlying micro data. Advantage: analysis can be performed at levels of geographic or industry detail that would be suppressed by traditional systems. Disadvantage: some analyses are significantly distorted to protect the confidentiality of the micro data.

Example: INSEE Researcher Restricted Access INSEE allows confidential micro data to be placed on secured facilities controlled by the researcher. Advantage: analysis is performed on the unaltered micro data. Disadvantage: other researchers must apply for access and create their own secure facility. In the US, the NCES uses a similar system.

General Principles Layers of confidentiality protection “Gold standard” micro data –Housed in a secure facility with restricted access Restricted micro data –Created by statistical manipulation of the confidential micro data –Suitable for licensed distribution Public use products –Confidentiality protection integrated with an analysis engine allowing general research