Characterization and Management of Multiple Components of Cost and Risk in Disclosure Protection for Establishment Surveys Discussion of Advances in Disclosure.

Slides:



Advertisements
Similar presentations
Introduction: the New Price Index Manuals Presentation Points IMF Statistics Department.
Advertisements

Multiple Indicator Cluster Surveys Survey Design Workshop
Comparison of Simulation Methods Using Historical Data in the U.S. International Price Program M.J. Cho, T-C. Chen, P.A. Bobbitt, J.A. Himelein, S.P. Paben,
Description, Characterization and Optimization of Drill-Down Methods for Outlier Detection and Treatment in Establishment Surveys J. L. Eltinge, U.S. Bureau.
1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII.
Web Design Issues in a Business Establishment Panel Survey Third International Conference on Establishment Surveys (ICES-III) June 18-21, 2007 Montréal,
Simulating Publicly Subsidized Reinsurance Strategies In Three States Lisa Clemans-Cope, Ph.D. (presenter) Randall R. Bovbjerg, J.D. (PI for Reinsurance.
Executive Session Office of Asset Management
Output Consultation Plans and Statistical Disclosure Control Strategy developments Angele Storey and Jane Longhurst ONS.
Adaptation of Evans, Zaytaz, and Slanta (EZS) Disclosure Method to Quarterly Census of Employment and Wages (QCEW) Shail Butani U.S Bureau of Labor Statistics.
Protecting the Confidentiality of Tables by Adding Noise to the Underlying Microdata Paul Massell and Jeremy Funk Statistical Research Division U.S. Census.
The Microdata Analysis System (MAS): A Tool for Data Dissemination Disclaimer: The views expressed are those of the authors and not necessarily those of.
Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector.
Social Science Research and Chesapeake Bay Restoration: Workshop Report Social Science Workshop Steering Committee.
Chapter 12 Analyzing Semistructured Decision Support Systems Systems Analysis and Design Kendall and Kendall Fifth Edition.
Mike Hardin Business and Licensing Division Director May 20, 2014 Colorado’s Service Center Approach.
Migration of a large survey onto a micro-economic platform Val Cox April 2014.
Quality Management of Statistics at the Bank of Japan 3 June 2014 Kuniko Moriya Research and Statistics Department Bank of Japan
BTS Confidentiality Seminar Series June 11, 2003 FCSM/CDAC Disclosure Limiting Auditing Software: DAS Mark A. Schipper Ruey-Pyng Lu Energy Information.
© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables Magnitude Tables Web.
 Ewen McCaig Scottish Household Survey Review 2005 Ewen McCaig McCaig Services.
USING ANALYTICS What to Take Away?
Metadata driven application for aggregation and tabular protection Andreja Smukavec SURS.
Vienna, 23 April 2008 UNECE Work Session on SDE Topic (v) Editing on results (post-editing) 1 Topic (v): Editing based on results Discussants: Maria M.
What is Business Analysis Planning & Monitoring?
Microdata Simulation for Confidentiality of Tax Returns Using Quantile Regression and Hot Deck Jennifer Huckett Iowa State University June 20, 2007.
Aggregate and Systemic Components of Risk in Total Survey Error Models John L. Eltinge U.S. Bureau of Labor Statistics International Total Survey Error.
Screening Data for Disclosure Risk and the Research behind One Possible Tool Kristine Witkowski Research support from the National Institute of Child Health.
1 Alignment of Standards, Large-scale Assessments, and Curriculum: A Review of the Methodological and Empirical Literature Meagan Karvonen Western Carolina.
Make the World a Better Place through Reproducible Research Roger D. Peng Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Wall.
Evaluating the Vermont Mathematics Initiative (VMI) in a Value Added Context H. ‘Bud’ Meyers, Ph.D. College of Education and Social Services University.
Elementary & Middle School 2014 Mathematics MCAS Evaluation & Strategy.
User-focused Threat Identification For Anonymised Microdata Hans-Peter Hafner HTW Saar – Saarland University of Applied Sciences
Daniel Beckler United States Department of Agriculture National Agricultural Statistics Service Timothy Mulcahy NORC at the University of Chicago Topic.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
On visible choice set and scope sensitivity: - Dealing with the impact of study design on the scope sensitivity Improving the Practice of Benefit Transfer:
1 © The Delos Partnership 2004 Project Management Organisation and Structure.
W. BentzEMBA 8021 Agenda for October 27, 1999 Comments on Copeland Wellesley Paint Co. Case On measuring capacity Standards for Toledo Tool.
MEM 612 Project Management Chapter 1 The World of Project Management.
1 Assessing the Impact of SDC Methods on Census Frequency Tables Natalie Shlomo Southampton Statistical Sciences Research Institute University of Southampton.
Evaluation of Multiple Components of Error in the Collection and Integration of Survey and Administrative Record Data John L. Eltinge International.
1 Developing a Framework for an Early Intervention System of Care NECTAC/ ITCA Finance Seminar May 22, 2006.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
EUT443 – Engineering Management Chapter 1 The World of Project Management.
Assessment of Misclassification Error in Stratification Due to Incomplete Frame Information Donsig Jang, Xiaojing Lin, Amang Sukasih Mathematica Policy.
Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada
1 Using Fixed Intervals to Protect Sensitive Cells Instead of Cell Suppression By Steve Cohen and Bogong Li U.S. Bureau of Labor Statistics UNECE/Work.
Data accessibility, confidentiality and copyright Bangkok 2010.
Frankfurt (Germany), 6-9 June 2011 SmartLife Guillaume & SmartLife Core Group – France – S1 – Paper SmartLife initiative in Focus.
Protection of frequency tables – current work at Statistics Sweden Karin Andersson Ingegerd Jansson Karin Kraft Joint UNECE/Eurostat.
The views expressed herein are those of the author and should not necessarily be attributed to the IMF, its Executive Board, or its management Data Confidentiality,
Copyright © The OWASP Foundation Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation.
Joint UNECE/Eurostat work session on statistical data confidentiality Manchester, December 2007 Dealing with Confidentiality in Dissemination: The.
Kathy Corbiere Service Delivery and Performance Commission
Using State Tests to Measure Student Achievement in Large-Scale Randomized Experiments IES Research Conference June 28 th, 2010 Marie-Andrée Somers (Presenter)
Security Methods for Statistical Databases. Introduction  Statistical Databases containing medical information are often used for research  Some of.
NCRM is funded by the Economic and Social Research Council 1 Interviewers, nonresponse bias and measurement error Patrick Sturgis University of Southampton.
SM Sec.1 Dated 13/11/10 STRATEGY & STRUCTURE Group 3.
Survey Training Pack Session 7 – Basic Principles of Sampling.
November | 1 CONTINUING CARE COUNCIL Report to Forum Year
EnVisionMATH Common Core K-8 Publisher’s Criteria for the Common Core State Standards.
11 Measuring Disclosure Risk and Data Utility for Flexible Table Generators Natalie Shlomo, Laszlo Antal, Mark Elliot University of Manchester
Improving researcher access to USDA’s Agricultural Resource Management Survey Charles Towe and Mitch Morehart Economic Research Service, USDA.
No Free Lunch: Working Within the Tradeoff Between Quality and Privacy
Assessing Disclosure Risk in Microdata
Dissemination Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May 2008,
Census Data for Transportation Planning—Some Thoughts
Measuring Data Quality
Transformation of the National Statistical System: Experience
Chapter 12 Analyzing Semistructured Decision Support Systems
Presentation transcript:

Characterization and Management of Multiple Components of Cost and Risk in Disclosure Protection for Establishment Surveys Discussion of Advances in Disclosure Protection: Releasing More Business and Farm Data to the Public John L. Eltinge Bureau of Labor Statistics ICES III Session 54 - June 21, 2007

2 Disclaimer and Acknowldegements: The views expressed in this paper are those of the author and do not necessarily reflect the policies of the Bureau of Labor Statistics. The author thanks the authors for the opportunity to review their papers and slides; and Steve Cohen and Larry Ernst for helpful comments on some of the issues considered here.

3 Each paper covers fascinating and important work Special importance for establishment surveys: Dominating units; unequal selection probabilities; plethora of subpopulations, cells Many potential topics for discussion I.Background: Manage Costs and Risks A. Disclosure Protection as a Form of Technology B. Multiple Stakeholders, Multiple Utility Functions II.Comments on Individual Papers

4 I.Background: Management of Costs and Risks in Disclosure Protection A. View Disclosure Protection as a Form of Technology Practical implication: Need to examine 1. Tangible costs to data users, producers 2. Risks to data users, providers, users, producers and other intermediaries a. Funding, primary dissemination b. Fieldwork, other direct contacts with data providers c. Secondary dissemination (U.S. example: states under fed-state programs)

5 B. Multiple Stakeholders and Multiple Utility Functions 1. Even within a given class, stakeholders may have very different utility functions and risk profiles a.Respondents: Publicity-phobic and publicity-philic b.Data users: - Some consider only published point estimates - Some want highly sophisticated inference methods 2. For disclosure: Utility function of data intruder? - May involve very high tolerance for error

6 II.Comments on Individual Papers A.Morehart and Towe 1.Note emphasis on system development - Impact of the underlying statistical science depends on the technology for implementation 2.Sec 1: expand access to farm survey data as a public good Costs incurred by users for partial data access: - Effort (more sophisticated statistical analysis) - Inconvenience (travel to data center) - Loss of some data quality, efficiency (added noise, cell suppression)

7 3.Relative weight assigned by NASS to: a.Privacy of respondents a.Production of standard aggregate reports a.Specific tasks in economic research 3.Low demand for advanced statistical analysis component (fewer than 30 requests in 30 months) - Limited general interest, or - Reflects high threshold requirement for a large suite of analytic tools?

8 B.Massell and Funk 1.Added-noise method of the Evans, Zayatz and Slanta (1998) has a very strong appeal a. Conceptual simplicity b. Coherence of resulting point estimates across, e.g., multiple levels of aggregation 2.Tuning of the added-noise process to specific high- priority sets of problem cells? 3.Possible extension: Weighted influence function associated with a given observation for a cell-level estimator - Impact of unit-level added noise?

9 4. Methods in All Three Papers (e.g., Weight Trimming, Added Noise and Controlled Tabular Adjustment) Lead to Examination of: a. Realistic inferential goals for disclosure-protected data? - Original true distribution (and parameters thereof)? - Trimmed version of the original dist? - Unspecified dist in neighborhood of original dist? (cf. similar comments on outliers by Pat Cantwell) b. Resulting magnitude of inferential error induced by disclosure-protection tool, relative to natural sources of error (sampling, nonresponse, measurement)?

10 C.Cox 1.Careful attention to performance criteria a. Link with utility functions of key stakeholders? b. Constrained optimization or satisficing: Degree of domination by constraints, optimality criteria? One stated objective: Analyses on original vs. adjusted data yield comparable results a. Emphasize: Many possible analyses b. Possible baselines (preservation and sensitivity): - Original data - Results from fit of loglinear model with moderate number of interactions, possible extramultinomial noise

11 3.Disclosure protection methods often involve complex constrained optimization approaches, but: a.Coherence of implied utility function with primary stakeholders concerns? a.Uncertainty in utility functions, constraints? a.True uncertainty vs. lack of a priori articulation of utility functions? 4.Characterization of cells: Sensitive/Not sensitive - Consistent with largely deterministic language in legislation and regulations

12 III.Closing Remarks Disclosure protection as a form of technology B.Risk is a fact of life for any technology, including disclosure protection technology C.Critical importance of a high level of sophistication regarding 1. Multiple stakeholders, utility functions and constraints 2. Mathematical solutions 3. Technological implementation D.Also consider operational risk 1. Will a given disclosure protection method be carried out as specified? 2. Distribution of the resulting impact on tangible costs and risks to stakeholders?