The Application for Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS UNECE Statistical Data Confidentiality Work Session Helsinki,

Slides:



Advertisements
Similar presentations
- ONS Classification Coding Tools Project Occupation Classification Workshop RSS, London, 21 June 2004 Nigel Swier.
Advertisements

Workshop on Standardisation Brussels, October 2010 Comparative review of SDC and SA standards Jean Marc MUSEUX – Eurostat.
Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector.
BTS Confidentiality Seminar Series June 11, 2003 FCSM/CDAC Disclosure Limiting Auditing Software: DAS Mark A. Schipper Ruey-Pyng Lu Energy Information.
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John.
STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE REPUBLIC OF SLOVENIA Rudi Seljak Statistical Office of the Republic of Slovenia.
IMPROVING CONFIDENTIALITY WITH tau-ARGUS BY FOCUSSING ON CLEVER USAGE OF MICRODATA Roland van der Meijden MSc. ± 10 minutes.
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
Modernisation of Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS Workshop on Modernisation of Statistical Production Geneva, 15–17.
Chapter 7 Managing Data Sources. ASP.NET 2.0, Third Edition2.
“GENERIC SCRIPT” Everything can be automated, even automation process itself. “GENERIC SCRIPT” Everything can be automated, even automation process itself.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Metadata driven application for aggregation and tabular protection Andreja Smukavec SURS.
Vienna, 23 April 2008 UNECE Work Session on SDE Topic (v) Editing on results (post-editing) 1 Topic (v): Editing based on results Discussants: Maria M.
Basque Statistics Office Confidentiality Project: Final stages Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain,
ABS Tablebuilder and DataAnalyser Session 7 UNECE Work Session on Statistical Data Confidentiality October 2013 Daniel Elazar
Implementing ESS standards for reference metadata and quality reporting at Istat Work Session on Statistical Metadata Topic (i): Metadata standards and.
Survey Data Management and Combined use of DDI and SDMX DDI and SDMX use case Labor Force Statistics.
Antje Rossmanith, Roche 14th German CDISC User Group, 25-Sep-2012
Version 1.1 Tau-Argus and SuperCROSS A practical example using the UK Business Register Unit data Andrea Staggemeier Philip Lowthian Grant Lee.
G-Confid: Turning the tables on disclosure risk Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Ottawa, Canada 30 October 2013 Peter.
Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.
Transparency and Open Data: GSS Response Iain Bell HoP MoJ.
The Adoption of METIS GSBPM in Statistics Denmark.
Daniel Beckler United States Department of Agriculture National Agricultural Statistics Service Timothy Mulcahy NORC at the University of Chicago Topic.
Population census micro data for research: the case of Slovenia Danilo Dolenc Statistical Office of the Republic of Slovenia Ljubljana, First Regional.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 2 Labor Market Information Produced in Collaboration between World Bank Institute and the.
Current and Future Applications of the Generic Statistical Business Process Model at Statistics Canada Laurie Reedman and Claude Julien May 5, 2010.
BAIGORRI Antonio – Eurostat, Unit B1: Quality; Classifications Q2010 EUROPEAN CONFERENCE ON QUALITY IN STATISTICS Terminology relating to the Implementation.
1 Improving Data Quality. COURSE DESCRIPTION Introduction to Data Quality- Course Outline.
Eurostat Expression language (EL) in Eurostat SDMX - TWG Luxembourg, 5 Jun 2013 Adam Wroński.
Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia.
Sampling Error Estimation – SORS practice Rudi Seljak, Petra Blažič Statistical Office of the Republic of Slovenia.
WP 19 Assessment of Statistical Disclosure Control Methods for the 2001 UK Census Natalie Shlomo University of Southampton Office for National Statistics.
Michelle Simard Statistics Canada UNECE Worksessions on Statistical Disclosure Control Methods Helsinki, October 2015 Development of rules from administrative.
Statistical data confidentiality and micro data in Albania
United Nations Economic Commission for Europe Statistical Division High-Level Group Achievements and Plans Steven Vale UNECE
Lyne Guertin Census Data Processing and Estimation Section Social Survey Methods Division Methodology Branch, Statistics Canada UNECE April 28-30, 2014.
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
Topic (i): Selective editing / macro editing Discussants Orietta Luzi - Italian National Statistical Institute Rudi Seljak - Statistical Office of Slovenia.
Michelle Simard, Thérèse Lalor Statistics Canada CSPA Project Manager UNECE Work Session on Statistical Data Confidentiality Helsinki, October 2015 Confidentialized.
The views expressed herein are those of the author and should not necessarily be attributed to the IMF, its Executive Board, or its management Data Confidentiality,
Access to microdata in the Netherlands: from a cold war to co-operation projects Eric Schulte Nordholt Senior researcher and project leader of the Census.
© Statistisches Bundesamt, I/A Case study Federal Statistical Office Germany (Destatis) Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata.
Joint UNECE/Eurostat work session on statistical data confidentiality Manchester, December 2007 Dealing with Confidentiality in Dissemination: The.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
RECENT DEVELOPMENT OF SORS METADATA REPOSITORIES FOR FASTER AND MORE TRANSPARENT PRODUCTION PROCESS Work Session on Statistical Metadata 9-11 February.
1 The Process of Practicing Statistical Disclosure Control in Tabular Data at Statistics Sweden Q2010 Helsinki, May 4-6 Ingegerd Jansson, Michael Carlson,
The business process models and quality issues at the Hungarian Central Statistical Office (HCSO) Mr. Csaba Ábry, HCSO, Methodological Department Geneva,
1 General Recommendations of the DIME Task Force on Accuracy WG on HBS, Luxembourg, 13 May 2011.
Introduction to Statistics Estonia Study visit of the State Statistical Service of Ukraine on Dissemination of Statistical Information and related themes.
11 Measuring Disclosure Risk and Data Utility for Flexible Table Generators Natalie Shlomo, Laszlo Antal, Mark Elliot University of Manchester
The Role of service Granularity in Successful CSPA Realization Zvone Klun, Tomaž Špeh Geneve, 22 June 2016.
Creation of synthetic microdata in 2021 Census Transformation Programme (proof of concept) Robert Rendell.
Confidentiality in Published Statistical Tables
UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing April 2017 The Hague,
Rudi Seljak, Aleš Krajnc
Access to European microdata for scientific purposes
Generic Statistical Business Process Model (GSBPM)
Improving the efficiency of editing in ONS business surveys
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Data from statistical modeling (e. g
Data validation in Statistical Office of the Republic of Serbia
Education and Training Statistics Working Group – 2-3 June 2016
Item 7.1 Implementation of the 2016 Adult Education Survey
Education and Training Statistics Working Group – 1-2 June 2017
Dealing with confidential data Introductory course Trainer: Felix Ritchie CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION.
Treatment of statistical confidentiality Introductory course Trainer: Felix Ritchie CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE.
Presentation of Project Joint meeting of the ESS.VIP.BUS ICT Project
Presentation transcript:

The Application for Statistical Processing at SURS Andreja Smukavec, SURS Rudi Seljak, SURS UNECE Statistical Data Confidentiality Work Session Helsinki, 5 – 7 October 2015

Old system Stove-pipe oriented production –Ad-hoc solutions were developed for a particular survey Survey methodologists‘ strive for improvement was crucial –“Our data are not confidential“ Process metadata were not organized –Difficulties when a survey methodologist resigns

Renovation An internal project started in 2012 –IT, General Methodology and subject-matter specialists –Build a global solution appropriate for most of the surveys –Solution which covers most of the parts of statistical production: Data validation Data editing and imputation Aggregation and standard error estimation Statistical disclosure control for tabular data Tabulation

Renewed system Generalised metadata driven application –Database of process metadata MS Access -> ORACLE For each survey instance –General SAS code –GUI for process metadata –Different microdata environments allowed, just some basic rules for the structure of microdata databases Ad hoc SAS program for preparation of microdata

Schematic presentation of the renewed system Different microdata databases General SAS Ad- Database of process metadata Metadata repository Different kind of output … program Application for management Data on tables and variables Ad-hoc

Tabular data protection 1.Calculation of primary sensitivity for seven types of statistics: number, total, share, ratio, average… –Threshold, p%-rule, (n,k)-dominance rule –„Holding rule“ + sampling weights –Zeroes unsafe 2.Secondary suppression applied in case of sensitive statistics (number and total) –SAS-Tool (Excel file with metadata, Tau Argus, SAS macros)

Tabular data protection Results for each survey instance saved in the database with statistics (ORACLE) –Statuses for lower precision –Confidentiality flags for the type of primary and secondary suppression 3 types of tabulation (codelists) –Excel format (the most user-friendly) –plain text format (.tab,.hrc) for Tau-Argus –plain text format (.csv) for PX-Edit (SURS’s publication tool)

Tabulation & Tabular Data Protection

Parameters for SDC in MetaSOP

Tabulation in MetaSOP

Processing in MetaSOP

Example of 3-dimensional table After aggregation CC_SI / Dim_2 Dim_3 TOT F O E E E E E E TOT z TOT E E TOT E z z E E z z After use of SAS-Tool CC_SI / Dim_2 Dim_3 TOT F O E E E E E E TOT zz zz zz zz zz 12 TOT E E zz zz zz 21 TOT E zzz 11 zzz 2 zzz E zz 23 zzz zz

New organization Old system: –Every survey had its own programmer and its own general methodologist Renewed system: –General methodologist and IT expert („support team“) help the subject-matter specialist to insert and edit the process metadata (except for SDC) into the application run particular parts of the statistical process

Advantages The subject-matter personnel‘s skills improve (higher quality of data) The process metadata can be changed easily and the procedure can be repeated in short time (flexibility) The rules for data processing are gathered in one place (transparency)

Drawbacks High risk of syntax errors in the process of the insertion of metadata expressions Subject-matter personnel has to learn some new skills (SAS expressions) An error during the execution can cause problem if the support team is busy or not available

Challenges for the future Introduce the application successfully into the production –Adjusting to changes by the subject-matter specialists –Building a qualified support team Adding new functionalities –Indices –Secondary suppression for other types of statistics –GUI instead of the Excel file for the SAS - Tool

Thank you for attention.