Development of UK Virtual Microdata Laboratory Felix Ritchie Shanghai, March 2010.

Slides:



Advertisements
Similar presentations
Microdata access in practice Felix Ritchie. Overview Concerns Conceptual and practical concerns International practice UK experience Key lessons.
Advertisements

ONS data – improving access Richard Laux National Statistics and International Division, ONS.
The Statistics Act and Research Access to Data Paul J Jackson Legal Services ONS.
ONS Research Data Access Strategy AGENDA Background and context Confidentiality The Strategy.
Eurostat T HE E UROPEAN PROCESS OF ENHANCING ACCESS TO E UROSTAT DATA A LEKSANDRA B UJNOWSKA E UROSTAT.
Issues in Designing a Confidentiality Preserving Model Server by Philip M Steel & Arnold Reznek.
Statistical Disclosure Control (SDC) at SURS Andreja Smukavec General Methodology and Standards Sector.
Data without Boundaries project A short overview of outputs & future perspectives Roxane Silberman DwB coordinator ESS workshop, Luxembourg September,
Input Data Warehousing Canada’s Experience with Establishment Level Information Presentation to the Third International Conference on Establishment Statistics.
Administrative Data Research Centre for England 1.
Secure Data Service: an improved access to disclosive data Reza Afkhami, Melanie Wright Secure Data service UKDA University of Essex IASSIST 2010, Ithaca,
Operationalising ‘safe statistics’ the case of linear regression Felix Ritchie Bristol Business School, University of the West of England, Bristol.
Access routes to 2001 UK Census Microdata: Issues and Solutions Jo Wathan SARs support Unit, CCSR University of Manchester, UK
International data sharing via standards Felix Ritchie.
Developing a Statistical Disclosure Standard for Europe Tanvi Desai LSE Research Laboratory Data Manager Research Laboratory IASSIST 2010: Cornell.
Systems Analysis and Design in a Changing World, 6th Edition 1 Chapter 6.
Strengthening Data Security Dr. Sharon Bolton Dr. Matthew Woollard.
Eurostat M ODES OF ACCESS TO EU MICRODATA IN THE NEW LEGAL FRAMEWORK A LEKSANDRA BUJNOWSKA E UROSTAT S TATISTICAL OFFICE OF THE E UROPEAN U NION.
3 rd Data without Boundaries Training Course EU‐SILC longitudinal component Paris, February 2014.
2 nd Data without Boundaries Training Course Bucharest, February 2013.
Integrated European Census Microdata 5 th DwB Training, Barcelona, January 2015.
Development of Remote Access Systems Tanvi Desai LSE Research Laboratory Data Manager Research Laboratory IASSIST 2008: Stanford.
Synthetic Data within the Risk – Utility Framework Keith Spicer Office for National Statistics.
Auditing Logical Access in a Network Environment Presented By, Eric Booker and Mark Ren New York State Comptroller’s Office Network Security Unit.
Thinking the Future: European Services for Official Statistics (ESCOS) and European Remote Access Network (EuRAN) David Schiller (IAB) and Christof Wolf.
Statistics Canada’s Real Time Remote Access Solution 2011 MSIS Meeting – Karen Doherty May 2011.
Dissemination to support Research & Analysis John Cornish.
Plans for Access to UK Microdata from 2011 Census Emma White Office for National Statistics 24 May 2012.
Access to microdata in Europe P resented by Michel Isnard – Insee DwB Training Course, Barcelona, Jan
Mara Cammarrota Italian National Institute of Statistics Development of Information System and Corporate Products, Information Management and Quality Assessment.
User-focused Threat Identification For Anonymised Microdata Hans-Peter Hafner HTW Saar – Saarland University of Applied Sciences
ISO17799 Maturity. Confidentiality Confidentiality relates to the protection of sensitive data from unauthorized use and distribution. Examples include:
Disclosure detection & control in research environments Felix Ritchie.
Census/NeSS Roadshows March 2003 Better Information Initiatives.
Access to sensitive data in the UK: a principles-based approach Felix Ritchie.
Access to Microdata Felix Ritchie Business Data Linking.
UK Data Access Practices Felix Ritchie. Overview The legislative model The data model The security model Developments Current key concerns.
Frameworks for the Access and Use of Administrative Data, With the Example of Current Practice in the UK Steven Vale Office for National Statistics UK.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks David Kelsey RAL/STFC,
About the Secure Data Access For the academic research community in the UK Delivered by the UK Data Service/Archive Funded by the Economic and Social Research.
MEDIN Work Plan for By March 2011 MEDIN will be 3 years into the original 5 year development plan started in Would normally ask for continued.
26 August 2011 Future of access to EU confidential data for scientific purposes Jean-Marc Museux Eurostat – 58th ISI conference,
© Federal Statistical Office, Research Data Centre, Maurice Brandt Folie 1 ESSnet Projects “Decentralised Access to EU microdata” Maurice Brandt Research.
The experience of a National Statistical Institute after a law change: Estonia First Regional Workshop Microdata Access in European Countries ― Cooperation.
Economic Research and Policy Analysis Branch May 6, 2010 Access to Business Micro-Data to Support Economic Research and Policy Analysis: Where Do We Go.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Data for secondary analysis: the experience of the UK Data Archive Hilary Beedham UK Data Archive.
Adrian Janson, Melbourne High School Information Systems, Data and Information, The IPC and Organisations For VCE Software Development ¾, 2007.
Data Dissemination Conditions in the European Statistical System (ESS) UNECE, Warschau May 2009.
Business data linking recent UK experience. business data in the UK common register (IDBR) since 1994 key law: Statistics of Trade Act 1947 data collection.
19-20 October 2010IT Directors’ Group Meeting 1 Item 3.3.g of the agenda Vision Infrastructure Project on Secure Infrastructure for CONfidential data access.
Joint UNECE/Eurostat work session on statistical data confidentiality October 2015 Helsinki, Finland Circle of trust Maurice Brandt DESTATIS.
Researchers’ Usage of Microdata The example of Statistics Finland Advanced presentation – Some additional details Consultation Mission on Promoting the.
Privacy and ‘Big Data’: the European perspective Human Subjects’ Protections in the Digital Age: IRB, Privacy and Big Data Peter Elias, University of Warwick.
Development of UK Virtual Microdata Laboratory
Data Confidentiality and the Common Good.
UK Data Service Secure Lab
Legal, political and methodological issues in confidentiality in the ESS Maria João Santos, Jean-Marc Museux Eurostat.
Sabrina Iavarone Senior User Services Officer
Treatment of statistical confidentiality Part 5 Summary & reflection: rules versus principles Introductory course Trainer: Felix Ritchie CONTRACTOR IS.
Workshop on Decentralised Access to European Microdata
Information Society Statistics
The Beginnings of a European Remote Access Network
The ‘Five Safes’ framework for data access management
High-level Working Group on Statistical Confidentiality
Item 2.2 of the Agenda Remote access to confidential data for researchers: possible actions under the 7th Framework Programme Pascal JACQUES Unit B 5 15.
Federal Statistical Office Germany Research Data Centre
Treatment of statistical confidentiality Part 5: Rules versus principles Introductory course Trainer: Felix Ritchie CONTRACTOR IS ACTING UNDER A FRAMEWORK.
Dealing with confidential data Introductory course Trainer: Felix Ritchie CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION.
Access to European microdata for scientific purposes
Presentation transcript:

Development of UK Virtual Microdata Laboratory Felix Ritchie Shanghai, March 2010

Plan of presentation Starting principles What we did, and the impact New things we had to develop security model, researcher management, SDC What we’ve learnt what matters, what doesn’t, what we’d do differently Future directions

Starting principles Designed by researchers for research –maximum access, limited by law Expandable Secure at reasonable cost Manageable at reasonable cost Distribute access, not data

Distributed access Why is this good? –Data always under ONS control –Live monitoring –Simpler, but safer, disclosure control How does this work in practice? –VML accessible from all ONS computers –Access points in govt. offices in Glasgow and Belfast –Plan to roll-out to more govt offices in 2010 –VML-duplicate set up on academic network VML set to become exception rather than default data store

What we did Central data repository and processors Access via secured thin clients Work space partitioned by dataset, not usage –researchers get access to dataset, not variables No access to internet or rest of network Same system for internal and external users

What we did - outcomes 30%-50% growth every year Massive increase in microeconomic analysis –Form almost no firm-level studies to European leaders Keystone of ONS Administrative Data Project Total cost ~£350,000 per year strategy 17%, fixed ops 65% variable ops 18% income ~£50,000

New things developed (1) The VML Security Model valid statistical purpose trusted researchers anonymisation of data technical controls around data disclosure control of results safe projects + safe people + safe data + safe outputs  safe use + safe setting

New things developed (2) Output statistical disclosure control ‘Standard’ SDC not appropriate –traditional rules not appropriate for research environments –SDC on data or methods pointless Principles-based output SDC –SDC at the point of release –trained researchers –trained staff –agreement on principles and purpose –safe vs unsafe outputs, based on functional form

New things developed (3) Active researcher management Need to develop shared objectives with researchers –Principles-based SDC needs buy-in from researchers –Reduced management costs Compulsory training –SDC –VML objectives and constraints –legal and procedural background

What we’ve learnt (1) Things that matter attitude to researchers model of SDC broad scale of operations –including future plans scale of coherent networks (for remote access) –eg ONS internal network, Government Secure Intranet, University Intranet, VPN?

What we’ve learnt (2) Things that don’t matter Location of servers and users Type of users Type of data IT Metadata Specific legal/procedural framework?

What we’ve learnt (3) Things we would do differently Prepare ONS for expansion –senior buy-in –IT planning better data management better user management better metadata

Future directions Expansion across the government network Supporting academic equivalent –VML facing massive internal increase in use Developing international standards Better communication –wikis, FAQs, common metadata system –metadata Not being considered –remote job systems –synthetic data

Questions? Felix Ritchie Microdata Analysis and User Support

Old stuff – if necessary

The data model (1) ‘Spectrum’ of access points balancing –value of data –ease of use –disclosure risk for a given level of confidentiality, maximise data use and convenience no ‘one-size-fits-all’ solution –no absolute prohibitions –trade-off is made explicit –users determine appropriate level of access

Type of access NoneVML ONS sites VML Govt sites Secure data service Special licences Licensed data archive Internet Anonymi- sation LittleComplete SDC of inputs NoneComplete Restric- tions on users ManyNone SDC of outputs CompleteNone Examples: Census data Original data Data for ONS linking ONS contractor Anon. CD-ROM Web tables Enterprise data Original data Identified data for ONS linking Identifiable data for analysis Govt. users only Web tables RDCs Use of confidential data: the access spectrum