Administrative Data Centre (ADC) ……. a GDPR ready data hub? Public Sector Data Analytics Network Workshop 18th May 2017 John.Dunne@cso.ie
Freedom of Information Legal environment Data Protection Freedom of Information Official Statistics Key : 3 Legislative pillars
Irish Statistical System - a register based statistical system Enterprises/Firms Persons Buildings/Location Key : Permanent official identification
ADC Administrative Data Centre Set up November 2009 Objectives twofold Clearing house for administrative data A catalyst to the development of the Irish Statistical System
Streamline interface with public authorities CSO Survey system 1 Survey system 2 Survey system ... Survey system n ADC Organisation n Admin data source 2 DSP Admin data source 2 Revenue Admin data source 1 Admin data source 1
Data organised as data flows Administrative data flow Instance (month, year, quarter) Version Datasets
Extract Transform Load (ETL) Inside the warehouse Sys 1 Sys ... Sys n Business systems tier Business Interface Analysis tier Data 1 Data 2 Extract Transform Load (ETL) Sources tier PREM DES
Data Goverance Framework Manage/know what we have (metadata) Manage who has access (access control) Minimise privacy risk Be open and transparent Supported and driven by an in house browser based application
Role of metadata Find it Understand it Evaluate it Dataset Collections need to be registered (keywords?) Understand it Needs to be properly labelled with appropriate codebooks and other relevant information (data summaries) Evaluate it Contextual background information usually in the form of documents that provides further understanding of data If you don’t have the necessary metadata - data is worthless at the very least and potentially disastrous if used for decision making.
Access control Register of access rights - Person based at flow level based on userid Business reason captured (DG must sign off on linking projects) System jobs snapshot register at regular basis System jobs inform administrators of inconsistencies unregistered access rights New tool also to record actual accesses Tiered access Source = identifiable Analysis = pseudonymised (PIK Protected Identifier Key)
Analysis tier General access for statistical purposes Contains no directly identifiable information PIK enables safe linkage across data sources and over time PIK can be random generated number or using a salt and hash PPSN, EIRCODE, other identifiers PIK generation closely guarded secret
Source tier Restricted access Management Board level sign off Data office sign off Underpinned by Privacy Impact Assessment (PIA)
Role of PIA Ensure due consideration has been given to identifying and mitigating potential risks Provide documentary evidence
Open and Transparent = Trust Metadata is open All staff can see all data descriptions Access control information is open All staff can see who has access to what All process documentation is open PIAs etc Data is not open
Organisational governance developments Data Office Confidentiality and Data Security Committee (CDSC) reports to Management Board ADC policy no longer set by ADC
Thank you John.Dunne@cso.ie