Microdata access in practice Felix Ritchie
Overview Concerns Conceptual and practical concerns International practice UK experience Key lessons
Conceptual concerns Flexibility Convenience Confidentiality Practicality Scalability Cost
Practical considerations Location –on-site laboratories –distributed centres –local access Data management –distributed vs centralised Processing facility –fat vs thin clients Remote job submission
International practice – social data Characteristics –easy to anonymise usefully –unlinkable –dominate microdata research Accessed through –anonymised files with almost unrestricted release –scientific use, CURF, etc identifiable data with limited release (eg special license, on-site lab, remote access, remote job submission) –released with identifying variables for NSI-work only –easily identified observations typically not useful statistically
International practice – business data almost always restricted/zero access –identifying characteristics often useful ones –data typically identifiable, even in scientific use files –no access is the international norm where access is provided: –on-site labs and special licenses dominate –moves towards centralised thin-client systems (UK, Denmark, Sweden, Netherlands, Slovenia, US) local access in Scandinavia Four main areas of development –making useful anonymous files (Canada, Germany) –synthetic data (US) –remote job submission (Australia, NZ, US) –remote access (non-NSI sites) through thin client systems
International practice – health and Census data share characteristics with business and social data –identifying characteristics often useful ones –Census presents special problems because of inclusion probability large variations on confidentiality within and across countries often not collected by NSI in general treated like business data
UK experience – the strategy More confidential, more secure No release Virtual microdata laboratory Special licence WebUKDA Less confidential, easier access Business data, Census data Not anonymised Census, health data, OGD access to business data GHS LFS Aggregate data [Remote job submission]
UK experience – the VML Limited lab experience Thin clients used to simulate on-site laboratory –cost –security –flexibility –ease of management Strict technical regime to ensure confidentiality Practicality of servicing researchers through –training –shifting of responsibility –limited support
Lessons learned Use the law intelligently –challenge unhelpful interpretations –use laws actively to support procedures Demonstrate benefits soon, clearly, continuously Running a lab: –Practising researchers design and manage lab –Sort out rules in advance especially confidentiality actively involve users Continual development in operations and principles
Felix Ritchie