Download presentation
Presentation is loading. Please wait.
1
Anonymisation: Theory and Practice
Mark Elliot University of Manchester
2
Outline UKAN – who we are What anonymisation is
The Anonymisation decision making framework Risk assessment techniques A look to the future
3
The UK Anonymisation Network
4
Motivation Transparent Government, Not Transparent Citizens (O’Hara 2011) Anonymisation is better understood theoretically than empirically Need a centre of excellence to study it Lack of capacity within the UK Lack of connectedness between different perspectives
5
Aims 1. To establish a mutual understanding of differences in perspective on anonymisation across sectors, disciplines and components 2. To synthesise key concepts into a common framework 3. To agree best practice principles 4. To create a strategy for network sustainability
6
4 Tier Structure Hub: Operations Partners: Management Group
Core Network: Strategy Extended Network: The Community
7
The Partners
8
The Core Network 30 representatives from
Academia: Computer Science, Law, Statistics Government Commercial sector Third sector NHS
9
From UKAN to UKAS Delivery of full service
Website: Clinics Consultancy Engagement Accreditation Dissemination of best practice via case studies Anonymisation Decision Making Framework Secured new funding from Access to Data Fund Move to sustainable full-cost recovery
10
What is Anonymisation?
11
Starting with the legal
Anonymisation is process by which personal data are rendered non personal.
12
Starting with the legal
Anonymisation is process by which personal data are rendered non personal. Avoid using success terms: “anonymised”
13
Starting with the legal
Anonymisation is process by which personal data are rendered non personal. Avoid using success terms “anonymised” or worse “truly anonymised”
14
Starting with the legal
Anonymisation is process by which personal data are rendered non personal. Avoid using success terms “anonymised” or worse “truly anonymised” And “really truly anonymised” is right out
15
Anonymisation and de-identification
Deal with different parts of the DPDs definition of personal data De-identification tackles: “Directly from those data” Anonymisation tackles: “Indirectly from those data and other information which is in the in the possession of, or is likely to come into the possession of, the data controller…”
16
Some tenets of our approach
Anonymisation is not about the data about data situations Data situations arise from data interacting with data environments
17
Some tenets Data environments are:
the set of formal and informal structures, processes, mechanisms and agents that: act on data; provide interpretable context for those data and/or define, control and/or interact with those data. Elliot and Mackey (2014)
18
Anonymisation types Absolute Anonymisation Formal Anonymisation
Zero possibility of re-identification under any circumstances Formal Anonymisation De-identification (including pseudonymisation) Statistical Anonymisation Statistical Disclosure Control Functional Anonymisation
19
Unintended Disclosure
Consists of two processes: Identification: the (correct) association of a population unit and a data unit Attribution: the (correct) association or disassociation of an item of data with a population unit Can occur independently Without the latter there is no disclosure
20
The Anonymisation Decision Making Framework
21
What is the ADMF? A system for developing anonymisation policy
A practical tool for understanding your data situation Not a checklist
22
The data controller’s responsibility
Understand how a privacy breach might occur Understand the possible consequences of the breach Reduce the risk of a breach occurring to a negligible level
23
10 step process for functional anonymisation
Know your data Understand the use case Understand the legal issues and governance Understand the issue of consent and your ethical obligations Know the processes you will need to go through to assess the risk of re-identification/disclosure Know the processes you will need to go through to anonymise your data Understand the environment into which you share or release the data Know your audience and how you will communicate Know what to do if things go wrong What happens next once you have shared and or release data
24
10 step process for functional anonymisation
Know your data Understand the use case Understand the legal issues and governance Understand the issue of consent and your ethical obligations Know the processes you will need to go through to assess the risk of re-identification/disclosure Know the processes you will need to go through to anonymise your data Understand the environment into which you share or release the data Know your audience and how you will communicate Know what to do if things go wrong What happens next once you have shared and or release data
25
Know the processes you will need to go through to assess the risk of re-identification
Data environment analysis Scenario analysis Statistical disclosure risk assessment Penetration tests Comparative data situation analysis
26
Elliot and Dale Scenario Framework
INPUTS OUTPUTS Motivation Means Opportunity Target Variables Attack Type Effect of Data Divergence Likelihood of Success Goals achievable by other means? Consequences of attempt Likelihood of attempt Key/Matching Variables
27
Elliot and Dale Scenario Framework
INPUTS OUTPUTS Motivation Means Opportunity Target Variables Attack Type Effect of Data Divergence Likelihood of Success Goals achievable by other means? Consequences of attempt Likelihood of attempt Key/Matching Variables
28
Elliot and Dale Scenario Framework
INPUTS OUTPUTS Motivation Means Opportunity Target Variables Attack Type Effect of Data Divergence Likelihood of Success Goals achievable by other means? Consequences of attempt Likelihood of attempt Key/Matching Variables
29
Elliot and Dale Scenario Framework
INPUTS OUTPUTS Motivation Means Opportunity Target Variables Attack Type Effect of Data Divergence Likelihood of Success Goals achievable by other means? Consequences of attempt Likelihood of attempt Key/Matching Variables
30
Understand the data environment
Global Always important but especially for open data.
32
Understand the data environment
Global Local Data Agents Governance Security Infrastructure
33
The Disclosure Risk Problem: Identification
The identification file Name Address Sex DOB .. Sex Age .. Income .. .. The target file ID variables Key variables Target variables
34
Risk is present At the variable level At the case level Power
Differentiation Skew Quality: susceptibility to divergence Availability At the case level Outliers
35
Risk is present At the case level Outliers Vulnerable person’s
36
Risk is present At the attribute value level Unusual characteristics
Sensitive attribute values
37
The Disclosure Risk Problem II: Attribution
38
The Disclosure Risk Problem III: Subtraction
39
The Disclosure Risk Problem III:
After Subtraction
40
SAP: Example output
41
Many other attack forms
Table linkage Stream linkage Mash attacks Cross dataset linkage enhancement Match and search Repetitive queries Data hiding Data manipulation Response knowledge Etc etc.
42
How much risk is negligible?
Policy decision based on risk appetite Mature Risk management triangulates across the data situation: Disclosiveness Sensitivity Environment Agents Data Governance Security
43
Some Futurology Given the current trends and likely future technological change Anonymisation will become a mostly meaningless concept within years If we care about privacy (and democracy) then we need a radically different solution. Personal data stores (and no central databases) Economic models
44
Summary Anonymisation done correctly is a functional process
which turns personal data to non-personal data. There a variety of techniques and tools which can be used to asses the likelihood of an attack occurring and the likelihood of it succeeding once it has occurred. Functional anonymisation requires an evaluation of the totality of a data situation not just the data in question. Anonymisation will have a lifespan.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.