Presentation is loading. Please wait.

Presentation is loading. Please wait.

Anonymisation: Theory and Practice

Similar presentations


Presentation on theme: "Anonymisation: Theory and Practice"— Presentation transcript:

1 Anonymisation: Theory and Practice
Mark Elliot University of Manchester

2 Outline UKAN – who we are What anonymisation is
The Anonymisation decision making framework Risk assessment techniques A look to the future

3 The UK Anonymisation Network

4 Motivation Transparent Government, Not Transparent Citizens (O’Hara 2011) Anonymisation is better understood theoretically than empirically Need a centre of excellence to study it Lack of capacity within the UK Lack of connectedness between different perspectives

5 Aims 1. To establish a mutual understanding of differences in perspective on anonymisation across sectors, disciplines and components 2. To synthesise key concepts into a common framework 3. To agree best practice principles 4. To create a strategy for network sustainability

6 4 Tier Structure Hub: Operations Partners: Management Group
Core Network: Strategy Extended Network: The Community

7 The Partners

8 The Core Network 30 representatives from
Academia: Computer Science, Law, Statistics Government Commercial sector Third sector NHS

9 From UKAN to UKAS Delivery of full service
Website: Clinics Consultancy Engagement Accreditation Dissemination of best practice via case studies Anonymisation Decision Making Framework Secured new funding from Access to Data Fund Move to sustainable full-cost recovery

10 What is Anonymisation?

11 Starting with the legal
Anonymisation is process by which personal data are rendered non personal.

12 Starting with the legal
Anonymisation is process by which personal data are rendered non personal. Avoid using success terms: “anonymised”

13 Starting with the legal
Anonymisation is process by which personal data are rendered non personal. Avoid using success terms “anonymised” or worse “truly anonymised”

14 Starting with the legal
Anonymisation is process by which personal data are rendered non personal. Avoid using success terms “anonymised” or worse “truly anonymised” And “really truly anonymised” is right out

15 Anonymisation and de-identification
Deal with different parts of the DPDs definition of personal data De-identification tackles: “Directly from those data” Anonymisation tackles: “Indirectly from those data and other information which is in the in the possession of, or is likely to come into the possession of, the data controller…”

16 Some tenets of our approach
Anonymisation is not about the data about data situations Data situations arise from data interacting with data environments

17 Some tenets Data environments are:
the set of formal and informal structures, processes, mechanisms and agents that: act on data; provide interpretable context for those data and/or define, control and/or interact with those data. Elliot and Mackey (2014)

18 Anonymisation types Absolute Anonymisation Formal Anonymisation
Zero possibility of re-identification under any circumstances Formal Anonymisation De-identification (including pseudonymisation) Statistical Anonymisation Statistical Disclosure Control Functional Anonymisation

19 Unintended Disclosure
Consists of two processes: Identification: the (correct) association of a population unit and a data unit Attribution: the (correct) association or disassociation of an item of data with a population unit Can occur independently Without the latter there is no disclosure

20 The Anonymisation Decision Making Framework

21 What is the ADMF? A system for developing anonymisation policy
A practical tool for understanding your data situation Not a checklist

22 The data controller’s responsibility
Understand how a privacy breach might occur Understand the possible consequences of the breach Reduce the risk of a breach occurring to a negligible level

23 10 step process for functional anonymisation
Know your data Understand the use case Understand the legal issues and governance Understand the issue of consent and your ethical obligations Know the processes you will need to go through to assess the risk of re-identification/disclosure Know the processes you will need to go through to anonymise your data Understand the environment into which you share or release the data Know your audience and how you will communicate Know what to do if things go wrong What happens next once you have shared and or release data

24 10 step process for functional anonymisation
Know your data Understand the use case Understand the legal issues and governance Understand the issue of consent and your ethical obligations Know the processes you will need to go through to assess the risk of re-identification/disclosure Know the processes you will need to go through to anonymise your data Understand the environment into which you share or release the data Know your audience and how you will communicate Know what to do if things go wrong What happens next once you have shared and or release data

25 Know the processes you will need to go through to assess the risk of re-identification
Data environment analysis Scenario analysis Statistical disclosure risk assessment Penetration tests Comparative data situation analysis

26 Elliot and Dale Scenario Framework
INPUTS OUTPUTS Motivation Means Opportunity Target Variables Attack Type Effect of Data Divergence Likelihood of Success Goals achievable by other means? Consequences of attempt Likelihood of attempt Key/Matching Variables

27 Elliot and Dale Scenario Framework
INPUTS OUTPUTS Motivation Means Opportunity Target Variables Attack Type Effect of Data Divergence Likelihood of Success Goals achievable by other means? Consequences of attempt Likelihood of attempt Key/Matching Variables

28 Elliot and Dale Scenario Framework
INPUTS OUTPUTS Motivation Means Opportunity Target Variables Attack Type Effect of Data Divergence Likelihood of Success Goals achievable by other means? Consequences of attempt Likelihood of attempt Key/Matching Variables

29 Elliot and Dale Scenario Framework
INPUTS OUTPUTS Motivation Means Opportunity Target Variables Attack Type Effect of Data Divergence Likelihood of Success Goals achievable by other means? Consequences of attempt Likelihood of attempt Key/Matching Variables

30 Understand the data environment
Global Always important but especially for open data.

31

32 Understand the data environment
Global Local Data Agents Governance Security Infrastructure

33 The Disclosure Risk Problem: Identification
The identification file Name Address Sex DOB .. Sex Age .. Income .. .. The target file ID variables Key variables Target variables

34 Risk is present At the variable level At the case level Power
Differentiation Skew Quality: susceptibility to divergence Availability At the case level Outliers

35 Risk is present At the case level Outliers Vulnerable person’s

36 Risk is present At the attribute value level Unusual characteristics
Sensitive attribute values

37 The Disclosure Risk Problem II: Attribution

38 The Disclosure Risk Problem III: Subtraction

39 The Disclosure Risk Problem III:
After Subtraction

40 SAP: Example output

41 Many other attack forms
Table linkage Stream linkage Mash attacks Cross dataset linkage enhancement Match and search Repetitive queries Data hiding Data manipulation Response knowledge Etc etc.

42 How much risk is negligible?
Policy decision based on risk appetite Mature Risk management triangulates across the data situation: Disclosiveness Sensitivity Environment Agents Data Governance Security

43 Some Futurology Given the current trends and likely future technological change Anonymisation will become a mostly meaningless concept within years If we care about privacy (and democracy) then we need a radically different solution. Personal data stores (and no central databases) Economic models

44 Summary Anonymisation done correctly is a functional process
which turns personal data to non-personal data. There a variety of techniques and tools which can be used to asses the likelihood of an attack occurring and the likelihood of it succeeding once it has occurred. Functional anonymisation requires an evaluation of the totality of a data situation not just the data in question. Anonymisation will have a lifespan.


Download ppt "Anonymisation: Theory and Practice"

Similar presentations


Ads by Google