Anonymisation techniques and other measures to enable using and sharing research data Managing and Sharing Research Data workshop London, 2 December 2009.

Slides:



Advertisements
Similar presentations
Reconciling the sharing of research data with ethical review for research with people as participants Dr Veerle Van den Eynden UK Data Archive Data support.
Advertisements

DLI Orientation: Concepts A Framework for Thinking about Statistical Information Train the Trainers Montreal, March 9, 2004 Chuck Humphrey Data Library.
Advice on Consent and Confidentiality for Sharing Research Data Ethics and Consent issues: one-day workshop Belfast, 18 January 2005 John Southall.
Accessing longitudinal data via the UK Data Archive / ESDS Jack Kneeshaw NCDS summer school course, July 2005 ESDS Longitudinal.
Reconciling the sharing of research data with ethical review for research with people as participants Veerle Van den Eynden UK Data Archive Data Support.
Dealing with confidential research information and consent agreements in research with people as participants Data Management and Sharing workshop Edinburgh,
13 February 2009ESDS – whats in it for librarians? Royal Statistical Society The strange case of the local data librarian - a peculiarly Edinburgh perspective!
Depositing Data for Archiving Libby Bishop ESDS Qualidata, University of Essex Changing Families, Changing Food Meeting University of Sheffield 15 March.
Ethical issues surrounding the use of research data: an archivists perspective Research Ethics Workshop Key ethical issues for Social Science research.
Data management, data sharing and the activities of the UKDA Managing research data well workshop London, 30 June 2009 Manchester, 1 July 2009.
Data security and controlling access Managing research data well workshop London, 30 June 2009 Manchester, 1 July 2009.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Online Qualitative Data Resources: Best Practice in Metadata Creation.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw MCS workshop 23 June 2005 ESDS Longitudinal.
Dealing with confidential research information anonymisation techniques and other measures to enable using and sharing research data Data Management and.
ESDS Qualidata and QUADS Coordination Louise Corti Online Resources Day 15 November 2005, London.
Obtaining informed consent Managing and Sharing Research Data workshop London, 2 December 2009.
Dealing with confidential research information - Anonymisation techniques and access regulations to enable using and sharing research data Data Management.
A Common Standard for Data and Metadata: The ESDS Qualidata XML Schema Libby Bishop ESDS Qualidata – UK Data Archive E-Research Workshop Melbourne 27 April.
Economic and Social Data Service a distributed data service for the social sciences.
Qualitative Data Resources: Qualidata UKDA Libby Bishop ESDS Qualidata, University of Essex Timescapes, University of Leeds St Catherines College, Oxford.
Obtaining informed consent Data Management and Sharing workshop Leeds and Essex, 11 March 2008.
QUADS Co-ordination Louise Corti QUADS Director, UKDA 28 September 2006.
Useful tools for ESRC Research Centres
The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
Accessing the MCS via the Economic and Social Data Service Jack Kneeshaw and Alasdair Crockett MCS workshop 20 November 2003 ESDS Longitudinal.
User views Jo Wathan SARs Support team
Conference Programme Introduction to the Samples of Anonymised Records - Keith Spicer, ONS CCSR's role in providing SAR's support - Jo Wathan,
ESDS Resources Vanessa Higgins ESDS Government Centre for Census and Survey Research University of Manchester.
The Samples of Anonymised Records: Understanding Individual differences Mark Brown.
Eurostat T HE E UROPEAN PROCESS OF ENHANCING ACCESS TO E UROSTAT DATA A LEKSANDRA B UJNOWSKA E UROSTAT.
MANAGING YOUR DATA WELL …………………………………………
Open Access: Data Protection, Storage and Sharing Caroline Dominey.
Conducting the Community Analysis. What is a Community Analysis?  Includes market research and broader analysis of community assets and challenges 
The Special Licence model for access to more detailed micro data IASSIST 2006 Thursday 25 May Karen Dennison UK Data Archive.
Access routes to 2001 UK Census Microdata: Issues and Solutions Jo Wathan SARs support Unit, CCSR University of Manchester, UK
Nigel James Bodleian Library The Census Accessing and mapping British Census Data.
Dealing with confidential research information and consent agreements in research Louise Corti Associate Director UK Data Archive University of Glamorgan.
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September 2011 Overview of Archiving of Microdata Session 4 United Nations.
MANAGING SENSITIVE DATA FOR SHARING − THE UK DATA ARCHIVE EXPERIENCE ……………………………………………………
MANAGING YOUR RESEARCH DATA: PLANNING TO SHARE ……………………………………………………………………………………………………………………………….…………………………….. ……………………………………………………………......…... RESEARCH.
Research data workflow Practice in Slovenian Social Science Data Archives SERSCIDA WP4 – WORKSHOP Ljubljana September 2013.
Curating and Managing Research Data for Re-Use Review & Processing Jared Lyle.
DATA MANAGEMENT SUPPORT FOR RESEARCHERS …………………………………………
Guidelines for data preparation - ESRC Datasets Policy Louise Corti ESDS/UKDA Social Science Data Archives for Social Historians: creating, depositing.
SECURITY Research Data Management. Research Data Management Security Laptops go missing very regularly; Intel’s study in 2012 surveying 329 private and.
Disclosure Avoidance: An Overview Irene Wong ACCOLEDS/DLI Training December 8, 2003.
Data documentation and metadata for data archiving and sharing Managing research data well workshop London, 30 June 2009 Manchester, 1 July 2009.
PART 2: DATA READINESS CASRAI CONFERENCE RECONNECT BIG DATA: THE ADVANCE OF DATA-DRIVEN DISCOVERY OCTOBER 16, 2013 JANE FRY Research Data Management: planning.
Framework of Statistical Information. This is a typology of the categories or classes of statistical information. Remember the relationship between statistics.
ELSA ELSA datasets and documentation available from the archive or by special arrangement Kate Cox National Centre for Social.
RESEARCH ETHICS AND DATA CONFIDENTALITY: ANONYMISATION AND ACCESS CONTROL ……………………………………………………………………………………………………………………………….…………………………….. ……………………………………………………………......…...
Introduction ESDS Qualidata John Southall ESDS Creating and delivering re-usable qualitative data 24 June 2004.
Open Access to Data Confidentiality, Consent and Archive Access CESSDA, Athens October John Southall ESDS Qualidata.
Peter Granda Archival Assistant Director / Data Archives and Data Producers: A Cooperative Partnership.
ANONYMISATION Research Data Management. c Research Data Management Sensitive Data Sensitive Data is information covering: The racial or ethnic origin.
Data Sharing in Nursing: What Researchers Need to Know November 9, 2015 Caitlin Bakker, Research Services Librarian |
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
HETUS Pilot Group 8 Privacy procedures and ethical issues Kimberly Fisher, Centre for Time Use Research – co-ordinator External consultant Kai Ludwigs.
The London Health Observatory: monitoring health and health care in the capital, supporting practitioners and informing decision-makers Disclosure control.
Anonymising quantitative data
Working with Sensitive or Confidential Data John Southall Bodleian Data Librarian Subject Consultant for Economics, Sociology, Social Policy and.
Incident Management: Recording New Incidents User Guide
What is Administrative Data?
Section 3: Sweep implementation
Open Access to Data Confidentiality, Consent and Archive Access CESSDA, Athens October John Southall ESDS Qualidata.
Data Protection Act and Anonymisation of Research Data
- National Institute for Demographic Studies
Data Management Ethical considerations for educational research
Disclosure Avoidance: An Overview
Open data in the social sciences, conundrum or feasible?”
Presentation transcript:

Anonymisation techniques and other measures to enable using and sharing research data Managing and Sharing Research Data workshop London, 2 December 2009

Using and sharing confidential research data …obtained from people as participants Requires a combination of: discussing consent and confidentiality with participants / respondents (dialogue) anomymisation of data user access regulations researchers only; use license with confidentiality agreement; data unavailable for certain time period What is required depends on: nature of research planned data uses is study specific

Identity disclosure A persons identity can be disclosed through: direct identifiers name, address, postcode, telephone number, voice, picture usually NOT essential research information (administrative) indirect identifiers – possible disclosure in combination with other information occupation, geography, unique or exceptional values (outliers) or characteristics

Why anonymise data? Ethical reasons – protect peoples identity (sensitive, illegal, confidential info) – disguise research location Legal reasons – not disclose personal data (DPA) Commercial reasons

Essential points Never disclose personal data (unless specific consent) Reasonable / appropriate level of anonymity Maintain maximum meaningful info Where possible replace rather than remove Identifying info may provide context, do not over-anonymise Re-users of data have the same legal and ethical obligation to NOT disclose confidential info as primary users

Anonymising quantitative data Remove direct identifiers names, address, institution Reduce the precision / detail of a variable through aggregation postcode sector vs full postcode, birth year vs date of birth, occupational categories Generalise meaning of detailed text variable occupational expertise Restrict upper / lower ranges of a variable to hide outliers income, age

Relational data Extra care needed - combinations of related datasets or a dataset in combination with publicly available info can disclose information e.g. businesses studied are mapped in publication

Geo-referenced data Spatial references (point coordinates, small areas) may disclose position of individuals, organisations, businesses, etc. Removing spatial references prevents disclosure; but all geographical, locational and related information lost Alternatives: Keep spatial references intact and impose access restrictions on data instead Reduce precision - replace point co-ordinates with larger, non- disclosing geographical areas km 2 area, postcode district, ward, road Reduce precision - replace point coordinate with meaningful variable typifying the geographical position catchment area, poverty index, population density

Anonymising qualitative data Plan or apply editing at time of transcription or initial write up Except: longitudinal studies - anonymise when data collection complete (linkages) Avoid blanking out information Use pseudonyms or replacements Removing or aggregating identifiers in text can distort data, make them unusable, unreliable or misleading - avoid over-anonymising Consistency within research team and throughout project Identify replacements in a meaningful way, e.g. with [brackets] Create anonymisation log of all replacements, aggregations or removals made; store this log separately from anonymised data files XML mark-up can be used for anonymisation (TEI tag) word to be anonymised

Audio-visual data Digital manipulation of audio and image files can remove personal identifiers voice alteration, image blurring (e.g. of faces) Labour intensive, expensive, may damage research potential of data Alternatives: Obtain consent to use and share data unaltered for research purposes Avoid mentioning disclosing information during audio recordings

Tips Always consider anonymisation of research data together with consent agreements and access restrictions Regulating / restricting user access may offer a better solution than anonymising Avoid collecting data that need anonymisation do not ask for full names if they latter need to be removed from data Remove, mask, change identifiers Maintain maximum information Retain unedited versions of data for use within the research team and for preservation Plan at start of research, not at the end

Sources Clark, A Anonymising research data. NCRM Working Paper Series 7/06. ESRC National Centre for Research Methods. [ 6/0706_anonymising_research_data.pdf] Inter-University Consortium for Political and Social Research (ICPSR) Guide to Social Science Data Preparation and Archiving: Best Practice Throughout the Data Life Cycle. 3rd Edition. ICPSR, Ann Arbor. UK Data Archive Manage and Share Data - Anonymising research data [ UK Data Archive Managing and Sharing Data – a best practice guide for researchers. [ Timescapes meetings & discussions

Exercises / scenarios Anonymising qualitative data: – Foot and mouth study Cumbria (5407) – Conflicts and violence in prison (4596) Anonymising quantitative data: Labour Force Survey Confidential relational and geo-referenced data: British Household Panel Survey