Meeting The Technical Security Needs Primary and Secondary use of EHR systems Filip De Meyer
2 Custodix: Company Introduction Concepts & Terminology From Concept to Technical Solutions Example: The Custodix Anonimisation Tool (“CAT”) (screen shots) Content
3 In a few words… –Established in 2000 as a spin-off company of the University of Ghent, Belgium –Providing Privacy Protection services, mainly in HealthCare Trusted Third Party Services Customized Privacy Enhanced Data Collection Solutions Secure storage Privacy Consultancy … “One stop shop” for privacy/data protection Involved in European Research since the start Operating in Europe, Australia and Asia About Custodix 3
4 Commercial & Research Activities 4 Commercial Research Programs
5 Countries involved (sources of data) in Custodix protected data flows. Scope of Activities 5
6 Data Protection legislation examples: Europe: –European Directive 95/46/EC (accepted as one of the world’s highest privacy standards) –Member state implementation Other: –Health Insurance Portability and Accountability Act (H.I.P.A.A.) –Ontario Freedom of Information and the Protection of Privacy Act in Canada –… Background/History of Activities 6
7 Custodix Services 7
8 Trusted Third Party Research Data Repositories Various EHR Sources (care/diagnostic purposes) Personal Health Records (e.g. personal diaries) + Other Sources Additionally Collected Data (for research purposes) link protect privacy EHR Sources Research Use Research Data Repositories
9 Reduction of Identifying Information Risk Analysis delete identifier transform date produce nym personal data de-identified data Reduce Identifying Information Content delete data items … encrypt data items
10 Starting Point: Definition of Personal Data “ 'personal data' shall mean any information relating to an identified or identifiable natural person ('data subject'); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity. ” (Directive 95/46/EC, the “DPD”)
11 Concept of Identification A data subject is identified (within a set of data subjects) if it can be singled out among other data subjects. Some associations between characteristics and data subjects are more persistent in time (e.g. a national security number, date of birth) than others (e.g. an address). set of characteristics a b c d e f g h Set of data subjects
12 The Concept of Anonymisation set of characteristics a b c d e f g hdata subject Anonymisation is the process that removes the association between the identifying data set and the data subject. This can be done in two different ways: -by removing or transforming characteristics in the associated characteristics-data-set so that the association is not unique anymore and relates to more than one data subject. - by increasing the population in the data subjects set so that the association between the data set and the data subject is not unique anymore.
13 Terminology: Pseudonymisation set of characteristics a c d h Pseudonym Pseudonymisation is a particular type of anonymisation that, after removal of the association with a data subject, adds an association between a particular set of characteristics relating to a data subject and one or more pseudonyms. The pseudonym may be unique in in a domain. In irreversible pseudonymisation, the conceptual model does not contain a method to derive the association between the data-subject and the set of characteristics from the pseudonym. ? b e f g Note that “pseudonymisation” and “anonymisation” terminology is not universal
14 The Conceptual vs. Real Life Model “To determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person; whereas the principles of protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable; whereas codes of conduct within the meaning of Article 27 may be a useful instrument for providing guidance as to the ways in which data may be rendered anonymous and retained in a form in which identification of the data subject is no longer possible”. (Recital 26 of the DPD) refine the concept of identifiability/anonymity. take into account “means likely and “any other person” in through re-identification risk analysis
15 Privacy Risk Analysis
16 Levels of De-identification ( ISO/IEC DTS25237) Level 1: removal of clearly identifying data (“rules of thumb”) Level 2: static, model based re-identification risk analysis Level 3: continuous re-identification risk analysis of live databases Targets for de-identification can be set and liabilities better defined in risk analysis and policies.
17 ISO TC215 / WG 4 ISO/IEC DTS25237 (Approved T.S.) Health Informatics: Pseudonymisation Result of work in ISO/ TC 215/ WG4 Based on conceptual model as explained in this presentation Lists a number of Healthcare scenarios –clinical trials –clinical research –public health monitoring –patient safety reporting (adverse drug events) Current status: Approved Technical Specification
18 Disease Management, Clinical Trials, … requirements –Dynamic data collection of individual line data… Longitudinal studies Processing data of individual patients –Protection of data subjects towards data collector Data must be stored in protected form Different from disclosure control Requires –De-identified individual line data Pseudonymisation / anonymisation no protection through aggregation, data swapping, … –A-priory estimation of privacy risks and required data protection measures Privacy risk based on statistical models cfr. re-identification theory –Protection of the “context” in which data is considered anonymous Common Healthcare Requirements 18
19 Goal: –Protection of identity and privacy of individuals or organizations –Allowing linkage of data associated with pseudo-IDs irrespective of the collection time (cf. longitudinal studies) and collection place (cf. multi-center studies) Simplified: –Translating a given identifier into a pseudo-identifier by using secure, dynamic and (preferably ir-)reversible cryptographic techniques Tricky part: –Making sure that data is truly de-identified (within a predefined context) –Removing “indirectly identifying” content Pseudonymisation 19
20 Batch Data Collection 20 Sources Data Collection Site Trusted Third Party Build custom solutions using standard components Integrate security & privacy components into existing and new projects
21 The “interactive pseudonymisation system” Reconciling the concept of a “central anonymous database” with “nominative access” Interactive Pseudonymisation 21 Privacy Protection Gateway
22 Data Protection Service (acting as reverse proxy) Non-intrusive to the application (transparent) Key Management Service Secured Search Service Provides Authentication and user management to the application Web Enabled Implementation of Privacy Enhanced Storage Framework 22 Sources Data Collection Site PESF Service available as FLASH or Java/JavaScript toolkit Browser API
23 Secure Communication Anonymous Data Collection Secured Repository Case: Combined Trust Services 23 State-of-the-art Implementation based on innovative security technology Secure Information eXchange
24 Core Activities Integration … of clinical history, medical imaging and genetic data. Knowledge Grid … distributed mining for knowledge extraction. Clinical Trials … breast cancer & pediatric nephroblastoma Developing a Biomedical GRID infrastructure for sharing Clinical and Genomic expertise
25 Pseudonymisation Tool
26 Center for Data Protection Act as "data controller" or assist "data controllers" in the sense of the European Directive 95/46/EC on the protection of individuals with regard to the processing of personal data and on the free movement of such data; Be a think-tank for everyone professionally involved or interested in practical data protection; Promote the application of novel technology in the context of data protection (ePrivacy, eSecurity), and act as a dissemination point for practical solutions; Get involved with the development and promotion of standards and certification related to privacy protection; Provide assistance in dealing with complex data protection issues on an international level by offering access to a multidisciplinary pool of expertise.
27 Generate privacy protection profiles that can be run on heterogeneous data. Create (profile) once, run many times....
28 CAT:Overview
29 CAT: Variable Mappings Editor, XML Variable mappings (dicom, xml, csv, custom) Define a privacy type /variable –Identifier –Free text –Undefined –...
30 CAT: Transformation Editor Operands –named variable (e.g. patientID) –privacy type Flexible and detailed configuration –simple nym transformation –secure vaults (single or multiple argument) –random –replace with value –clear –make date relative –...
31 CAT: Transformation Editor, XML
32 CAT XML Example: Result “firstname” replaced by calculated nym “last name” cleared before after
33 CAT: Key Handling generate keys store keys import/export ...
34 CAT, DICOM Example
35 CAT: Variable Mappings Editor, DICOM
36 CAT: Transformation Editor, DICOM
37 CAT: DICOM Examples replaced by nym cleared original examples
38 Custodix NV Verlorenbroodstr. 120 B-9820 Merelbeke Belgium or Thank you for your attention! 38 Any Questions?