De-Identification Jules J. Berman, Ph.D., M.D. Panel #: 1, March 8.

Slides:



Advertisements
Similar presentations
Regulatory Clinical Trials Clinical Trials. Clinical Trials Definition: research studies to find ways to improve health Definition: research studies to.
Advertisements

Independent Contractor Orientation HIPAA What Is HIPAA? Health Insurance Portability and Accountability Act of 1996 The Health Insurance Portability.
WRSU Customer Service The Beauty of Change. Privacy and Confidentiality.
Patient Rights and Confidentiality. Inform Patient of their Rights  Upon admissions  Written information available in English and Spanish  Non-English.
HIPAA – Privacy Rule and Research USCRF Research Educational Series March 19, 2003.
Criteria For Approval 45 CFR CFR Minimized risks Reasonable risk/benefit ratio Equitable subject selection Informed consent process Informed.
SLIDE 1 Westbrook Technologies from Fortis: A Healthcare Solution for Medical Records, Billing and HIPAA.
HIPAA Training Presentation for New Employees How did we get here? HIPAA Police 1.
Information Sharing and Cross-System Collaboration John Petrila, J.D., LL.M. Professor, University of South Florida
Health Insurance Portability Accountability Act of 1996 HIPAA for Researchers: IRB Related Issues HSC USC IRB.
Workshop on High Confidence Medical Device Software and Systems (HCMDSS) Research & Roadmap June 2-3, 2005 Philadelphia, PA. Manufacturer/Care-Giver Perspective.
Professional Wills: Meeting our Obligations Alan Slusky, Ph.D., C. Psych. Registrar – Psychological Association of Manitoba Nov. 14, 2012.
 Guarantee that EK is safe  Yes because it is stored in and used by hw only  No because it can be obtained if someone has physical access but this can.
CUMC IRB Investigator Meeting November 9, 2004 Research Use of Stored Data and Tissues.
BTRIS: The NIH Biomedical Translational Research Information System James J. Cimino Chief, Laboratory for Informatics Development NIH Clinical Center.
BTRIS: The NIH Biomedical Translational Research Information System James J. Cimino Chief, Laboratory for Informatics Development NIH Clinical Center.
Taking Steps to Protect Privacy A presentation to Hamilton-area Physiotherapy Managers by Bob Spence Communications Co-ordinator Office of the Ontario.
Applied Health Informatics and Information Management Workforce Claire Dixon-Lee, PhD, RHIA, FAHIMA Vice President for Education and Accreditation American.
Medical Informatics Basics
The Nuffield Council on Bioethics Report : The collection, linking and use of data in biomedical research and health care: ethical issues. Martin Richards.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Confidentiality and Privacy of Lab Data: An IRB Perspective Automated Information Management in the Clinical Laboratory Ann Arbor, Michigan, May 24, 2002.
Working Plan of US-China Bilateral cooperation on biomedical data sharing.
Community Feedback and Involvement in [Health Department’s] Proposed Data to Care Program [Name of Provider Session Date of Provider Session]
Cornell Evaluation Network The Use of Human Participants in Research Office of Research Integrity and Assurance ~ May 14, 2007.
Human Research Protection Programs 1a: How to Navigate Human Subject Protection Regulations Sponsored by the American Society for Investigative Pathology.
Paula Peyrani, MD Medical/Project Director, HIV Program at the 550 Clinic Assistant Director, Research Design and Development Clinical and Translational.
Confidentiality and Security Issues in ART & MTCT Clinical Monitoring Systems Meade Morgan and Xen Santas Informatics Team Surveillance and Infrastructure.
Li Xiong CS573 Data Privacy and Security Healthcare privacy and security: Genomic data privacy.
De-identifying Pathology Reports for Pathology Informatics
The analyses upon which this publication is based were performed under Contract Number HHSM C sponsored by the Center for Medicare and Medicaid.
Open Source Solutions for Tissue Banking Informatics Jules J. Berman, Ph.D., M.D. INFORMATICS FOR REPOSITORIES Wednesday, May 21, :30 pm – 4:05 pm.
UNIT 8 Seminar.  According to Sanderson (2009), the Practice Partner is an electronic health record and practice management program for ambulatory practices.
Group 3 Angela, Rachael, Misty, Kayelee, and Krysta.
Privacy in Healthcare Challenges Associated with Implementing Privacy in an Electronic Health Records Environment John P. Houston, J.D. Vice President,
H I P A A T R A I N I N G Self Directed Module 7 Research Disclosures For Data Custodians START Click to begin…
Presented by Yvette Jones, MHA, RHIA, LHCRM.  WELCOME!!!!!!  My story  Questions-Please hold questions until designated question time.
IT Security Policy Framework ● Policies ● Standards ● Procedures ● Guidelines.
HIT Policy Committee Report from HIT Standards Committee Privacy and Security Workgroup Dixie Baker, SAIC December 15, 2009.
Working with HIT Systems
The HMO Research Network (HMORN) is a well established alliance of 18 research departments in the United States and Israel. Since 1994, the HMORN has conducted.
AN INTRODUCTION Managing Change in Healthcare IT Implementations Sherrilynne Fuller, Center for Public Health Informatics School of Public Health, University.
Pathology data sharing United States Military Cancer Institute Walter Reed Army Medical Center November 16, 2004 Jules J. Berman, Ph.D., M.D. Program Director,
Implementing an RDF Schema for Pathology Images, From the Association for Pathology Informatics Jules J. Berman, Ph.D., M.D. APIII, Pittsburgh, PA Monday,
CS 127 Introduction to Computer Science. What is a computer?  “A machine that stores and manipulates information under the control of a changeable program”
A Road Map to Research at Jefferson: HIPAA Privacy and Security Rules for Researchers Presented By: Privacy Officer/Office of Legal Counsel October 2015.
Computer Science, Algorithms, Abstractions, & Information CSC 2001.
Case Studies: Puzzles in Human Research Kevin L. Nellis, M.S., M.T. (A.S.C.P.) Program Analyst, Program for Research Integrity Development and Education.
The Importance of Tissue Banking and Tissue Research Mark E. Sobel, M.D., Ph.D. Executive Officer, American Society for Investigative Pathology
IOM Review: VSD Data Sharing Program Melinda Wharton, M.D. National Immunization Program, CDC NVAC Vaccine Safety Subcommittee October 5, 2004.
Open Source in Healthcare and Public Health Track The Geo. Washington Univ. Open Source Conference Open Source Confidentiality Methods 4:00 P.M. March.
WISHA, 7/23/04 Employee Medical and Exposure Records Chapter WAC Employer Responsibilities.
HIPAA Privacy Rule Positive Changes Affecting Hospitals’ Implementation of the Rule.
Biomedical Informatics Research Network DATA SHARING HIPAA Compliance & IRB Approvals Martha Payne, Jeffrey Grethe October 10, nd Annual All Hands.
1 Copyright © 2009, 2006, 2003, 2000, 1997, 1994 by Saunders, an imprint of Elsevier Inc. Chapter 23 Nursing Informatics.
The Health Insurance Portability and Accountability Act (HIPAA) requires Plumas County to train all employees in covered departments about the County’s.
Christine Yalda, J.D., Ph.D. Chair, Human Research Review Committee Grand Valley State University.
Table of Contents. Lessons 1. Reducing Liability Go Go 2. Ethics Go Go 3. Ethical Dilemmas Go Go.
School of Health Sciences Unit 4 Legal Aspects of Health Information and Health Care Statistics HI 135 Instructor: Alisa Hayes, MSA, RHIA, CCRC.
FERPA & HIPAA: Maintaining Student Confidentiality.
HIPAA Privacy Rule Positive Changes Affecting Hospitals’ Implementation of the Rule Melinda Hatton -- Oct. 31, 2002.
Honest Brokers for Secure De- identification of Patient Records Project – CSE 5810 – Introduction to Biomedical Informatics Krishna Kalaparti Date: 04/20/2016.
Tim Friede Department of Medical Statistics
Medical Devices and Clinical Informatics
The HIPAA Privacy Rule: Implications for Medical Research
Medical Ethics Chapter 6.
A Patient has the Right to…..
Analysis of Final HIPAA Privacy Modification Rule
Nursing informatics Lecture (11).
Office of Audit, Compliance & Privacy
Presentation transcript:

De-Identification Jules J. Berman, Ph.D., M.D. Panel #: 1, March 8

2 De-Identification Reasons for De-Identification –Exchange and combine large, complex data sets containing human subject info, from multiple sources. –Conduct human subject research without harming patients Avoid impossible task of getting individual informed consent on thousands/millions of records Exempt from HIPAA regulations Potentially exempt from Common Rule regulations for human subject research –Make vital contributions to medical science, provide better health care, protect nation's health (as per HITECH)

3 De-Identification Historical development of de-Identification under Common Rule and HIPAA –In the Common Rule (45 CRF 46, Protection of Human Subjects, 1991) there was the general concept if you remove all links to the person, the data becomes disembodied, and thus safe Not focused on combining data from multiple sources, accruing data over time, or revisiting source data. Not always safe –De-identification (HIPAA Privacy Rule, 2003) permits re- identification and provides a narrow opportunity to bind a record to a unique object without harming patients

4 De-Identification Processes in dataset De-Identification –Removing identifiers from individual records (most attention but least difficult part of job) –Making sure there are no records with unique sets of data that can identify an individual (ambiguation) –*Providing a unique code to each data record that will be the same for every record belonging to an individual (most difficult part of job) –*Data scrubbing (removing private information from text)

5 De-Identification Providing a unique code to each data record that will be the same for every record belonging to an individual and will not be used for other individuals –Sometimes confusingly referred to as providing a unique identifier for the record –John Q. Public “Glucose 85” “9/20/03” –Replace with one-way hash performed on John Q. Public (name cannot be computed from hash value) – “Glucose 85” “9/20/03”

6 De-Identification – “Glucose 85” “9/20/03” –John Q. Public comes back about a year later and has another glucose test. –John Q. Public “Glucose 93” “10/15/04” –Perform one-way hash on John Q. Public – “Glucose 93” “10/15/04” –Combine your data – “Glucose 85” “9/20/03” – “Glucose 93” “10/15/04”

7 De-Identification –Problem: Vulnerable to dictionary attack –Explicitly forbidden under HIPAA to achieve de- identification of a dataset by using a one-way hash on an identifier.

8 De-Identification –Solution: Use a zero-knowledge protocol to determine if two records belong to the same person A zero-knowledge protocol is a way of resolving a question without learning anything about the subjects in question, other than the answer to your question. A patient's identifier is added to a random number, producing a new random number, for the two records being compared; if the new random number is the same for both, the patient in both records is the same If so, assign both records the same unique random code (e.g., uuid). HIPAA and Common Rule exempt.

9 De-Identification A big problem assigning a unique code to each record is the absence of a national patient identifier system in the U.S. –Without national patient identifier, you've got to settle for flawed alternate methods (name, social security number, social security number plus birthdate). –The weakness in many EHR systems is poor patient identification (one patient with multiple identifiers, one identifier with multiple patients)

10 De-Identification Data scrubbing: removing private information, including identifiers, incriminating and embarrassing remarks, information unrelated to the necessary use of the data –Applies to free-text data

11 De-Identification Data scrubbing: Two methods –One way: Remove everything from the data that is found on a list of forbidden words and phrases. –Another way: Remove everything from the data that is absent from a list of acceptable phrases.

12 De-Identification Remove everything from the data that is found on a list of forbidden words and phrases –Produces a readable output, but slowly. –Requires an up-to-date list of bad words and phrases (patient names, staff names, etc.) –Reduces identifiers, does not eliminate all identifiers –To the best of my knowledge, has never been used to prepare de-identified data to the public. Used in “Data Use Agreements” - no scientific value because not publicly reviewed or shared

13 De-Identification Remove everything from the data that is absent from a list of acceptable phrases –Removes all identifiers if the list is clean. –Can be used to make public release data. –Very fast (thousands of times faster than alternate method). –Simple code, in public domain. (Can be implemented in 16 lines). –Chief drawback: Provides an inferior output with respect to readability

14 De-Identification Advice regarding de-identification under HITECH –In absence of national patient identifier system, the most important task of Information systems is to provide a unique identifier to each patient. There should be certified public protocols to accomplish this and EHR certification should focus on this task. –Certify public methods for comparing patient records across institutions to determine when two records belong to the same patient (as per zero knowledge protocol).

15 De-Identification Advice regarding de-identification under HITECH –Certify public methods that bind a unique number to a patient record (to aggregate records across institutions and across time). –Certify public protocols, algorithms, and software routines that scrub free-text data.

16 De-Identification Advice regarding de-identification under HITECH –Medical informatics is a serious field of study, and can't be mastered by attending a few meetings. Guidelines for curricula that include in-depth discussions of the issues covered in these panels should be written and distributed to educational facilities.

17 De-Identification References to some of my works Berman JJ. Confidentiality for Medical Data Miners. Artificial Intelligence in Medicine. 26(1-2):25-36, Berman JJ. Threshold protocol for the exchange of confidential medical data. BMC Medical Research Methodology. 2:12, Berman JJ. Concept-Match Medical Data Scrubbing: How pathology datasets can be used in research. Arch Pathol Lab Med. 127: , (Concept-Match has been replaced by doublet method, see Ruby Programming book) Berman JJ. Zero-Check: A Zero-Knowledge Protocol for Reconciling Patient Identities Across Institutions. Archives of Pathology and Laboratory Medicine 128: , Berman JJ. Nomenclature-based data retrieval without prior annotation: facilitating biomedical data integration with fast doublet matching. In Silico Biology 5, 0029, Berman JJ. Biomedical Informatics. Jones and Bartlett, Sudbury, MA, Berman JJ. Perl Programming for Medicine and Biology. Jones and Bartlett, Sudbury, MA, Berman JJ. Ruby Programming for Medicine and Biology. Jones and Bartlett, Sudbury, MA, Berman JJ. Methods in Medical Informatics: Fundamentals of Healthcare Programming in Perl, Python, and Ruby. CRC Press, Chapman & Hall/CRC Mathematical & Computational Biology, 2010 Web site with links to programs and text of papers: blog site: