Open Data Sharing and its Statistical Limitations

Slides:



Advertisements
Similar presentations
HIPAA and Public Health 2007 Epi Rapid Response Team Conference.
Advertisements

COBB/DOUGLAS COMMUNITY SERVICES BOARD Confidentiality and Privacy of Consumer Information.
HIPAA – Privacy Rule and Research USCRF Research Educational Series March 19, 2003.
NAU HIPAA Awareness Training
HIPAA. Health Insurance Portability and Accountability Act.
Informed Consent.
Health Insurance Portability & Accountability Act “HIPAA” To every patient, every time, we will provide the care that we would want for our own loved ones.
Protecting Client Data HIPAA, HITECH and PIPA Part 1A
THE UW HEALTH SCIENCES IRB S OVERVIEW PRACTICAL REGULATORY ISSUES IN HUMAN SUBJECTS RESEARCH.
© Statistisches Bundesamt, IIA - Mathematisch Statistische Methoden Summary of Topic ii (Tabular Data Protection) Frequency Tables Magnitude Tables Web.
1 HIPAA, Researchers and the IRB: Part Two Alan Homans, IRB Chair and Nancy Stalnaker, IRB Administrator.
HIPAA, Researchers and the IRB Alan Homans, IRB Chair and Nancy Stalnaker, IRB Administrator.
© John M. Abowd 2005, all rights reserved Recent Advances In Confidentiality Protection John M. Abowd April 2005.
Health Insurance Portability and Accountability Act (HIPAA)
Quick Facts about Exempt Research No continuing review required IRB Reviewer makes Exempt determination 6 OHRP & 4 FDA categories(1 category overlaps)
Protected Health Information (PHI). Privileged Communication An exchange of information between two individuals in a confidential relationship. (Examples:
Microdata Simulation for Confidentiality of Tax Returns Using Quantile Regression and Hot Deck Jennifer Huckett Iowa State University June 20, 2007.
HIPAA Business Associates Leadership Group Meeting June 28, 2001.
1 Research & Accounting for Disclosures March 12, 2008 Leslie J. Pfeffer, BS, CHP Office of the Vice President for Research Administration Office of Compliance.
14 May Privacy Requirements Phoenix Ambulatory Blood Pressure Monitoring System © 2006 Christopher J. Adams Copying and distribution of this document.
Intruder Testing: Demonstrating practical evidence of disclosure protection in 2011 UK Census Keith Spicer, Caroline Tudor and George Cornish 1 Joint UNECE/Eurostat.
The Implementation of HIPAA Joan M. Kiel, Ph.D., C.H.P.S. Duquesne University Pittsburgh, Pennsylvania.
De-identifying Pathology Reports for Pathology Informatics
Health information that does not identify an individual and with respect to which there is no reasonable basis to believe that the information can be.
Health Datasets in Spatial Analyses: The General Overview Lukáš MAREK Department of Geoinformatics, Faculty.
The use of protected microdata in tabulation: case of SDC-methods microaggregation and PRAM Researcher Janika Konnu Manchester, United Kingdom December.
1 1 Anonymised Integrated Event History Datasets for Researchers Johan Heldal Statistics Norway.
HIPAA Health Insurance Portability and Accountability Act of 1996.
Disclosure Limitation in Microdata with Multiple Imputation Jerry Reiter Institute of Statistics and Decision Sciences Duke University.
1 Dissemination Michael J. Levin Harvard Center for Population and Development Studies
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Configuring Electronic Health Records Privacy and Security in the US Lecture b This material (Comp11_Unit7b) was developed by Oregon Health & Science University.
CSCI 6962: Server-side Design and Programming Shopping Carts and Databases.
Census 2011 – A Question of Confidentiality Statistical Disclosure control for the 2011 Census Carole Abrahams ONS Methodology BSPS – York, September 2011.
Combinations of SDC methods for continuous microdata Anna Oganian National Institute of Statistical Sciences.
Table of Contents. Lessons 1. Introduction to HIPAA Go Go 2. The Privacy Rule Go Go.
HIPAA Yesterday, Today and Tomorrow? Dianne S. Faup Office of HIPAA Standards Centers for Medicare & Medicaid Services.
Privacy: HIPAA Emerson Murphy-Hill. Rosie Callender, RHIA, web.msm.edu/hipaa/An%20Introduction%20to%20HIPAA.ppt What is HIPAA? A Federal Law Created in.
HIPAA and RESEARCH 5 th Thursday May 31, Page 2.
Expanding the Role of Synthetic Data at the U.S. Census Bureau 59 th ISI World Statistics Congress August 28 th, 2013 By Ron S. Jarmin U.S. Census Bureau.
Learning Objectives : After completing this lesson, you should be able to: Describe key data collection methods Know key definitions: Population vs. Sample.
Anonymity and risk of re-identification of health data
Michael Spiegel, Esq Timothy Shimeall, Ph.D.
What is HIPAA? HIPAA stands for “Health Insurance Portability & Accountability Act” It was an Act of Congress passed into law in HEALTH INSURANCE.
Data Collection and Reporting for MIS Student success (SS)
Assessing Disclosure Risk in Microdata
Figure 3: TSN Analysis Methodology
UK Data Service Secure Lab
Confidentiality in Published Statistical Tables
Measures for Information Loss in Protected Data
Data Analysis & Report Writing
Dissemination Workshop for African countries on the Implementation of International Recommendations for Distributive Trade Statistics May 2008,
No No, Yes Yes: Simple Privacy & Information Security Tips Krista Barnes, J.D. Senior Legal Officer and Director, Privacy & Information Security, Institutional.
Census Data for Transportation Planning—Some Thoughts
Access to European microdata for scientific purposes
Audience Analysis Phillip G. Clampitt, Ph.D. 11/27/2018.
CHAPTER 11: Producing Data— Part II Review
Protecting Confidential Data
The Health Insurance Portability and Accountability Act
HCPF’s Safe Harbor Rule Applied to COGNOS
Research Problem: The research problem starts with clearly identifying the problem you want to study and considering what possible methods will affect.
Federal Statistical Office Germany Research Data Centre
The Health Insurance Portability and Accountability Act
Education and Training Statistics Working Group, May 2011
SAFE – a method for anonymising the German Census
Quality assurance and assessment in the vital statistics system
Office of the Vice President for Research Human Subjects Protection Program IRB Submission Process Module 4 - Health Insurance Portability and Accountability.
The Health Insurance Portability and Accountability Act
Presentation transcript:

Open Data Sharing and its Statistical Limitations Pooja Iyer Barbara Do RTI International

Outline I. Data utility vs. data risk II. Introduction of the standard disclosure techniques III. Reason for cell suppression IV. Additional limitation techniques a. Limitation of detail b. Top/bottom coding c. Additive Noise

R vs. U R vs. U³ 1. High data utility U, so faithful in critical ways to the original (analytically valid) data 2. Low disclosure risk R, so confidentiality is protected (safe data)

Standard Disclosure Techniques De-Identification of PHI in accordance with HIPAA¹ Medical Record Numbers ‘Geographic subdivisions smaller than state’ Site ID Other items to consider removing/randomizing: remove ALL PROPER NOUNS i.e. names, initials, specific geographic locations Mask specific dates to a new category that shows ‘days from randomization’ Telephone numbers, email address, social security numbers, all biometric identifiers

Cell Suppression “In a contingency table, cells with too few observations cannot be released to the public, as it may be easy to infer the identity of these individuals.” ²

Additional techniques² Limitation of detail collapsing categories Top/bottom coding adding categories Additive noise Top/Bottom Coding Additive Noise Z = transformed point X = original data point ε = random variable with distribution e~N(0,σ²) Limitation of detail: View slide in presentation mode: Summary Limitation of detail Top/Bottom Coding Noise Addition Be sure to explain what MUAC reading is, and why it is potentially identifiable

References ¹HHS Office of the Secretary,Office for Civil Rights, & OCR. (2015, November 06). Methods for De-identification of PHI. Retrieved April 05, 2018 ²Matthews, G. J., & Harel, O. (2011). Data confidentiality: A review of methods for statistical disclosure limitation and methods for assessing prviacy. Statistics Surveys, 5, 1-29. doi:10.1214/11-SS074 ³Duncan, G. T., Keller-McNulty, S. A., & Stokes, S. L. (2001). Disclosure Risk vs. Data Utility: The R-U Confidentiality Map. National Institute of Statistical Sciences, 5-7. Retrieved April 5, 2018.

Pooja Iyer piyer@rti.org RTI International Thank you Pooja Iyer piyer@rti.org