Download presentation
Presentation is loading. Please wait.
Published byShanon Montgomery Modified over 8 years ago
1
De-identification using Harvard Scrubber Umit Topaloglu, Ph.D.
2
Agenda Introduction Harvard Scrubber at UAMS Methodology De-identification Sample References Questions
3
Introduction EPF JCAPS HL7 Msg. Folder Original Messages Research Portal WEB-GUI Transform Messages Backup Messages Pulling Concepts Indexing Concepts Storing De-id Reports Harvard Scrubber caTissue De-id Reports
4
DEIDENTIFICATION at UAMS Harvard Scrubber (SPIN scrubber) [1] Open Source De-identification software package. Efficiency: An evaluation showed that it’s efficiency was greater than 98% at removing HIPAA identifiers [2]. Customizable through the regular expression list. Specifications: Operating System: Platform independent. Programming Language: Java. License: GNU GPL. Any restrictions to use by non-academics: None. Project Home Page: http://spin.nci.nih.gov.http://spin.nci.nih.gov
5
Harvard Scrubber at UAMS The Cancer Text Information Extraction System (caTIES) [3] Harvard Scrubber is the default de-identification package. Bundled with caTIES v3.6 Customization: Mirth Connect [4] is used to extract and mark the Patient Health Information (PHI) in the message based on the HL7 header. (e.g. [[ name ]] Added ~ 15 regular expression rules in addition to the build-in expressions. A pathologist name list is added, which is constantly updated by our Tissue Bank.
6
Methodology HL7 Pipeline De-id Pipeline (Harvard Scrubber) Tie Pipeline Remove PHI based on HL7 header Remove PHI pattern Remove Pathologists’ Name Regular Expression List Pathologist Name List De-identification Processes: 1.Remove PHI known to be associated with patient based on the HL7 header. E.g. Name, MRN, Accession #, etc. 2.Predictable PHI patterns removal using a series of regular expression clauses. E.g. SSN=[^a-z^A-Z^0- 9]+[0-9]{3}-[0-9]{2}-[0-9]{4} 1.Remove pathologist names that exists in the pathologist name list. caTIES
7
De-identification Sample [CLINICAL DATA] CLINICAL DATA: The patient is a female; abnormal uterine bleeding. Date of operation: xxxDATE7xxx (yr:2009) Name of operation: Endometrium biopsy. Preoperative diagnosis: Not given. xxxx PROVIDER_NAME xxxx DOCTOR_NAME xxxx [DIAGNOSIS] DIAGNOSIS: A) Endometrium, biopsy: Disordered proliferative pattern endometrium; no hyperplasia or carcinoma identified. [SPECIMENS SUBMITTED] SPECIMENS SUBMITTED: ENDOMETRIUM BIOPSY [GROSS DESCRIPTION] GROSS DESCRIPTION: Specimen A is received in formalin labeled with the patient's name [ xxxx PATIENT_NAME4 xxxx ] [ xxxx PATIENT_NAME4 xxxx ] medical record xxxx MRN xxxx and ENDOMETRIAL BIOPSY. The specimen consists of multiple fragments of red-brown, soft tissue cores measuring 2.0 x 1.8 x 0.4 cm in aggregate. The specimen is submitted entirely in 1 yellow cassette labeled A1. xxxx PATHOLOGYISTL1 xxxx xxxx PATHOLOGYISTF1 xxxx xxxDATE7xxx (yr:2009) Electronically signed on xxxDATE7xxx (yr:2009) xxxx TIME1 xxxx * I have reviewed the pertinent gross findings, any and all microscopic slides and the Resident's/ Fellow's interpretations. I have made appropriate editorial changes and have rendered the final diagnosis.
8
References 1.Beckwith BA, Mahaadevan R, Balis UJ, Kuo F. Development and evaluation of an open source software tool for deidentification of pathology reports. BMC Med Inform Decis Mak 2006;6:12. 2.Drake TA, Braun J, Marchevsky A, Kohane IS, Fletcher C, Chueh H, et al.: A system for sharing routine surgical pathology specimens across institutions: the Shared Pathology Informatics Network. 2007, 38(8):1212. 3.http://caties.cabig.upmc.edu/Wiki.jsp?page=Homehttp://caties.cabig.upmc.edu/Wiki.jsp?page=Home 4.http://www.mirthcorp.com/community/mirth-connecthttp://www.mirthcorp.com/community/mirth-connect
9
Thanks for listening Questions? Contact information: Umit Topaloglu Ph.D. utopaloglu@uams.edu 501- 686 - 7238
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.