Presentation is loading. Please wait.

Presentation is loading. Please wait.

De-identification using Harvard Scrubber Umit Topaloglu, Ph.D.

Similar presentations


Presentation on theme: "De-identification using Harvard Scrubber Umit Topaloglu, Ph.D."— Presentation transcript:

1 De-identification using Harvard Scrubber Umit Topaloglu, Ph.D.

2 Agenda Introduction Harvard Scrubber at UAMS Methodology De-identification Sample References Questions

3 Introduction EPF JCAPS HL7 Msg. Folder Original Messages Research Portal WEB-GUI Transform Messages Backup Messages Pulling Concepts Indexing Concepts Storing De-id Reports Harvard Scrubber caTissue De-id Reports

4 DEIDENTIFICATION at UAMS Harvard Scrubber (SPIN scrubber) [1] Open Source De-identification software package. Efficiency: An evaluation showed that it’s efficiency was greater than 98% at removing HIPAA identifiers [2]. Customizable through the regular expression list. Specifications: Operating System: Platform independent. Programming Language: Java. License: GNU GPL. Any restrictions to use by non-academics: None. Project Home Page: http://spin.nci.nih.gov.http://spin.nci.nih.gov

5 Harvard Scrubber at UAMS The Cancer Text Information Extraction System (caTIES) [3] Harvard Scrubber is the default de-identification package. Bundled with caTIES v3.6 Customization: Mirth Connect [4] is used to extract and mark the Patient Health Information (PHI) in the message based on the HL7 header. (e.g. [[ name ]] Added ~ 15 regular expression rules in addition to the build-in expressions. A pathologist name list is added, which is constantly updated by our Tissue Bank.

6 Methodology HL7 Pipeline De-id Pipeline (Harvard Scrubber) Tie Pipeline Remove PHI based on HL7 header Remove PHI pattern Remove Pathologists’ Name Regular Expression List Pathologist Name List De-identification Processes: 1.Remove PHI known to be associated with patient based on the HL7 header. E.g. Name, MRN, Accession #, etc. 2.Predictable PHI patterns removal using a series of regular expression clauses. E.g. SSN=[^a-z^A-Z^0- 9]+[0-9]{3}-[0-9]{2}-[0-9]{4} 1.Remove pathologist names that exists in the pathologist name list. caTIES

7 De-identification Sample [CLINICAL DATA] CLINICAL DATA: The patient is a female; abnormal uterine bleeding. Date of operation: xxxDATE7xxx (yr:2009) Name of operation: Endometrium biopsy. Preoperative diagnosis: Not given. xxxx PROVIDER_NAME xxxx DOCTOR_NAME xxxx [DIAGNOSIS] DIAGNOSIS: A) Endometrium, biopsy: Disordered proliferative pattern endometrium; no hyperplasia or carcinoma identified. [SPECIMENS SUBMITTED] SPECIMENS SUBMITTED: ENDOMETRIUM BIOPSY [GROSS DESCRIPTION] GROSS DESCRIPTION: Specimen A is received in formalin labeled with the patient's name [ xxxx PATIENT_NAME4 xxxx ] [ xxxx PATIENT_NAME4 xxxx ] medical record xxxx MRN xxxx and ENDOMETRIAL BIOPSY. The specimen consists of multiple fragments of red-brown, soft tissue cores measuring 2.0 x 1.8 x 0.4 cm in aggregate. The specimen is submitted entirely in 1 yellow cassette labeled A1. xxxx PATHOLOGYISTL1 xxxx xxxx PATHOLOGYISTF1 xxxx xxxDATE7xxx (yr:2009) Electronically signed on xxxDATE7xxx (yr:2009) xxxx TIME1 xxxx * I have reviewed the pertinent gross findings, any and all microscopic slides and the Resident's/ Fellow's interpretations. I have made appropriate editorial changes and have rendered the final diagnosis.

8 References 1.Beckwith BA, Mahaadevan R, Balis UJ, Kuo F. Development and evaluation of an open source software tool for deidentification of pathology reports. BMC Med Inform Decis Mak 2006;6:12. 2.Drake TA, Braun J, Marchevsky A, Kohane IS, Fletcher C, Chueh H, et al.: A system for sharing routine surgical pathology specimens across institutions: the Shared Pathology Informatics Network. 2007, 38(8):1212. 3.http://caties.cabig.upmc.edu/Wiki.jsp?page=Homehttp://caties.cabig.upmc.edu/Wiki.jsp?page=Home 4.http://www.mirthcorp.com/community/mirth-connecthttp://www.mirthcorp.com/community/mirth-connect

9 Thanks for listening Questions? Contact information: Umit Topaloglu Ph.D. utopaloglu@uams.edu 501- 686 - 7238


Download ppt "De-identification using Harvard Scrubber Umit Topaloglu, Ph.D."

Similar presentations


Ads by Google