Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Extraction Group Health David Carrell, PhD Group Health Research Institute June 29, 2010.

Similar presentations


Presentation on theme: "Information Extraction Group Health David Carrell, PhD Group Health Research Institute June 29, 2010."— Presentation transcript:

1 Information Extraction Group Health David Carrell, PhD Group Health Research Institute June 29, 2010

2 BA Phil. & Relig. NNC MA Pol. Science SUNY U of Washington PhD Pol. Science Pew Health Policy UCSF U of Wash Research, IT Group Health IT, Research David’s background...

3 Group Health Research Institute (GHRI) ●Group Health (www.ghc.org) ●Founded 1947, Seattle, WA ●Integrated delivery system (“HMO”) ●~600K patients in WA (some OR, ID) ●Comprehensive EMR & patient portal (2004+) ●GHRI (www.grouphealthresearch.org) ●Founded 1983 ●300 staff (50 investigators) ●2009: >250 active grants ($39M)

4 Group Health Research Institute (GHRI) ●Applied research ●Epidemiology, health systems, clinical trials, economics... ●Limited bio-informatics expertise ●Collaborative ●HMO-Research Network, Cancer-RN,... MH-RN ●Federated data systems ●NLP vision ●NLP expertise through collaboration ●Bring NLP to the text—locally... other network sites

5 HMO Research Network Large data repositories Common EMR platforms GHRI & Research Consortia Virtual Data Warehouse (VDW)

6 GHRI & Virtual Data Warehouse (VDW) Structured data (legacy + Epic/EMR) Minimum 1990+ Integrated care delivery (some claims) Diagnoses, procedures, pharmacy, tumor, vitals, census/geocode, etc.

7 HMO Research Network GHRI & Virtual Data Warehouse (VDW)

8 GHRI & NLP Adoption

9 HMO Research Network GHRI & NLP Adoption

10 caBIG TBPT adoption proposal, Jun 2006 caTIES for pathology & radiology text, ~2007 Chart note text, May 2007 GWAS (eMERGE) proposal, Aug 2007 GATE experimentation, Feb 2008 Strategic planning conference, Dec 2008 ARRA Challenge Grant, Apr 2009 UIMA/cTAKES adoption, Aug 2009 Proposals... e.g.,HMORN multi-site, Jan 2010 GHRI & NLP Adoption

11 ●How to bring NLP capacity to clinical text? ●“Cookbooks” (SAS  Java programmers) ●“Parachuted” hardware ●Parachuted virtual machine (?) ●Cloud-based processing ●Security issues ●Other?

12 GHRI & NLP Adoption

13 Challenges of Cloud-based Solutions: Unfamiliar technologies Responsibility sharing (e.g., security) Patient privacy Institutional risk De-identification Graduated adoption? GHRI & NLP Adoption

14 SHARP Cloud Security Workshop Spring 2011 Educational focus Challenges of processing clinical text in a novel security space (virtual firewall?) Security best practices IRB engagement Graduated adoption strategies SHARP -- Exploring deployment strategies

15 NLP Challenge Grant Natural Language Processing for Cancer Research Network Surveillance Studies Aim 1: Deploy open-source NLP software Develop ETL connective tissue Build “human capital” (Java, NLP) Aim 2: NLP algorithm boot camp: Recurrent breast cancer diagnoses >3000 existing gold standard cases (human reviewed) Approach: Local deployment/programming support High-level NLP/bioinformatics expertise via external collaboration Participants: GHRI (Carrell, Buist, Chubak), Mayo Clinic/Harvard (Savova), Pittsburgh (Chapman), Vanderbilt (Xu).

16 Epic/Clarity Chart Notes Radiology Reports Pathology Reports UIMA/cTAKES NLP Raw Rich Document Manager Document_IdentifierConcept_Code Radiology_Report_0000012877143 Radiology_Report_0000018600231 Radiology_Report_0000013134988 Radiology_Report_0000015287109 Normalized NLP SQL Server Database NLP Challenge Grant – Aim 1

17 Document Type Available Documents Percent NLP Concept-Coded Chart Notes20M25% Radiology4M33% Pathology1.2M2% Chart Notes Radiology Path NLP Challenge Grant – Aim 1

18 NLP Challenge Grant – Aim 2

19

20 Rec Br Ca? AE1AE2AE3 Progress Notes AE1AE2 Oncology Notes AE1AE2AE3 Radiology Reports AE1 Pathology Reports NLP Challenge Grant – Aim 2

21 eMERGE consortium Vanderbilt, Mayo, Northwestern, Marshfield, Group Health Can EMRs from multiple institutions provide comparable phenotype data for GWAS? 14 phenotypes Group Health structured data Adoption of NLP algorithms developed by others “Low-tech” NLP Text explorer, Assisted chart abstraction

22 Clinical Text Explorer Select text source (chart notes, radiology, pathology, etc.) Search: recurrent NEAR breast NEAR cancer. Date range Sample spec’s N documents, N patients found Search terms highlighted

23 Assisted Chart Abstraction

24 A-Z Full-text Indexes Chart notes 550K pts 17M notes 0.8B lines SQL Server Chart notes 550K pts 17M notes 0.8B lines Pre-processed A-Z ID A-Z Date Cohort Lists Data Warehouse A-Z Etc. Point-and-click Outside EMR Assisted Chart Abstraction GUI NLP Concept Codes Data Text capture Assisted Chart Abstraction

25 Identify Cohort Selection criteria applied to the patient Selection criteria applied to the notes Pt Dx/Px/RxPt VisitsPt DemogNote DateNote ByNote TypeNote Text Assign note priority Assisted Chart Abstractio n Traditional chart abstractionAssisted chart abstraction Data Assisted Chart Abstraction

26 2903 (100%) Initial cohort identification: 137,019 (100%) 671 (23%) Inclusion criteria (demog., dx, px, etc.): 70,119 (51%) 122 (4%) Pre- processed text: 284 (0.2 %) 228 (8%) Electronic text: 28,186 (21%) Chart Notes Patients Stage Text: “CATARACT” Note: Op/Ophthal exam Near: Cataract procedure Assisted Chart Abstraction

27 Potential SHARP synergy... National Cancer Institute FOA: Tools for Electronic Data Extraction Funding: NCI Contract for software development Aim: Enhance/automate existing SEER cancer case identification (largely manual abstraction of EHR/paper charts) Approach: Assess, propose, test, modify, develop, deploy technologies that leverage NLP to automate some aspects of SEER workflow Participants: IMS, Inc., SEER sites (4), Group Health, Harvard

28 SHARP – NLP research lab

29 Questions – Discussion


Download ppt "Information Extraction Group Health David Carrell, PhD Group Health Research Institute June 29, 2010."

Similar presentations


Ads by Google