Download presentation
Presentation is loading. Please wait.
Published byMoses Mosley Modified over 9 years ago
1
Information Extraction Group Health David Carrell, PhD Group Health Research Institute June 29, 2010
2
BA Phil. & Relig. NNC MA Pol. Science SUNY U of Washington PhD Pol. Science Pew Health Policy UCSF U of Wash Research, IT Group Health IT, Research David’s background...
3
Group Health Research Institute (GHRI) ●Group Health (www.ghc.org) ●Founded 1947, Seattle, WA ●Integrated delivery system (“HMO”) ●~600K patients in WA (some OR, ID) ●Comprehensive EMR & patient portal (2004+) ●GHRI (www.grouphealthresearch.org) ●Founded 1983 ●300 staff (50 investigators) ●2009: >250 active grants ($39M)
4
Group Health Research Institute (GHRI) ●Applied research ●Epidemiology, health systems, clinical trials, economics... ●Limited bio-informatics expertise ●Collaborative ●HMO-Research Network, Cancer-RN,... MH-RN ●Federated data systems ●NLP vision ●NLP expertise through collaboration ●Bring NLP to the text—locally... other network sites
5
HMO Research Network Large data repositories Common EMR platforms GHRI & Research Consortia Virtual Data Warehouse (VDW)
6
GHRI & Virtual Data Warehouse (VDW) Structured data (legacy + Epic/EMR) Minimum 1990+ Integrated care delivery (some claims) Diagnoses, procedures, pharmacy, tumor, vitals, census/geocode, etc.
7
HMO Research Network GHRI & Virtual Data Warehouse (VDW)
8
GHRI & NLP Adoption
9
HMO Research Network GHRI & NLP Adoption
10
caBIG TBPT adoption proposal, Jun 2006 caTIES for pathology & radiology text, ~2007 Chart note text, May 2007 GWAS (eMERGE) proposal, Aug 2007 GATE experimentation, Feb 2008 Strategic planning conference, Dec 2008 ARRA Challenge Grant, Apr 2009 UIMA/cTAKES adoption, Aug 2009 Proposals... e.g.,HMORN multi-site, Jan 2010 GHRI & NLP Adoption
11
●How to bring NLP capacity to clinical text? ●“Cookbooks” (SAS Java programmers) ●“Parachuted” hardware ●Parachuted virtual machine (?) ●Cloud-based processing ●Security issues ●Other?
12
GHRI & NLP Adoption
13
Challenges of Cloud-based Solutions: Unfamiliar technologies Responsibility sharing (e.g., security) Patient privacy Institutional risk De-identification Graduated adoption? GHRI & NLP Adoption
14
SHARP Cloud Security Workshop Spring 2011 Educational focus Challenges of processing clinical text in a novel security space (virtual firewall?) Security best practices IRB engagement Graduated adoption strategies SHARP -- Exploring deployment strategies
15
NLP Challenge Grant Natural Language Processing for Cancer Research Network Surveillance Studies Aim 1: Deploy open-source NLP software Develop ETL connective tissue Build “human capital” (Java, NLP) Aim 2: NLP algorithm boot camp: Recurrent breast cancer diagnoses >3000 existing gold standard cases (human reviewed) Approach: Local deployment/programming support High-level NLP/bioinformatics expertise via external collaboration Participants: GHRI (Carrell, Buist, Chubak), Mayo Clinic/Harvard (Savova), Pittsburgh (Chapman), Vanderbilt (Xu).
16
Epic/Clarity Chart Notes Radiology Reports Pathology Reports UIMA/cTAKES NLP Raw Rich Document Manager Document_IdentifierConcept_Code Radiology_Report_0000012877143 Radiology_Report_0000018600231 Radiology_Report_0000013134988 Radiology_Report_0000015287109 Normalized NLP SQL Server Database NLP Challenge Grant – Aim 1
17
Document Type Available Documents Percent NLP Concept-Coded Chart Notes20M25% Radiology4M33% Pathology1.2M2% Chart Notes Radiology Path NLP Challenge Grant – Aim 1
18
NLP Challenge Grant – Aim 2
20
Rec Br Ca? AE1AE2AE3 Progress Notes AE1AE2 Oncology Notes AE1AE2AE3 Radiology Reports AE1 Pathology Reports NLP Challenge Grant – Aim 2
21
eMERGE consortium Vanderbilt, Mayo, Northwestern, Marshfield, Group Health Can EMRs from multiple institutions provide comparable phenotype data for GWAS? 14 phenotypes Group Health structured data Adoption of NLP algorithms developed by others “Low-tech” NLP Text explorer, Assisted chart abstraction
22
Clinical Text Explorer Select text source (chart notes, radiology, pathology, etc.) Search: recurrent NEAR breast NEAR cancer. Date range Sample spec’s N documents, N patients found Search terms highlighted
23
Assisted Chart Abstraction
24
A-Z Full-text Indexes Chart notes 550K pts 17M notes 0.8B lines SQL Server Chart notes 550K pts 17M notes 0.8B lines Pre-processed A-Z ID A-Z Date Cohort Lists Data Warehouse A-Z Etc. Point-and-click Outside EMR Assisted Chart Abstraction GUI NLP Concept Codes Data Text capture Assisted Chart Abstraction
25
Identify Cohort Selection criteria applied to the patient Selection criteria applied to the notes Pt Dx/Px/RxPt VisitsPt DemogNote DateNote ByNote TypeNote Text Assign note priority Assisted Chart Abstractio n Traditional chart abstractionAssisted chart abstraction Data Assisted Chart Abstraction
26
2903 (100%) Initial cohort identification: 137,019 (100%) 671 (23%) Inclusion criteria (demog., dx, px, etc.): 70,119 (51%) 122 (4%) Pre- processed text: 284 (0.2 %) 228 (8%) Electronic text: 28,186 (21%) Chart Notes Patients Stage Text: “CATARACT” Note: Op/Ophthal exam Near: Cataract procedure Assisted Chart Abstraction
27
Potential SHARP synergy... National Cancer Institute FOA: Tools for Electronic Data Extraction Funding: NCI Contract for software development Aim: Enhance/automate existing SEER cancer case identification (largely manual abstraction of EHR/paper charts) Approach: Assess, propose, test, modify, develop, deploy technologies that leverage NLP to automate some aspects of SEER workflow Participants: IMS, Inc., SEER sites (4), Group Health, Harvard
28
SHARP – NLP research lab
29
Questions – Discussion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.