Information Extraction Group Health David Carrell, PhD Group Health Research Institute June 29, 2010.

Slides:



Advertisements
Similar presentations
CDM Registry Project Dr. Richard Lewanczuk Regional Medical Director Chronic Disease Management Capital Health.
Advertisements

Area 4 SHARP Face-to-Face Conference Phenotyping Team – Centerphase Project Assessing the Value of Phenotyping Algorithms June 30, 2011.
Overview of Biomedical Informatics Rakesh Nagarajan.
EleMAP: An Online Tool for Harmonizing Data Elements using Standardized Metadata Registries and Biomedical Vocabularies Jyotishman Pathak, PhD 1 Janey.
CCEGA InformaticsHemminger CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science Supported in part by NIH Grant 5P20RR
CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science.
St. Joseph Hospital Cancer Center & Cancer Institute NCCCP Pilot Project.
Enabling a Medical Home With a Patient Communication Strategy Jeanette Christopher Northwest Primary Care Group, P.C.
ICT Strategy, Business Plan & Business Case for Community Information System Siobhan Hanna May 2009.
Clinical Registries Needs and Solutions Dr. Peter Greene, CMIO Diana Gumas, IT Director 1.
Massachusetts Health Data Consortium, Inc. Slide #1 The Planets are aligning…. …for improving our health information infrastructure.
SCIENCE-DRIVEN INFORMATICS FOR PCORI PPRN Kristen Anton UNC Chapel Hill/ White River Computing Dan Crichton White River Computing February 3, 2014.
Shared Health Research Information Network Andrew McMurry, MS SHRINE Architect Harvard Medical School Center for BioMedical Informatics Children’s Hospital.
Consent2Share Linking Cohort Discovery to Consent David R Nelson MD Assistant Vice President for Research Professor of Medicine Director, Clinical and.
Review of 10 years Evidence for up-to-date clinical dental practice – a review of 10 years of the Cochrane Oral Health Group 30 th -31 st May 2006, Manchester.
Supporting the local research data environment via cross-campus collaboration and leveraging of national expertise Hannah F. Norton, Rolando Garcia Milian,
Linking Harvard for Clinical and Translational Science Powered by SPIN: Shared Pathology Informatics Network Primary support: NCI, NLM, and the DF/HCC.
HIT Policy Committee Quality Measures Workgroup October 28, 2010 Fred D Rachman, MD.
New Roles for Librarians: The Blended Professional Elaine Martin, MLS, DA Director of Library Services, Lamar Soutter Library Director of the National.
Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data SHARPfest June 2-3, 2010 PI: Christopher G Chute, MD DrPH.
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
Open Health Natural Language Processing Consortium (OHNLP)
The Health Care Systems Research Network AN INTRODUCTION.
Virginia Local Government IT Executives (VALGITE) April 26, 2010 Bruce Sturk & Leslie Fuentes – City of Hampton.
Treatment Summary University of California San Francisco Center of Excellence for Breast Cancer Care PI: Laura J Esserman MD MBA; Edward Mahoney; Elly.
1  Organization, Roles, and Skills  Methodology  Standards Analysis  Tool Evaluation Terminology Collaboration Business Plan  Project Identification.
Facilitate Scientific Data Sharing by Sharing Informatics Tools and Standards Belinda Seto and James Luo National Institute of Biomedical Imaging and Bioengineering.
De-identification: A Critical Success Factor in Clinical and Population Research Steven Merahn MD Dee Lang, RHIT Prepared for 2007 APIII Pittsburgh, PA.
ACHIEVING A LEARNING HEALTH SYSTEM THROUGH COLLABORATIVE ENGAGEMENT AND THE CREATION OF A STATEWIDE RESEARCH INFRASTRUCTURE AND CLINICAL DATA RESOURCES.
SLDS P ROGRAM U PDATE J UNE 12, : OO A. M. – 9:00 A. M. Statewide Longitudinal Data Systems (SLDS) Program Florida Department of Education.
HMORN Member Organizations Most sites include insurance companies and multi-specialty practices Representing 10 million covered lives Reimbursement models.
David Carr The Wellcome Trust Data management and sharing: the Wellcome Trust’s approach Economic & Social Data Service conference.
Issues and Challenges for Integrated Surveillance Systems Daniel M. Sosin, MD, MPH Division of Public Health Surveillance and Informatics Epidemiology.
The HMO Research Network (HMORN) is a well established alliance of 18 research departments in the United States and Israel. Since 1994, the HMORN has conducted.
Clinical Collaboration Platform Overview ST Electronics (Training & Simulation Systems) 8 September 2009 Research Enablers  Consulting  Open Standards.
Clinical Data Normalization Dr. Chute Aims: Build generalizable data normalization pipeline Semantic normalization annotators involving LexEVS Establish.
The HMO Research Network (HMORN) is a consortium of research centers working in close partnership with health systems. Members conduct public domain health.
While most HMORN projects involve two to five Network sites, its largest consortiums are the most widely recognized. Nearly 40% of HMORN projects and consortium.
Breast Cancer Surveillance Consortium (BCSC): sponsored by the National Cancer Institute Cancer Screening Surveillance in Clinical Practice Tracy Onega,
This material was developed by Oregon Health & Science University, funded by the Department of Health and Human Services, Office of the National Coordinator.
May 2007 CTMS / Imaging Interoperability Scenarios March 2009.
Research Tools Brought to you by the Clinical and Translational Science Institute Presented by: Terri Shkuda Systems Analyst Research Informatics The Penn.
Informatics Tools and Services Biomedical Informatics Core Tim Aro.
Uses of the NIH Collaboratory Distributed Research Network Jeffrey Brown, PhD for the DRN Team Harvard Pilgrim Health Care Institute and Harvard Medical.
Building Capacity for EMR Adoption and Data Utilization Among Safety Net Organizations Presented by Chatrian Reynolds, MPH, Evaluator, LPHI Shelina Foderingham,
Linking Electronic Health Records Across Institutions to Understand Why Women Seek Care at Multiple Sites for Breast Cancer Caroline A. Thompson, PhD,
Baseline The baseline at July Previously there was a lack of consistency for: Pathways into specialist clinics; Policies, procedures and guidelines.
Early Identification of Patients for Clinical Trials and Special Studies with Custom Metafile NAACCR, June 18, 2009 Alan R. Houser, MA, MPH C/NET Solutions.
Using CDC Edits Metafile in the Registry to Support Clinical Trials Recruitment Alan R. Houser, MA, MPH C/NET Solutions Dennis Deapen, DrPH Los Angeles.
“Preparing competitive grant proposals that match policy objectives - project proposal evaluators' viewpoint ” Despina Sanoudou, PhD FACMG Assistant Professor.
C3PR: An Introduction for Users A Tool Demonstration from caBIG™ Vijaya Chadaram Duke Cancer Center April 29, 2008.
IRB Open House: Implementation of Single IRB Review
Jim Bland Executive Director, CRIX International
Phenotyping youth depression
Solutions to Clinical Data Visualization and Analysis
Stony Brook University The Process for Joining TIES
Electronic Case Reporting Update
Evaluation of NCI Research Resources
Summit 2017 Breakout Group 2: Data Management (DM)
Rural Health Summit June 11, 2010.
of Pathology Specimens for the VA Precision Oncology Program
Getting and using data: MI2 supporting MCRN
INTEGRATED ELECTRONIC HEALTH RECORD SYSTEM
Health Resources and Services Administration (HRSA)
Matthew A. Michela President & CEO, lifeIMAGE
An Investigator’s Guide to the Clinical and Translational Science Awards and the Trial Innovation Network (TIN)  What can we do for you? May 2017.
National HIT Resource Center
Clinical and Translational Science Awards Program
Supporting academic research
Members Meeting Leadership Consortium for a Value & Science-Driven Health System March 21, 2019 Vision  Research  Evidence  Effectiveness  Trials.
Presentation transcript:

Information Extraction Group Health David Carrell, PhD Group Health Research Institute June 29, 2010

BA Phil. & Relig. NNC MA Pol. Science SUNY U of Washington PhD Pol. Science Pew Health Policy UCSF U of Wash Research, IT Group Health IT, Research David’s background...

Group Health Research Institute (GHRI) ●Group Health ( ●Founded 1947, Seattle, WA ●Integrated delivery system (“HMO”) ●~600K patients in WA (some OR, ID) ●Comprehensive EMR & patient portal (2004+) ●GHRI ( ●Founded 1983 ●300 staff (50 investigators) ●2009: >250 active grants ($39M)

Group Health Research Institute (GHRI) ●Applied research ●Epidemiology, health systems, clinical trials, economics... ●Limited bio-informatics expertise ●Collaborative ●HMO-Research Network, Cancer-RN,... MH-RN ●Federated data systems ●NLP vision ●NLP expertise through collaboration ●Bring NLP to the text—locally... other network sites

HMO Research Network Large data repositories Common EMR platforms GHRI & Research Consortia Virtual Data Warehouse (VDW)

GHRI & Virtual Data Warehouse (VDW) Structured data (legacy + Epic/EMR) Minimum Integrated care delivery (some claims) Diagnoses, procedures, pharmacy, tumor, vitals, census/geocode, etc.

HMO Research Network GHRI & Virtual Data Warehouse (VDW)

GHRI & NLP Adoption

HMO Research Network GHRI & NLP Adoption

caBIG TBPT adoption proposal, Jun 2006 caTIES for pathology & radiology text, ~2007 Chart note text, May 2007 GWAS (eMERGE) proposal, Aug 2007 GATE experimentation, Feb 2008 Strategic planning conference, Dec 2008 ARRA Challenge Grant, Apr 2009 UIMA/cTAKES adoption, Aug 2009 Proposals... e.g.,HMORN multi-site, Jan 2010 GHRI & NLP Adoption

●How to bring NLP capacity to clinical text? ●“Cookbooks” (SAS  Java programmers) ●“Parachuted” hardware ●Parachuted virtual machine (?) ●Cloud-based processing ●Security issues ●Other?

GHRI & NLP Adoption

Challenges of Cloud-based Solutions: Unfamiliar technologies Responsibility sharing (e.g., security) Patient privacy Institutional risk De-identification Graduated adoption? GHRI & NLP Adoption

SHARP Cloud Security Workshop Spring 2011 Educational focus Challenges of processing clinical text in a novel security space (virtual firewall?) Security best practices IRB engagement Graduated adoption strategies SHARP -- Exploring deployment strategies

NLP Challenge Grant Natural Language Processing for Cancer Research Network Surveillance Studies Aim 1: Deploy open-source NLP software Develop ETL connective tissue Build “human capital” (Java, NLP) Aim 2: NLP algorithm boot camp: Recurrent breast cancer diagnoses >3000 existing gold standard cases (human reviewed) Approach: Local deployment/programming support High-level NLP/bioinformatics expertise via external collaboration Participants: GHRI (Carrell, Buist, Chubak), Mayo Clinic/Harvard (Savova), Pittsburgh (Chapman), Vanderbilt (Xu).

Epic/Clarity Chart Notes Radiology Reports Pathology Reports UIMA/cTAKES NLP Raw Rich Document Manager Document_IdentifierConcept_Code Radiology_Report_ Radiology_Report_ Radiology_Report_ Radiology_Report_ Normalized NLP SQL Server Database NLP Challenge Grant – Aim 1

Document Type Available Documents Percent NLP Concept-Coded Chart Notes20M25% Radiology4M33% Pathology1.2M2% Chart Notes Radiology Path NLP Challenge Grant – Aim 1

NLP Challenge Grant – Aim 2

Rec Br Ca? AE1AE2AE3 Progress Notes AE1AE2 Oncology Notes AE1AE2AE3 Radiology Reports AE1 Pathology Reports NLP Challenge Grant – Aim 2

eMERGE consortium Vanderbilt, Mayo, Northwestern, Marshfield, Group Health Can EMRs from multiple institutions provide comparable phenotype data for GWAS? 14 phenotypes Group Health structured data Adoption of NLP algorithms developed by others “Low-tech” NLP Text explorer, Assisted chart abstraction

Clinical Text Explorer Select text source (chart notes, radiology, pathology, etc.) Search: recurrent NEAR breast NEAR cancer. Date range Sample spec’s N documents, N patients found Search terms highlighted

Assisted Chart Abstraction

A-Z Full-text Indexes Chart notes 550K pts 17M notes 0.8B lines SQL Server Chart notes 550K pts 17M notes 0.8B lines Pre-processed A-Z ID A-Z Date Cohort Lists Data Warehouse A-Z Etc. Point-and-click Outside EMR Assisted Chart Abstraction GUI NLP Concept Codes Data Text capture Assisted Chart Abstraction

Identify Cohort Selection criteria applied to the patient Selection criteria applied to the notes Pt Dx/Px/RxPt VisitsPt DemogNote DateNote ByNote TypeNote Text Assign note priority Assisted Chart Abstractio n Traditional chart abstractionAssisted chart abstraction Data Assisted Chart Abstraction

2903 (100%) Initial cohort identification: 137,019 (100%) 671 (23%) Inclusion criteria (demog., dx, px, etc.): 70,119 (51%) 122 (4%) Pre- processed text: 284 (0.2 %) 228 (8%) Electronic text: 28,186 (21%) Chart Notes Patients Stage Text: “CATARACT” Note: Op/Ophthal exam Near: Cataract procedure Assisted Chart Abstraction

Potential SHARP synergy... National Cancer Institute FOA: Tools for Electronic Data Extraction Funding: NCI Contract for software development Aim: Enhance/automate existing SEER cancer case identification (largely manual abstraction of EHR/paper charts) Approach: Assess, propose, test, modify, develop, deploy technologies that leverage NLP to automate some aspects of SEER workflow Participants: IMS, Inc., SEER sites (4), Group Health, Harvard

SHARP – NLP research lab

Questions – Discussion