Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use Dr. Friedman on-site visit, Mayo Clinic 3 September 2010.

Similar presentations


Presentation on theme: "Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use Dr. Friedman on-site visit, Mayo Clinic 3 September 2010."— Presentation transcript:

1 Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use Dr. Friedman on-site visit, Mayo Clinic 3 September 2010

2 SHARP: Area 4: Secondary Use of EHR Data 14 academic and industry partners Develop tools and resources that influence and extend secondary uses of clinical data Cross-integrated suite of project and products Clinical Data Normalization Natural Language Processing (NLP) Phenotyping (cohorts and eligibility) Common pipeline tooling (UIMA) and scaling Data Quality (metrics, missing value management) Evaluation Framework (population networks) © 2009 Mayo Clinic2

3 Collaborations Agilex Technologies CDISC (Clinical Data Interchange Standards Consortium) Centerphase Solutions Deloitte Group Health, Seattle IBM Watson Research Labs University of Utah Harvard Univ. & i2b2 Intermountain Healthcare Mayo Clinic Minnesota HIE (MNHIE) MIT and i2b2 SUNY and i2b2 University of Pittsburgh University of Colorado

4 Themes & Projects

5 Major Achievements Foster social connections across projects Recognition by team members that not all problems must be solved within their team NLP and phenotypes Phenotypes and CEM normalization Shared responsibility for overlapping dependencies

6 The bookends - Projects 1&6 Data Normalization & Evaluation Christopher G. Chute Stan Huff (Peter Haug)

7 Overview Build generalizable data normalization pipeline Establish a globally available resource for health terminologies and value sets Establish and expand modular library of normalization algorithms Iteratively test normalization pipelines, including NLP where appropriate, against normalized forms, and tabulate discordance. Use cohort identification algorithms in both EMR data and EDW data. (normalize against CEMs)

8 Progress Designation of Clinical Element Models (CEMs) as canonical form Utilizing use case scenario’s (PAD, CPNA, etc) for CEM normalization. Exploration into generalizable CEM models – diagnosis, medications, labs. Development of processes/tools to identify relevant existing CEM models within CEM libraries Development of processes to identify missing CEMs for data (and classes of data) in use-cases Preliminary population of phenotype use-cases

9 Planned Adopt eMERGE EleMap tooling for CEMs to population canonical model Formalize Meaningful Use vocabularies into LexGrid server Design other components of Data Normalization framework (Terminology Services - NHIN connections) Model end-to-end flow needed to produce normalized data from structured data and unstructured (natural language) data: High level description of process for taking “wild-type” data instances to canonical CEM instances Applicability to use-case data as well as to general classes of data Adopt UMIA data flows for normalization services Examine Regenstreif and SHARP 3 modules

10 Project 2 Clinical Natural Language Processing (cNLP) Dr. Guergana Savova

11 Overview Overarching goal High-throughput phenotype extraction from clinical free text based on standards and the principle of interoperability Focus Information extraction (IE): transformation of unstructured text into structured representations (CEMs) Merging clinical data extracted from free text with structured data

12 Progress Detailed 4-year project plan Tasks in execution: Investigative tasks: (1) defining CEMs and attributes as normalization targets for NLP, (2) defining set of clinical named entities and their attributes, (3) methods for cNE Engineering tasks: (1) defining users, (2) incorporating site NLP tools into cTAKES and UIMA, (3) common conventions and requirements, (4) de-identification flow and data sharing Forging cross-SHARP collaborations (SHARP 3, PI Kohane and Mandl)

13 Planned Y1 Gold standard for cNEs, relations and CEMs Focus on methods for cNE discovery and populating relevant CEMs (many subtasks) Projected module releases: Medication extraction (Nov’10) CEM OrderMedAmb population (Mar’11) Deep parser for cTAKES (Nov’10) Dependency parser for cTAKES (Jan’11) Collaboration with SHARP 3 by providing medication extraction capabilities for the medication SMaRT app

14 Project 3 High throughput Phenotyping (HTP) Dr. Jyoti Pathak

15 Overview Overarching goal To develop techniques and algorithms that operate on normalized EMR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings Focus Portability of phenotyping algorithms Representation of phenotyping logic Measure goodness of EMR data 06/21/10© 2010 Mayo Clinic15

16 Progress Explored use case phenotypes from eMERGE network for HTP process validation Representation of phenotype descriptions and data elements using Clinical Element Models Preliminary execution of phenotyping algorithms (Peripheral Arterial Disease) to compare aggregate data

17 Planned Interaction and collaboration with Data Normalization and NLP teams to develop “data collection widgets” Representation of phenotyping execution logic in a machine processable format/language Development of machine learning methods for semi-automatic cohort identification

18 Project 4 Infrastructure & Scalability Jeff Ferraro Marshal Schor Calvin Beebe

19 UIMA exploitation Some initial discussions on UIMA were held in a meeting at MIT attended by Peter Szolovits (MIT) and Guergana Savova (Harvard) and some of their team members. A plan is underway for a UIMA "deep dive" for other members from Intermountain Health and Mayo. A discussion is pending to understand the how UIMA might fit with RPE (in particular, BPEL) RPE = Retrieve Process for Execution: an IHE (Integrating the Health Enterprise) profile to automate collaborative workflow between healthcare and secondary use domains)

20 Infrastructure Progress Code repository – Reviewed requirements (e.g. SVN), need pre-release work areas for project teams, bulk of materials will all be in public repository. Licensing compatibility discussion. Initial discussions on Open Source licensing which is consistent with UIMA and other project teams tooling. Will need to survey teams. Initial platform discussions Still working on Sandbox (“Shared”) environment, need to consider Cloud in later phases of project.

21 Planned Review repository options with: ONC, Source Forge, Open Health Tools Need to establish straw man proposal for Sandbox configuration. Conduct cross-project discussions Inventory tools that can be shared. Inventory data that can be shared. Identify shared environment site location. Initiate high-level requirements gathering.

22 Project 5 Data Quality Dr. Kent Bailey (Kim Lemmerman)

23 Overview Support data quality and ascertain data quality issues across projects Deploy and enhance methods for missing or conflicting data resolution Integrate methods into UIMA pipelines

24 Progress & Planned Integrate across projects and gather requirements and standards to establish data quality plan and metrics Compare expected quality of data to actual data quality Provide recommendation and methods to improve data quality and/or possible outcomes

25 Cross-Area 4 Program Efforts Lacey Hart

26 Progress Started with early with face-to-face collaboration; cross-knowledge pollination Individual project efforts synergized with timelines in synch; use cases vetted and determined for the first six months of focus. IRB & Data Sharing issues have been raised with best practice sharing and inventory of existing agreements between institutions reviewed.

27 Planned Best practices for IRB submissions and template protocol material will be made available w/ applicable state implications Data use agreements will be completed across sites where needed in short term; effort for ‘consortium’ agreement will commence for long-term data sharing needs

28 Cross-ONC Efforts Dr. Christopher Chute

29 SHARP Area Synergies 1. Security: ensure piplined data does not have compromisable integrity 2. Cognitive: explore how normalized data and phenotypes can contribute to decisions 3. Applications: Potential for shared architectural strategies © 2009 Mayo Clinic29

30 Beacon Synergies High-throughput data normalization and phenotyping (SHARP) Applied to population laboratory (Beacon) Validate on consented sub-samples Potential to include ALL patients in population area – regardless of provider © 2009 Mayo Clinic30

31 SHARP Area 4: More information… http://sharpn.org

32 © 2009 Mayo Clinic32 http://informatics.mayo.edu/beacon SE MN Beacon: More information…


Download ppt "Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use Dr. Friedman on-site visit, Mayo Clinic 3 September 2010."

Similar presentations


Ads by Google