EleMAP: An Online Tool for Harmonizing Data Elements using Standardized Metadata Registries and Biomedical Vocabularies Jyotishman Pathak, PhD 1 Janey.

Slides:



Advertisements
Similar presentations
Introduction The cancerGrid metadata registry (cgMDR) has proved effective as a lightweight, desktop solution, interoperable with caDSR, targeted at the.
Advertisements

BioPortal Status and Plans September 2011 Ray Fergerson NCBO Project Director Stanford University 1.
Biomedical Informatics Reference Ontologies in Biomedicine Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College.
BioPortal as (the only functional) OOR SandBox (so far) Natasha Noy, Michael Dorf Stanford University.
27 June 2005caBIG an initiative of the National Cancer Institute, NIH, DHHS caBIG the cancer Biomedical Informatics Grid Arumani Manisundaram caBIG - Project.
NCBO-I2B2 Collaboration Overview and Use Cases Nigam Shah
Consistent and standardized common model to support large-scale vocabulary use and adoption Robust, scalable, and common API to reduce variation in clinical.
Copyright © 2001 College of American Pathologists SNOMED Quick Start Guide A Quick Guide to Installing and Working with SNOMED ® CT Help Exit.
Session II: Standardization of phenotyping algorithms and data Jyotishman Pathak, PhD Associate Professor of Biomedical Informatics Department of Health.
Amy Sheide Clinical Informaticist 3M Health Information Systems USA Achieving Data Standardization in Health Information Exchange and Quality Measurement.
©2013 MFMER | slide-1 Building A Knowledge Base of Severe Adverse Drug Events Based On AERS Reporting Data Using Semantic Web Technologies Guoqian Jiang,
The work proposed in this study is an attempt to use Semantic Web technologies for integrating patient clinical data derived from Electronic Health Records.
1 Knowledge Management for Disease Coding (KMDC): Background & Introduction Timothy Hays, Ph.D. Project Manager, Knowledge Management for Disease Coding.
Overview of Biomedical Informatics Rakesh Nagarajan.
The Role of Standard Terminologies in Facilitating Integration James J. Cimino, M.D. Departments of Biomedical Informatics and Medicine Columbia University.
Harmonization of SHARPn Clinical Element Models with CDISC SHARE Clinical Study Data Standards Guoqian Jiang, MD, PhD Mayo Clinic On behalf of CDISC CEMs.
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
OpenMDR: Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
1 Betsy L. Humphreys, MLS Betsy L. Humphreys, MLS National Library of Medicine National Library of Medicine National Institutes of Health National Institutes.
OpenMDR: Alternative Methods for Generating Semantically Annotated Grid Services Rakesh Dhaval Shannon Hastings.
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
© 2015 Mayo Foundation for Medical Education and Research Guoqian Jiang, MD, PhD 1, Richard Kiefer 1, Luke V. Rasmussen 2 ; Huan Mo, MD, MS 3, Jennifer.
Future Use of Stored Samples & Data and the NIH Policy on GWAS and dbGaP NIAID/DAIDS Dione Washington, M.S. -- ProPEP Sudha Srinivasan, Ph.D.-- TRP Tanisha.
Health Ontology Mapper A project initiated within the CTSA (Clinical Translation Science Awards) program Goal: create a semantic interoperability layer.
CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics.
LexEVS Overview Mayo Clinic Rochester, Minnesota June 2009.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Public Health Vocabulary Services (a) Gautam Kesarinath – CDC NCPHI Associate Director of Technology, (b) Nikolay Lipskiy – CDC SDO & Interoperability.
Michael F. Huerta, Ph.D. Associate Director for Program Development National Library of Medicine, NIH BD2K CDE Webinar – September 8, 2015 Common Data.
SHARPn High-Throughput Phenotyping (HTP) November 18, 2013.
Open Health Natural Language Processing Consortium (OHNLP)
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
Query Health Concept-to-Codes (C2C) SWG Meeting #12 March 6,
H Using the Open Metadata Registry (OpenMDR) to generate semantically annotated grid services Rakesh Dhaval, MS, Calixto Melean,
EPA’s Environmental Terminology System and Services (ETSS) Michael Pendleton Data Standards Branch, EPA/OEI Ecoiformatics Technical Collaborative Indicators.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
Open Terminology Portal (TOP) Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer Institute, Center for Biomedical Informatics.
Cancer MetaData Standards Peter A. Covitz, Ph.D. HL7 RCRIM October 1, 2002.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
Data Integration and Management A PDB Perspective.
AMIA 2007 Monday, Nov :30-1:30 National Library of Medicine National Institutes of Health U.S. Dept. of Health & Human Services UMLS ® Users Meeting.
Query Health Concept-to-Codes (C2C) SWG Meeting #11 February 28,
A Semantic-Web Representation of Clinical Element Models
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo.
ISO/IEC JTC 1/SC 32 Plenary and WGs Meetings Jeju, Korea, June 25, 2009 Jeong-Dong Kim, Doo-Kwon Baik, Dongwon Jeong {kjd4u,
Metadata-Centered Development for a Community Registry System Andrew Waters, BS 1, Julie Frund, BS 1, Michelle Smerek, BS 1, Anita Walden, BA 1, Guilherme.
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
Methods We employ the UMLS Metathesaurus to annotate ICD-9 codes to MedDRA preferred terms (PTs) using the three-step process below. The mapping was applied.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Uses of the NIH Collaboratory Distributed Research Network Jeffrey Brown, PhD for the DRN Team Harvard Pilgrim Health Care Institute and Harvard Medical.
The NINDS Common Data Elements Project February 20, 2014 Wendy R. Galpern, MD, PhD NINDS / NIH American Society for Experimental NeuroTherapeutics | 16.
Implementation of National Standards (LOINC, SNOMED) for Electronic Reporting of Laboratory Results: BioSense Experience Nikolay Lipskiy 1, DrPH, MS, MBA;
Challenges and issues with information sharing: The four pillars of semantic interoperability Douglas B. Fridsma, MD, PhD, FACP University of Pittsburgh.
CaCORE In Action: An Introduction to caDSR and EVS Browsers for End Users A Tool Demonstration from caBIG™ caCORE (Common Ontologic Representation Environment)
National Cancer Institute caDSR Briefing for Small Scale Harmonication Project Denise Warzel Associate Director, Core Infrastructure caCORE Product Line.
Essential SNOMED. Simplifying S-CT. Supporting integration with health
Oncology in SNOMED CT NCI Workshop The Role of Ontology in Big Cancer Data Session 3: Cancer big data and the Ontology of Disease Bethesda, Maryland May.
Semantic Interoperability: caCORE and the Cancer Data Standards Repository (caDSR)  Jennifer Brush.
Semantic Web - caBIG Abstract: 21st century biomedical research is driven by massive amounts of data: automated technologies generate hundreds of.
The UMLS and the Semantic Web
Pharmacogenomics Data Standardization using Clinical Element Models
Co-Champions: Ram D. Sriram (NIST) Leo Obrst (MITRE)
Achieving Semantic Interoperability of Cancer Registries
Leveraging Value Sets from the Value Set Authority Center (VSAC) in a Standards-Based Clinical Data Repository Richard Kiefer1, Luke V. Rasmussen2, Jennifer.
Networking and Health Information Exchange
LexRDF: An Approach for Representing Biomedical Ontologies in RDF
Undergraduate Courses
Carolina Mendoza-Puccini, MD
Interoperability with SNOW SHRINE
Presentation transcript:

eleMAP: An Online Tool for Harmonizing Data Elements using Standardized Metadata Registries and Biomedical Vocabularies Jyotishman Pathak, PhD 1 Janey Wang, MS 2 Sudha Kashyap 2 Rongling Li, MD, PhD 3 Daniel R. Masys, MD 2 Christopher G. Chute, MD, DrPH 1 1 Mayo Clinic, Rochester, MN; 2 Vanderbilt University, Nashville, TN; 3 National Human Genome Research Institute, Bethesda, MD Abstract. A key aspect in enabling high-throughput phenotyping studies requires standardized representation of the phenotype data using Common Data Elements (CDEs) and controlled biomedical vocabularies. In this abstract, we introduce eleMAP—an online tool that allows researchers to harmonize their local phenotype data dictionaries to existing metadata and terminology standards such as the caDSR (Cancer Data Standards Registry and Repository) and SNOMED-CT (Systematized Nomenclature of Medicine-Clinical Terms). Introduction and Background. With recent advances in genotyping technologies, to increase our ability to fully understand the genetic basis of common diseases, the NIH in 2007 funded a multi-site consortia called eMERGE (Electronic Medical Records and Genomics; for high-throughput phenotyping. In particular, the crux of eMERGE is the development of tools and algorithms for extracting phenotypic data, representing actual healthcare events, from the EMR systems at each institution in a consistent and comparable fashion. However, due to lack of common EMR systems or standardization of EMR data across the institutions, one of the goals of eMERGE is develop tools and methods to facilitate harmonization of phenotype data dictionaries and CDEs to terminological and metadata healthcare standards for interoperable representation of phenotype data. To address this requirement, we developed eleMAP—an online tool that allows researchers to harmonize their local phenotype data dictionaries to existing metadata and terminology standards such as the caDSR (Cancer Data Standards Registry and Repository [1]) and SNOMED-CT (Systematized Nomenclature of Medicine [2]). Methods and Results. Our approach to mapping CDEs to pre-coordinated terms and concepts from standardized biomedical terminologies and metadata resources is as conservative as possible. We first try to find an exact string match for the CDE variable provided in the data dictionaries of several eMERGE studies (e.g., cataract, type 2 diabetes). If no match is found, we do an approximate search by normalizing the original search string (e.g., eliminating underscores, hyphen variations) as well as adding a wildcard (*) to the beginning and end of the string. The entire process is automated, and the search stops as soon as a match is found. Furthermore, if CDE has an enumerated list of permissible values (in the data dictionary), we repeat the above procedure to find corresponding terms for the CDE value set contents. We developed an online tool called eleMAP for mapping CDEs from several eMERGE studies to the caDSR and various biomedical vocabularies in the NCBO For querying the caDSR, we use the caDSR HTTP API which provides BioPortal [3]. various forms of functions for querying the CDEs. For querying biomedical vocabularies, we used RESTful Web services from BioPortal. While BioPortal contains approximately 200 biomedical terminologies and ontologies, our searches were restricted to SNOMED-CT and NCI Thesaurus. More information about eleMAP is available at: Figure 1: caDSR and ISO/IEC Model for Metadata Registries Figure 3a: eMERGE data elements mapped to caDSR (November 19th, 2009 caDSR release) Figure 3b: eMERGE data elements mapped to caDSR (April 20th, 2010 caDSR release) Figure 4a: eMERGE data elements mapped to pre-coordinated SNOMED-CT concepts (January 31st; 2010 SNOMED International Release) Figure 4b: eMERGE data elements mapped to post-coordinated SNOMED- CT concepts (January 31st; 2010 SNOMED International Release) References 1.caDSR: Cancer Data Standards Registry and Repository. Last accessed: March 6th, SNOMED-CT: Systematized Nomeclature of Medicine-Clinical Terms. Last accessed: March 6th, Noy, N., et al., BioPortal: ontologies and integrated data resources at the click of a mouse. Figure 2: Preliminary results from eMERGE network CDE harmonization using eleMAP Future Direction and Next Steps Incorporation of additional terminology and metadata resources such as RxNorm and LOINC Collaboration with PhenX and incorporation of standardized measures for use in GWAS Improved visualization and search capabilities Investigate using Clinical Element Models, and ISO Archetypes Release the eleMAP software code, package and database to the open-source community Application in projects beyond eMERGE