Enhancing the Quality of ImmPort Data Barry Smith ImmPort Science Meeting, February 27, 2014 With thanks to Anna Maria Masci.

Slides:



Advertisements
Similar presentations
Semantic Interoperability in Health Informatics: Lessons Learned 10 January 2008Semantic Interoperability in Health Informatics: Lessons Learned 1 Medical.
Advertisements

1 Summary Slides from Enhancing Organism Based Disease Knowledge Using Biological Taxonomy, and Environmental Ontologies Ken Baclawski Northeastern University.
The ORCHID project Dr Ian Gaywood, NUH Dr Ira Pande, NUH Professor John Chelsom, City University London.
WHO Agenda: Classifications – Terminologies - Standards
Consistent and standardized common model to support large-scale vocabulary use and adoption Robust, scalable, and common API to reduce variation in clinical.
PubMed Central Mahyar Ahmadpour-B. Kowsar Publicatin Corp. Kowsar Editorial Meeting 1 September 19th, 2013 Tehran, Iran.
VSAC Users’ Forum Thursday, January 15 2:00 – 3:00 PM, EST.
On the Future of the NeuroBehavior Ontology and Its Relation to the Mental Functioning Ontology Barry Smith
1 Ontology in 15 Minutes Barry Smith. 2 Main obstacle to integrating genetic and EHR data No facility for dealing with time and instances (particulars)
Developing a Biomedical Ethics Ontology (BMEO) Robert Arp, Ph.D. Ontology Research Group (ORG) National Center for Biomedical Ontology.
The Core Infectious Disease Ontology. Purpose: To make infectious disease-relevant data deriving from different sources comparable and computable Across.
1 Definitions Value set: A list of specific values, which may – or may not – contain subsets of one or more standard vocabularies, that define or identify:
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
The IHTSDO Workbench A Terminology Management Tool John Gutai, IHTSDO May 2011 For OHT.
1 Betsy L. Humphreys, MLS Betsy L. Humphreys, MLS National Library of Medicine National Library of Medicine National Institutes of Health National Institutes.
Enriching the Ontology for Biomedical Investigations (OBI) to Improve Its Suitability for Web Service Annotations Chaitanya Guttula, Alok Dhamanaskar,
Veterinary Terminology Services Lab Va-Md Regional College of Veterinary Medicine The Animal Names Terminology Subset of SNOMED CT ® Suzanne L. Santamaria,
Update on Newborn Screening Use Case Advisory Committee on Heritable Diseases in Newborns and Children - Advisory Committee on Heritable Diseases in Newborns.
A Case Study of ICD-11 Anatomy Value Set Extraction from SNOMED CT Guoqian Jiang, PhD ©2011 MFMER | slide-1 Division of Biomedical Statistics & Informatics,
Michael F. Huerta, Ph.D. Associate Director for Program Development National Library of Medicine, NIH BD2K CDE Webinar – September 8, 2015 Common Data.
SNOMED CT – Distributed Content Management Stefan Schulz Content Committee April 2, 2009.
Community Ontology Development Lessons from the Gene Ontology.
Lisa A. Lang, MPP Assistant Director for Health Services Research Information Head, National Information Center on Health Services Research and Health.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
Terminology and HL7 Dr Colin Price HL7 UK 11 th December 2003.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Leveraging Ontologies for Human Immunology Research Barry Smith, Alexander Diehl, Anna- Maria Masci Presented at Leveraging Standards and Ontologies to.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
Immunological Images and the ImmPort Database and Analysis Portal Anna Maria Masci Department of Immunology Duke University Buffalo, 24 June 2014.
Component 3-Terminology in Healthcare and Public Health Settings Unit 17-Clinical Vocabularies This material was developed by The University of Alabama.
Copyright OpenHelix. No use or reproduction without express written consent1.
AMIA 2007 Monday, Nov :30-1:30 National Library of Medicine National Institutes of Health U.S. Dept. of Health & Human Services UMLS ® Users Meeting.
Query Health Concept-to-Codes (C2C) SWG Meeting #11 February 28,
Anatomy Ontology Community Melissa Haendel. The OBO Foundry More than just a website, it’s a community of ontology developers.
NLM Value Set Authority Center Curation and delivery of value sets for eMeasures eMeasures Issues Group (eMIG) May 24, 2012 NLM.
HIT Standards Committee Vocabulary Task Force Task Force Report and Recommendation Jamie Ferguson Kaiser Permanente Betsy Humphreys National Library of.
PRO and the NIF / ImmPort Antibody Registries Alexander Diehl Protein Ontology Workshop 6/18/14.
Lab and Messaging CoP 11/29/2012. Agenda Agenda review Introduction to the Specimen Cross-Mapping Table (Specimen CMT) – Background – Expected uses –
1 Ontolog OOR-BioPortal Comparative Analysis Todd Schneider 15 October 2009.
IT Infrastructure Planning Committee Use Case Enhanced SVS Nikolay Lipskiy, MD, DrPH, Centers for Disease Control (CDC), USA Sundak Ganesan, MD, Northrop.
1 An Introduction to Ontology for Scientists Barry Smith University at Buffalo
Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS Jessie Kennedy Rob Gales, Robert Kukla.
Ontologies for Terminologies, Knowledge Representation & Software: Benefits & Gaps (“Don’t make the tea”) (Only a part of Knowledge Representation) Alan.
SNOMED CT A Technologist’s Perspective Gaur Sunder Principal Technical Officer & Incharge, National Release Center VC&BA, C-DAC, Pune.
SAGE Nick Beard Vice President, IDX Systems Corp..
SNOMED CT Vendor Introduction 27 th October :30 (CET) Implementation Special Interest Group Tom Seabury IHTSDO.
May 2007 CTMS / Imaging Interoperability Scenarios March 2009.
FROM ONE NOMENCLATURES TO ANOTHER… Drs. Sven Van Laere.
Ontology in MBSE How ontologies fit into MBSE The benefits and challenges.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
NAACCR CDA Pilot Project - Overview, Status, and Findings 2009 NAACCR Conference Ken Gerlach, Co-Chair, NAACCR Clinical Data Work Group; Health Scientist,
Canadian SNOMED CT® Extensions Challenges & Lessons learned Presentation to Implementation SIG October 2012 Presented by Linda Parisien and Beverly Knight.
Frequent Criticisms too ambitious recreation of a Textbook of Medicine competing with SNOMED-CT replicates the work done elsewhere: DSM, ICPC, too academic.
1 Standards and Ontology Barry Smith
Public Health - Clinical LOINC Meeting Sundak Ganesan, M.D. Health Scientist, Surveillance Operations Team National Notifiable Disease Surveillance (NNDSS)
Randi Vita, M.D. Better living through ontologies at the Immune Epitope Database La Jolla Institute for Allergy & Immunology Division of Vaccine Discovery.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
The UMLS and the Semantic Web
SNOMED CT E-Learning Status & Planning September Update (for ELRG)
MedDRA and Ontology Discussion of Strategy
Development of the Amphibian Anatomical Ontology
Ontology in 15 Minutes Barry Smith.
Terminology and HL7 Dr Colin Price
Health Information Exchange Interoperability
OBO Foundry Principles
Carolina Mendoza-Puccini, MD
Ontology in 15 Minutes Barry Smith.
Interoperability and data for open science
Presentation transcript:

Enhancing the Quality of ImmPort Data Barry Smith ImmPort Science Meeting, February 27, 2014 With thanks to Anna Maria Masci

Example of a data submission template to

What kind of artifact is this list?

Alan Rector, Representing Specified Values in OWL: "value partitions" and "value sets“ (2005),

Value set =def. a list of subtypes partitioning a given type

Value set (according to the VSAC) =def. a list of specific values (terms and their codes) derived from standard vocabularies or code systems used to define clinical concepts (e.g. patients with diabetes, clinical visit, reportable diseases).

For VSAC each value set involves: natural language noun phrases from a controlled vocabulary (can in principle vary between communities / disciplines) official name for code system / ontology / version alphanumeric IDs URLs

Basic assumptions Step by step, ImmPort should move away from use of free text fields Use of common value sets increases discoverability and comparability by third-party users Ideally these value sets should be used also by clinicians, researchers and literature curators in neighboring fields ImmPort and its users will benefit if value sets are well-maintained in light of scientific advance

Which controlled vocabularies + code systems should we use when composing value sets for use as ImmPort templates? VSAC Coding Systems FDA mandated terminologies (CDICSC, MedDRA) OBO Foundry ontologies

Figure 3: Typical examples for code lists with multiple names

The existence of duplicate and highly-similar value sets suggests the need for an authoritative repository of value sets and related tooling in order to support the development of quality measures. Invalid codes affect a large proportion of the value sets (19%).

The existence of duplicate and highly-similar value sets suggests the need for an authoritative repository of value sets and related tooling in order to support the development of quality measures. Invalid codes affect a large proportion of the value sets (19%).

Figure 3: Typical examples for code lists with multiple names

Should (can) ImmPort adopt SNOMED CT as a source for value sets?

SNOMED CT (Clinical Terms): Pro > 311,000 concepts; 1,360,000 relationships DHHS / NLM mandated use as core exchange terminology for all US electronic health records international standard; multi-language perpetual guaranteed funding from > 15 countries agreement with LOINC to coordinate on mappings (albeit after 10 years of negotiation) similar agreement with ICD

SDY 30 Comprehensive study of germinal center development and antibody response (Kepler) - Masci

Hypothesis SDY30: Comprehensive study of germinal center development and antibody response inguinal lymph node

Hypothesis inguinal lymph node right inguinal lymph node from mouse Kepler_011_206 left inguinal lymph node from mouse Kepler_011_206

Search Results (16) Structure of deep inguinal lymph node Structure of superficial inguinal lymph node Structure of inferior inguinal lymph node Inguinal lymph node structure Entire inguinal lymph node Entire femoral lymph node Entire inferior inguinal lymph node Inguinal lymph node group Deep inguinal lymph node group Superficial inguinal lymph node group Superior medial inguinal lymph node Superior lateral inguinal lymph node Entire superficial inguinal lymph node Superolateral superficial inguinal lymph node group Inferior superficial inguinal lymph node group Superomedial superficial inguinal lymph node group

Search Results (16) Inguinal lymph node structure Entire inguinal lymph node Inguinal lymph node group Why no term: ‘inguinal lymph node’ ? Well-intended but mistaken ontological design Major fix to address an inference problem, now hard to undo

Problems with SNOMED Massive committee structure Means: slow reaction time for needed changes Causes problems for information-driven translational science Not quite open (license problems) Problems for treatment of non-human subjects

Animal subjects chicken duck fly macaque mouse pig rat

Do we want a different value set here for each different species?

Whatever answer we give to questions like this, we should never abandon considerations of feasibility 1.What can NG cope with? 2.What can the data processing software cope with? Note: only small selections from big lists will be needed 3.What can data providers cope with? Action item under 3.: Explore potential of LIMS system-generated mappings

Brenda Tissue Ontology

disease-specific cell types NOT: cell type is_a tissue

mollusk terms in Brenda

anatomical structures mixed with cell types

Brenda Tissue and Enzyme Source Ontology Confuses type of tissue vs. source of tissue Too little structure Primary hierarchy is a partonomy No clear treatment of species-specificity ‘Source of enzyme’ is not a coherent way to specify an ontology domain Confused definitions (for example the definition provided for “Alzheimer-specific cell” is in fact a definition for Alzheimer’s Disease Too little attention to developments in ontology in last 5 years

Action item Explore alternatives to Brenda for Tissue Subtypes, including Foundational Model of Anatomy Ontology Uberon broadly compatible with OBO Foundry principles

why go for OBO Foundry ontologies (GO, PRO, CL, …) discoverability (open) allows different value sets (and values) to be compared, logically composed versioning, update, scientific basis huge established annotation resources clearly determined domains ensure consistent annotation and division of labor management, trackers quick response (vs. multi-year timelines for some VSAC systems)

Principal reason for using OBO Foundry ontologies quick response provides an opportunity to use the ImmPort workflow to do real science

What is proposed: one column with options Better: HIPC data providers should include entries (ideally with URIs) under more than one of B, C, D, E (primarily B and E) The results can then be used e.g. to identify issues identified through CL definitions, and thereby advance the quality of CL over time and also advance consistency of immunology terminology

More action items Evaluate SNOMED CT as a potential source for the Subject Phenotype template Evaluate the VSAC Value Set Authoring Tool (released in October 2013) Explore developing facility such as to scoop up the terms used in free text fields for review regarding submission to value sets: replace ‘Other’ in existing templates with a ‘request for new term’ submission field

Yet more action items Create catalog of all templates we have Allow template-focused search across all studies Prioritize templates that need to be created Prioritize existing templates that need work Explore LIMS collaborations to allow automatic input into templates