All Hands Meeting 2003 BIRN ONTOLOGIES Session Jeffrey Grethe Amarnath Gupta Bertram Ludäscher Maryann E. Martone
Overview First half: Ontologies (brief; not a “total recall”) (Bertram) UMLS & “Bonfire” extensions (Jeff) Disease maps (Maryann, Amarnath) Ontology-enhanced tools (Maryann) Second half: Discussion on policy issues, BIRN ontology curation etc
Kinds of “Ontologies” (simplified cheat sheet ) 1.Controlled Vocabularies agreed upon range of values (“enumeration type”) e.g. “standard names” for materials, diseases, … 2.Simple Taxonomies, Classification hierarchies (isa) controlled concept vocabulary + subconcept (specialization) relationship e.g. biological taxonomies 3.Graph-like Ontologies isa, has-a, and other relationships (contained_in, causes, activates, …), the latter usually w/o formalized semantics (but agreed upon) e.g. semantic nets, RDF, … 4.Full-fledged Ontologies Usually logic-based ontologies; relationships (including isa) are logic consequences of formal concept definitions (partially) defined semantics You knew this already: e.g. 5. Conceptual Models DSM-IV (Diagnostic and Statistical Manual of Mental Disorders ) isa isa 290 concept relationship concept isa concept Parent Person offspring.Person class relation part-of
DSM IV Taxonomy 290 Dementia of the Alzheimer's Type, With Late Onset, Uncomplicated Dementia Due to Creutzfeldt-Jakob Disease Dementia Due to Pick's Disease Dementia of the Alzheimer's Type, With Early Onset, Uncomplicated Dementia of the Alzheimer's Type, With Early Onset, With Delirium Dementia of the Alzheimer's Type, With Early Onset, With Delusions Dementia of the Alzheimer's Type, With Early Onset, With Depressed Mood Dementia of the Alzheimer's Type, With Late Onset, With Delusions Dementia of the Alzheimer's Type, With Late Onset, With Depressed Mood Dementia of the Alzheimer's Type, With Late Onset, With Delirium Vascular Dementia, Uncomplicated Vascular Dementia, With Delirium Vascular Dementia, With Delusions Vascular Dementia, With Depressed Mood 291 Alcohol Intoxication Delirium
Uses of Ontologies in Data Integration “Smart” (conceptual-level) data discovery, browsing, querying looking for “C”, finding “D” (which is C-related) terminological and semantic “glue” between different “data worlds” Conceptual / Semantic Modeling of a domain for terminologies: “domain maps” (in description logic) for processes: “process maps”, “disease maps” (open issue) … Need to provide … Ontology exchange syntax Ontology extensions mechanisms ( BONFIRE) Inter-ontology mapping mechanisms ( yours vs. mine) Data-to-ontology registration mechanisms ( data to concepts)
Generic Standards RDF, RDFS For graph-based ontologies (=explicit statements) OWL (Web Ontology Language) Three levels: OWL Lite, OWL DL, OWL Full Some Features: Ontology O2 uses (refers to) O1 (over the web!) formalism to exchange, extended, and map between ontologies concept relationship Parent Person offspring.Person
Generic Tools Ontology authoring: Protégé-2000 ontology tool (Stanford) OWL Plug-In (evolving) Developers’ corner: Jena-2 Semantic Web Framework (HP), for dealing with OWL ontologies Logic programming extensions (SWI)
Example: Ontology-enhanced Map Integration (OMI) 1.Upload ontologies O1, O2, … (O2, O3 use O1,…) 2.Upload ontology mapping Om :: Oa Ob 3.Register data sets D1, D2, … to ontology Oa 4.Query data sets through Ob interface!
UMLS & BONFIRE (Community Ontology Building)
What is UMLS? UMLS is a long-term research project began on 1986 by the National Library of Medicine (NLM) UMLS is a collection of knowledge sources designed to facilitate the retrieval and integration of information from multiple machine-readable biomedical information sources
Knowledge Resources Metathesaurus The Metathesaurus is organized by concept or meaning, it provides a uniform, integrated distribution format from about 60 biomedical vocabularies and classifications and links many different names for the same concepts The 2000 edition of the Metathesaurus includes more than 730,000 concepts and 1.5 million concept names from over 50 different biomedical vocabularies, some in multiple languages
Knowledge Resources Semantic Network Semantic Network contains information about the types or categories (e.g., "Disease or Syndrome," "Virus") to which all concepts have been assigned and the permissible relationships among these types (e.g., "Virus" causes "Disease or Syndrome") The semantic types are the nodes in the Network, and the relationships between them are the links.It has 132 semantic types, 53 links between the semantic types.
Knowledge Resources Semantic Network Semantic types: organisms, anatomical structures, biologic function, chemicals, events, physical objects, and concepts or ideas etc. Relations: ‘isa’, `physically related to,' `spatially related to,' `temporally related to,' `functionally related to,' and `conceptually related to’ etc.
Hierarchical relations types:
Associative (non-isa) Relationships
Knowledge Resources Information Sources Map The information sources are varied and include bibliographic databases, diagnostic expert systems, and factual databases The Information Sources Map or directory contains both human-readable and machine-"processable" information about the scope, location, vocabulary, syntax rules, and access conditions of biomedical databases of all kinds
Related Sites Further Information:
BONFIRE BONFIRE will allow BIRN users to accommodate concepts not present in the available pre-defined source ontologies Whenever possible, users should employ the relationship terms provided within the UMLS or other source ontologies provided by BIRN Once new terms are defined, when will they will become part of the BIRN Ontology (BONFIRE)? after appropriate curation?
Ontology Refinement
An Example Data Set species = rat (UMLS: C003493) region = neostriatum (UMLS: C ) cell type = medium spiny cell (No Concept Available) structure = spiny dendrite(No Concept Available) segmented object = dendritic spine (UMLS: C ) segmented object = dendritic shaft (No Concept Available)
BONFIRE Example For this data set, no ontology IDs exist for medium spiny cell, spiny dendrite or dendritic shaft. medium spiny cell (BONFIRE: BID006) medium spiny cell “is a” neuron (UMLS: C ) medium spiny cell “has location” neostriatum (UMLS: C ) medium spiny cell “is a” neuron AND “has property” dendritic spine (UMLS: C ) spiny dendrite (BONFIRE: BID007) spiny dendrite “is a” dendrite (UMLS: C ) spiny dendrite ‘contains” dendritic spine (UMLS: C )
DISEASE MAPS
Glue Knowledge for Mouse BIRN Navigating through Multi-resolution information Linking animal and human imaging data brain cerebellum cerebellar cortex Purkinje cell dendritic spine Entopeduncular nucleus Globus pallidus, internal segment Animal Model Disease Process Link database concepts to UMLS/Neuronames Utilize the neurohomology ontology: M. Bota at USC Develop disease and animal model knowledge maps
Knowledge Maps Parkinson’s Disease Pathological feature Alpha synuclein Abnormal filaments Substantia nigra ubiquitin symptom tremor akinesia rigidity Motor deficit neurons Lewy Body C C Cell inclusion C glia neuronal degeneration Dopamine neuron cortex Basal forebrain Filamentous inclusion C C C C C C C C C C C
Knowledge Map: Animal Model a-synuclein mouse transgenic animal Cellular phenotype Cellular inclusion Behavioral phenotype Alpha synuclein C nuclear inclusion C Alpha synuclein C Cytoplasmic inclusion C Motor deficit C ubiquitin C neurons glia C C C
Knowledge Maps Parkinson’s Disease Pathological feature Alpha synuclein neurons Lewy Body Cytoplasmic inclusion glia Filamentous inclusion a-synuclein mouse Cellular phenotype Cellular inclusion nuclear inclusion ubiquitin Alpha synuclein neurons Cytoplasmic inclusion ubiquitin
Parkinson’s disease map diseaseC course of illnessC disease phaseC pathologyC disease characteristicC symptomsC Pathological processC sign/symptomC proneness/riskC severitiesC epidemiologyC disease classificationC prevention, intervention and treatmentC
Parkinson’s disease features Concept A relationshipConcept B
Parkinson’s disease processes Concept ArelationshipConcept B
Object-Oriented Modeling
TOOLS Custom: Know-ME, OMI, … Generic: Protégé-2000, …
Discussion
Getting Organized … Join the mailing list: Mail to with “subscribe birn-ontologies”