Developing an OWL-DL Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz, Susanne Hanser Institute for Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany ABCC5 gene encodes the protein MRP5 Naïve Approach: ABCC5 ⊑ encodes.MRP5 Critique: there may be gene (sensu nucleotide chain) instances that happen to never encode any Gene product Y instance Solution 1 (preferred): Use value restrictions ABCC5 ⊑ encodes.GeneProductY Solution 2: Consider ABCC5 and MRP5 not as classes but as information entity instances: ABCC5 Information Entity MRP5 Information Entity encodes Headache is associated with intracranial aneurysm Naïve Approach: Headache ⊑ associated_with. Intracranial Aneurysm Intracranial Aneurysm ⊑ associated_with. Headache Critique: Not every headache is a symptom of intracranial aneurysm and not every intracranial aneurysm produces headache. Solution: Introduce (anonymous) dispositions Intracranial Aneurysm ⊑ has-disposition.( has-realization Headache) BBS2 is a protein with unknown function Naïve Approach: BBS2 ⊑ Protein With Unknown Function ⊑ Protein Critique: Whether a function is known or unknown is ontologically irrelevant Objection: Such classes are important for ontology navigation and housekeeping Solution 1: Eliminate irrelevant class BBS2 ⊑ Protein Solution 2: Mark irrelevant class as housekeeping or navigational class BBS2 ⊑ NAVProtein With Unknown Function ⊑ Protein Smoking is a risk factor for aneurysm rupture Naïve Approach: Tobacco Smoking ⊑ Aneurysm Rupture Risk Factor Tobacco Smoking ⊑ Process Critique: What is represented is a context-dependent statement on and not a generic property of Smoking. Such classes are important for ontology navigation and housekeeping Solution: Clearly separate: Ontology proper (with the top node Factor Particular ) Tobacco Smoking ⊑ Process ⊑ Endurant ⊑ Particular from context ontology (with the top node Factor Particular in context ) Tobacco Smoking ⊑ Aneurysm Rupture Risk Factor ⊑ Risk Factor ⊑ Particular in Context Is a disease an event or state ? Naïve Approach: Brain Neoplasm ⊑ Event or Brain Neoplasm ⊑ Stative Critique: Ontologically, there are two entities: the disease and the course of the disease. In the context there is no need for this distinction. Solution: Introduce the disjunction class State or Process State or Process Event ⊔ Stative Brain Neoplasm ⊑ State or Process Pregnancy is a suspected risk factor for aneurysm rupture Naïve Approach: Pregnancy ⊑ Suspected Aneurysm Rupture Risk Factor Critique: The context-dependent statement on pregnancy as a risk factor is modified by a modal expression. This is not an issue of ontology Objection: The distinction between suspected and proven risks is crucial for the use of the ontology (retrieval) Solution: Maintain na ï ve approach, but encode it within context ontology
Developing an Ontology for Research and Care of Intracranial Aneurysms – Challenges and Limitations Holger Stenzhorn, Martin Boeker, Stefan Schulz, Susanne Hanser Institute for Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany BioTop - Characteristics 257 classes 42 relation types 193 logical restriction on classes. 57 sufficient criteria for full class definitions OWL-DL as representation language BFO as upper ontology OBO RO relations Example of a formal class definition (biotop:Tissue): ChemTop According to the modularization principles the more detailed biochemistry descriptions were separated into an optional add-on called ChemTop. Integration of ChemTop and ChEBI planned. Modularization BioTop supports the following modularization principles: as subdomain boundaries are fuzzy a limited degree of overlap must be accounted for modules should be dimensioned in a way that they classify rapidly modules follow the same principles of upper-level arrangement bridging files link modules between themselves and with upper-level ontologies Upper Level Ontology Domain Top Level Ontology Domain Ontology DOLCE BioTopChemTop BFO-RO GO CLChEBI Alignment and mapping Completed: UMLS Semantic Network Mapping GENIA Ontology mapping GO: mapping of the three GO ontologies CO: mapping of the cell ontology Taxdemo: mapping of sample biological taxonomy Ongoing: BioTop – DOLCE mapping (requires redesign of the BioTop Quality branch) Background The need for semantic standardization in the Life Sciences has been addressed by a dynamic evolution of domain ontologies (e.g. OBO, SNOMED CT), which tend to adhere to foundational principles of ontology design. Current biomedical ontologies are characterized by large fragmentation and overlap missing cross-ontology links lack of clear and unambiguous formal definitions of basic terms purpose-specific architecture and design decisions Availability BioTop - Rationale To consolidate and integrate domain ontologies bridging the gap to a common upper level ontology To enforce unambiguous logic-based descriptions of basic entities of biology and medicine using a standardized description language To maintain neutrality with regard to granularity and observer-biased views Bridging mechanism ChemTop BioTop Bridge BioTop GO Bridge ChemTop ChEBI Bridge BioTop CL Bridge BioTop BFO-RO Bridge ChemTop BFO-RO Bridge Onto1 Onto2 Bridge Onto1 Onto2 ≡ ⊑ Sources Publications Discussion List