The BIRNLex: Principles and practices of community ontology development Maryann Martone
The Ontology Task Force: Cross Test Beds Carol Bean (co-chair), NIH-NCRR Maryann Martone (co-chair), BIRN CC Amarnath Gupta, BIRN CC Bill Bug, Mouse BIRN Christine Fennema-Notestine, Morph BIRN Jessica Turner, FBIRN Jeff Grethe, BIRN CC Daniel Rubin, NCBO David Kennedy, Morph BIRN Provide a dynamic knowledge infrastructure to support integration and analysis of BIRN federated data sets, one which is conducive to accepting novel data from researchers to include in this analysis Identify and assess existing ontologies and terminologies for summarizing, comparing, merging, and mining datasets. Relevant subject domains include clinical assessments, assays, demographics, cognitive task descriptions, neuroanatomy, imaging parameters/data provenance in general, and derived (fMRI) data Identify the resources needed to achieve the ontological objectives of individual test-beds and of the BIRN overall. May include finding other funding sources, making connections with industry and other consortia facing similar issues, and planning a strategy to acquire the necessary resources
Concept Based User Interface Has been developed based on feedback from community at Ontology boot-camp and test bed AHMs Provides access to BIRN ontological sources Allows for the construction of queries based on familiar concepts - architecture handles the generation of integrated views Currently, over 2000 tables registered from BIRN databases, internal and external knowledge sources
BONFIRE: BIRN Knowledge Sources Bonfire Bonfire Ontology Browser and Extension Tool
BIRNLex Grew out of BIRN Ontology Workshops UMLS difficult to work with Duplicate terms No definitions Inconsistent and sometimes incomprehensible relationships –Meant to cover all domains of interest to BIRN: imaging, neuroanatomy, experimental techniques, behavior –Presented at this year’s SFN meeting; version 1.0 to be released very soon –Draft version posted on the web (see OTF Wiki)see OTF Wiki –Current domain areas: neuroanatomy, behavioral paradigms, mouse strain nomenclature, experimental procedures –Developed in Protégé using OWL
BIRNLex - General Principles OTF has adopted and refined best practices for ontology development being promoted by NCBO/OBO Foundry re-use existing community ontologies covering BIRN require domains - e.g. OBI, CARO, BFO, GO Cellular Component, NCBI taxonomy novel domains - behavioral paradigms, imaging protocols, etc. - submit to OBO Foundry or contribute to relevant community effort (e.g., imaging experiments and processing going into OBI) for all BIRNLex entities - must have Aristotelian definitions (genera & differentia) OTF and other BIRN members are holding regular curation sessions heavy use of curatorial metadata to support automated evaluation/analysis/maintenance of ontology Use OWL and other supporting technologies enabling us to leverage variety of mature and emerging tools to support ontology curation, ontology-centric annotation, and ontology-driven semantic querying
Core Ontologies Imported into Protégé –BFO: Basic Foundational Ontology –skos (simple knowledge organization system) Preferred labels Alternative labels –OBI: Ontology of Biomedical Investigation Manually imported: NeuroNames brain anatomy, paradigm classes from Peter Fox –Each term is identified by its source and its source unique identifier Included cross reference to UMLS identifiers –Utilize synonyms –Maps to other efforts using UMLS End user doesn’t have to worry about these categories
Facilitates alignment with other ontologies across scales and modalities Adopted framework proposed by Barry Smith and colleagues for biological ontologies (Rosse et al., 2005, AMIA proceedings) Based anatomical work on the FMA Don’t want to concern ourselves about the upper level ontologies; want to focus on our domain Using as a rough guide for now while these ontologies are being built Use of Foundational Ontologies UBO - Upper Bio Ontology BFO - Basic Formal Ontology
BIRNLex is a Lexicon, not a terminology A is a B which has C –Defines class structure –Defines properties Electron microscope is a type of microscope which uses electrons to form an image –Microscope Electron microscope –Has property »Image formation
BIRNLex Curation Meet on a semi-regular basis (many interruptions) Identify domains and strategies –Not mixing structure and function big help in moving forward We slip up quite a bit Revise, revise, revise Tools for biologists are inadequate; better if you’re a computer scientist (I handed off BIRNLex, reluctantly, several months ago) Divide up the work Assign curation status –We don’t argue too long –Curated, graph position temporary, uncurated, raw import from source
Strict rules for developing taxonomies Behavioral Paradigm –Oddball paradigm Auditory oddball paradigm Visual oddball paradigm Forebrain: –Has part: Amygdala Working memory paradigm –Serial item recognition task –Radial maze Limbic system –Has part: Amygdala
The state of Neuroanatomy in BIRN Assessed the usage of anatomical terms in each atlas used by BIRN Inconsistency in application of terms Resolution of technique was not considered Create standard “atomic” definitions for core brain parts Create a volumetric hierarchy Provides a basis for accounting for resolution Goal: which structures give rise to signals measured by a technique Structure not function no arguments about whether the amygdala exists functionally No arguments about whether the fornix is functionally part of the hypothalamus Imported Neuronames hierarchy for volummetric relations among brain parts e.g., hippocampal formation has part Mostly gray matter = dentate gyrus, hippocampus Mostly white matter = alveus Develop consistent application rules: “My hippocampus” = dentate gyrus + hippocampus” Need descriptors for topological relationships and spatial overlap
DendriteAxon Neuron Neuroepithelial cell Glia Cell bodySpine Dendritic Spine Component Post synaptic Component PSD SER Actin Filament Ribosome Orientation Distribution Properties Morphometrics Shape Compartment Shaft Component Actin Filament SER RER Ribosome Lysosome Ribosome Microtubule Component Orientation Distribution Properties Morphometrics Shape MicrogliaMacroglia Compartment Macromolecule macromolecule Gene Ontology Cell type Ontology “has regional part” “has constitutional part” Subcellular Anatomy Ontology: Extending anatomy to subcellular dimension; based on FMA
Next Steps Community extension and curation –Import into Bonfire –BIRNLex “Wikipedia” Integration with BIRN imaging, workflow and analysis tools Work to evaluate and extend PATO for imaging data –Spatio-temporal relationships Better web interface Begin transition into fully structured ontology: –MIND Ontology: Multiscale Investigation of Neurological disease
Relationships in complex scenes Vlad Mitsner, Masako Terada, Stephen Larson Incorporation of ontologies into segmentation tools for electron tomography Describe each “scene” as an instance of the ontology Capture not only entities but relationships among entities Electron microscopic data are sparse Discover “rules” for subcellular anatomy
DataTechnique AnalysisAnnotator Biological Entity FUGO OBI PATO Images as Instances
PATO: Phenotype and Trait Ontology GenotypeEntityAttributeValue npogutstructuredysplastic gutrelative sizesmall r210retinapatternirregular brainstructurefused tm84d/v pattern formation qualitativeabnormal blood islandsrelative numbernumber increased Bsb[2]elongation of arista literal processarrested C-alpha[1D]adult behaviourbehavioral activityuncoordinated 2003 trial data: FB & ZFIN Way of expressing complex phenotypes in way that is more scientifically “sound” BIRN provides valuable test cases for PATO BIRN data immediately becomes interoperable with Zebrafish and fly communities Suzanna Lewis, Chris Mungall et al.
BIRN has made a good faith effort to evaluate and employ existing ontologies; we are patient but we’ve got work to do Ontology building is not for people with thin skins –We are not attempting to build formal ontologies for everything –Provide a formal and consistent structure for describing data A man who consults one ontologist knows what to do; a man who consults two ontologists is never sure Don’t want to be victims in the ontology wars NCBO/MGI have been very helpful The principles suggested to us so far have been useful; they make the process easier, not harder Reference ontologies are useful, because they take care of the categories, e.g., dependent enduring entity, that tend to drive domain scientists a little nuts –Challenge to develop tools on top of shifting infrastructure –expect that we’ll have to redo annotation periodically Lessons learned