Presentation is loading. Please wait.

Presentation is loading. Please wait.

Creating a community-based knowledge management framework for integrating neuroscience data via ontologies. A. Gupta 1, C. Bean 2, W. Bug 3, *C. Fennema-Notestine.

Similar presentations


Presentation on theme: "Creating a community-based knowledge management framework for integrating neuroscience data via ontologies. A. Gupta 1, C. Bean 2, W. Bug 3, *C. Fennema-Notestine."— Presentation transcript:

1 Creating a community-based knowledge management framework for integrating neuroscience data via ontologies. A. Gupta 1, C. Bean 2, W. Bug 3, *C. Fennema-Notestine 4, M. E. Martone 5, J. A. Turner 6, J. S. Grethe 5 1 SDSC, UC San Diego, La Jolla, CA, 2 NCRR, NIH, Bethesda, MD, 3 Neurobio & Anat, Drexel Univ Coll of Med, Philadelphia, PA, 4 Dept Psychiatry, Univ California, San Diego, San Diego, CA, 5 Neurosci, UC San Diego, La Jolla, CA, 6 Brain Imaging Ctr, UC Irvine, Irvine, CA. The Biomedical Informatics Research Network (BIRN) project is creating a large capacity and high bandwidth network for linking disease-related databases created at multiple centers. Current test beds are focused on utilizing neuroimaging across scales to accomplish multiscale investigations of neurological disease. Linking and navigating among distributed data sources is a key component of BIRN’s architecture. A controlled terminology is a fundamental requirement for assuring the inter-operability of the unique and diverse knowledge sources that make up BIRN's shared environment. In order to enhance collaboration across BIRN's open, fluid, and diverse infrastructure, we have adopted structured procedures to support enhanced interoperability. Building the BIRNLex BIRNLex is being constructed and maintained by the BIRN Ontology Task Force (OTF). The OTF consists of members of each of the three imaging test beds (Function BIRN, Morphometry BIRN and Mouse BIRN), the BIRN coordinating center. Other members of BIRN and the greater scientific community are recruited as necessary for input in specific areas. The BIRN OTF also contains representatives from the National Center for Biomedical Ontologies (NCBO). NCBO is a consortium of leading biologists, clinicians, informaticians, and ontologists who develop innovative technology and methods that allow scientists to create, disseminate, and manage biomedical information and knowledge in machine-processable form. The OTF attended a workshop held at NCBO where we were briefed on ontology “best practices”. By following these practices, we ensure that the logical structure of BIRNLex is consistent and clear and can be easily integrated into other resources. BIRNLex is built using the Owl version of Protégé, an open source tool for construction and maintenance of ontologies. OWL (Web Ontology Language) is the W3C standard for ontolog The OTF assessed several existing ontologies and terminologies to serve as foundations for building the BIRNLex. Based on these considerations and following the recommendations of NCBO, BIRNLex has adopted the following: Multiscale Investigation of Neurological Disease Navigating through Multi-resolution information Linking animal and human imaging data brain cerebellum cerebellar cortex Purkinje cell dendritic spine Entopeduncular nucleus Globus pallidus, internal segment Animal Model Disease Microarray Immunolabeling Interpreting Results Technique Phenotype The BIRN has created BIRNLex, a controlled lexicon for annotating BIRN data sources. BIRN’s data sources include structural and functional magnetic resonance imaging (MRI) databases from human subjects involved in studies of Alzheimer's disease and schizophrenia, and multiscale image databases from mouse models of human neurological disease using MRI, light, and electron microscopic imaging. BIRNLex provides terms, utilized by BIRN scientists in the context of their research, covering neuroanatomy, molecular species, behavioral and cognitive processes, subject information, experimental practice and design, and associated elements of primary data provenance required for large-scale data integration across disparate experimental studies. BIRNLex integrates existing terminologies and ontologies and extends them as needed to cover BIRN- related domains. In this way, efforts are not duplicated across domains. BIRNLex has developed a set of ontology “best practices” that will enhance the utility of BIRNLex and facilitate its integration into the larger domain of the life sciences Structure of the BIRNLex BIRNLex adopted as its foundation the Ontology of Biomedical Reality (Rosse et al., 2005). OBR categorizes entities along a fundamental division into continuants vs occurrents. Continuants are those entities that endure through time, e.g., mammal, brain Occurrents are those entities that unfold through time, e.g., mitosis, degeneration Each entity that is added to the BIRNLex must be accompanied by a standard set of annotation properties. Some examples include: Preferred label: preferred name for an entity Contributor: Person who contributed the entity External source: Identifies source ontology or vocabulary Curation status: Indicates the level of curation for the entity (see below) Each of these resources was imported into Protégé-OWL using the import function. Additional entities specifically related to domains of relevance to BIRN were added to these structures. Protégé These domains include neuroimaging, gross and cellular neuroanatomy, cognitive and behavioral processes, molecular entities and taxonomic species and strain. For the cognitive and behavioral terms, the OTF collaborated with the BrainMap group in San Antonio. Entities were added by assembling a list of entities relating to data currently in the BIRN databases. Each term is assigned a unique identifier from Bonfire, a tool created by BIRN for browsing and extending ontologies.Where possible, terms were cross referenced to the UMLS. Curation of the BIRNLex is performed on a regular basis by the OTF and other experts from the BIRN and wider scientific communities Each entity is given a human-readable definition by the contributor that conforms to the structure: “A is a B which has C”. basal forebrain: region of the brain consisting of ventral and rostral subcortical sub-regions of the telencephalon, including the basal ganglia, septal nuclei, amygdala, ventral pallidum, substantia innominata, and basal nucleus of Meynert. Electron microscope: A microscope utilizing electrons and magnetic lenses to form an image. Defining terms this way as qualified cases of higher order classes makes it easier to place each term in its proper place in the hierarchy and derive the complete set of attributes for each term when constructing formal ontologies from the BIRNLex. Definitions may include an attribution of the source from which the definition was derived, e.g., Wikipedia. For example, from the above definition of electron microscope, the following class hierarchy can be inferred: Microscope Electron microscope The following attributes for microscopes and electron microscopes can also be defined : Microscope has attribute lens type Electron microscope has attribute lens type = magnetic Microscope has attribute electromagnetic radiation type Electron microscopy has attribute EMR type = electrons Curating the BIRNLex Next Steps and Challenges Ahead The BIRNLex lays the foundation for the creation of a fully structured ontology for multiscale investigation of neurological disease and for establishing consistency between different forms of information contained in databases, XML data exchange standards and ontologies Considerable effort was expended to ensure that these ontologies are constructed based upon fundamental principles of ontology design and interoperate with community-developed ontologies. Because these fundamental ontologies are still under development, constant alignment of our efforts with these broader community efforts is necessary. BIRN is building a user-friendly interface for browsing BIRNLex and is working with NCBO to develop annotation tools that facilitate annotation of imaging data. Tools for community editing and extending of BIRNLex will be released in early 2007. We gratefully acknowledge the participation of Dr. Daniel Rubin from NCBO on the OTF and our many interactions with NCBO staff and affiliates Entites are added to the BIRNLex by the BIRN community and the BIRN OTF. Currently, contributions are made by filling out a spreadsheet. The OTF meets on a regular basis to review entities that are contributed, ensure that they are defined clearly and in the correct format, and to build the necessary hierarchies. Each entity is assigned a curation status: Uncurated = the entity has not yet been reviewed by the OTF Raw import = entity was imported “as is” from existing ontology Definition incomplete = entity has been reviewed but is missing an acceptable definition or a complete set of annotation properties Graph position temporary = entity has been reviewed but its position in a hierarchy is not clear at this time Pending final vetting = entity was revised according to OTF recommendations and is waiting final approval by the OTF Curated = entity is complete and placed in an appropriate hierarchy The Ontology of Biomedical Investigation (OBI), for describing and organizing terms related to experimental practices employed across all BIRN testbeds. Functional Genomics Ontology (FuGO) for describing entities related to experimental findings, e.g., data analysis The Foundational Model fo Anatomy (FMA) for the core organizational structure for neuroanatomical terms. Neuronames for neuroanatomical entities, e.g., cerebellum Skos and (something else) for standard annotation properties, e.g., lexical variants, synonyms Core Domain: Neuroanatomy The BIRN project currently involves two test beds (Morphometry and Mouse) that acquire imaging information on normal and diseased brain. Morphometry BIRN is utilizing automated segmentation tool, FreeSurfer (ref) to extract quantitative information on ~35 brain regions from human brain MRI data. The Mouse BIRN is constructing a multiscale and multimodal atlas of the mouse brain, combining information from different imaging modalities including MRI, light and electron microscopy and is also mapping data from microarray experiments onto the atlas. Thus, neuroanatomy is one of the core areas addressed by the BIRNLex. Several rich vocabulary resources exist for neuroanatomy, including NeuroNames (Bowden et al., 2002) and the Brain Architecture Management System (BAMS; Bota and Swanson, 2004). Examination of the use of anatomical entities within the various imaging efforts in BIRN revealed that basic anatomical entities were not defined consistently across atlases or annotation efforts. Some examples: EntityMouse Atlas 1Mouse Atlas 2Morph BIRNNeuroNames HippocampusDentate Gyrus + Ammon’s Horn Dentate Gyrus + Ammon’s Horn + part of subiculum Dentate Gyrus + Ammon’s Horn + alveus + part of subiculum Ammon’s Horn Cerebral ventricleLateral + third + fourth ventricle + cerebral aqueduct Lateral + third + fourth ventricle STRATEGY As a first pass, BIRNLex elected to create a purely structural hierarchy, i.e., entities are classified according to their location rather than any functional properties. This strategy enables us to more easily translate between tecniques with inherently different resolution. Neuronames has already created a very detailed structural hierarchy. Neuronames was imported into OWL and the entities organized following the structure of the FMA. We are in the process of supplying human readable definitions for each entity that includes the regional location, the general bounding structures and whether the structure is predominantly gray matter or white matter. Part of the difficulty was related to the resolution of the imaging technique and the type of contrast agent used. Mouse atlases constructed by different techniques, e.g., MRI vs histology vs dissection for microarray, have different levels of granularity BIRNLex is working to provide a set of core, well-defined and non- overlapping neuroanatomical structures that can be used for annotating imaging data. These entities can be used to create composite entities, depending upon the usage in a given atlas or test bed, e.g., “My hippocampus = dentate gyrus + Ammon’s horn + subiculum.” BIRNLex does not currently distinguish between mouse and human neuroanatomy, except in cases where a structure exists only in one or the other. Because Neuronames was developed primarily around primate, much of the current hierarchy in BIRNLex reflects primate neuroanatomy. Entry for “Hippocampal Formation” Core Domain: Mouse Strain Names An examination of the list of mouse strains contributed by different Mouse BIRN participants revealed that the identification of mouse strains was poorly controlled among the sites. Different spellings, e.g., C57BL6, C57/BL6, C57BL/6 or incomplete specifications, e.g., C57, alpha synuclein overexpressor, were the most common errors. Thus, a second core area addressed by the OTF was the naming of mouse strains. STRATEGY Mouse BIRN will adopt the MGI standards for naming of rodent strains. This standard is in accordance with the International Committee on Standardized Genetic Nomenclature for Mice. An on-line tutorial on naming of mice strains is available from Jackson Laboratories.the MGI standards for naming of rodent strainstutorial New data: Standard mouse strains will be entered into BIRNLex and given a unique BIRN identifer. Each strain will have a preferred spelling entered as well. Data providers should use this spelling in their databases and map this to the unique identifier. BIRNLex will maintain a cross reference to the accession number at MGI, currently the stock number. BIRNLex: When the MGI geneology is made available in electronic form, the BIRNLex will utilize its hierarchy. In the meanwhile, BIRNLex will recapitulate parts of this hierarchy for strains used in BIRN. The hierarchy will allow users to retrieve all C57 derived strains for example.BIRNLex


Download ppt "Creating a community-based knowledge management framework for integrating neuroscience data via ontologies. A. Gupta 1, C. Bean 2, W. Bug 3, *C. Fennema-Notestine."

Similar presentations


Ads by Google