Download presentation
Presentation is loading. Please wait.
1
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at www.biopax.org)www.biopax.org
2
Definition of an Ontology Conceptualization of a domain of interest Concepts, relations, attributes, constraints, objects, values … An ontology is a specification of a conceptualization Formal notation Documentation A variety of forms, but includes: A vocabulary of terms Some specification of the meaning of the terms
3
Ontologies – Key Aspects Focus on semantics! Accurately model a complex domain Capture semantic nuances Rigorously define what each field means Adhere to those definitions!
4
Ontologies – Key Aspects Ontologies are for people and computers: People browse the ontology to learn it It encodes the definition of a concept so that the computer “ understands ” it “understands” = automated reasoning with concept definitions Is concept A more general than concept B? Is X an instance of concept A?
5
Components of an Ontology Concepts (Class, Set, Type, Predicate) ex: Gene, Reaction, Macromolecule Taxonomy of concepts (generalization/specialization hierarchies) ex: a physical interaction is an interaction Relations and Attributes Domains –values allowed for an attribute- ex: a feature location consists of a sequence location Constraints and other meta-information about relations ex: a pathway has at least one interaction
6
Ontologies in Bioinformatics Biological DBs need to have a good ontology AND a good mapping – implementation- of it: this prevents errors on data entry and interpretation Provide a common framework for multidatabase queries Provide a controlled vocabulary, such as for genome annotation For information extraction
7
BioPAX Biological PAthway eXchange A data exchange ontology and format for biological pathway integration, aggregation and inference Open source, ongoing
8
BioPAX Goals Include support for these pathway types: Metabolic pathways Signaling pathways Protein-protein interactions Genetic regulatory pathways Note: representing pathways is nothing new
9
The problem 200 + pathway databases of different kinds (http://www.pathguide.org/)http://www.pathguide.org/ Rich data, different ontologies Nightmare for integration and data exchange
10
Biological pathways Metabolic Pathways Molecular Interaction Networks Signaling Pathways
11
Ontologies reflect “real life” A typical pathway would be decomposed into: A single pathway instance, which would contain several pathway steps, which would each contain one or more interactions occurring between physical entity participants, which each point to one physical entity.
12
BioPAX vs other ontologies Conceptual framework based upon existing DB schemas, allowing wide range of detail, multiple levels of abstraction Uses (refers to) existing ontologies to provide supplemental annotations where appropriate Cellular location GO Component Cell type Cell.obo Organism NCBI taxon DB Incorporates other standards where appropriate Interoperates with existing standards
13
BioPax & other Exchange Formats BioPAX PSI-MI 2 SBML, CellML Genetic Interactions Molecular Interactions Pro:Pro All:All Interaction Networks Molecular Non-molecular Pro:Pro TF:Gene Genetic Regulatory Pathways Low Detail High Detail DB Exchange Formats Simulation Model Exchange Formats Rate Formulas Biochemical Reactions Small Molecules Low Detail High Detail Metabolic Pathways Low Detail High Detail BioPAX level 1
14
Capturing data at different resolutions Metabolic pathway data has a high level of detail Molecular interaction have less Ex: no causal or temporal aspects of interactions BioPAX Level 2 captures molecular binding interactions at a relatively high level in the ontology class hierarchy This reflects the fact that any given binding interaction may be a low-resolution (or more abstract) view of a more specific type of interaction.
15
Example A signaling database would likely capture the interaction between MEK1 and ERK1 as a catalysis event (MEK1 catalyzes the phosphorylation of ERK1). A molecular interaction database would likely store the interaction using a simpler abstraction, such as a protein-protein interaction. BioPAX Level 2 supports both of these representations.
16
Aggregation, Integration, Inference with BioPax 1. Aggregation: represent multiple kinds of pathway databases metabolic molecular interactions signal transduction gene regulatory 2. Integration: special constructs designed for integration DB References XRefs (Publication, Unification, Relationship) Synonyms 3. OWL DL – to enable reasoning
17
BioPAX Ontology: Top Level Pathway A set of interactions E.g. Glycolysis, MAPK, Apoptosis Interaction A set of entities and some relationship between them E.g. Reaction, Molecular Association, Catalysis Physical Entity A building block of simple interactions E.g. Small molecule, Protein, DNA, RNA Entity Pathway Interaction Physical Entity Subclass (is a) Contains (has a)
18
BioPAX Ontology: Physical Entities PhysicalEntity ComplexRNA ProteinSmall Molecule This class serves as the super-class for all physical entities, although its current set of subclasses is limited to molecules. This list may be expanded to include photon, environment, cell and cellular component in later levels of BioPAX.
19
Interaction Class Structure
20
Relational implementation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.