Download presentation
Presentation is loading. Please wait.
Published byLeona Knight Modified over 9 years ago
1
An Introduction to Ontologies Contributors: Melissa Haendel, Chris Mungall, David Osumi-Sutherland
2
Setting the stage 1.What is an ontology? 2.Understanding ontologies 3.Anatomy of an ontology class 4.Upper ontologies 5.How are ontologies used?
3
MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity Common controlled vocabularies indicate the same meaning under different annotation circumstances
4
Any closed, prescribed list of terms used for classifying data What is a controlled vocabulary? Key Features: Terms are not usually defined. Relationships between the terms are not usually defined. Can be a list. Key Features: Terms are not usually defined. Relationships between the terms are not usually defined. Can be a list. Here is a CV of wines: Pinot noir, red, chardonnay, Chianti, Bordeaux, Riesling…. These are all different types- color, location, varietal, and are present in a list. Another example would be the map locations list at the end of your Gazeteer.
5
Any controlled vocabulary that is arranged in a hierarchy What is a Taxonomy? Key Features: Terms are not usually defined. Relationships between the terms are not usually defined. Terms are arranged in a hierarchy. Key Features: Terms are not usually defined. Relationships between the terms are not usually defined. Terms are arranged in a hierarchy. Here is a wine taxonomy: Wine Red merlot zinfandel cabernet pinot noir White chardonnay pinot gris Riesling Wine Red merlot zinfandel cabernet pinot noir White chardonnay pinot gris Riesling
6
A taxonomy that contains additional information about use of the terms What is a Thesaurus? Key Features: Terms are not usually defined. Relationships between the terms are not usually defined. Terms are arranged in a hierarchy. Statements about the terms are included such as scope notes or instructions for use. Key Features: Terms are not usually defined. Relationships between the terms are not usually defined. Terms are arranged in a hierarchy. Statements about the terms are included such as scope notes or instructions for use. Some well known thesauri are: WordNet, NCI cancer thesaurus, MeSH
7
A formal conceptualization of a specified domain of interest What is an ontology? Key Features: Terms are defined. Relationships between the terms are defined, allowing logical inference. Terms are arranged in a hierarchy. Expressed in a knowledge representation language such as RDFS, OBO, or OWL. Key Features: Terms are defined. Relationships between the terms are defined, allowing logical inference. Terms are arranged in a hierarchy. Expressed in a knowledge representation language such as RDFS, OBO, or OWL. Some well known ontologies are: SnoMED, Foundational Model of Anatomy, Gene Ontology, Linnean Taxonomy of species Reproduced with permission, Jason Freeny http://web.mac.com/moistproduction/flash/index.html
8
http://www.mkbergman.com/?m=20070516 The Ontology spectrum: Bottom line: you get what you pay for. OBO
9
Ontology Languages Web Ontology Language (OWL) – Standard set of logical constructs for building an ontology – Many syntaxes OWL-RDF/XML OWL-XML Manchester – Many reasoners OBO-Format – Current formalized by mapping to a subset of OWL can be treated as another OWL syntax
10
A common misconception Are ontologies about terms or things? When you are arguing about including something in your ontology, 1. Are you arguing about what a term means? 2. Or are you arguing about what term should be adopted in your ontology language to represent a well-characterized entity or concept? These are terminological questions, and not ontological questions. 1 is a purely linguistic dispute; 2 is primarily a practical question. The ontological questions are: What kind of things should we recognize in our ontology? (Never mind, for the moment, what we might choose to call them.) What are their relations to one another? (Note: What are the relations of their terms/names to one another?) Adapted from Gary Merrill
11
What matters are things Data annotated to the right schema will be more consistent Dessert Ice cream gelato sno-cone soft-serve custard cream Cake torte double-layer galette angel food cake DES:000038 DES:000034 DES:000456 DES:005566 DES:000564 DES:000744 DES:0003221 DES:000667 DES:000975 DES:000544 DES:000765 DES:000034 - Frozen milk and/or cream with various sugars, and flavoring such as fresh fruit and nut purees.[1] Labels are handles, not things
12
Why build an ontology? A simple example Number of genes annotated to each of the following brain parts in an ontology: brain 20 part_of hindbrain 15 part_of rhombomere 10 Query brain without ontology 20 Query brain with ontology 45 Ontologies can facilitate grouping and retrieval of data
13
A C B D E Vertebrata Ascidians Arthropoda Annelida Mollusca Echinodermata tetrapod limbs ampullae tube feet parapodia Querying for genes in similar structures across species Panganiban et al., PNAS, 1997 Distal-less orthologs participate in distal-proximal pattern formation and appendage morphogenesis mouse limb sea urchin tube feet ascidian ampulla polychaete parapodia
14
is_a entity organism cat mammal animal is_a human is_a instance_of PeanutChris Shaffer Types, subtypes, and instances Subtyping relation is_a = SubClassOf
15
How do you tell if it is an instance or a class? Is there more than one in existence? Is the entity referencing a group of things with common properties? Class or instance? There is only one Snoopy There is a class of things labeled “Snoopy toys” Class or instance? There is only one Alaska There is a class of things labeled “States” There is only one blue morpho in my specimen collection There is a class of things labeled “Blue morpho butterflies”
16
General Principle for Logical Definitions Definitions are of the following Genus-Differentia form: X = a Y which has one or more differentiating characteristics. where X is the is_a parent of Y. Definition: Blue cylinder = Cylinder that has color blue. Definition: cylinder = Surface formed by the set of lines perpendicular to a plane, which pass through a given circle in that plane. is_a Definition: Red cylinder = Cylinder that has color red.
17
The True Path Rule cuticle synthesis --[i] chitin metabolism cell wall biosynthesis --[i] chitin metabolism ----[i] chitin biosynthesis ----[i] chitin catabolism chitin metabolism --[i] chitin biosynthesis --[i] chitin catabolism --[i] cuticle chitin metabolism ----[i] cuticle chitin biosynthesis ----[i] cuticle chitin catabolism --[i] cell wall chitin metabolism ----[i] cell wall chitin biosynthesis ----[i] cell wall chitin catabolism GO Before:GO After: BUT: A fly chitin synthase gene could be annotated to chitin biosynthesis, and appear in a query for genes annotated to cell wall biosynthesis (and its children), which makes no sense because flies don't have cell walls. NOW: all the subClass terms can be followed up to chitin metabolism, but cuticle chitin metabolism terms do not trace back to cell wall terms, so all the paths are true. The pathway from a subClass all the way up to its top level parent(s) must be universally true.
18
Where does the True Path Rule come from? Transitivity. Some relations are transitive, and apply across all levels of the hierarchy. For example, a cat is_a mammal, and a mammal is_a vertebrate SO a cat is_a vertebrate => This is the true path rule and is because the is_a relation is transitive. Some properties are not transitive. For example, head has_quality round. and, head part_of organism. So is the organism round? Of course not! BUT, eyes are part_of head, and head part_of organism, SO eye part_of organism is true, because part_of is a tranistive relation. Relations are logically defined in a common relation ontology or within each ontology that uses them. ≠ >
19
Relationships and definitions A relationship from one class to another is a formalized part of its definition (an object property in OWL) A subtype relation (is_a in OBO, SubClassOf in OWL) specifies necessary conditions for membership of a class. For example, finger part_of hand (all finger part_of some hand) states that a necessary condition of being in the class finger is to be part of some hand. So… if a finger exists, it is part of some hand. But…this does not mean that if a hand exists, it has as a part a finger.
20
Directionality Often, the order of the terms in an assertion will matter: We can assert: adult transformation_of child but not child transformation_of adult More about using ontology classes and relations Universality Entities should have the same meanings on every occasion of use. (= They should refer to the same universal types) Basic ontological relations such as SubClassOf and part_of should be used in the same way by all ontologies For example, if you need a non-transitive part relation, then define a new relation for this purpose.
21
Meissirel C et al. PNAS 1997;94:5900-5905 ©1997 by National Academy of Sciences Symmetric For example: Retinal ganglion cell connected_to lateral geniculate nucleus AND Lateral geniculate nucleus connected_to Retinal ganglion cell Some relations are Symmetric. If A is connected_to B, then B is connected_to A.
22
muscle anatomical structure lumen anatomical space lumen of gut anatomical space anatomical structure ✗ Some classes are declared to never share any instances in common OWL DisjointWith OBO: disjoint_from NO!
23
About reasoners A piece of software able to infer logical consequences from a set of asserted facts or axioms. They are used to check the logical consistency of the ontologies and to extend the ontologies with "inferred" facts or axioms For example, a reasoner would infer: Major premise: All mortals die. Minor premise: Some men are mortals. Conclusion: Some men die. Different reasoners can perform slightly differently. There are a number of reasoners to be aware of: ELK, HerMIT, Pellet, etc. We’ll use these later in the protégé tutorial.
24
Different kinds of definitions An ontology is a collection of axioms An axiom is simply a sentence or a statement Axioms can be non-logical (aka “annotations” or “text definitions”) E.G. GO_0000262 has synonym ‘mtDNA’ opaque to reasoners logical well-defined semantics understood by reasoners Example: SubClassOf axioms Arguments can be classes or class expressions (for example, the class of things that are parts of the leg)
25
Anatomy ontologies: Exemplar use case 1.What makes anatomy ontologies different? 2.What kinds of anatomy ontologies exist? 3.Anatomy of an anatomy ontology class 4.Upper ontologies 5.How are anatomy ontologies used?
26
There are many useful ways to classify parts of organisms: its parts and their arrangement its relation to other structures what is it: part of; connected to; adjacent to, overlapping? its shape its function its developmental origins its species or clade its evolutionary history Cajal 1915, “Accept the view that nothing in nature is useless, even from the human point of view.”
27
Not all classification is useful Be practical: Build ontologies for what you need and for what can be reused About thirty years ago there was much talk that geologists ought only to observe and not theorise; and I well remember some one saying that at this rate a man might as well go into a gravel-pit and count the pebbles and describe the colours. C. Darwin
28
Classifying anatomy appendage antenna fore wing fore wing hind wing
29
Relationships record classifications too leg part_of some ‘thoracic segment wing ‘leg’ SubClassOf part_of some thoracic segment
30
The knowledge in an ontology can make the reasons for classification explicit Any sense organ that functions in the detection of smell is an olfactory sense organ sense organ capable_of some detection of smell olfactory sense organ
31
nose sense organ nose capable_of some detection of smell Classifying sense organ capable_of some detection of smell olfactory sense organ nose => These are necessary and sufficient conditions, also called an equivalent class axiom
32
It is difficult to keep track of multiple classification chains and: ensure completeness avoid redundancy avoid true path rule violation Why create class equivalent classes and class restrictions?
33
Compositionality and avoiding asserted multiple inheritance We can logically define composed classes and create complex definitions from simpler ones aka: building blocks, cross-products, logical definitions Descriptions can be composed at any time Ontology construction time (pre-composition) Annotation time (post-composition) Formal necessary and sufficient definitions + a reasoner Automatic (and therefore manageable) classification Requires subtype classification, so apart from the root term(s), no term should lack a superclass parent. Let the reasoner do the work!
34
post-composed versus pre-composed anatomical entities are logically equivalent Plasma membrane of spermatocyte Plasma membrane [GO CC] Spermatocyte [Cell Ontology] plasma membrane and part_of some spermatocyte Gene Ontology Relation Ontology Cell Ontology GenusDifferentia
35
What kinds of anatomy ontologies exist? Mouse MA (adult) EMAP / EMAPA (embryonic) Human FMA (adult) EHDAA2 (CS1-CS20) Amphibian AAO XAO Fish ZFA (zebrafish) MFO (medaka) TAO (teleosts) Nematode WBbt (c elegans) Arthropod FBbt (Drosophila) TGMA (Mosquito) HAO (hymenoptera) Arthropod anatomy ontology Plant ontology Species-centric and multi-species ontologies Species neutral ontologies CARO (common anatomy reference ontology) Uberon (cross-species anatomy) vHOG (vertebrate homologous organs) CL (cell ontology) GO (gene ontology) Phenotype ontologies MP mammalian phenotype HP human phenotype WB worm phenotype
36
chemical entities Many perspectives, many ontologies – that overlap in content gross anatomy gross anatomy tissues cells cell anatomy cell anatomy proteins phenotypes clinical disorders processes physiological processes development reactions cellular processes behavior evolutionary characters evolutionary characters nervous system neural crest
37
Domain ontologies are organized according to upper ontologies (Basic Formal Ontology) that specify the general types of things that exist RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) => Classification according to these higher level types helps ensure the True Path Rule holds
38
BFO high level classes Continuant: An entity that exists in full at any time in which it exists at all, persists through time while maintaining its identity and has no temporal parts. Independent continuant: A continuant that is a bearer of quality and realizable entities, in which other entities inhere and which itself cannot inhere in anything. Dependent continuant: A continuant that is either dependent on one or other independent continuant bearers or inheres in or is borne by other entities. Occurrent: An entity that has temporal parts and that happens, unfolds or develops through time. Sometimes also called perdurants.
39
The Common Anatomy Reference Ontology CARO is a structural classification based on granularity From the bottom up: Cell component Cell Portion of tissue Multi-tissue structure From the top down: Organism subdivision Anatomical system Acellular structures CARO is an upper reference ontology that can be used to structure new anatomy ontologies
40
Using CARO as a template is_a zebrafish AO CARO Cell and cell component are cross- referenced to GO Zebrafish classes are asserted to be subclasses of CARO classes
41
Example of complexity arising from multiple species-contexts erythrocyte cell nucleate cell enucleate cell not applicable in all contexts
42
Example of complexity arising from multiple species-contexts erythrocyte nucleate erythrocyte enucleate erythrocyte cell nucleate cell enucleate cell zebrafish nucleate erythrocyte human erythrocyte ZFA:0009256 … … CL:0000562 CL:0000232 CL:0000592 FMA:81100 species ontologies attached at appropriate level
43
Developmental Biology, Scott Gilbert, 6 th ed. Using reasoners to detect errors Fruit fly FBbt ‘tibia’Human FMA ‘tibia’ UBERON: tibia UBERON: bone is_a Vertebrata Drosophila melanogaster part_of Homo sapiens is_a only_in_taxon part_of disjoint_with ✗
44
Representing different levels of granularity lateral line development ? ? GO:cilium part_of CL:hair cell part_of CL:neuromast CL:hair cell part_of CL:neuromast CL:neuromast part_of ZFA:lateral line GO cilium development hair cell development neuromast development
45
Ontology considerations An ontology is a classification There are many useful ways to classify anatomy Maintaining multiple classification schemes by hand is impractical So you should automate it. Everybody makes mistakes So you should let the computer find errors for you Re-use other people’s work where possible Import class hierarchies Use common patterns Cautionary note – formal languages have limitations. Don’t expect to be able to express everything!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.