Phenotype annotation using ontologies Chris Mungall (+ BS) Berkeley Bioinformatics and Ontologies Project (BBOP) National Center for Biomedical Ontology (NCBO)
A biological ontology is: A precise representation of some aspect of biological reality eye what kinds of things exist? what are the relationships between these things? ommatidium sense organ eye disc is_a part_of develops from
OBD Foundry definition of a phenotype A collection of qualities inhering in one or more entities Examples: the quality of being reduced mass inhering in bone (i.e osteoporosis) the quality of being hypoplastic inhering in a midface the quality of lacking asters inhering in spermatocytes Qualities are real and have spatial or spatiotemporal extent
PATO : qualities Purpose: originally for annotation of mutant phenotypes now: OBO Foundry candidate reference ontology for biological qualities Both spatial and spatiotemporal attributes shape (inheres in a 3D object) rate (inheres in a process) Multiple levels of granularity
Old EAV representation EntityAttributeValue eyesizesmall heartstructureedematous ventral mandibular arch thicknessthick swim bladder inflation process attribute arrested PATO AO GO, CLO, … PATO
Problems with EAV EntityAttributeValue organismeatspeanut butter and jelly sandwiches skinreddark skincolorred heartshapeheart-shaped What is attribute, what is value? Every attribute is turned into a relation?
Towards a Single Hierarchy of Monadic Qualities long length is_a length hot temperature is_a temperature cylindrical shape is_a shape extended cylindrical shape is_a cylindrical shape ontologies are about types; values reflect a confusion between types and instances
New Quality Ontology BearerQuality eyesmall [size] heartedematous [structure] ventral mandi- bular arch thick [thickness] swim bladder inflation arrested [process] CARO, GO, CL, … PATO
Extensions E+Q is not enough in itself Relational attributes Relative attributes Measurements the measurement is not the phenotype
Relational attributes Most qualities are monadic they inhere in a single self-connected entity e.g. shape, color Some qualities are relational (instance-to-type) they inhere in >1 entity e.g. sensitivity, tolerance Some qualities are relative (instance-to-instance) e.g. taller_than Soon PATO will indicate which qualities are relational
E+Q extended with secondary relata BearerQuality eyehigh [sensitivity]red light brainfused [structure]eye leg1longer_thenleg2
Problem of Lacks Common attributes for systematics spermatocyte devoid of asters Example: wingless E=wing, A=presence, V=absent but there is no wing for the quality of absence to inhere in Consider instead The thorax (or whole fly) has the quality of being wingless =def. A normal thorax of this type has_a wing but not (this instance has_a wing) lacks is like instantiation: it relates instances to types
Ontological relations for anatomy and phylogeny Basic set (from OBO relations ontology) is_a part_of ontogenic/developmental derives_from transformation_of New relations has [has_quality] lacks in_organism not_in_organism evolved_from homologous_to
Instances and types Dictionary definitions leave lots of room for ambiguity especially for relations We must be careful to distinguish between instances and types when defining relations We directly perceive and interact with instances As scientists we are primarily interested in types Type level relations are defined in terms of instances
is_a also known as: subtype_of X is_a Y given any x that instantiates X at time t, x also instantiates Y at time t informally: all Xs are always Ys is_a is a transitive relation if X is_a Y and Y is_a Z then X is_a Z Examples: left eyeball is_a eyeball eye development is_a organ development monotreme is_a mammal
eyeball cavitated organ is_a organ is_a instance_of types instances
part_of X part_of Y (where X and Y are types) given any x that instantiates X at time t, there exists some y at time t such that y instantiates Y and x part_of y informally: all Xs are part of some Y at all times part_of is a transitive relation if X part_of Y and Y part_of Z then X part_of Z Examples: ommatidium part_of compound eye male genital system part_of body placenta part_of female reproductive system
ontogenic relations OBO relations ontology defines: transformation_of (single entity, identity preserving) derives_from (fusion and fission) Most OBO ontologies currently use develops_from can be considered the union of transformation_of and derives_from Transitive Example: T cell develops_from T lymphoblast male reproductive organ develops_from gonad primordium female reproductive organ develops_from gonad primordium
in_organism X in_organism Y (where X and Y are types) given any x that instantiates X at time t, there exists some y at time t such that y instantiates Y and x part_of y informally: all Xs are in some organism of type Y Examples: vertebrate eye in_organism vertebrate placenta in_organism mammal [discuss!]
not_in_organism Features are often lost X not_in_organism Y (where X and Y are types) given any x that instantiates X at time t, there exists no y at time t such that y instantiates Y and x part_of y informally: no Xs are in any organism of type Y Examples: ceratobranchial 5 tooth not_in_organism Gonorhynchiformes To be added to OBO relations ontology Do we need organism_lacks? Warning: exceptions are bad for ontologies how to deal with return of atavistic features?
eye vertebrate eyecompound eye is_a vertebratearthropod in_organism coelomata ommatidium part_of is_a