Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Relation Ontology Barry Smith 1. Concepts, Types and Frames ConceptsFrames Types Relational Structures 2.

Similar presentations


Presentation on theme: "The Relation Ontology Barry Smith 1. Concepts, Types and Frames ConceptsFrames Types Relational Structures 2."— Presentation transcript:

1 The Relation Ontology Barry Smith 1

2 Concepts, Types and Frames ConceptsFrames Types Relational Structures 2

3 Concepts, Types and Frames ConceptsFrames Linguistic Approach Types Relational Structures Scientific Approach 3

4 4 has_lower_level_granularity TLR2-MyD88 binding TLR2 has_participant LTA binding has_disposition TIR domain has_part TLR2-TLR2 ligand binding TIR-TIR binding process preceded_by regulated_by has_output has_participant TLR2:MyD88 complex MyD88 has_participant TLR-2 signalling pathway

5 5 how to define relations such as this?

6 Uses of ‘ontology’ in PubMed abstracts 6

7 By far the most successful: The Gene Ontology

8 MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFES IPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVIS VMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVY TLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLER CHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKY GYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERL KRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRAC ALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVC KLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDD NNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGI SLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLK TLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPW MDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEY ATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGS RFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSG TTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV How to do biology across the genome?

9 MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDR KRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTL SLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYM FLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRA CALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCAC TARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTR RIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDP NQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGS RFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCS FSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEI YMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPV RNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQS QFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMF NLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVV WIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGG LCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIE RMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTAST NVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATT TESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTS ATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTN SNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSEN MNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEAL AVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTR GKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKG GVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSM LIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDG RFDILLCRDSSREVGE 9

10 10 what cellular component? what molecular function? what biological process?

11 11 what cellular component? what molecular function? what biological process? GO used in curation of literature

12 and in integration of databases MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 12

13 The GO Idea MouseEcotope GlyProt DiabetInGene GluChem Holliday junction helicase complex 13

14 The GO Idea MouseEcotope GlyProt DiabetInGene GluChem sphingolipid transporter activity 14

15 Clark et al., 2005 part_of is_a 15 GO used in reasoning

16 How does the Gene Ontology work? with thanks to Jane Lomax, Gene Ontology Consortium 16

17 The methodology of annotations: tagging scientific literature with terms from GO Model organism databases (MODs) employ scientific curators, who use experimental observations reported in the biomedical literature to link gene products (such as proteins) with GO terms in annotations. International Society of Biocurators http://www.biocurator.org/ http://www.biocurator.org/ 17

18 GO provides a controlled system of representations for use in annotating data multi-species multi-disciplinary multi-granularity, from molecules to population 18

19 Gene products involved in cardiac muscle development in humans 19

20 $100 mill. invested in literature curation using GO over 11 million annotations relating gene products described in the UniProt, Ensembl and other databases to terms in the GO 20

21 GO allows a new kind of biological research based on analysis and comparison of the massive quantities of annotations linking GO terms to the gene products described in scientific literature and in scientific databases 21

22 GO is amazingly successful in overcoming data silo problems but it covers only – cellular components – molecular functions – biological processes 22

23 RELATION TO TIME GRANULARITY CONTINUANTOCCURRENT INDEPENDENTDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) Anatomical Entity (FMA, CARO) Organ Function (FMP, CPRO) Phenotypic Quality (PaTO) Biological Process (GO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) Cellular Function (GO) MOLECULE Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 23

24 The OBO Foundry – to extend the GO to enable intelligent integration of gigantic bodies of heterogeneous data across the entire domain of the life sciences, including clinical medicine – to create an evolving, map-like, computable representation of the entire domain of biological and medical reality 24

25 Initial Candidate Members –GO Gene Ontology –CL Cell Ontology –SO Sequence Ontology –ChEBI Chemical Ontology –PATO Phenotype (Quality) Ontology –FMA Foundational Model of Anatomy –ChEBI Chemical Entities of Biological Interest –CARO Common Anatomy Reference Ontology –PRO Protein Ontology 25 The OBO Foundry

26 Under development –Disease Ontology –Infectious Disease Ontology –Mammalian Phenotype Ontology –Plant Trait Ontology –Environment Ontology –Ontology for Biomedical Investigations –Behavior Ontology –RNA Ontology –RO Relation Ontology 26 The OBO Foundry

27 A success story in top-down information integration Ontologies configured as extensions of a single upper level ontology (BFO) Used by 100s of researchers to promote interoperability of experimental data in scores of high- throughput domains of biology and medicine via semantic annotation 27

28 The linguistic approach Bottoms-up, focused on linguistic properties manifested by the contents of a large corpus viewed from a cognitive perspective (mapping/modeling meanings or concepts rather than entities in reality) 28

29 Automatic mining of “assocations” from MEDLINE FACTA: Finding Associated Concepts with Text Analysis –What diseases are related to a particular chemical? –What proteins are related to a particular disease? http://text0.mib.man.ac.uk/software/facta/ 29

30 For the linguistic approach fiction may be no less important than fact English has no privileged status (the larger the corpus, the better) consistency (and thus additivity) of annotations is not important, because cognitive perspectives differ goal is automatic generation of semantic annotations via pattern- matching 30

31 For the scientific approach factual discourse alone important English is lingua franca regimentation is allowed goal of truth: to create a single computer-processable map of reality via painstaking Handarbeit truth is one  we strive for consistency of annotations 31

32 The linguistic approach is concerned with knowledge representation The scientific approach is concerned with reality representation 32

33 OBO Relation Ontology (RO 1.0) Foundationalis_a part_of Spatiallocated_in contained_in adjacent_to Temporaltransformation_of derives_from preceded_by Participationhas_participant has_agent 33

34 Relation Ontology supports consistent linkage of OBO Foundry ontologies through a common system of formally defined relations to enable reasoning both within and across ontologies, and thus also within and between the literature annotated in its terms 34

35 Relation Ontology instance_of is_a (= is a subtype of) depends_on part_of inheres_in has_input has_participant …. http://obofoundry.org/ro/ 35

36 Basic Formal Ontology (BFO) Continuant Occurrent (Process, Event) Independent Continuant Dependent Continuant 36 http://ifomis.uni-saarland.de/bfo/

37 Fundamental Dichotomy Continuants preserve their identity through change Occurrents (aka processes) –have temporal parts –unfold themselves in successive phases –exist only in their phases –have all their parts of necessity 37

38 instance_of Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality................ types instances 38

39 types vs. instances compare OWL: T-box vs. A-box (terminology vs. assertions) 39

40 3 kinds of (binary) relations Between types human is_a mammal human heart part_of human Between an instance and a type this human instance_of the type human this human allergic_to the type tamiflu Between instances Mary’s heart part_of Mary Mary’s aorta connected_to Mary’s heart 40

41 depends_on Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality................ quality depends on bearer 41

42 Dependent continuants the whiteness quality of this cheese your role as lecturer the disposition of this peach to ripen 42

43 depends_on Continuant Occurrent process Independent Continuant thing Dependent Continuant quality................ temperature depends on bearer 43

44 depends_on Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality, …................ event depends on participant 44

45 Type-level relations presuppose the underlying instance-level relations A is_a B =def. A and B are types and all instances of A are instances of B A part_of B =def. All instances of A are instance-level-parts-of some instance of B 45

46 Rule for including relations in RO In every case we need to check, before we add a relation [A] R [B] to RO, that, for some set of A and B terms we have data about the As and data about the Bs which is such that all the instances of A stand in R to some instance of B e.g. all the instances of cell membrane stand in part_of to cell 46

47 The assertions linking terms in ontologies must hold universally Hence all type-level relations in RO are provided with All-Some definitions (For linguists, Some-Some relations are equally important) 47

48 Rule for including relations in RO Before including a new relation in RO ask yourself: can the relation can be easily defined in terms of existing relations e.g. define ‘synaptically_connected_to’ in terms of: is_connected_to, is_a and synapse 48

49 Including only All-Some relations means: All relations evaluable as 1.Transitive 2.Symmetric 3.Reflexive 4.Anti-Symmetric All relations support logical reasoning – as contrasted with: is_related_to, is_associated_with, is_narrower_in_meaning_than … 49

50 Reasoning should be able to cascade from one relational assertion (A R 1 B) to the next (B R 2 C). Find all DNA binding proteins should also Find all transcription factor proteins because –Transcription factor is_a DNA binding protein Only the All-Some structure guarantees such cascading of relational assertions 50

51 Why All-Some ? If you know A part_of B, and B part_of C then whichever A you choose, the instance of B of which it is a part will be included in some C, which will include as part also the A with which you began 51

52 A part_of B, B part_of C... The All-Some structure of the definitions in the RO allows cascading of inferences (i) within ontologies (ii) between ontologies (iii) between ontologies and repositories of instance-data 52

53 Organisms are continuants they are entities which endure through time through gain and loss of parts Processes are occurrents they are entities which unfold through time, and have all their parts as a matter of necessity 53

54 human testis part_of adult human being but not human being has_part human testis and not even male human being has_part human testis 54

55 part_of for continuant types A part_of B =def. For all x, t if x instance_of A at t then there is some y, y instance_of B at t and x instance_level_part_of y at t cell membrane part_of cell 55

56 part_of for occurrent types A part_of B =def. For all x, if x instance_of A then there is some y, y instance_of B and x instance_level_part_of y EVERY A IS PART OF SOME B 56

57 is_a (for continuants) A is_a B  For all x, t if x instance_of A at t then x instance_of B at t abnormal cell is_a cell adult human is_a human but not: adult is_a child 57

58 Lacks Instance-type level p lacks U with respect to r at time t =def. there is no instance u of U such that p stands in r to u at t. Type-type level C1 lacks C2 with respect to r =def. for all c,t, if c instance of C1 at t then c lacks C2 with respect to r at time t. Need a way to state on top of this: that C1s normally stand in r to some C2 58

59 transformation_of A transformation_of B =Def. Every instance of A was at some earlier time an instance of B –adult transformation_of child 59

60 transformation_of 60 c at t 1 C c at t C 1 time same instance pre-RNAmature RNA adultchild

61 C c at t C 1 c 1 at t 1 C' c' at t time instances zygote derives_from ovum sperm derives_from correction to original Genome Biology paper: derivation is never one-to-one 61

62 two continuants fuse to form a new continuant C c at t C 1 c 1 at t 1 C' c' at t fusion derives_from 62

63 one initial continuant is replaced by two successor continuants C c at t C 1 c 1 at t 1 C 2 c 1 at t 1 fission derives_from 63

64 one continuant detaches itself from an initial continuant, which itself continues to exist C c at t c at t 1 C 1 c 1 at t budding derives_from combined with transformation_of 64

65 one continuant absorbs a second continuant while itself continuing to exist C c at t c at t 1 C' c' at t capture derives_from combined with transformation_of 65

66 ISO “Concept logic” for mereology Toronto part_of Ontario brain part_of central nervous system ISO, “Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies” ANSI/NISO Z39.19-2005) sees these as examples of the same part_of relation 66

67 Instances vs. types Instance-level relations and type-level relations have logically distinct properties Type relations are liftings of instance relations 67

68 What is symmetric on the level of instances need not be symmetric on the level of types adjacency on the instance level is always symmetric 68

69 Not however on the level of types: seminal vesicle adjacent_to urinary bladder Not: urinary bladder adjacent_to seminal vesicle 69

70 Similarly, on the level of types, while: nucleus adjacent_to cytoplasm it is not the case that cytoplasm adjacent_to nucleus 70

71 continuous_with on the instance level is always symmetric a continuous_with b on the instance level means: there is a fiat boundary between a and b if a continuous_with b, then b continuous_with a 71

72 72

73 continuous_with as a relation between types A continuous_with B =Def. for all x, if x instance-of A then there is some y such that y instance_of B and x continuous_with y 73

74 continuous_with is not symmetric Consider lymph node and lymphatic vessel Each lymph node is continuous with some lymphatic vessel, but there are lymphatic vessels (e.g. lymphs and lymphatic trunks) which are not continuous with any lymph nodes 74

75 3 kinds of binary relations Between types human is_a mammal cell nucleus part_of cell Between an instance and a type this human instance_of the type human this human allergic_to the type penicillin Between instances Mary’s heart part_of Mary Mary’s aorta connected_to Mary’s heart 75

76 Linguistic vs. scientific approach to semantic annotation Semantic annotation can provide support for logical reasoning across the content of scientific literature only if the distinctions between relations at the type level and relations at the instance level are taken account of. (Many?) linguistic accounts of relations do not take account of this distinction. 76

77 Why not? Because linguistic accounts (like dictionaries) focus on relations between meanings, not on instances in reality Because linguistic accounts focus on what is meaningfully combinable, rather than on what is logically inferrable Because linguistic accounts focus on relations captured grammatically, not on relations observed experimentally and captured in scientific theories 77

78 The Relation Ontology Barry Smith 78 Sophia Ananiadou UK National Centre for Text Mining

79 79 Do linguistics and biology truly ever meet?


Download ppt "The Relation Ontology Barry Smith 1. Concepts, Types and Frames ConceptsFrames Types Relational Structures 2."

Similar presentations


Ads by Google