Download presentation
Presentation is loading. Please wait.
1
1 “Penuria nominum” – shortage of words Knowledge beyond the capacity of language? by György Surján ESKI Hungary Commentary to Judith Blake Beyond Data Integration: Data Management for Knowledge Discovery Ontology and Biomedical Informatics Rome 29 April – 2 May 2005
2
2 Overview: (A commentary of an outsider) 1.Modern science is analytical 2.Problem of identity 3.Capacity of language is limited 4.Genomics and proteomics deals with extremely large databases. 5.Ontologies are bound to reality by language tags
3
3 1. Modern science is analytical
4
4 Mouse and human genes agree in 90% Mouse and human body is built from rather similar building blocks. The difference is not in the building elements, but in the different way of integration of the elements. The difference is not only phenotypic, but functional: Humans are able to create such a sculpture demonstrating the beauty of human body
5
5 Q1. Is analytical approach sufficient to explain differences of living organisms? By changing of 10 % of its genes would a mouse be able to create sculptures like Michelangello?
6
6 2. The problem of identity Importance of identity in ontology: Entities having different identity criteria can not belong to the same class.
7
7 = We have the strong feeling of our self-identity all over of our whole life, despite of all changes that happen to us
8
8 = Identity is independent from similarity and recognition
9
9 Identity of genes or proteins Entities may gain or loose parts without loosing their identity Genes loosing some nucleotides are still identical? Q2. What are the identity criteria for genes and proteins?
10
10 Elementary particles have no identity Humans and developed animals obviously have Q3. At which level of organisation identity emerges? (Do biological macromolecules have identity?)
11
11 Shepard in the 19th century ~3-400 words Anatomy (intermediate language certificate) ~4000 terms SNOMED 3.1, Encyclopaedia Britannica~120 000 terms WordNet~150 000 strings UMLS Metathesaurus> 1 500 000 terms 3. Capacity of language is limited Our language capacity is huge, but nevertheless finite
12
12 Limiting factors: 1. Capacity of human brain 2. Number of terms shared by a community
13
13 Example of numbers Different names for the first 13 numbers (zero- twelve) in English, then we use combinations hundred10 2 thousand10 3 million 10 6 billion10 9 …. ? 10 80 We have linguistic solution to express extremely large numbers in price of precision loss 94869313860999624578839454223454292345623754278394542323452456598564789345634987 9.486 10 80
14
14 Up to now, mankind has not met any situation which could exhaust the capacity of human language, not because the number of things to be expressed were less than this capacity, but we always could find some acceptable compromise. We do not know where are the limitations of our language capacity, but the feeling of this limitation was well known centuries ago (penuria nominum): In the 17 th century Harsdörfer proposed a machine with 5 wheels containing 256 syllables, prefixes and suffixes, beeing able to generate about 97 million (mostly nonsense) German words in order to find the real name of God and also to being able to use different names for all particualrs in the world instead of referring them by names of their classes (U. Eco: Between La Mancha and Babel)
15
15 Size of genomics databases: GO~18 000 terms Human genome ~ 30 000 ? genes GenBank~42 000 000 sequences
16
16 Are we able to use 42 million names? Q4. Is it possible to describe molecular biology using human language? Is there any other representation tool to be used for that purpose?
17
17 5. Ontologies are bound to reality by language tags formal languages are used to describe structures
18
18 language tag ID language tag Reality Language What is the meaning ?
19
19 Q5. If language fails in genomics and proteomics, is there a need and possibility for alternative methods of ontology engineering, that does not requires language? If ontologies are bound to reality by language, than it is hard to create (use) ontology where the problem field exceeds the capacity of language.
20
20 Q4. Is it possible to describe molecular biology using human language? Is there any other representation tool to be used for that purpose? Q3. At which level of organisation identity emerges? (Do biological macromolecules have identity?) Q1. Is analytical approach sufficient to explain differences of living organisms? Q2. What are the identity criteria for genes? Summary of questions Q5. If language fails in genomics and proteomics, is there a need and possibility for alternative methods of ontology engineering, that does not requires language?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.