Extending to the GO model OBO open biology ontologies aka - extended go - (ego)
obo obo.sf.net
The aims of SO 1.Develop a shared set of terms and concepts to annotate biological sequences. 2.Apply these in our separate projects to provide consistent query capabilities between them. 3.Provide a software resource to assist in the application and distribution of SO.
What is a pseudogene? Human –Sequence similar to known protein but contains frameshift(s) and/or stop codons which disrupts the ORF. Neisseria –A gene that is inactive - but may be activated by translocation (e.g. by gene conversion) to a new chromosome site. –- note such a gene would be called a “cassette” in yeast.
Give me all the dicistronic genes Define a dicistronic gene in terms of the cardinality of the transcript to open-reading-frame relationship and the spatial arrangement of open-reading frames.
ISA—927 relationshipsPARTOF—186 relationships holonymmeronym
Classical Extensional Mereology The formal properties of parts: 1.If A is a proper part of B then B is not a part of A (nothing is a proper part of itself) 2.If A is a part of B and B is a part of C then A is a part of C Because of these rules, we can apply some functions to parts…
EM operationDefinition Overlap (x○y) x and y overlap if they have a part in common. Disjoint (xιy) x and y are disjoint if they share no parts in common. Binary Product (x.y) The parts that x and y share in common. Difference (x–y) The largest portion of x which has no part in common with y. Binary Sum (x+y) The set consisting of individuals x and y Extensional Mereology (EM) : a formal theory of parts
Exon part of single transcript285 Exon in all transcripts243 (52%) Exon in one transcript148 (32%) Exon in > 1 but < all74 (16%) Exon distribution to transcripts Drosophila chromosome 4.
Anatomy Ontologies For the representation of phenotypic and expression data. Now available for: Drosophila, Mus, C. elegans, Arabidopsis, Ozyra ….
The need for a (bio)chemical ontology. CAS - commercial & expensive. LIGAND - no internal structure. MESH - semantically weak, very biased towards pharmaceutical agents. ChEBI - in development at EBI - 1st release was June 2004.
Tissue, cell & pathology ontologies. Medical ontologies - e.g. SNOMED - (a) commercial. (b) designed not for research, but for billing.
The next challenge A syntax and semantics for the description of phenotypic data.
value entity describes attribute has
Thank yous Berkeley –Chris Mungall, John Richter, Brad Marshall Insightful biologists –Midori Harris, David Hill, Bernard de Bono My Co-founders –Suzanna Lewis, Judith Blake, and Mike Cherry The GO Editorial Team at the EBI –Midori Harris, Jane Lomax, Amelia Ireland & Jennifer Clark SO: Karen Eilbeck, Mark Yandel And many, many more…
Gene Ontology Consortium The Pathogen Group Schizosaccharomyces pombe Genome Sequencing Project DictyBase