Presentation is loading. Please wait.

Presentation is loading. Please wait.

Insight into GO and GOA Angelica Tulipano , INFN Bari CNR

Similar presentations


Presentation on theme: "Insight into GO and GOA Angelica Tulipano , INFN Bari CNR"— Presentation transcript:

1 Insight into GO and GOA Angelica Tulipano , INFN Bari CNR
Giulia De Sario , ITB Bari CNR Andreas Gisel, ITB Bari CNR EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov -9.Nov 2007

2 GODB GO_200709 3,3 million gene products more than 100000 organisms
The GO Database, which comprises both ontology and annotation data, is built from the flat files available on the GO website, and can be downloaded in mySQL or RDF XML format. termdb ontologies, definitions and mappings to other dbs assocdb the above, plus associations to gene products seqdb the above, plus protein sequences for some of the gene products seqdblite the above, with IEA associations stripped out (this is the version that drives AmiGO) GO_200709 3,3 million gene products more than organisms 25000 GO terms 14,5 million associations EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

3 GO tree - path Total terms 24955
Number of terms without child (60%) Number of terms with children (40%) EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

4 GO tree - path Total terms 24955 Number of terms without child 14974
Number of terms with children Number of different path Average number of path / end term 17 Max number of path / end term 851 EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

5 GO tree - path Total terms 24955 Number of terms without child 14974
Number of terms with children Average length of path 10 Max length of path 18 EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

6 GO tree - path The GO is very wide and has a large knowledge to associate with gene products, however the depth of the path is quite short EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

7 Gene product description
BCL2_HUMAN 3 million gene products (UniProt) are described by descriptions EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

8 Gene product description
BCL2_HUMAN GODB version go_09_07 Gene product per description Descriptions Gene products 3 million gene products (UniProt) described by descriptions EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

9 GOA Evidence Description Code Inferred by Curator IC experimental
Inferred from Direct Assay IDA experimental Inferred from Electronic Annotation IEA computational Inferred from Expression Pattern IEP experimental Inferred from Genetic Interaction IGI experimental Inferred from Mutant Phenotype IMP experimental Inferred from Physical Interaction IPI experimental Inferred from Sequence or Structural Similarity ISS computational Non-traceable Author Statement NAS experimental No biological Data available ND -- Inferred from Reviewed Computational Analysis RCA computational Traceable Author Statement TAS experimental Not Recorded NR -- EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

10 GOA Evidence Description Code Inferred by Curator IC exp 388
Inferred from Direct Assay IDA exp 10888 Inferred from Electronic Annotation IEA comp ,5% Inferred from Expression Pattern IEP exp 392 Inferred from Genetic Interaction IGI exp 145 Inferred from Mutant Phenotype IMP exp 1646 Inferred from Physical Interaction IPI exp 7517 Inferred from Sequence or Structural Similarity ISS comp 16759 Non-traceable Author Statement NAS exp 10811 No biological Data available ND Inferred from Reviewed Computational Analysis RCA comp 107 Traceable Author Statement TAS exp 19463 Not Recorded NR total associations % EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

11 GOA p-value The number of gene products associated to a term or any of its children, divided by the number of total associations between the GO terms and gene products. The smaller the p(term) is the higher the information content and the more detailed the description. name level term-type count p(term) molecular_function 1 molecular_function biological_process 1 biological_process cellular_component 1 cellular_component hormone activity 4 molecular_function Gliogenesis 4 biological_process e-05 cell fate specification 5 biological_process e-05 Angiogenesis 7 biological_process e-05 EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

12 GOA delta p-value EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

13 GOA p-value One would expect a linear increase of the information content along a path Re-evaluate annotaions and GO term choise according such studies EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007

14 GO - GOA Important knowledge to understand better biological data Urgent need to collect and incoorporate existent information especially from non-model organisms THANKS!!!!!! EMBRACE Workshop on ‘Applied Gene Ontology’ Bari, Italy 7.Nov - 9.Nov 2007


Download ppt "Insight into GO and GOA Angelica Tulipano , INFN Bari CNR"

Similar presentations


Ads by Google