Using The Gene Ontology: Gene Product Annotation
Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary terms (annotation) Develop tools: to query and modify the vocabularies and annotations annotation tools for curators GO Project Goals
GO provides two bodies of data: Terms with definitions and cross- references Gene product annotations with supporting data GO Data
Molecular Function — elemental activity or task nuclease, DNA binding, transcription factor Biological Process — broad objective or goal mitosis, signal transduction, metabolism Cellular Component — location or complex nucleus, ribosome, origin recognition complex The Three Ontologies
DAG Structure Directed acyclic graph: each child may have one or more parents
Every path from a node back to the root must be biologically accurate The True Path Rule
Association between gene product and applicable GO terms Provided by member databases Made by manual or automated methods GO Annotation
DAG Structure Annotate to any level within DAG
DAG Structure Annotate to any level within DAG mitotic chromosome condensation S.c. BRN1, D.m. barren
DAG Structure Annotate to any level within DAG mitosis S.c. NNF1 mitotic chromosome condensation S.c. BRN1, D.m. barren
Database object: gene or gene product GO term ID Reference publication or computational method Evidence supporting annotation GO Annotation: Data
IDA - Inferred from Direct Assay IMP - Inferred from Mutant Phenotype IGI - Inferred from Genetic Interaction IPI - Inferred from Physical Interaction IEP - Inferred from Expression Pattern GO Evidence Codes TAS - Traceable Author Statement NAS - Non-traceable Author Statement IC - Inferred by Curator ISS - Inferred from Sequence or structural Similarity IEA - Inferred from Electronic Annotation ND - Not Determined
IDA - Inferred from Direct Assay IMP - Inferred from Mutant Phenotype IGI - Inferred from Genetic Interaction IPI - Inferred from Physical Interaction IEP - Inferred from Expression Pattern GO Evidence Codes TAS - Traceable Author Statement NAS - Non-traceable Author Statement IC - Inferred by Curator ISS - Inferred from Sequence or structural Similarity IEA - Inferred from Electronic Annotation ND - Not Determined From primary literature
IDA - Inferred from Direct Assay IMP - Inferred from Mutant Phenotype IGI - Inferred from Genetic Interaction IPI - Inferred from Physical Interaction IEP - Inferred from Expression Pattern GO Evidence Codes TAS - Traceable Author Statement NAS - Non-traceable Author Statement IC - Inferred by Curator ISS - Inferred from Sequence or structural Similarity IEA - Inferred from Electronic Annotation ND - Not Determined From reviews or introductions From primary literature
IDA - Inferred from Direct Assay IMP - Inferred from Mutant Phenotype IGI - Inferred from Genetic Interaction IPI - Inferred from Physical Interaction IEP - Inferred from Expression Pattern GO Evidence Codes TAS - Traceable Author Statement NAS - Non-traceable Author Statement IC - Inferred by Curator ISS - Inferred from Sequence or structural Similarity IEA - Inferred from Electronic Annotation ND - Not Determined From reviews or introductions From primary literature
IDA - Inferred from Direct Assay IMP - Inferred from Mutant Phenotype IGI - Inferred from Genetic Interaction IPI - Inferred from Physical Interaction IEP - Inferred from Expression Pattern GO Evidence Codes TAS - Traceable Author Statement NAS - Non-traceable Author Statement IC - Inferred by Curator ISS - Inferred from Sequence or structural Similarity IEA - Inferred from Electronic Annotation ND - Not Determined From reviews or introductions From primary literature automated
Manual Automated sequence similarity transitive annotation nomenclature, other text matching GO Annotation: Methods
Experiment 1 - Purification and enzyme assay Purified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + P i ; inosine and guanosine are substrates Experiment 2 - Knockout of YLR209C null mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles) Literature-Based Manual Annotation: Experimental Evidence Codes Lecoq, K., et al. (2001) YLR209C Encodes Saccharomyces cerevisiae Purine Nucleoside Phosphorylase. J. Bacteriology 183(16):
Experiment 1 - Purification and enzyme assay Purified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + P i ; inosine and guanosine are substrates Experiment 2 - Knockout of YLR209C null mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles) Literature-Based Manual Annotation: Experimental Evidence Codes Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913. IDA
Experiment 1 - Purification and enzyme assay Purified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + P i ; inosine and guanosine are substrates Experiment 2 - Knockout of YLR209C null mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles) Literature-Based Manual Annotation: Experimental Evidence Codes Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913. FUNCTION: purine nucleoside phosphorylase IDA
Experiment 1 - Purification and enzyme assay Purified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + P i ; inosine and guanosine are substrates Experiment 2 - Knockout of YLR209C null mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles) Literature-Based Manual Annotation: Experimental Evidence Codes Lecoq, K., et al. (2001) YLR209C ncodes Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913. FUNCTION: purine nucleoside phosphorylase IDA IMP
Experiment 1 - Purification and enzyme assay Purified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + P i ; inosine and guanosine are substrates Experiment 2 - Knockout of YLR209C null mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles) Literature-Based Manual Annotation: Experimental Evidence Codes Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913. FUNCTION: purine nucleoside phosphorylase IDA IMP
Experiment 1 - Purification and enzyme assay Purified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + P i ; inosine and guanosine are substrates Experiment 2 - Knockout of YLR209C null mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles) Literature-Based Manual Annotation: Experimental Evidence Codes Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913. FUNCTION: purine nucleoside phosphorylase IDA PROCESS: purine nucleoside catabolism IMP
Experiment 1 - Purification and enzyme assay Purified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + P i ; inosine and guanosine are substrates Experiment 2 - Knockout of YLR209C null mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles) Literature-Based Manual Annotation: Experimental Evidence Codes Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913. FUNCTION: purine nucleoside phosphorylase IDA PROCESS: purine nucleoside catabolism IMP This paper has no data for cellular component.
InterPro2go links InterPro entries and GO terms Automated Annotation: InterPro Example YFP InterPro entry GO entry
InterPro2go links InterPro entries and GO terms Automated Annotation: InterPro Example YFP InterPro entry GO entry Run InterProScan to link YFP and InterPro entry
InterPro2go links InterPro entries and GO terms Automated Annotation: InterPro Example YFP Infer GO term from the other two links InterPro entry GO entry Run InterProScan to link YFP and InterPro entry
detailed view of term AmiGO Browser
gene products annotated to term
FlyBase WormBase Saccharomyces Genome Database DictyBase Mouse Genome Informatics Gramene The Arabidopsis Information Resource Compugen, Inc. Swiss-Prot/TrEMBL/InterPro Pathogen Sequencing Unit (Sanger Institute) PomBase (Sanger Institute) Rat Genome Database The Institute for Genomic Research GO Annotation: Contributors
Fruit fly (Drosophila melanogaster) Budding yeast (Saccharomyces cerevisiae) Fission yeast (Schizosaccharomyces pombe) Human (Homo sapiens) Mouse (Mus musculus) Rice (Oryza sativa) Rat (Rattus norvegicus) Tsetse fly (G. morsitans) Caenorhabditis elegans Arabidopsis thaliana Vibrio cholerae Dictyostelium discoideum GO Annotation: Organisms
Current GO Annotations
FlyBase & Berkeley Drosophila Genome Project WormBase Saccharomyces Genome Database DictyBase Mouse Genome Informatics Gramene The Arabidopsis Information Resource Compugen, Inc. Swiss-Prot/TrEMBL/InterPro Pathogen Sequencing Unit (Sanger Institute) PomBase (Sanger Institute) Rat Genome Database Genome Knowledge Base (CSHL) The Institute for Genomic Research The Gene Ontology Consortium is supported by NHGRI grant HG02273 (R01). The Gene Ontology project thanks AstraZeneca for financial support. The Stanford group acknowledges a gift from Incyte Genomics.
Conference: Standards and Ontologies for Functional Genomics (SOFG) Towards unified ontologies for describing biology and biomedicine 17 – 20 November 2002 Hinxton Hall Conference Centre Hinxton, Cambridge, UK