Introduction to GO Annotation Eurie Hong (SGD), Michelle Gwinn (TIGR), Tanya Berardini (TAIR), Karen Pilcher (DictyBase), Russell Collins (FlyBase), Carol.

Slides:



Advertisements
Similar presentations
1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida.
Advertisements

Annotation of Gene Function …and how thats useful to you.
Applications of GO. Goals of Gene Ontology Project.
Cell identity and positional information. How does a neuron find its target?
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Gene Ontology John Pinney
POC tutorial#3: Annotation This tutorial will run automatically in Quicktime. To run the tutorial at your own pace use the internal controllers within.
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Community Annotation of Gene Function with GONUTS Jim Hu EcoliHub/EcoliWiki Dept. of Biochemistry and Biophysics Texas A&M University.
COG and GO tutorial.
CACAO Biocurator Training CACAO Fall CACAO Syllabus What is CACAO & why is it important? Training Examples.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Comprehensive Annotation System for Infectious Disease Data Alexander Diehl University at Buffalo/The Jackson Laboratory IDO Workshop /9/2010.
BICH CACAO Biocurator Training Session #3.
CACAO - Penn State Gene Function and Gene Ontology January 2011
Gene Ontology at WormBase: Making the Most of GO Annotations Kimberly Van Auken.
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
SPH 247 Statistical Analysis of Laboratory Data 1 May 12, 2015 SPH 247 Statistical Analysis of Laboratory Data.
Using The Gene Ontology: Gene Product Annotation.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
SPH 247 Statistical Analysis of Laboratory Data 1May 14, 2013SPH 247 Statistical Analysis of Laboratory Data.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Improving Curation Efficiency: User Contributions and Textpresso-Based Semi-Automation SAB 2008 WormBase Literature Curators Textpresso.
Gene Ontology Project
Gene expression analysis
Sunday, July 22, 2012 Plan Areas of coverage: high-level neurological system process, inc. sensory perception, sensory processing, cognition transmission.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Gene Product Annotation using the GO ml Harold J Drabkin Senior Scientific Curator The Jackson Laboratory.
Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
DATA MANAGEMENT AND CURATION AT TAIR
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Oct.27, 2003 Curator Meeting, Oct Gene Expression Curation ~WormBase, 2003 ~
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Proteomics, the next step What does each protein do? Where is each protein located? What does each protein interact with, if anything? What role does it.
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
A sensor histidine kinase coordinates cell wall architecture with cell division in Bacillus subtilis Component annotation PMID:
Genetic Screen and Analysis of Regulators of Sexually Dimorphic Motor Neuron Development Jack Timmons, Esther Liu, Zachary Palchick, Sonya Krishnan, and.
1 Annotation EPP 245/298 Statistical Analysis of Laboratory Data.
An example of GO annotation from a primary paper Rebecca E. Foulger (UniProt Curator) GO Annotation Camp, June 2005 PMID:
An example of GO annotation from a primary paper GO Annotation Camp, July 2006 PMID:
Gene Ontology TM (GO) Consortium
Nitrogen Fixing GO Annotations UW Fall 2013 Example.
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
CACAO Training ASM-JGI 2012.
Annotating with GO: an overview
D D ? ? D D ? ?Check sequences TAIR? E. coli? ? ISS IDA, IMP, IGI, IC
Introduction to the Gene Ontology
Department of Genetics • Stanford University School of Medicine
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Strategies for annotation of a genome
Gene expression analysis
Annotating Gene Products to the GO
Insight into GO and GOA Angelica Tulipano , INFN Bari CNR
Christopher C. Quinn, Douglas S. Pfeil, William G. Wadsworth 
Presentation transcript:

Introduction to GO Annotation Eurie Hong (SGD), Michelle Gwinn (TIGR), Tanya Berardini (TAIR), Karen Pilcher (DictyBase), Russell Collins (FlyBase), Carol Bastiani (Wormbase), Doug Howe (ZFIN), Stacia Engel (SGD)

What is a GO annotation? References Gene (protein coding gene, functional RNA) GO TermEvidence code IMP, IGI, IPI, ISS, IDA, IEP, TAS, NAS, ND, RCA, IC Qualifiers NOT contributes_to colocalizes_with With/From Supporting evidence for certain evidence codes

What is an annotation? Strategies for identifying literature to annotate Identifying the correct annotation Molecular Function Biological Process Cellular Component Extent of annotation for a single gene product Strategies for annotating a genome

Which type of literature is appropriate for annotation? Papers with experimental evidence for GO process, function or component annotation Mutant phenotype descriptions Enzymatic activity assays Localization studies Papers describing phylogenetic studies for GO function annotation (ISS) Reviews (Textbooks) (Meeting abstracts)

Strategies for reading a paper for annotation Abstract Results/Figures Materials and Methods Discussion

Which granularity of GO term is appropriate for annotation? Molecular Function Souza et al. (1998) YakA, a protein kinase required for the transition from growth to development in Dictyostelium. PMID:

Background YakA was identified as a developmental mutant YakA is an ortholog of the yeast Yak1p The protein kinase domain of YakA is similar to both serine/threonine kinases and tyrosine kinases PMID:

YakA belongs to the DYRK family YakA is a member of the DYRK family of protein kinases (dual-specificity tyrosine-regulated kinase)

The Experiment Assay for YakA protein kinase activity YakA + γ 32 P-ATP + MBP (substrate) Look for presence of 32 P in substrate in the presence of YakA PMID:

The Result PMID:

GO Term for Annotation protein kinase activity ; GO: MBP (myelin basic protein) is a generic substrate Kinase specificity not determined; no phospho- tyrosine antibodies used, for example Definition: Catalysis of the transfer of a phosphate group, usually from ATP, to a protein substrate.

Searching for Terms in DAG- Edit Search term name that contains: kinase 359 results protein kinase 60 results protein kinase activity 20 results

Search Output in DAG-Edit

Sibling Terms in DAG-Edit

Child Terms in DAG-Edit

Parent Terms in AmiGO

Evidence Code The evidence code for the protein kinase activity term is IDA (Inferred from Direct Assay) Although endogenous substrates were not tested, the authors clearly showed kinase activity with a direct assay

Granular Terms Using ISS protein serine/threonine kinase activity ; GO: protein tyrosine kinase activity ; GO: (Inferred from Sequence or structural Similarity)

How is Biological Process different form Molecular Function? Molecular Function… “Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.” Biological Process… “A phenomenon marked by changes that lead to a particular result, mediated by one or more gene products.” is about the protein. is about the organism. are the activities that a protein specifically and directly does. are the organism uses those activities for. Rho1 has GTPase activity…and the organism uses that activity for gastrulation, axon guidance, germ cell migration, etc … A hammer hammers nails…and builds houses. for example

Important points: Process is a migration of germ (pole) cells. It is the movement of cells from one side of the epithelium to the other. It is one step in a three step process.

Is a new term needed?

New term might be appropriate because it would describe a discrete, separable process, thus providing additional useful information to the user. Also, a new term(s) permit linking two similar processes that are currently separate in GO, but are connected in the literature. cell migration (is a) transepithelial cell migration (is a) pole cell transepithelial migration (is a) cellular extravasation cell migration (is a) germ cell migration (is a) pole cell migration (part of) pole cell transepithelial migration

Annotating to the Cellular Component Ontology Carol Bastiani, Caltech

Experiment: Immunolocalization of LIN-10 with a LIN-10 antibody.

Vulval epithelial cells can be distinguished from ventral cord neurons by their larger size and the presence of stained cell junctions (red) Localization of LIN-10 by Immunoflourescence:

Figure 7. LIN-10 is expressed in neurons. (A-C) Wild-type, late L3 hermaphrodite stained with anti-LIN-10 antibodies (green). LIN-10 is present in ventral cord processes (A, *), lateral neural cell bodies and processes (A and B, arrowheads), and dorsal cord processes

Search MGI GO Browser for neuron:

Choosing the evidence code:

In neural cell bodies, a small amount of LIN-10 appears diffusely throughout the cytoplasm, whereas the majority of LIN-10 is concentrated in discrete perinuclear structures (Figure 7, D and E), similar to perinuclear structures observed in vulval epithelial cells. To determine whether these perinuclear structures correspond to Golgi, we used ST-GFP as a marker for the trans-cisterna of the Golgi (Jamora et al., 1997). We expressed ST-GFP in transgenic worms using a heat shock promoter and examined the subcellular localization of LIN-10 and ST-GFP using anti-LIN-10 and anti-GFP antibodies. In single neurons expressing both endogenous LIN-10 and transgenic ST-GFP, the subcellular pattern of LIN-10 staining is similar to that of ST-GFP staining. Deconvolution of images obtained in double-staining experiments revealed that LIN-10 staining is closely associated with ST-GFP staining (Figure 7, F-I), but LIN-10 staining is consistently offset (by µm) from ST-GFP staining. These results indicate that LIN-10 is localized in the trans-cisterna of the Golgi or is localized in a compartment closely associated with the trans-cisterna, such as the trans-Golgi network. Further subcellular localization of LIN-10:

LIN-10 is localized to: 1) Cytoplasm 2) Within or in association with a part of the Golgi apparatus/ in close association with the trans-cisterna or trans-Golgi network

1) Annotate to cytoplasm:

LIN-10 is localized to: 1) Cytoplasm 2) Within or in association with a part of the Golgi apparatus/ in close association with the trans-cisterna or trans-Golgi network

2)Annotate to Golgi apparatus, evidence code IDA:

Qualifier to use “when the resolution of the assay is not accurate enough to say that the gene product is a bona fida component member:”

Strategies for annotation of a genome 1. How to get a complete set of GO annotations 2. Updating GO annotations 3. Representative approaches

Strategies for annotation of a genome Complete a first pass –For all 3 aspects (MF, BP, CC) –For all genes that get GO annotations Proteins, RNAs, pseudogenes NOT centromeres, telomeres, LTRs, retrotransposons, ARSs –Unknowns are allowed How to get a complete set of GO annotations

Strategies for annotation of a genome Second pass –Replace unknowns –Update where IEA was used Info with “better” evidence code, if available –Update where other db’s are referenced Primary literature is preferred Updating the complete set of GO annotations

Strategies for annotation of a genome GO annotations will never be “done” Part of normal curation process –More specific information –Better evidence code Replace obsolete terms “Last reviewed” date Updating GO annotations - ongoing

Strategies for annotation of a genome Updating GO annotations - ongoing

Strategies for annotation of a genome Representative approaches