Introduction to the Gene Ontology GO Workshop 3-6 August 2010.

Slides:



Advertisements
Similar presentations
Introduction to BioConductor Friday 23th nov 2007 Ståle Nygård Statistical methods and bioinformatics for the analysis of microarray.
Advertisements

Modeling Functional Genomics Datasets CVM Lesson 3 13 June 2007Fiona McCarthy.
GO-based tools for functional modeling GO Workshop 3-6 August 2010.
European Bioinformatics Institute The Gene Ontology Annotation (GOA) Database and enhancement of GO annotations through InterPro2GO Nicky Mulder
Collaboration with IntAct and InterMine: SGD Rama Balakrishnan Saccharomyces Genome Database Gene Ontology Consortium Stanford University, CA USA.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Bioinformatics resources for IITA Crops GO Workshop 3-6 August 2010.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
IST Computational Biology1 Information Retrieval Biological Databases 2 Pedro Fernandes Instituto Gulbenkian de Ciência, Oeiras PT.
Proteins and Protein Function Charles Yan Spring 2006.
Comprehensive Annotation System for Infectious Disease Data Alexander Diehl University at Buffalo/The Jackson Laboratory IDO Workshop /9/2010.
CACAO - Penn State Gene Function and Gene Ontology January 2011
UniProt - The Universal Protein Resource
GO Enrichment analysis COST Functional Modeling Workshop April, Helsinki.
An introduction to using the AmiGO Gene Ontology tool.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Bioinformatics for biomedicine
VectorBase A Resource Centre for Invertebrate Hosts of Human Pathogens Bob MacCallum Imperial College London.
Introduction to databases Tuomas Hätinen. Topics File Formats Databases -Primary structure: UniProt -Tertiary structure: PDB Database integration system.
Copyright OpenHelix. No use or reproduction without express written consent1.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
AgBase: bioinformatics enabling knowledge generation from agricultural omics data Fiona McCarthy.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Community Ontology Development Lessons from the Gene Ontology.
March 24, Integrating genomic knowledge sources through an anatomy ontology Gennari JH, Silberfein A, and Wiley JC Pac Symp Biocomputing 2005:
Managing Data Modeling GO Workshop 3-6 August 2010.
Adding GO for Large Datasets COST Functional Modeling Workshop April, Helsinki.
VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to.
Adding GO GO Workshop 3-6 August GOanna results and GOanna2ga 2. gene association files 3. getting GO for your dataset 4. adding more GO (introduction)
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Strategies for functional modeling TAMU GO Workshop 17 May 2010.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
GO-based tools for functional modeling TAMU GO Workshop 17 May 2010.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
The New Website of the Gene Ontology Consortium Seth Carbon Chris Mungall, PhD Monica Munoz-Torres, PhD Genomics Division,
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Increasing GO Annotation Through Community Involvement Fiona McCarthy*, Nan Wang*, Susan Bridges** and Shane Burgess** GO.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Data Mining at PLEXdb : Plant and Plant Pathogen Gene Expression Database.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Update Susan Bridges, Fiona McCarthy, Shane Burgess NRI
RiceWiki: a wiki-based database for community curation of rice genes Available at
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
AgBase Shane Burgess, Fiona McCarthy Mississippi State University.
Central hub for biological data UniProtKB/Swiss-Prot is a central hub for biological data: over 120 databases are cross-referenced (EMBL/DDBJ/GenBank,
Prioritization of Avian GO Annotation , , Chicken ,06949,5163.4Rat ,69664, Mouse ,83036, Human.
1 of 28 Evaluating Genes and Transcripts (“Genebuild”)
Need a solid base for analysis of future genomes Reference genome criteria: Sequenced genome MOD Functional genomics projects Adequate research community.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Getting GO annotation for your dataset
Tools For Vertebrate Gene Naming
Building a community for genome and proteome annotation
Strategies for functional modeling
Introduction to the Gene Ontology
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
GO Annotation from different sources
Strategies for annotation of a genome
Presentation transcript:

Introduction to the Gene Ontology GO Workshop 3-6 August 2010

Introduction to GO GO and the GO Consortium (GOC) What the GOC does (and doesn’t do) GO Groups Working groups GO Wiki Dilemma: annotation strategies Sources for GO

The GO Consortium began as a collaboration between FlyBase (Drosophila), the Saccharomyces Genome Database (SGD) and the Mouse Genome Database (MGD), in 1998 GO Consortium groups are actively involved in developing the GO, providing annotations and supporting use of the GO

The GO Consortium provides: central repository for ontology updates and annotations central mechanism for changing GO terms (adding, editing, deleting) quality checking for annotations consistency checks for how annotations are made by different groups central source of information for users co-ordination of annotation effort

GO Consortium and GO Groups: groups decide gene product set to annotate biocurator training tool development mostly by groups many non-consortium groups education and training by groups outreach to biocurators/databases by GOC

GO Working Groups:

Information about:  Development projects  Meetings  Annotation projects  Changes to the GO

The Annotation Dilemma Exponential increase in biological data More important than ever to provide annotation for this data How to keep up?

Annotation Strategy Experimental data Many species have a body of published, experimental data Detailed, species-specific annotation: ‘depth’ Requires manual annotation of literature  slow Computational analysis Can be automated  faster Gives ‘breadth’ of coverage across the genome Annotations are general Relatively few annotation pipelines

GO & PO: literature annotation for rice, computational annotation for rice, maize, sorghum, Brachypodia 1.Literature annotation for Agrobacterium tumefaciens, Dickeya dadantii, Magnaporthe grisea, Oomycetes 2.Computational annotation for Pseudomonas syringae pv tomato, Phytophthora spp and the nematode Meloidogyne hapla. Literature annotation for chicken, cow, maize, cotton; Computational annotation for agricultural species & pathogens. literature annotation for human; computational annotation for UniProtKB entries (237,201 taxa).

Community Annotation Researchers are the domain experts – but relatively few contribute to annotation time 'reward' & 'employer/funding agency recognition' training – easy to use tools, clear instructions Required submission Community annotation Groups with special interest do focused annotation or ontology development As part of a meeting/conference or distributed (eg. wikis) Students!

Releasing GO Annotations GO annotations are stored at individual databases Sanity checks as data is entered – is all the data required filled in? Databases do quality control (QC) checks and submit to GO GO Consortium runs additional QC and collates annotations Checked annotations are picked up by GO users eg. public databases, genome browsers, array vendors, GO expression analysis tools

AgBase Biocurators AgBase biocuration interface AgBase database ‘sanity’ check & GOC QC EBI GOA Project GO Consortium database ‘sanity’ check & GOC QC ‘sanity’ check GO analysis tools Microarray developers UniProt db QuickGO browser GO analysis tools Microarray developers Public databases AmiGO browser GO analysis tools Microarray developers AgBase Quality Checks & Releases ‘sanity’ check: checks to ensure all appropriate information is captured, no obsolete GO:IDs are used, etc.

1. Primary sources of GO: from the GO Consortium (GOC) & GOC members most up to date most comprehensive 2. Secondary sources: other resources that use GO provided by GOC members public databases (eg. NCBI, UniProtKB) genome browsers (eg. Ensembl) array vendors (eg. Affymetrix) GO expression analysis tools Sources of GO

 Different tools and databases display the GO annotations differently.  Since GO terms are continually changing and GO annotations are continually added, need to know when GO annotations were last updated. Sources of GO annotation

 EXAMPLES: public databases (eg. NCBI, UniProtKB) genome browsers (eg. Ensembl) array vendors (eg. Affymetrix)  CONSIDERATIONS: What is the original source? When was it last updated? Are evidence codes displayed? Secondary Sources of GO annotation

Differences in displaying GO annotations: secondary/tertiary sources.

At the GO Consortium website:  FAQs  Mailing groups  Tools that use GO  News about changes and updates  publications Learning more about the GO