Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology.

Slides:



Advertisements
Similar presentations
A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
Advertisements

1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida.
Annotation of Gene Function …and how thats useful to you.
Applications of GO. Goals of Gene Ontology Project.
25th June 2007 Jane Lomax Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI.
Www. GeneOntology.org Gene Ontology Collaboration.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
European Bioinformatics Institute The Gene Ontology Annotation (GOA) Database and enhancement of GO annotations through InterPro2GO Nicky Mulder
Gene Ontology John Pinney
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
COG and GO tutorial.
Bioinformatics master course DNA/Protein structure-function analysis and prediction Lecture 13: Protein Function Centre for Integrative Bioinformatics.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
BI class 2010 Gene Ontology Overview and Perspective.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Ontologies for Informatics. Infrastructure for Systems Biology. Oxford October
Protein and Function Databases
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK.
Lecture 4: Gene Annotation & Gene Ontology June 11, 2015.
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
Using The Gene Ontology: Gene Product Annotation.
Gene Ontology (GO) Project
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Slide-1 DEVELOPMENT AND INTEGRATION OF ONTOLOGIES IN GRAMENE Scientific Advisory Board Meeting January 2005.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Biology 224 Instructor: Tom Peavy Feb 21 & 26, Protein Structure & Analysis.
Only build an ontology if: You have a body of data to annotate.
Ontologies, data standards and controlled vocabularies.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
GO: The Gene Ontology Pascale Gaudet dictyBase curator Northwestern University, Chicago, IL.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology Project
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
EBI is an Outstation of the European Molecular Biology Laboratory. GOA: Looking after GO annotations Emily Dimmer Gene Ontology Annotation (GOA) Database.
BIOINFORMATIK I UEBUNG 2 mRNA processing.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Emily Dimmer GOA group European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK Gene Ontology (GO)
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster.
The Gene Ontology and its insertion into UMLS Jane Lomax.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Gene Ontology Consortium
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Computer Science Ph. D. Seminar Gene Ontology (GO) Based Search for Protein Structure Similarity Clustering Metrics Ph.D. Candidate Steve Johnson Committee.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Gene Ontology TM (GO) Consortium
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
Gene Annotation & Gene Ontology May 24, Gene lists from RNAseq analysis What do you do with a list of 100s of genes that contain only the following.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Gene Annotation & Gene Ontology
Annotating with GO: an overview
GO : the Gene Ontology & Functional enrichment analysis
Department of Genetics • Stanford University School of Medicine
Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI 25th June 2007 Jane Lomax.
Annotating Gene Products to the GO
Presentation transcript:

Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology

S. cerevisiae

D. melanogaster

Cells that normally survive CED-9 ON CED-3 CED-4 OFF CED-9 OFF CED-3 CED-4 ON Cells that normally die C elegans

M. musculus

MCM3 MCM2 CDC46/MCM5 CDC47/MCM7 CDC54/MCM4 MCM6 These proteins form a hexamer in the species that have been examined Comparison of sequences from 4 organisms

A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Gene Ontology FlyBaseDrosophilaCambridge, EBI, Harvard Berkeley & Bloomington. SGDSaccharomycesStanford. MGIMusJackson Labs., Bar Harbor.

Gene Ontology -now Fruitfly - FlyBase Budding yeast - Saccharomyces Genome Database (SGD) Mouse - Mouse Genome Database (MGD & GXD) Rat - Rat Genome Database (RGD) Weed - The Arabidopsis Information Resource (TAIR) Worm - WormBase Dictyostelium discoidem - Dictybase InterPro/UniProt at EBI - InterPro Fission yeast - Pombase Human - UniProt, Ensembl, NCBI, Incyte, Celera, Compugen Parasites - Plasmodium, Trypanosoma, Leishmania - GeneDB - Sanger Microbes - Vibrio, Shewanella, B. anthracus, … - TIGR Grasses - rice & maize - Gramene database zebra fish – Zfin

To provide structured controlled vocabularies for the representation of biological knowledge in biological databases.

Be open source Use open standards Make data & code available without constraint Involve your community

Gene Ontology Objectives GO represents concepts used to classify specific parts of our biological knowledge: –Biological Process –Molecular Function –Cellular Component GO develops a common language applicable to any organism GO terms can be used to annotate gene products from any species, allowing comparison of information across species

GO: Three ontologies Where does it act? What processes is it involved in? What does it do?Molecular Function Cellular Component Biological Process gene product

Molecular Function 7,309 terms Biological Process 10,041 terms Cellular Component 1,629 terms Total 18, 975 terms Definitions: 94.9 % Obsolete terms: 992 Content of GO

term: gluconeogenesis id: GO: definition: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. What’s in a GO term?

Mitochondrial P450 Annotation of gene products with GO terms

Cellular component: mitochondrial inner membrane GO: Biological process: Electron transport GO: Molecular function: monooxygenase activity GO: substrate + O 2 = CO 2 +H 2 0 product

Other gene products annotated to monooxygenase activity (GO: ) - monooxygenase, DBH-like 1 (mouse) - prostaglandin I2 (prostacyclin) synthase (mouse) - flavin-containing monooxygenase (yeast) - ferulate-5-hydrolase 1 (arabidopsis)

What’s in a name? Glucose synthesis Glucose biosynthesis Glucose formation Glucose anabolism Gluconeogenesis All refer to the process of making glucose from simpler components

tree directed acyclic graph

Nucleus Nucleoplasm Nuclear envelope ChromosomePerinuclear spaceNucleolus A child is a subset of a parent’s elements The cell component term Nucleus has 5 children Parent-Child Relationships

Ontology Relationships Directed Acyclic Graph

Evidence Codes for GO Annotations

IEAInferred from Electronic Annotation ISSInferred from Sequence Similarity IEPInferred from Expression Pattern IMPInferred from Mutant Phenotype IGIInferred from Genetic Interaction IPIInferred from Physical Interaction IDAInferred from Direct Assay RCAInferred from Reviewed Computational Analysis TASTraceable Author Statement NASNon-traceable Author Statement ICInferred by Curator NDNo biological Data available

Meloidogyne incognita: McCarter et al Annotation summaries

Two types of GO Annotations:  Electronic Annotation  Manual Annotation All annotations must: be attributed to a source indicate what evidence was found to support the GO term-gene/protein association

Manual Annotations High–quality, specific gene/gene product associations made, using: Peer-reviewed papers Evidence codes to grade evidence BUT – is very time consuming and requires trained biologists

1.Extract information from published literature 2.Curators performs manual sequence similarity analyses to transfer annotations between highly similar gene products (BLAST, protein domain analysis) Manual Annotations: Methods

Finding GO terms In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GFP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response… Process: response to wounding GO: serine/threonine kinase activity, Function: protein serine/threonine kinase activity GO: integral membrane protein Component: integral to plasma membrane GO: PubMed ID: wound response

Electronic Annotations Provides large-coverage High-quality BUT – annotations tend to use high-level GO terms and provide little detail.

1.Database entries Manual mapping of GO terms to concepts external to GO (‘translation tables’) Proteins then electronically annotated with the relevant GO term(s) 2.Automatic sequence similarity analyses to transfer annotations between highly similar gene products Electronic Annotations: Methods

Fatty acid biosynthesis (Swiss-Prot Keyword) EC: (EC number) IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit ( InterPro entry) GO:Fatty acid biosynthesis ( GO: ) GO:acetyl-CoA carboxylase activity ( GO: ) GO:acetyl-CoA carboxylase activity (GO: ) Electronic Annotations

Mappings of external concepts to GO EC: > GO:alcohol dehydrogenase activity ; GO: EC: > GO:L-xylulose reductase activity ; GO: EC: > GO:4-oxoproline reductase activity ; GO: EC: > GO:retinol dehydrogenase activity ; GO:

Annotate to finest granularity Annotating to GO: automatically annotates to all of its parents; thus a product is annotated to both protein modification AND cytoskeleton organization

A gene product can have several functions, cellular locations and be involved in many processes Annotation of a gene product to one ontology is independent from its annotation to other ontologies Annotations are only to terms reflecting a normal activity or location Usage of ‘unknown’ GO terms Additional points

Unknown v.s. Unannotated “Unknown” is used when the curator has determined that there is no existing literature to support an annotation. –Biological process unknown GO: –Molecular function unknown GO: –Cellular component unknown GO: NOT the same as having no annotation at all –No annotation means that no one has looked yet

Annotation of a genome GO annotations are always work in progress Part of normal curation process –More specific information –Better evidence code Replace obsolete terms “Last reviewed” date

How to access the Gene ontology and its annotations 1. Downloads Ontologies Annotations : Gene association files Ontologies and Annotations 2. Web-based access AmiGO ( QuickGO ( among others…

组别 第四讲:讨论论文(课堂讨论 时间 5 分左右) A C D E H M S