25th June 2007 Jane Lomax Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI.

Slides:



Advertisements
Similar presentations
1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida.
Advertisements

Annotation of Gene Function …and how thats useful to you.
Applications of GO. Goals of Gene Ontology Project.
24th Feb 2006 Jane Lomax Gene Ontology tutorial Talk:Using the Gene Ontology (GO) for Expression Analysis Practical:Onto-Express analysis tool Talk: GO.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Gene Ontology John Pinney
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
COG and GO tutorial.
Bioinformatics master course DNA/Protein structure-function analysis and prediction Lecture 13: Protein Function Centre for Integrative Bioinformatics.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
BI class 2010 Gene Ontology Overview and Perspective.
CACAO - Penn State Gene Function and Gene Ontology January 2011
Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK.
Lecture 4: Gene Annotation & Gene Ontology June 11, 2015.
SPH 247 Statistical Analysis of Laboratory Data 1 May 12, 2015 SPH 247 Statistical Analysis of Laboratory Data.
Using The Gene Ontology: Gene Product Annotation.
Gene Ontology (GO) Project
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
SPH 247 Statistical Analysis of Laboratory Data 1May 14, 2013SPH 247 Statistical Analysis of Laboratory Data.
Ontologies, data standards and controlled vocabularies.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Gene Ontology Consortium
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology Project
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
EBI is an Outstation of the European Molecular Biology Laboratory. GOA: Looking after GO annotations Emily Dimmer Gene Ontology Annotation (GOA) Database.
Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
The Gene Ontology and its insertion into UMLS Jane Lomax.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
Gene Ontology Consortium
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
1 Annotation EPP 245/298 Statistical Analysis of Laboratory Data.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Gene Ontology TM (GO) Consortium
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
Canadian Bioinformatics Workshops
Module 1: Gene Lists 1 Canadian Bioinformatics Workshops
Gene Annotation & Gene Ontology May 24, Gene lists from RNAseq analysis What do you do with a list of 100s of genes that contain only the following.
Canadian Bioinformatics Workshops
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Gene Annotation & Gene Ontology
Annotating with GO: an overview
GO : the Gene Ontology & Functional enrichment analysis
Introduction to the Gene Ontology
Mental Functioning and the Gene Ontology
Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI 25th June 2007 Jane Lomax.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Gene expression analysis
Insight into GO and GOA Angelica Tulipano , INFN Bari CNR
Presentation transcript:

25th June 2007 Jane Lomax Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI

25th June 2007 Jane Lomax What is the Gene Ontology? Set of standard biological phrases (terms) which are applied to genes/proteins: –protein kinase –apoptosis –membrane

25th June 2007 Jane Lomax What is the Gene Ontology? Genes are linked, or associated, with GO terms by trained curators at genome databases –known as ‘gene associations’ or GO annotations Some GO annotations created automatically

25th June 2007 Jane Lomax gene -> GO term associated genes GO annotations GO database genome and protein databases

25th June 2007 Jane Lomax What is the Gene Ontology? Allows biologists to make queries across large numbers of genes without researching each one individually

Copyright ©1998 by the National Academy of Sciences Eisen, Michael B. et al. (1998) Proc. Natl. Acad. Sci. USA 95,

25th June 2007 Jane Lomax GO structure GO isn’t just a flat list of biological terms terms are related within a hierarchy

25th June 2007 Jane Lomax GO structure gene A

25th June 2007 Jane Lomax GO structure This means genes can be grouped according to user-defined levels Allows broad overview of gene set or genome

25th June 2007 Jane Lomax How does GO work? GO is species independent –some terms, especially lower-level, detailed terms may be specific to a certain group e.g. photosynthesis –But when collapsed up to the higher levels, terms are not dependent on species

25th June 2007 Jane Lomax How does GO work? What does the gene product do? Where and does it act? Why does it perform these activities? What information might we want to capture about a gene product?

25th June 2007 Jane Lomax GO structure GO terms divided into three parts: –cellular component –molecular function –biological process

25th June 2007 Jane Lomax Cellular Component where a gene product acts

25th June 2007 Jane Lomax Cellular Component

25th June 2007 Jane Lomax Cellular Component

25th June 2007 Jane Lomax Cellular Component Enzyme complexes in the component ontology refer to places, not activities.

25th June 2007 Jane Lomax Molecular Function activities or “ jobs ” of a gene product glucose-6-phosphate isomerase activity

25th June 2007 Jane Lomax Molecular Function insulin binding insulin receptor activity

25th June 2007 Jane Lomax Molecular Function drug transporter activity

25th June 2007 Jane Lomax Molecular Function A gene product may have several functions Sets of functions make up a biological process.

25th June 2007 Jane Lomax Biological Process a commonly recognized series of events cell division

25th June 2007 Jane Lomax Biological Process transcription

25th June 2007 Jane Lomax Biological Process regulation of gluconeogenesis

25th June 2007 Jane Lomax Biological Process limb development

25th June 2007 Jane Lomax Biological Process courtship behavior

25th June 2007 Jane Lomax Ontology Structure Terms are linked by two relationships –is-a  –part-of 

25th June 2007 Jane Lomax Ontology Structure cell membrane chloroplast mitochondrial chloroplast membrane is-a part-of

25th June 2007 Jane Lomax Ontology Structure Ontologies are structured as a hierarchical directed acyclic graph (DAG) Terms can have more than one parent and zero, one or more children

25th June 2007 Jane Lomax Ontology Structure cell membrane chloroplast mitochondrial chloroplast membrane Directed Acyclic Graph (DAG) - multiple parentage allowed

25th June 2007 Jane Lomax Anatomy of a GO term id: GO: name: gluconeogenesis namespace: process def: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. [ exact_synonym: glucose biosynthesis xref_analog: MetaCyc:GLUCONEO-PWY is_a: GO: is_a: GO: unique GO ID term name definition synonym database ref parentage ontology

25th June 2007 Jane Lomax GO terms Where do GO terms come from? –GO terms are added by editors at EBI and annotating databases –new terms are usually only added when they are asked for by annotators –GO editors work with experts to make major ontology developments metabolism pathogenesis cell cycle

25th June 2007 Jane Lomax GO stats over 23,000 GO terms: –13593 biological_process –1980 cellular_component –7700 molecular_function

25th June 2007 Jane Lomax GO annotations Where do the links between genes and GO terms come from?

25th June 2007 Jane Lomax GO annotations Contributing databases: –Berkeley Drosophila Genome Project (BDGP)Berkeley Drosophila Genome Project (BDGP –dictyBase (Dictyostelium discoideum)dictyBase –FlyBase (Drosophila melanogaster)FlyBase –GeneDB (Schizosaccharomyces pombe, Plasmodium falciparum, Leishmania major and Trypanosoma brucei)GeneDBSchizosaccharomyces pombe –UniProt Knowledgebase (Swiss-Prot/TrEMBL/PIR-PSD) and InterPro databasesUniProt KnowledgebaseInterPro –Gramene (grains, including rice, Oryza)Gramene –Mouse Genome Database (MGD) and Gene Expression Database (GXD) (Mus musculus)Mouse Genome Database (MGD) and Gene Expression Database (GXD) –Rat Genome Database (RGD) (Rattus norvegicus) –ReactomeReactome –Saccharomyces Genome Database (SGD) (Saccharomyces cerevisiae)Saccharomyces Genome Database (SGD) –The Arabidopsis Information Resource (TAIR) (Arabidopsis thaliana)The Arabidopsis Information Resource (TAIR) –The Institute for Genomic Research (TIGR): databases on several bacterial speciesThe Institute for Genomic Research (TIGR) –WormBase (Caenorhabditis elegans)WormBase –Zebrafish Information Network (ZFIN): (Danio rerio)Zebrafish Information Network (ZFIN)

25th June 2007 Jane Lomax Species coverage All major eukaryotic model organism species Human via GOA group at UniProt Several bacterial and parasite species through TIGR and GeneDB at Sanger –many more in pipeline

25th June 2007 Jane Lomax Annotation coverage

25th June 2007 Jane Lomax Anatomy of a GO annotation Three key parts: –gene name/id –GO term(s) –evidence for association

25th June 2007 Jane Lomax Example annotation Breast cancer type 1 susceptibility protein gene in humans

25th June 2007 Jane Lomax Types of GO annotation:  Electronic Annotation  Manual Annotation

25th June 2007 Jane Lomax Manual annotation Created by scientific curators High quality Small number

25th June 2007 Jane Lomax Manual annotation In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response…

25th June 2007 Jane Lomax Manual annotation

25th June 2007 Jane Lomax Electronic Annotation Annotation derived without human validation –mappings file e.g. interpro2go, ec2go. –Blast search ‘hits’ Lower ‘quality’ than manual codes

25th June 2007 Jane Lomax Mappings files Fatty acid biosynthesis ( Swiss-Prot Keyword) EC: (EC number) IPR000438: Acetyl-CoA carboxylase carboxyl transferase beta subunit ( InterPro entry) GO:Fatty acid biosynthesis ( GO: ) GO:acetyl-CoA carboxylase activity ( GO: ) GO:acetyl-CoA carboxylase activity (GO: )

25th June 2007 Jane Lomax Evidence types ISS: Inferred from Sequence/structural Similarity IDA: Inferred from Direct Assay IPI: Inferred from Physical Interaction IMP: Inferred from Mutant Phenotype IGI: Inferred from Genetic Interaction IEP: Inferred from Expression Pattern TAS: Traceable Author Statement NAS: Non-traceable Author Statement IC: Inferred by Curator ND: No Data available IEA: Inferred from electronic annotation

25th June 2007 Jane Lomax GO tools GO resources are freely available to anyone to use without restriction –Includes the ontologies, gene associations and tools developed by GO Other groups have used GO to create tools for many purposes:

25th June 2007 Jane Lomax GO tools Affymetrix also provide a Gene Ontology Mining Tool as part of their NetAffx™ Analysis Center which returns GO terms for probe sets

25th June 2007 Jane Lomax GO tools Many tools exist that use GO to find common biological functions from a list of genes:

25th June 2007 Jane Lomax GO tools Most of these tools work in a similar way: –input a gene list and a subset of ‘interesting’ genes –tool shows which GO categories have most interesting genes associated with them i.e. which categories are ‘enriched’ for interesting genes –tool provides a statistical measure to determine whether enrichment is significant

25th June 2007 Jane Lomax Microarray process Treat samples Collect mRNA Label Hybridize Scan Normalize Select differentially regulated genes Understand the biological phenomena involved

25th June 2007 Jane Lomax Traditional analysis Gene 1 Apoptosis Cell-cell signaling Protein phosphorylation Mitosis … Gene 2 Growth control Mitosis Oncogenesis Protein phosphorylation … Gene 3 Growth control Mitosis Oncogenesis Protein phosphorylation … Gene 4 Nervous system Pregnancy Oncogenesis Mitosis … Gene 100 Positive ctrl. of cell prolif Mitosis Oncogenesis Glucose transport …

25th June 2007 Jane Lomax Traditional analysis gene by gene basis requires literature searching time-consuming

25th June 2007 Jane Lomax Using GO annotations But by using GO annotations, this work has already been done for you! GO: : apoptosis

25th June 2007 Jane Lomax Grouping by process Apoptosis Gene 1 Gene 53 Mitosis Gene 2 Gene 5 Gene45 Gene 7 Gene 35 … Positive ctrl. of cell prolif. Gene 7 Gene 3 Gene 12 … Growth Gene 5 Gene 2 Gene 6 … Glucose transport Gene 7 Gene 3 Gene 6 …

25th June 2007 Jane Lomax GO for microarray analysis Annotations give ‘function’ label to genes Ask meaningful questions of microarray data e.g. –genes involved in the same process, same/different expression patterns?

25th June 2007 Jane Lomax Using GO in practice statistical measure –how likely your differentially regulated genes fall into that category by chance microarray 1000 genes experiment100 genes differentially regualted mitosis – 80/100 apoptosis – 40/100 p. ctrl. cell prol. – 30/100 glucose transp. – 20/100

25th June 2007 Jane Lomax Using GO in practice However, when you look at the distribution of all genes on the microarray: ProcessGenes on array # genes expected in occurred 100 random genes mitosis 800/ apoptosis 400/ p. ctrl. cell prol. 100/ glucose transp. 50/

25th June 2007 Jane Lomax Enrichment tools GO is developing its own enrichment tool as part of the GO browser AmiGO Currently in testing phase, should be released next month

25th June 2007 Jane Lomax Onto-Express walkthrough