Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

24th Feb 2006 Jane Lomax Gene Ontology tutorial Talk:Using the Gene Ontology (GO) for Expression Analysis Practical:Onto-Express analysis tool Talk: GO.
Rama Balakrishnan AmiGO Tutorial Saccharomyces Genome Database (SGD) Stanford University.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Microarray Data Analysis Day 2
The Gene Ontology and Immune System Processes Alexander D. Diehl 6/11/12.
PRO-PO-GO Alexander Diehl The Gene Ontology Cellular Component – Subcellular structures, locations, and macromolecular complexes including protein.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Gene Ontology John Pinney
Real-life ontology development: lessons from the Gene Ontology.
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Community Annotation of Gene Function with GONUTS Jim Hu EcoliHub/EcoliWiki Dept. of Biochemistry and Biophysics Texas A&M University.
1 Using Gene Ontology. 2 Assigning (or Hypothesizing About) Biological Meaning to Clusters What do you want to be able to to? –Identify over-represented.
COG and GO tutorial.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
BI class 2010 Gene Ontology Overview and Perspective.
Internet tools for genomic analysis: part 2
Comprehensive Annotation System for Infectious Disease Data Alexander Diehl University at Buffalo/The Jackson Laboratory IDO Workshop /9/2010.
1 Gene Ontology and Semantic Similarity Measures.
CACAO - Penn State Gene Function and Gene Ontology January 2011
Mouse Genome Informatics November 2008 Paul Szauter MGI User Support.
Daniel Rico, PhD. Daniel Rico, PhD. ::: Introduction to Functional Analysis Course on Functional Analysis Bioinformatics Unit.
Modifying GO How changes are made to GO, and how you can be involved.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Using The Gene Ontology: Gene Product Annotation.
Gene Ontology (GO) Project
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Gene Ontology Overview and Perspective Lung Development Ontology Workshop.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Gene Ontology Consortium
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology Project
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Sunday, July 22, 2012 Plan Areas of coverage: high-level neurological system process, inc. sensory perception, sensory processing, cognition transmission.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
The Gene Ontology and its insertion into UMLS Jane Lomax.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
Copyright OpenHelix. No use or reproduction without express written consent1.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
GO terms implicitly refer to other term cysteine biosynthesis myoblast fusion hydrogen ion transporter activity snoRNA catabolism wing disc pattern formation.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
Gene Ontology Project
Gene Ontology Consortium
CHAPTER 1 INTRODUCTION: THEMES IN THE STUDY OF LIFE.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Gene Ontology TM (GO) Consortium
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
Gene Annotation & Gene Ontology May 24, Gene lists from RNAseq analysis What do you do with a list of 100s of genes that contain only the following.
GO : the Gene Ontology & Functional enrichment analysis
Mental Functioning and the Gene Ontology
Overview Gene Ontology Introduction Biological network data
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Browsing the GO at MGI Harold Drabkin, Ph.D. Senior Scientific Curator
Gramene’s Ontologies Tutorial
Presentation transcript:

Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics Bar Harbor, ME

What is the GO The scope of the GO The GO Relationships Using the GO for annotation Anatomy of an annotation Evidence codes qualfiers gene association files

What IS the GO The Gene Ontology is a dictionary of concepts used to describe the normal properties of a gene product It has concepts describing molecular functions It has concepts describing biological processes It has concepts describing cellular locations that the gene products are found in

Gene Ontology Built for a very specific purpose: “annotation of genes and proteins in genomic and protein databases” Built to be applicable to any organism Formed to develop a shared language adequate for the annotation of molecular characteristics across organisms; a common language to share knowledge.

The GO is NOT list of genes or proteins although you might find a synonym as a gene or protein name does NOT track diseases although certain disease phenotypes might suggest the function of a gene product or a process that it may participate in you will not find “tumor suppressor activity/tumor suppression” as GO terms

The Gene Ontology Consortium Started Small Original GO created in 2000 Three databases involved: FlyBase (Drosophila)‏ MGI (Mouse)‏ SGD (S. cerevisae)‏ Used immediately

More quickly joined... Later databases: TAIR (Arabadopsis)‏ TIGR (microbes including prokaryotes)‏ SWISS-PROT (several thousand species inc. human)‏ PSU (P. falciparum)‏ ZFIN (zebrafish)‏ PAMGO (plant pathogens)‏

8 Gene Ontology widely adopted AgBase

Why do we need this?

Tactition Tactile sense Taction perception of touch ; GO: Often the same term is referred to differently

Bud initiation? Of then the same term is used by different communities to mean different things...

More specifically The GO is not just a flat list of terms transcription factor activity DNA binding transcription regulator activity membrane mitochondrial membrane glycolysis nucleus cytoplasm ion transport..... transcription factor activity DNA binding transcription regulator activity membrane mitochondrial membrane glycolysis nucleus cytoplasm ion transport.....

is_a And the terms can have more than one parent! is_a DNA binding is a type of nucleic acid binding. Nucleic acid binding is a type of binding. There are also relationships between them.

Ontology Structure The Gene Ontology is structured as a hierarchical directed acyclic graph (DAG)‏ Terms can have more than one parent and zero, one or more children Terms are linked by three relationships is-a part-of regulates (new)‏ negatively regulates positively regulates is_apart_of

Ontology Structure cell membrane chloroplast mitochondrial chloroplast membrane is-a part-of

It gets complicated quickly

Molecular Function = elemental activity/task the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity Biological Process = biological goal or objective broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme The 3 Gene Ontologies

Cellular Component where a gene product acts

Molecular Function activities or “jobs” of a gene product glucose-6-phosphate isomerase activity i nsulin binding insulin receptor activity A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product. Sets of functions make up a biological process.

Biological Process gluconeogenesis cell division limb development a commonly recognized series of events

Mitochondrial P450 ( CC24 PR01238; MITP450CC24)‏ An example…

Anatomy of a GO term A GO term obo format stanza begins with [Term] and minimally has id: name: namespace def one or more relationships

More GO Term Stanzas

24 The Regulates Relationship

In the Beginning There Were Two Relationships Is_a: denotes a subtype of its parent. Part_of: denotes a portion of a parent Is_part: If it exists, it is always a part of its parent (this is the relationship we use). Has_part: If there is a parent, then it has this as a part of it.

We made the regulation of something a part_of the something But it’s not really part_of

So, what’s the issue with regulates? Regulation is not always an inherent part of the process that it regulates A speed-bump regulates the velocity of my car 50 mph5 mph

We needed a better way to express ‘regulates’ We defined regulation as “any process that modulates the frequency, rate or extent of something. Something can be: A Biological Process A Molecular Function A Biological Quality

A ‘decomposed’ Term [Term] id: GO: name: regulation of mitotic recombination namespace: biological_process def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators] synonym: "regulation of recombination within rDNA repeats" NARROW [] is_a: GO: ! regulation of DNA recombination intersection_of: GO: ! biological regulation intersection_of: regulates GO: ! mitotic recombination relationship: regulates GO: ! mitotic recombination The intersection tags make up the logical definition. This places the ‘regulation’ term in the context of mitotic recombination.

The context of mitotic recombination

Old regulation of mitotic recombination’ part of the graph on top of ‘mitotic recombination’

Now regulates

What does this buy us? The new relationship portrays the biology more accurately than part_of Regulates Positively rgulates Negatively regulates The new logical definitions allow automated consistency checks as the ontology is developed. The first implementation of cross-products in GO Sets the stage for: Molecular function -> biological process Cell type -> biological process Chebi -> biological process

On March 18 th 2008)‏ [Term] id: GO: name: regulation of mitotic recombination namespace: biological_process def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators] narrow_synonym: "regulation of recombination within rDNA repeats" [] is_a: GO: ! regulation of DNA recombination relationship: part_of GO: ! mitotic recombination [Term] id: GO: name: regulation of mitotic recombination namespace: biological_process def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators] synonym: "regulation of recombination within rDNA repeats" NARROW [] is_a: GO: ! regulation of DNA recombination intersection_of: GO: ! biological regulation intersection_of: regulates GO: ! mitotic recombination relationship: regulates GO: ! mitotic recombination

Evolution of GO GO term development was annotation-driven Development directed by use: Terms added as new species annotated Terms added on as as-needed basis Developed by an international consortium of biologists and computer scientists members from individual databases central office at EBI Development involves collaboration with domain experts from different biological fields also formal ontologists

Important Consideration for Users The GO changes daily new terms added additional relationships added terms removed: obsoletes terms

GO Slims

What is a GO Slim A GO Slim is a smaller slice of the GO that can be used to “bin” data into categories relevant to the user's experiment Why use this? you want to group several sections of the GO into a single broader category you want to remove sections that are totally irrelevant for your assay (eg, photosynthetic processes irrelevant for birds).

Several GO Slims are referenced in the gene_ontology.obo file Section of OboEdit showing GO slims built into the ontology

But you can build your own In OboEdit, select the Category Manager (under Metadata)‏ Use “add” to add a new one; I am adding one for translation

Now I browse through the GO, selecting terms and checkingthem in the catagories box. Make sure you “commit” (save) each selected term. Note, the children of a term are not automatically selected.You need to decide. After saving in the category manager, the new slim appears in the category list

Checking the “filter terms” box during save will allow you to save just your slim to a new file

Now you can use THIS obo in various binning tools such as GO term finder, Vlad, GO Slimmer, rather than the entire GO

GO Slimmer tool is part of AmiGO

You cans specify your genes in a number of ways You can filter on species and evidence code you can input or choose a GO slim

You can also select various output options The gene product counts and a tab-delimited file are great for making pie or bar charts in Excel!

Visit and for more GO Slim help