Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Gene Ontology Project

Similar presentations


Presentation on theme: "The Gene Ontology Project"— Presentation transcript:

1 The Gene Ontology Project

2 gene gene gene gene gene
DNA gene gene gene gene gene mRNA Genes produce proteins protein

3 Proteins do the work in an organism
Proteins do the work in an organism. Malfunctioning, missing, or over-abundant proteins are a problem. They are likely to be responsible for many of the health problems that are currently difficult to treat, such as Parkinson’s disease, depression, Alzheimer’s disease, schizophrenia and cancer. html/fnmol /images/article/image_n/fnmol g001.gif

4 What is the bigger picture?
substrate + O2 = CO2 +H20 product Gene product: What does it do? Where does it do it? What is the bigger picture? GO categorizes gene products (e.g. proteins) according to what they do, where they do it, and what general process is achieved as a result. mitochondrion/krebpic.html

5 Clark et al., 2005 is_a part_of
The graphs have multiple inheritance, and gene products can be categorized at any level, and to as many categories as are needed to capture the information.

6 Clark et al., 2005 is_a part_of
The graph can be slimmed to find groups of gene products involved in more general processes.

7 Whole genome analysis (J. D. Munkvold et al., 2004)
This enables us to get a rough idea of what kind of genes are contained in a newly sequenced genome.

8 See which processes are upregulated or downregulated.
time Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Puparial adhesion Molting cycle hemocyanin Amino acid catabolism Lipid metobolism Or to see what effect a treatment or disease state has on the range of processes taking place in an organism. The GO is also helpful for natural language processing. This opens up opportunities for processing the research literature to spot patterns or questions that have not been found by individual scientists. Peptidase activity Protein catabolism Immune response Immune response Toll regulated genes attacked control Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI.

9 Current metrics 3 ontologies Biological process Molecular function
Cellular component 26,933 terms 5 relationship types 47,000+ relationships 40,000,000+ annotations 88,193 species. This is the current state of the GO

10 Future plans Cross products Function-process links
These are some of the big developments coming along soon.

11 Cross products: Identify ontologies within ontologies – make computable
Currently, although the process ontology is notionally just one ontology, it actually contains multiple ontologies, For example with the antennal disc terms in the bottom centre of this slide you can see that there are really two ontologies. One contains the terms ‘development’ and ‘morphogenesis’ and the other contains ‘imaginal disc’, eye-antennal disc’ and ‘antenna’. These two ontologies are working together compound terms to capture more complex processes.

12 Intersection tags / cross products
[Term] id: GO: ! embryonic shoot morphogenesis intersection_of: GO: ! shoot morphogenesis intersection_of: part_of GO: ! embryonic development id: GO: ! post-embryonic root morphogenesis intersection_of: GO: ! root morphogenesis intersection_of: part_of GO: ! post-embryonic development We are making compound terms in GO computable by using intersection tags.

13 Cross products (xp) in progress
XPs being set up amongst GO ontologies. Also with: cell type ontology, uberon (anatomy) ontology, ChEBI (chemical) ontology. 170+ new relationship types planned. Currently are making intersection links within and between the ontologies and also out to three other ontologies. We plan to have at least 170 new relationship types.

14 Link out to other ontologies:
Open Biomedical Ontologies 54 compatible ontologies 26 other nearly compatible ontologies In the future we plan to use intersection tags to link the GO to many more ontologies. The tools that we use to edit the ontologies provide some reasoner support for the people who edit the terms. However the editing program is currently struggling to process the data fast enough to provide proper support for the people who make the edits. We are looking into ways to fix this.

15 Function-process links
Another plan for the future is to link function to process in situations where the single step reactions in a process are well known. However, this presents some challenges.

16 hp hp hp hp hp hp hp Function glucose metabolic process Process
biosynthetic process glucose metabolic process UDP-glucose metabolic process colanic acid biosynthetic process galactose metabolic process carbohydrate catabolic process response to desiccation hp hp hp hp hp hp hp Process Many very different processes are dependent on the same single step reactions. UTP:glucose-1-phosphate uridylyltransferase activity α-D-glucose 1-phosphate + UTP -> UDP-D-glucose + diphosphate Function hp = has_part

17 Lysine biosynthesis pathways
Many processes that have the same input and outcome happen in different ways in different organisms.

18 lysine biosynthesis is_a lysine biosynthesis 7? is_a is_a is_a is_a
Process lysine biosynthesis is_a lysine biosynthesis 7? is_a is_a is_a is_a is_a is_a lysine biosynthesis 1 lysine biosynthesis 3 lysine biosynthesis 5 Each different variant of a process would need a term of its own. lysine biosynthesis 4 lysine biosynthesis 2 lysine biosynthesis 6 Function

19 Process Process B Process C Function = has_part Shared function?
Lysine Biosynthesis Process B Process C We will have many relationships to manage once the information has all been gathered. Much of the information will be able to be mined from a variety of other sources. = has_part Function Shared function? new GO term Non-shared function existing GO term

20 Relationship explosion
Process Lysine Biosynthesis Relationship explosion Process B Process C There will be a relationship explosion! (Image courtesy of This means that we need very fast computation so that the tools we use for editing and processing the data can provide proper support for the people who edit and maintain the terms and relationships. (or Editorial office explosion) Function

21 Tools

22 OBO-Edit: Ontology Editor Tool
OBO-Edit is a standalone java application developer in-house for ontology development.

23 Reasoner finds missing and redundant relationships.
We are developing a reasoner to spot when relationships are missing or redundant. The reasoner currently runs very slowly so editors can only run it occasionally to check their work. They cannot have it running constantly to support their editing. We hope to speed the reasoner up so that the editors can have it running all the time to aid their work. We also have a tool called obol, which autogenerates moduler sets of terms based on general rules that have been figured out by the ontology editor.

24 U.S. Virgin Islands, March 30 - April 3, 2006
2006 Consortium Meeting, St. Croix, U.S. Virgin Islands, March 30 - April 3, 2006

25 E. Coli hub Reactome

26 Particular thanks to: Chris Mungall - OBOL, Rule Based Reasoner
Cross product system Function process link plan – The electron transport working group OBO-Edit: John Day-Richter, Nomi Harris, Amina Abdulla:


Download ppt "The Gene Ontology Project"

Similar presentations


Ads by Google