Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics.

Similar presentations


Presentation on theme: "Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics."— Presentation transcript:

1 Gene Onotology Part 1: what is the GO? http://www.geneontology.org Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics Bar Harbor, ME

2 What is the GO The scope of the GO The GO Relationships Using the GO for annotation Anatomy of an annotation Evidence codes qualfiers gene association files

3 What IS the GO The Gene Ontology is a dictionary of concepts used to describe the normal properties of a gene product It has concepts describing molecular functions It has concepts describing biological processes It has concepts describing cellular locations that the gene products are found in

4 Gene Ontology Built for a very specific purpose: “annotation of genes and proteins in genomic and protein databases” Built to be applicable to any organism Formed to develop a shared language adequate for the annotation of molecular characteristics across organisms; a common language to share knowledge.

5 The GO is NOT list of genes or proteins although you might find a synonym as a gene or protein name does NOT track diseases although certain disease phenotypes might suggest the function of a gene product or a process that it may participate in you will not find “tumor suppressor activity/tumor suppression” as GO terms

6 The Gene Ontology Consortium Started Small Original GO created in 2000 Three databases involved: FlyBase (Drosophila)‏ MGI (Mouse)‏ SGD (S. cerevisae)‏ Used immediately

7 More quickly joined... Later databases: TAIR (Arabadopsis)‏ TIGR (microbes including prokaryotes)‏ SWISS-PROT (several thousand species inc. human)‏ PSU (P. falciparum)‏ ZFIN (zebrafish)‏ PAMGO (plant pathogens)‏

8 8 Gene Ontology widely adopted AgBase

9 Why do we need this?

10 Tactition Tactile sense Taction perception of touch ; GO:0050975 Often the same term is referred to differently

11 Bud initiation? Of then the same term is used by different communities to mean different things...

12 More specifically The GO is not just a flat list of terms transcription factor activity DNA binding transcription regulator activity membrane mitochondrial membrane glycolysis nucleus cytoplasm ion transport..... transcription factor activity DNA binding transcription regulator activity membrane mitochondrial membrane glycolysis nucleus cytoplasm ion transport.....

13 is_a And the terms can have more than one parent! is_a DNA binding is a type of nucleic acid binding. Nucleic acid binding is a type of binding. There are also relationships between them.

14 Ontology Structure The Gene Ontology is structured as a hierarchical directed acyclic graph (DAG)‏ Terms can have more than one parent and zero, one or more children Terms are linked by three relationships is-a part-of regulates (new)‏ negatively regulates positively regulates is_apart_of

15 Ontology Structure cell membrane chloroplast mitochondrial chloroplast membrane is-a part-of

16 http://www.ebi.ac.uk/ego It gets complicated quickly

17 Molecular Function = elemental activity/task the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity Biological Process = biological goal or objective broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme The 3 Gene Ontologies

18 Cellular Component where a gene product acts

19 Molecular Function activities or “jobs” of a gene product glucose-6-phosphate isomerase activity i nsulin binding insulin receptor activity A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product. Sets of functions make up a biological process.

20 Biological Process gluconeogenesis cell division limb development a commonly recognized series of events

21 Mitochondrial P450 ( CC24 PR01238; MITP450CC24)‏ An example…

22 Anatomy of a GO term A GO term obo format stanza begins with [Term] and minimally has id: name: namespace def one or more relationships

23 More GO Term Stanzas

24 24 The Regulates Relationship

25 In the Beginning There Were Two Relationships Is_a: denotes a subtype of its parent. Part_of: denotes a portion of a parent Is_part: If it exists, it is always a part of its parent (this is the relationship we use). Has_part: If there is a parent, then it has this as a part of it.

26 We made the regulation of something a part_of the something But it’s not really part_of

27 So, what’s the issue with regulates? Regulation is not always an inherent part of the process that it regulates A speed-bump regulates the velocity of my car 50 mph5 mph

28 We needed a better way to express ‘regulates’ We defined regulation as “any process that modulates the frequency, rate or extent of something. Something can be: A Biological Process A Molecular Function A Biological Quality

29 A ‘decomposed’ Term [Term] id: GO:0000019 name: regulation of mitotic recombination namespace: biological_process def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators] synonym: "regulation of recombination within rDNA repeats" NARROW [] is_a: GO:0000018 ! regulation of DNA recombination intersection_of: GO:0065007 ! biological regulation intersection_of: regulates GO:0006312 ! mitotic recombination relationship: regulates GO:0006312 ! mitotic recombination The intersection tags make up the logical definition. This places the ‘regulation’ term in the context of mitotic recombination.

30 The context of mitotic recombination

31 Old regulation of mitotic recombination’ part of the graph on top of ‘mitotic recombination’

32 Now regulates

33 What does this buy us? The new relationship portrays the biology more accurately than part_of Regulates Positively rgulates Negatively regulates The new logical definitions allow automated consistency checks as the ontology is developed. The first implementation of cross-products in GO Sets the stage for: Molecular function -> biological process Cell type -> biological process Chebi -> biological process

34

35 On March 18 th 2008)‏ [Term] id: GO:0000019 name: regulation of mitotic recombination namespace: biological_process def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators] narrow_synonym: "regulation of recombination within rDNA repeats" [] is_a: GO:0000018 ! regulation of DNA recombination relationship: part_of GO:0006312 ! mitotic recombination [Term] id: GO:0000019 name: regulation of mitotic recombination namespace: biological_process def: "Any process that modulates the frequency, rate or extent of DNA recombination during mitosis." [GOC:go_curators] synonym: "regulation of recombination within rDNA repeats" NARROW [] is_a: GO:0000018 ! regulation of DNA recombination intersection_of: GO:0065007 ! biological regulation intersection_of: regulates GO:0006312 ! mitotic recombination relationship: regulates GO:0006312 ! mitotic recombination

36 Evolution of GO GO term development was annotation-driven Development directed by use: Terms added as new species annotated Terms added on as as-needed basis Developed by an international consortium of biologists and computer scientists members from individual databases central office at EBI Development involves collaboration with domain experts from different biological fields also formal ontologists

37 Important Consideration for Users The GO changes daily new terms added additional relationships added terms removed: obsoletes terms

38 GO Slims

39 What is a GO Slim A GO Slim is a smaller slice of the GO that can be used to “bin” data into categories relevant to the user's experiment Why use this? you want to group several sections of the GO into a single broader category you want to remove sections that are totally irrelevant for your assay (eg, photosynthetic processes irrelevant for birds).

40 Several GO Slims are referenced in the gene_ontology.obo file Section of OboEdit showing GO slims built into the ontology

41 But you can build your own In OboEdit, select the Category Manager (under Metadata)‏ Use “add” to add a new one; I am adding one for translation

42 Now I browse through the GO, selecting terms and checkingthem in the catagories box. Make sure you “commit” (save) each selected term. Note, the children of a term are not automatically selected.You need to decide. After saving in the category manager, the new slim appears in the category list

43 Checking the “filter terms” box during save will allow you to save just your slim to a new file

44 Now you can use THIS obo in various binning tools such as GO term finder, Vlad, GO Slimmer, rather than the entire GO

45 GO Slimmer tool is part of AmiGO

46 You cans specify your genes in a number of ways You can filter on species and evidence code you can input or choose a GO slim

47 You can also select various output options The gene product counts and a tab-delimited file are great for making pie or bar charts in Excel!

48 Visit http://wwwhttp://www.geneontology.org and http://www.godatabase.org for more GO Slim help


Download ppt "Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics."

Similar presentations


Ads by Google