Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster.

Similar presentations


Presentation on theme: "Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster."— Presentation transcript:

1 Part II GO-Vocabulary of Genome

2 S. cerevisiae

3 D. melanogaster

4 Cells that normally survive CED-9 ON CED-3 CED-4 OFF CED-9 OFF CED-3 CED-4 ON Cells that normally die C elegans

5 M. musculus

6 MCM3 MCM2 CDC46/MCM5 CDC47/MCM7 CDC54/MCM4 MCM6 These proteins form a hexamer in the species that have been examined Comparison of sequences from 4 organisms

7 A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

8 Gene Ontology - 1998 FlyBaseDrosophilaCambridge, EBI, Harvard Berkeley & Bloomington. SGDSaccharomycesStanford. MGIMusJackson Labs., Bar Harbor.

9 Gene Ontology -now Fruitfly - FlyBase Budding yeast - Saccharomyces Genome Database (SGD) Mouse - Mouse Genome Database (MGD & GXD) Rat - Rat Genome Database (RGD) Weed - The Arabidopsis Information Resource (TAIR) Worm - WormBase Dictyostelium discoidem - Dictybase InterPro/UniProt at EBI - InterPro Fission yeast - Pombase Human - UniProt, Ensembl, NCBI, Incyte, Celera, Compugen Parasites - Plasmodium, Trypanosoma, Leishmania - GeneDB - Sanger Microbes - Vibrio, Shewanella, B. anthracus, … - TIGR Grasses - rice & maize - Gramene database zebra fish – Zfin.........

10 To provide structured controlled vocabularies for the representation of biological knowledge in biological databases.

11 Be open source Use open standards Make data & code available without constraint Involve your community

12 Outline Introduction to the Gene Ontologies (GO) Annotations to GO terms GO Tools Applications of GO

13 Gene Ontology Objectives GO represents concepts used to classify specific parts of our biological knowledge: –Biological Process –Molecular Function –Cellular Component GO develops a common language applicable to any organism GO terms can be used to annotate gene products from any species, allowing comparison of information across species

14 GO: Three ontologies Where does it act? What processes is it involved in? What does it do?Molecular Function Cellular Component Biological Process gene product

15 Function (what) Process (why) Drive nail (into wood) Carpentry Drive stake (into soil) Gardening Smash roach Pest Control Clown’s juggling object Entertainment Example: Gene Product = hammer

16 Biological Examples Molecular Function Biological Process Cellular Component

17 Molecular Function = elemental activity/task –the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity Biological Process = biological goal or objective –broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex –subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme The 3 Gene Ontologies

18 Molecular Function A single reaction or activity, not a gene product A gene product may have several functions Sets of functions make up a biological process

19 Molecular Function

20 Carbonate dehydratase activity

21 Biological Process

22 Gluconeogenesis

23 Cellular Component where a gene product acts

24 Mitochondrial membrane

25 term: gluconeogenesis id: GO:0006094 definition: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. What’s in a GO term?

26 What’s in a name?

27 Molecular Function 7,309 terms Biological Process 10,041 terms Cellular Component 1,629 terms Total 18, 975 terms Definitions: 94.9 % Obsolete terms: 992 Content of GO As of October 2005

28

29

30 What’s in a name? Glucose synthesis Glucose biosynthesis Glucose formation Glucose anabolism Gluconeogenesis All refer to the process of making glucose from simpler components

31 tree directed acyclic graph

32 Nucleus Nucleoplasm Nuclear envelope ChromosomePerinuclear spaceNucleolus A child is a subset of a parent’s elements The cell component term Nucleus has 5 children Parent-Child Relationships

33 Ontology Relationships Directed Acyclic Graph

34

35 Evidence Codes for GO Annotations http://www.geneontology.org/doc/GO.evidence.html

36 IEAInferred from Electronic Annotation ISSInferred from Sequence Similarity IEPInferred from Expression Pattern IMPInferred from Mutant Phenotype IGIInferred from Genetic Interaction IPIInferred from Physical Interaction IDAInferred from Direct Assay RCAInferred from Reviewed Computational Analysis TASTraceable Author Statement NASNon-traceable Author Statement ICInferred by Curator NDNo biological Data available

37 IEA Inferred from Electronic Annotation Sequence Similarity (BLAST) Automatic transfer from mappings (InterPro2GO, EC2GO etc.) -> Not manually reviewed

38 ISS Inferred from Sequence or Structural Similarity Sequence similarity Recognized domains Structural similarity ->Use of ‘with’ column recommended

39 IEP Inferred from Expression Pattern Transcript levels (Northerns, microarrays) Protein levels (Western blots) ->Timing or localization of expression ->Biological process annotations

40 IMP Inferred from Mutant Phenotype Gene mutation/knockout Overexpression/ectopic expression Anti-sense experiments RNAi experiments Specific protein inhibitors

41 IGI Inferred from Genetic Interaction Suppressors, synthetic lethals… Functional complementation Rescue experiments ->Use of ‘with’ column recommended

42 IPI Inferred from Physical Interaction 2-hybrid interactions Co-purification Co-immunoprecipitation Ion/complex/protein binding experiments ->Use of ‘with’ column recommended

43 IDA Inferred from Direct Assay Enzyme assays In vitro reconstitution (e.g. transcription) Immunofluorescence (for cell. comp.) Cell fractionation (for cell. comp.) Physical interaction/binding assay

44 RCA Inferred from Reviewed Computational Analysis Non-sequence-based computational methods Genome-wide analyses (e.g. 2-hybrid) Combinations of large-scale experiments

45 TAS Traceable Author Statement Support from review article Textbook ‘common knowledge’ ->Data that can be ‘traced’ back

46 NAS Non-traceable Author Statement Database entries that don't cite a paper ->Data that cannot be ‘traced’ back

47 IC Inferred by Curator Not supported by any direct evidence Inferred from other GO annotations -> GO term in ‘with/from’ column required

48 ND No biological Data available molecular function unknown GO:0005554 biological process unknown GO:0000004 cellular component unknown GO:0008372 Curator found no information supporting any annotation

49 TAS/IDA IMP/IGI/IPI ISS/IEP NAS IEA Term Hierarchy

50 Meloidogyne incognita: McCarter et al. 2003 Annotation summaries

51

52 Mitochondrial P450 Annotation of gene products with GO terms

53 Cellular component: mitochondrial inner membrane GO:0005743 Biological process: Electron transport GO:0006118 Molecular function: monooxygenase activity GO:0004497 substrate + O 2 = CO 2 +H 2 0 product

54 Other gene products annotated to monooxygenase activity (GO:0004497) - monooxygenase, DBH-like 1 (mouse) - prostaglandin I2 (prostacyclin) synthase (mouse) - flavin-containing monooxygenase (yeast) - ferulate-5-hydrolase 1 (arabidopsis)

55 Annotate to finest granularity Annotating to GO:0030047 automatically annotates to all of its parents; thus a product is annotated to both protein modification AND cytoskeleton organization

56 Unknown v.s. Unannotated “Unknown” is used when the curator has determined that there is no existing literature to support an annotation. –Biological process unknown GO:0000004 –Molecular function unknown GO:0005554 –Cellular component unknown GO:0008372 NOT the same as having no annotation at all –No annotation means that no one has looked yet

57 Annotation of a genome GO annotations are always work in progress Part of normal curation process –More specific information –Better evidence code Replace obsolete terms “Last reviewed” date

58 How to access the Gene ontology and its annotations 1. Downloads Ontologies Annotations : Gene association files Ontologies and Annotations 2. Web-based access AmiGO (http://www.godatabase.org) QuickGO (http://www.ebi.ac.uk/ego) among others…

59 Gene Ontology :

60

61 attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. …analysis of high-throughput data according to GO MicroArray data analysis

62 Anatomy Physiology Phenotype Pathway Disease Molecular Metabolic Developmental Stage Ontologies


Download ppt "Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster."

Similar presentations


Ads by Google