Presentation is loading. Please wait.

Presentation is loading. Please wait.

Annotation. Traditional genome annotation BLAST Similarities.

Similar presentations


Presentation on theme: "Annotation. Traditional genome annotation BLAST Similarities."— Presentation transcript:

1 Annotation

2 Traditional genome annotation

3 BLAST Similarities

4 Traditional genome annotation BLAST Similarities

5 Traditional genome annotation BLAST Similarities

6 Traditional genome annotation BLAST Similarities

7 Traditional genome annotation BLAST Similarities

8 Traditional genome annotation BLAST Similarities

9 Traditional genome annotation BLAST Similarities

10 Traditional genome annotation BLAST Similarities

11 Traditional genome annotation BLAST Similarities

12 Traditional genome annotation BLAST Similarities

13 Traditional genome annotation BLAST Similarities

14 Traditional genome annotation BLAST Similarities

15 Traditional genome annotation BLAST Similarities

16 Protein Families

17

18

19

20 Gene Ontology Ontology  A “hierarchy” of functions  Does not need to be linear Directed Acyclic Graph Controlled Vocabulary  Decides which words or phrases to use

21 GO Gene ontology  A eukaryotic focus Drosophila Mus Saccharomyces Homo

22 GO Cellular component  The parts of a cell Molecular function  e.g. ligand binding Biological processes  What things do

23 GO Terms [GO ID, function] e.g:  GO:0004743  Ontology: molecular function  Name: pyruvate kinase activity

24 GO Terms [GO ID, function] e.g:  GO:0004743  Ontology: molecular function  Name: pyruvate kinase activity Mainly assigned by BLAST/HMMER/... etc

25 Directed Acyclic Graph Molecular function Catalytic activity Transferase activity Transferase activity, transferring phosphorous Kinase activity phosphotransferase activity, alcohol group as acceptor Pyruvate kinase activity

26 Problems Annotation by committee Eukaryotic focus  Some efforts to counter that Owen White Arriane Toussaint Not very deep Strict controlled vocabulary

27 Alternatives

28 lacZlacIlacYlacA Jacob & Monod, 1961 Basic biology

29 lacZlacIlacYlacA Basic biology

30 < 80 % Different types of clustering

31 < 80 % Different types of clustering

32 Purine metabolism

33 < 80 % Different types of clustering

34 Heme / chlorophyll metabolism is conserved They are both porphyrins

35 Actinobacteria Aquificae Bacteroidetes Chlamydiae Chloroflexi Cyanobacteria Deinococcus- Thermus Firmicutes Spirochaetes Thermotogae Proteobacteria 1 0.8 0.6 0.4 0.2 0 Clusters of genes w/ maximum 80% identity Genes in subsystems in clusters Total number of genomes in group Fraction of genes in clusters Number of genomes 0 40 80 120 Average Occurrence of clustering in different genomes

36 Subsystem is a generalization of “pathway”  collection of functional roles jointly involved in a biological process or complex Functional Role is the abstract biological function of a gene product  atomic, or user-defined, examples: 6-phosphofructokinase (EC 2.7.1.11) LSU ribosomal protein L31p Streptococcal virulence factors Should not contain “putative”, “thermostable”, etc Populated subsystem is complete spreadsheet of functions and roles The Subsystems Approach to Annotation

37 Conversion of histidine to glutamate Functional roles defined in table Inclusion in subsystem is only by functional role Controlled vocabulary … Histidine Degradation

38 Column headers taken from table of functional roles Rows are selected genomes or organisms Cells are populated with specific, annotated genes Functional variants defined by the annotated roles Variant code -1 indicates subsystem is not functional Clustering shown by color OrganismVariant HutHHutUHutIGluFHutGNfoDForI Bacteroides thetaiotaomicron 1 Q8A4B3Q8A4A9Q8A4B1Q8A4B0 Desulfotela psychrophila 1 gi51246205gi51246204gi51246203gi51246202 Halobacterium sp. 2 Q9HQD5Q9HQD8Q9HQD6Q9HQD7 Deinococcus radiodurans 2 Q9RZ06Q9RZ02Q9RZ05Q9RZ04 Bacillus subtilis 2 P10944P25503P42084P42068 Caulobacter crescentus 3 P58082Q9A9MIP58079Q9A9M0Q9A9L9 Pseudomonas putida 3 Q88CZ7Q88CZ6Q88CZ9Q88D00Q88CZ3 Xanthomonas campestris 3 Q8PAA7P58988Q8PAA6Q8PAA8Q8PAA5 Listeria monocytogenes Subsystem Spreadsheet

39 OrganismVariant HutHHutUHutIGluFHutGNfoDForI Bacteroides thetaiotaomicron 1 Q8A4B3Q8A4A9Q8A4B1Q8A4B0 Desulfotela psychrophila 1 gi51246205gi51246204gi51246203gi51246202 Halobacterium sp. 2 Q9HQD5Q9HQD8Q9HQD6Q9HQD7 Deinococcus radiodurans 2 Q9RZ06Q9RZ02Q9RZ05Q9RZ04 Bacillus subtilis 2 P10944P25503P42084P42068 Caulobacter crescentus 3 P58082Q9A9MIP58079Q9A9M0Q9A9L9 Pseudomonas putida 3 Q88CZ7Q88CZ6Q88CZ9Q88D00Q88CZ3 Xanthomonas campestris 3 Q8PAA7P58988Q8PAA6Q8PAA8Q8PAA5 Listeria monocytogenes Subsystem Spreadsheet “The Populated Subsystem”

40 Wet lab Chromosomal context Metabolic context Phylogenetic context Microarray data Proteomics data … Subsystems developed based on

41 Three level “hierarchy” Amino Acids and Derivatives –Alanine, serine, and glycine Serine Biosynthesis Amino Acids and Derivatives –Lysine, threonine, methionine, and cysteine Methionine Biosynthesis Make your own subsystems! About 2,500 Subsystems

42 Growth in Subsystems Over Time

43 Classification # SS Classification # SS Classification# SS Experimental Subsystems 498Regulation and Cell signaling 51Motility and Chemotaxis 11 Clustering-based subsystems 352Virulence49Plant cell walls and outer surfaces 10 Carbohydrates160Stress Response43Phages10 Cofactors, Vitamins, Prosthetic Groups, Pigments 123DNA Metabolism41Cell Division and Cell Cycle 10 Amino Acids and Derivatives 96Aromatic Compounds38Photosynthesis9 Protein Metabolism95Phages36Metabolite damage8 Virulence, Disease, Defense 70Secondary Metabolism34Phosphorus Metabolism 7 Miscellaneous70Iron acquisition and metabolism 31Potassium metabolism4 RNA Metabolism65Nucleosides and Nucleotides 24Transcriptional regulation 2 Membrane Transport65Sulfur Metabolism20Plasmids2 Respiration62Dormancy and Sporulation 17Central metabolism2 Cell Wall and Capsule62Plant-prokaryote12Autotrophy2 Fatty Acids, Lipids, and Isoprenoids 60Nitrogen Metabolism12Arabinose Transport1

44 RAST usage grows...

45 RAST coverage....

46 RASTtk RAST2.0 Customizable choice of pipelines to run Same behind the scenes infrastructure

47 RASTtk


Download ppt "Annotation. Traditional genome annotation BLAST Similarities."

Similar presentations


Ads by Google