Download presentation
Presentation is loading. Please wait.
Published byEmerald Carmella Harris Modified over 8 years ago
1
Annotation
2
Traditional genome annotation
3
BLAST Similarities
4
Traditional genome annotation BLAST Similarities
5
Traditional genome annotation BLAST Similarities
6
Traditional genome annotation BLAST Similarities
7
Traditional genome annotation BLAST Similarities
8
Traditional genome annotation BLAST Similarities
9
Traditional genome annotation BLAST Similarities
10
Traditional genome annotation BLAST Similarities
11
Traditional genome annotation BLAST Similarities
12
Traditional genome annotation BLAST Similarities
13
Traditional genome annotation BLAST Similarities
14
Traditional genome annotation BLAST Similarities
15
Traditional genome annotation BLAST Similarities
16
Protein Families
20
Gene Ontology Ontology A “hierarchy” of functions Does not need to be linear Directed Acyclic Graph Controlled Vocabulary Decides which words or phrases to use
21
GO Gene ontology A eukaryotic focus Drosophila Mus Saccharomyces Homo
22
GO Cellular component The parts of a cell Molecular function e.g. ligand binding Biological processes What things do
23
GO Terms [GO ID, function] e.g: GO:0004743 Ontology: molecular function Name: pyruvate kinase activity
24
GO Terms [GO ID, function] e.g: GO:0004743 Ontology: molecular function Name: pyruvate kinase activity Mainly assigned by BLAST/HMMER/... etc
25
Directed Acyclic Graph Molecular function Catalytic activity Transferase activity Transferase activity, transferring phosphorous Kinase activity phosphotransferase activity, alcohol group as acceptor Pyruvate kinase activity
26
Problems Annotation by committee Eukaryotic focus Some efforts to counter that Owen White Arriane Toussaint Not very deep Strict controlled vocabulary
27
Alternatives
28
lacZlacIlacYlacA Jacob & Monod, 1961 Basic biology
29
lacZlacIlacYlacA Basic biology
30
< 80 % Different types of clustering
31
< 80 % Different types of clustering
32
Purine metabolism
33
< 80 % Different types of clustering
34
Heme / chlorophyll metabolism is conserved They are both porphyrins
35
Actinobacteria Aquificae Bacteroidetes Chlamydiae Chloroflexi Cyanobacteria Deinococcus- Thermus Firmicutes Spirochaetes Thermotogae Proteobacteria 1 0.8 0.6 0.4 0.2 0 Clusters of genes w/ maximum 80% identity Genes in subsystems in clusters Total number of genomes in group Fraction of genes in clusters Number of genomes 0 40 80 120 Average Occurrence of clustering in different genomes
36
Subsystem is a generalization of “pathway” collection of functional roles jointly involved in a biological process or complex Functional Role is the abstract biological function of a gene product atomic, or user-defined, examples: 6-phosphofructokinase (EC 2.7.1.11) LSU ribosomal protein L31p Streptococcal virulence factors Should not contain “putative”, “thermostable”, etc Populated subsystem is complete spreadsheet of functions and roles The Subsystems Approach to Annotation
37
Conversion of histidine to glutamate Functional roles defined in table Inclusion in subsystem is only by functional role Controlled vocabulary … Histidine Degradation
38
Column headers taken from table of functional roles Rows are selected genomes or organisms Cells are populated with specific, annotated genes Functional variants defined by the annotated roles Variant code -1 indicates subsystem is not functional Clustering shown by color OrganismVariant HutHHutUHutIGluFHutGNfoDForI Bacteroides thetaiotaomicron 1 Q8A4B3Q8A4A9Q8A4B1Q8A4B0 Desulfotela psychrophila 1 gi51246205gi51246204gi51246203gi51246202 Halobacterium sp. 2 Q9HQD5Q9HQD8Q9HQD6Q9HQD7 Deinococcus radiodurans 2 Q9RZ06Q9RZ02Q9RZ05Q9RZ04 Bacillus subtilis 2 P10944P25503P42084P42068 Caulobacter crescentus 3 P58082Q9A9MIP58079Q9A9M0Q9A9L9 Pseudomonas putida 3 Q88CZ7Q88CZ6Q88CZ9Q88D00Q88CZ3 Xanthomonas campestris 3 Q8PAA7P58988Q8PAA6Q8PAA8Q8PAA5 Listeria monocytogenes Subsystem Spreadsheet
39
OrganismVariant HutHHutUHutIGluFHutGNfoDForI Bacteroides thetaiotaomicron 1 Q8A4B3Q8A4A9Q8A4B1Q8A4B0 Desulfotela psychrophila 1 gi51246205gi51246204gi51246203gi51246202 Halobacterium sp. 2 Q9HQD5Q9HQD8Q9HQD6Q9HQD7 Deinococcus radiodurans 2 Q9RZ06Q9RZ02Q9RZ05Q9RZ04 Bacillus subtilis 2 P10944P25503P42084P42068 Caulobacter crescentus 3 P58082Q9A9MIP58079Q9A9M0Q9A9L9 Pseudomonas putida 3 Q88CZ7Q88CZ6Q88CZ9Q88D00Q88CZ3 Xanthomonas campestris 3 Q8PAA7P58988Q8PAA6Q8PAA8Q8PAA5 Listeria monocytogenes Subsystem Spreadsheet “The Populated Subsystem”
40
Wet lab Chromosomal context Metabolic context Phylogenetic context Microarray data Proteomics data … Subsystems developed based on
41
Three level “hierarchy” Amino Acids and Derivatives –Alanine, serine, and glycine Serine Biosynthesis Amino Acids and Derivatives –Lysine, threonine, methionine, and cysteine Methionine Biosynthesis Make your own subsystems! About 2,500 Subsystems
42
Growth in Subsystems Over Time
43
Classification # SS Classification # SS Classification# SS Experimental Subsystems 498Regulation and Cell signaling 51Motility and Chemotaxis 11 Clustering-based subsystems 352Virulence49Plant cell walls and outer surfaces 10 Carbohydrates160Stress Response43Phages10 Cofactors, Vitamins, Prosthetic Groups, Pigments 123DNA Metabolism41Cell Division and Cell Cycle 10 Amino Acids and Derivatives 96Aromatic Compounds38Photosynthesis9 Protein Metabolism95Phages36Metabolite damage8 Virulence, Disease, Defense 70Secondary Metabolism34Phosphorus Metabolism 7 Miscellaneous70Iron acquisition and metabolism 31Potassium metabolism4 RNA Metabolism65Nucleosides and Nucleotides 24Transcriptional regulation 2 Membrane Transport65Sulfur Metabolism20Plasmids2 Respiration62Dormancy and Sporulation 17Central metabolism2 Cell Wall and Capsule62Plant-prokaryote12Autotrophy2 Fatty Acids, Lipids, and Isoprenoids 60Nitrogen Metabolism12Arabinose Transport1
44
RAST usage grows...
45
RAST coverage....
46
RASTtk RAST2.0 Customizable choice of pipelines to run Same behind the scenes infrastructure
47
RASTtk
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.