Download presentation
Presentation is loading. Please wait.
1
Annotating Metagenomes Using the NMPDR Rob Edwards Department of Computer Sciences, San Diego State University Mathematics and Computer Sciences Division, Argonne National Laboratory ASM General Meeting, Boston. www.nmpdr.orgwww.theseed.org See also poster: B-179 (126B) Aziz et al
2
First bacterial genome 100 bacterial genomes 1,000 bacterial genomes Number of known sequences Year How much has been sequenced? Environmental sequencing www.nmpdr.orgwww.theseed.org
3
Everybody in Boston Everybody in USA All cultured Bacteria 100 people How much will be sequenced? One genome from every species Most major microbial environments www.nmpdr.orgwww.theseed.org
4
The Problem How do you generate consistent and accurate annotations for metagenomes? www.nmpdr.orgwww.theseed.org
5
The SEED Family www.nmpdr.orgwww.theseed.org
6
Annotations using subsystems FIG has developed the notion of Subsystem – a generalization of “pathway” as a collection of functional roles jointly involved in a biological process or complex Extended subsystems into FIGfams – protein families that perform the same functions. www.nmpdr.orgwww.theseed.org
7
Subsystems make up metabolism Wikipedia Metabolism http://en.wikipedia.org/wiki/Portal:Metabolism
8
SEED Viewer www.nmpdr.orgwww.theseed.org
9
Populated Subsystem www.nmpdr.orgwww.theseed.org
10
predicted or measured co-regulation genome context (virulence islands, prophages, conserved gene clusters) virulence mechanism cellular localization enzymatic activity common phenotype combinations of criteria Subsystems Are Not Just Pathways www.nmpdr.orgwww.theseed.org
11
Automated Annotations of Complete genomes Automated user originated processing Takes 1-7 hours depending on size and complexity of the genome ~1,500 external submissions, including 150 genomes not yet publicly released. Reannotation of >500 genomes complete 789 users, 160 organizations, 25 countries. http://rast.nmpdr.org/
12
Automated Annotations of Complete Metagenomes MG-RAST Server Accurate and consistent annotations in a few days Automatic metabolic reconstruction Freely available after registration http://metagenomics.theseed.org/ www.nmpdr.orgwww.theseed.org
13
Metagenome Annotation Automated pipeline –upload sequences in fasta, with or without Q- scores –removes exact duplicates (454 artefact) –renumbers sequences (mapping provided) –BLAST against SEED nr, 16S rDNA –Annotations and metabolic reenactment –Taxonomic summary www.nmpdr.orgwww.theseed.org
14
Metagenome Metabolic Reenactment
15
Phylogenomics
16
Comparing Metagenomes to Genomes (or other metagenomes!)
17
Metabolic potential in environments
18
Hours of Compute Time Input size (MB) MG-RAST computation ~19 hours of compute per input megabyte
19
How much so far 676 metagenomes 10,012,793,995 bp (10 Gbp) Average: ~15 M bp per genome Compute time (on a single CPU): 190,243 hours = 7,926 days = 21 years ~200 GS20 ~200 FLX ~200 Sanger] www.nmpdr.orgwww.theseed.org
20
Lots of sequences all pyrosequencing www.nmpdr.orgwww.theseed.org
21
Sulfur CDA 60.2% CDA 21.7% Respiration Capsule Motility Membrane transport Stress Signaling Phosphorus RNA Mine Saltern Marine Microbialites Coral Fish Animals Freshwater From Sequences To Environments Dinsdale et al, Nature 2008
22
Upcoming Features More user options (removing sequences, E-values, percent identities, etc) More databases (ACLAME, human, etc) More user generated content (mash- ups) via webservices and published API www.nmpdr.orgwww.theseed.org
23
Thanks: Bahador Nosrat SDSU Accessing Data via Web Services
24
Workshops Free workshops on NMPDR, RAST, mg- RAST, SEED Upcoming workshops: Greece, Argonne, Urbana-Champaign, San Diego Contact Leslie McNeil lkmcneil@ncsa.uiuc.edu or visit http://www.nmpdr.org/
25
Acknowledgements Environmental Genomics Forest Rohwer and the labs that provided sequence Metagenomics Annotation Server Rick Stevens Daniel Paarman Folker Meyer Bob Olsen Mark D'Souza Statistics & Web services Liz Dinsdale Dana Hall Beltran Rodriguez-Brito Bahador Nosrat FIG Ross Overbeek Veronika Vonstein Annotators www.nmpdr.orgwww.theseed.org
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.