Download presentation
Presentation is loading. Please wait.
1
High Throughput Computational Sequence Analysis Rob Edwards redwards@salmonella.org Argonne National Laboratory San Diego State University
2
First bacterial genome 100 bacterial genomes 1,000 bacterial genomes Number of known sequences Year How much has been sequenced Environmental sequencing
3
Everybody in San Diego Everybody in USA All cultured Bacteria 100 people How much will be sequenced One genome from every species Most major microbial environments
4
High Performance Computing
5
TeraGrid
6
The Teragrid National Resource
7
Life Sciences Gateway to TeraGrid
8
Subsystems
9
Subsystems make up metabolism Wikipedia Metabolism http://en.wikipedia.org/wiki/Portal:Metabolism
10
Subsystems are not just metabolism http://aig.cs.man.ac.uk/gallery/Utopia/ Enzyme complex http://webdeptos.uma.es/ Cell Machinery http://www.brown.edu/ Cell Processes
11
http://www.theseed.org
13
Growth in generation of subsystems
14
Microbial Genomics Annotation Platform Goal 1: Automate the generation of high quality annotations by leveraging the information contained in SubSystems and FIGfams. Goal 2: Minimize turnaround time. Initial target 48 hours
15
Automated process consisting of: –Gene calling –Initial annotation of function –Initial metabolic reconstruction Process takes 1-7 hours depending on size and complexity of the genome ~20 genomes per day Password protected, secure, private Release to public databases if required Freely available annotation service http://www.nmpdr.org/anno-server/index48.cgi
16
Some estimate of annotation quality
17
Evaluation / Viewing
18
Download results We provide a number of export formats: –Genbank, Fasta, GFF3, Excel –can easily be extended to all formats supported by BioPerl Genomes can be deleted by the user at any time (we keep them for max. 120 days) Genomes can be directly imported into the SEED if the user wishes all genomes are password protected
19
Metagenomics SEED
20
http://metagenomics.theseed.org
21
Metagenome Metabolic Reconstruction
22
Starch utilization in cow rumens
23
Metabolic potential in environments
24
Everybody in San Diego Everybody in USA All cultured Bacteria 100 people Too much will be sequenced One genome from every species Most major microbial environments
25
Acknowledgements Argonne National Laboratory Rick Stevens Bob Olson Folker Meyer San Diego State University Forest Rohwer Fellowship for Interpretation of Genomes Ross Overbeek Veronika Vonstein The Annotators
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.