Download presentation
Presentation is loading. Please wait.
1
SGM Meeting, Warwick, April 2006
Challenges for metagenomic data analysis and lessons from viral metagenomes [What would you do if sequencing were free?] Rob Edwards San Diego State University Fellowship for Interpretation of Genomes
2
Outline The envy is not mine A tour around the world, thanks to phage People suck What is the most successful gene in evolution? Is there a Future?
3
This is all 454 sequence data
21 libraries 10 microbial, 11 phage 597,340,328 bp total 20% of the human genome 50% of all complete and partial microbial genomes 5,769,035 sequences Average 274,716 per library Average read length bp Av. read length has not increased in 7 months Cost 0.04¢ per bp
4
Sequencing is cheap and easy. Bioinformatics is neither.
5
The Soudan Mine, Minnesota
Red Stuff Oxidized Black Stuff Reduced
6
Red and Black Samples Are Different
Black stuff Cloned and 454 sequenced 16S are indistinguishable Cloned Red Red
7
There are different amounts of metabolism in each environment
8
There are different amounts of substrates in each environment
Stuff Black Stuff
9
But are the differences significant?
Sample 10,000 proteins from site 1 Count frequency of each “subsystem” Repeat 20,000 times Repeat for sample 2 Combine both samples Sample 10,000 proteins 20,000 times Build 95% CI Compare medians from sites 1 and 2 with 95% CI Rodriguez-Brito (2006). BMC Bioinformatics
10
Subsystem differences & metabolism Iron acquisition
Black Stuff Siderophore enterobactin biosynthesis ferric enterobactin transport ABC transporter ferrichrome ABC transporter heme Black stuff: ferrous iron (Fe2+, ferroan [(Mg,Fe)6(Si,Al)4O10(OH)8]) Red stuff: ferric iron (goethite [FeO(OH)])
11
Nitrification differentiates the samples
Edwards (2006) BMC Genomics
12
The challenge is explaining the differences between samples
Red Sample Arg, Trp, His Ubiquinone FA oxidation Chemotaxis, Flagella Methylglyoxal metabolism Black Sample Ile, Leu, Val Siderophores Glycerolipids NiFe hydrogenase Phenylpropionate degradation
13
We can cheaply compare the important
biochemistry happening in different environments We don’t care which organisms are doing the metabolism but we know what organisms are there
14
Outline The envy is not mine A tour around the world, thanks to phage People suck What is the most successful gene in evolution? Is there a Future?
15
Why Phages? Phages are viruses that infect bacteria
10:1 ratio of phages:bacteria 1031 phages on the planet Specific interactions (probably) one virus : one host Small genome size Higher coverage Horizontal gene transfer bp DNA per year in the oceans Can’t do fosmids
16
Phages In The Worlds Oceans
GOM 41 samples 13 sites 5 years SAR 1 sample 1 site 1 year BBC 85 samples 38 sites 8 years ARC 56 samples 16 sites LI 4 sites
17
Most Marine Phage Sequences are Novel
18
Phages are specific to environments
ssDNA -like Phage Proteomic Tree v. 5 (Edwards, Rohwer) T4-like T7-like Thanks: Mya Breitbart
19
Marine Single-Stranded DNA Viruses
6% of SAR sequences ssDNA phage (Chlamydia-like Microviridae) 40% viral particles in SAR are ssDNA phage Several full-genome sequences were recovered via de novo assembly of these fragments Confirmed by PCR and sequencing
20
SAR Aligned Against the Chlamydia 4
Individual sequence reads Coverage Concatenated hits Chlamydia phi 4 genome 12,297 sequence fragments hit using TBLASTX over a ~4.5 kb genome
21
Outline The envy is not mine A tour around the world, thanks to phage People suck What is the most successful gene in evolution? Is there a Future?
22
Phages, Reefs, and Human Disturbance
23
Phages, Reefs, and Human Disturbance
Kingman Christmas Kingman Palmyra Washington Fanning Christmas The Northern Line Islands Expedition, 2005
24
Christmas to Kingman Bias in No. Phage Hosts
Negative numbers mean relatively more phage hosts at Kingman More pathogens at Christmas. More people at Christmas. More photosynthesis at Kingman. No people at Kingman.
25
Outline The envy is not mine A tour around the world, thanks to phage People suck What is the most successful gene in evolution? Is there a Future?
26
Phages enrich for important genes
Rios Mesquites Stromatolites No photosynthesis genes in phages Pozas Azules Stromatolites 5 different photosynthesis genes in phages
27
RNR is the most successful reaction in evolution
28
Outline The envy is not mine A tour around the world, thanks to phage People suck What is the most successful gene in evolution? Is there a Future?
29
Computational Challenges
Sequence annotations and analysis What is there? What is it doing? How is it doing it? Gene predictions in unknowns Lutz Krause (Bielefeld) Sequence comparisons BLAST Other ways to rapidly compare short sequences What happens when everyone is using 454 sequencing?
30
Sequence data from 21 libraries
600 million bp 6 million sequences Each BLASTX search takes 1,000 CPU hours 21 libraries = 21,000 CPU hours or 2.4 CPU years Users want repeat runs, TBLASTX, more analysis more data more, more, more, more
31
SDSU Forest Rohwer USF Mya Breitbart Rohwer Lab Stromatolites ANL
Beltran Rodriguez-Brito USF Mya Breitbart Rohwer Lab Linda Wegley Florent Angly Matt Haynes Stromatolites Janet Seifert Rice University) Valeria Souza (UNAM, Mexico) ANL Rick Stevens Bob Olsen CI Support FIG Veronika Vonstein Ross Overbeek Annotators Also at SDSU Anca Segall Stanley Maloy Math Peter Salamon Joe Mahaffy James Nulton Ben Felts David Bangor Steve Rayhawk Jennifer Mueller UBC Curtis Suttle Amy Chan MIT: Ed DeLong
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.