Environmental Genome Shotgun Sequencing of the Sargasso Sea J. Craig Venter1,*, Karin Remington1, John F. Heidelberg3, Aaron L. Halpern2, Doug Rusch2, Jonathan A. Eisen3, Dongying Wu3, Ian Paulsen3, Karen E. Nelson3, William Nelson3, Derrick E. Fouts3, Samuel Levy2, Anthony H. Knap6, Michael W. Lomas6, Ken Nealson5, Owen White3, Jeremy Peterson3, Jeff Hoffman1, Rachel Parsons6, Holly Baden-Tillson1, Cynthia Pfannkoch1, Yu-Hui Rogers4, Hamilton O. Smith1 Bianca Sanchez Mora, Meghana Munagala D145: Genomics January 26, 2017
Introduction Interested in using environmental shotgun sequencing to study difficult to study microbes To find new solutions for energy problems—ocean may have answers Looking for unique genes to create alternative energy/ clean up environment To catalogue poorly understood microorganisms into a database for researchers To stimulate more research
Why the Sargasso Sea? Well Studied Defined boundaries Low Nutrient Cyanobacteria: Synechococcus Prochlorococcus = Chlorophyll rich
Whole Genome Shotgun Sequencing Created by Frederic Sanger A method used to sequence whole genomes by randomly fragmenting DNA then reassembling the overlapping fragments Initially used to identify sequence from one organism Here it is applied to find a representative sequence from a diverse population at once Thermocline: http://oceanservice.noaa.gov/facts/thermocline.html
WGS Procedure
Environmental Sequencing The sequencing of naturally occurring populations and communities Circumvents the need for culturing Overcomes selection/culture bias Aims to capture full measure of diversity through the sequencing of genomes
Comparing Techniques PCR-based Methods Whole-Genome Shotgun Sequencing Previous method used to PCR to assess microbial diversity Limitations: Undersampling of genotypes Small subsample of genomes Pros: Fast Able to sequence large genomes Diverse phylogenetic markers Challenges: Assembly of millions of DNA fragments
Gene Conservation among Prochlorococus = start of genome = end of genome Completed Genomic Sequence Environmentally Sequenced Fragments *similarities between sequences: can work environmentally; able to detect variation *differences between environmental sequence and known sequence; diversity
Sargasso Sea v. Crenarchael clone 4B7 Scaffolds tBLASTx = Translated Bursts Purpose is to find distant relationships between nucleotide sequences Crenarchael clone 4B7 = archea Sim = similarity 4B7 = exact clone of protein *known Numbers = predicted proteins Line = alignment = similarity Q: what are they comparing Nothing below 25%, mostly similar Look up TblastX
Megaplastids found in Scaffold Set More diversity in this set than previously 9 megaplastids Unable to separate other megaplastid because maybe they diverged too far Or possibly shorter assemblies Main point*able to obtain plasmids; and large
Prochlorococcus-related Scaffold M Examined the multiple sequence alignments of contigs Identified 2 distinct classes Shows a homogenous blend of haplotypes = contigs = fragments = stages of assembly of fragments into resulting contigs
Multiple Sequence Alignment Sample Prochlorococcus scaffolds show differences = meaning there is a diverse population A comparison of scaffolds show variations
Phylogenetic Markers of Sargasso Sea Sequences Diversity of Sargasso Sea sequences using multiple phylogenetic markers. Relative contribution of organisms from different major phylogenetic groups (phylotypes)
Phylogenetic tree of rhodopsinlike genes proteorhodopsin: a light driven proton pump discovered in BAC
Challenges/Improvements Problems in method of sample collection Possible bias in diversity estimation Difficult to distinguish between the DNA sequence of closely related Manual Curation Rare organisms less likely to be sequenced—low sequence coverage The real limitation was cost
Conclusion/Significance A new use for shotgun sequencing for studying environmental genomes Microbes were added to an online public data base for research use 1800 species of microbes, 150 new bacteria and over 1.2 million new genes were found Genes found in samples indicate microbial life in the ocean is more abundant and diverse than expected
Additional Reading For another perspective: The following expedition Falkowski, P.G., de Vargas, C., Science 58, 58 (2004); published online 4 March 2004 (10.1126/science.1097146). The following expedition Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S, Yooseph S, et al. (2007) The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific. PLoS Biol 5(3): e77. doi:10.1371/journal.pbio.0050077
References Gross L (2007) Untapped Bounty: Sampling the Seas to Survey Microbial Biodiversity. PLoS Biol 5(3): e85. doi:10.1371/journal.pbio.0050085 Falkowski, P.G., de Vargas, C., Science 58, 58 (2004); published online 4 March 2004 (10.1126/science.1097146). Tring, S.G., Rubin, E.M., Nature 6, 805- 81; published online