Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et. al (2004) Presented by Ken Vittayarukskul Steven S. White.

Slides:



Advertisements
Similar presentations
Cyber Metagenomics; Challenge to See The Unseen Majority in The Ocean
Advertisements

Tucson High School Biotechnology Course Spring 2010.
1 3. genome analysis. 2 The first DNA-based genome to be sequenced in its entirety was that of bacteriophage Φ-X174; (5,368 bp), sequenced by Frederick.
Robert May ecologist Photo: Hubble Telescope We have a catalog of all the celestial bodies our instruments can detect in the universe, but …
9 Genomics and Beyond Brief Chapter Outline
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Community structure and metabolism through reconstruction of microbial genomes from the environment articles Venter et al.
Comparative Genomics Virulence in E. coli Diversity of Genomes How Many Genomes are There? Different Genome Perspectives.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Project Proposals Due Monday Feb. 12 Two Parts: Background—describe the question Why is it important and interesting? What is already known about it? Proposed.
The Human Genome Race. Collins vs. Venter Collins Venter.
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Central Dogma Information storage in biological molecules DNA RNA Protein transcription translation replication.
Zebra Finch Seg Dup Analysis 1.Genome 2.Parameters for Pipeline 3.Analysis.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Environmental Genome Shotgun Sequencing of the Sargasso Sea
20.1 – 1 Look at the illustration of “Cloning a Human Gene in a Bacterial Plasmid” (Figure 20.4 in the orange book). If the medium used for plating cells.
Mouse Genome Sequencing
20.1 – 1 Look at the illustration of “Cloning a Human Gene in a Bacterial Plasmid” (Figure 20.4 in the orange book). If the medium used for plating cells.
PHYSICAL MAPPING AND POSITIONAL CLONING. Linkage mapping – Flanking markers identified – 1cM, for example Probably ~ 1 MB or more in humans Need very.
Molecular Microbial Ecology
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
Probes can be designed in an evolutionary hierarchy.
The Sargasso Sea “Metagenome”
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
Advancing Science with DNA Sequence Metagenome definitions: a refresher course Natalia Ivanova MGM Workshop September 12, 2012.
Construction of Substitution Matrices
Microbial genomics Genomics: study of entire genomes Logical next step after genetics: study of genes Genomics: 1) “Structural genomics” * Determine and.
Big Picture Of ≈1.7 million species classified so far, roughly 6000 are microbes True number of microbes is obviously larger than 6000 “Imagine if our.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Molecular Phylogeny. 2 Phylogeny is the inference of evolutionary relationships. Traditionally, phylogeny relied on the comparison of morphological features.
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Human Genome.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Phylogeography of Leucetta chagosensis (Porifera, Calcarea) Christoph Flucke, Jens Kurz, Rasmus Liedigk, Zdenka Valenzova Fig.4: RAxML Phylogram Fig.5:
MEGAN analysis of metagenomic data Daniel H. Huson, Alexander F. Auch, Ji Qi, et al. Genome Res
Date of download: 6/23/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A)
MICROBIOLOGIA GENERALE Prokaryotic genomes. The Escherichia coli nucleoid.
From Reads to Results Exome-seq analysis at CCBR
Metagenomic Species Diversity.
Human Genome Project.
Environmental Genome Shotgun Sequencing of the Sargasso Sea
3. genome analysis.
Taxonomic distribution of large DNA viruses in the sea
Seminar in Bioinformatics (236818)
Genomic Data Manipulation Thinking about data visually
Genomes and Their Evolution
Environmental Genome Shotgun Sequencing of the Sargasso Sea
Workshop on the analysis of microbial sequence data using ARB
Life on Earth Thought to be ~3.8 billion years old
Genomic Data Manipulation
Today… Review a few items from last class
H = -Σpi log2 pi.
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Metagenomics Microbial community DNA extraction
Volume 11, Issue 3, Pages (March 2018)
Volume 18, Issue 5, Pages (November 2015)
Volume 11, Issue 3, Pages (March 2018)
Genome resolved metagenomics
Comparison of species and function profiles with ultradeep sequencing data. Comparison of species and function profiles with ultradeep sequencing data.
Volume 27, Issue 9, Pages (May 2017)
Toward Accurate and Quantitative Comparative Metagenomics
Presentation transcript:

Environmental Genome Shotgun Sequencing of the Sargasso Sea Venter et. al (2004) Presented by Ken Vittayarukskul Steven S. White.

Context of the Problem Evolutionary history is directly tied to microbial genetics. Little is known Until recently, microbial diversity was measured by PCR amplification and sequencing of only ribosomal genes 20 major phyla identified in Bacteria and Archaea through this approach 2 critical limitations: undersampling of total genes undersampling of rest of genome Larger microbes can’t be cultured ex situ and thus are ignored

Objective of Venter et. al: What they did Sought to obtain a more representative study of the “gene content, genetic diversity, and relative abundance of the microbial population found in the oceans." How did they go about doing this? Sargasso Sea, four sites (RV Weatherbird II, and SV Sorcerer II) Extracted DNA from 0.1 to 3.0 um-filtered seawater Applied whole-genome shot sequencing to the DNA samples for the above purpose

Why the Sargasso Sea? Inhospitable for most forms of life -weak currents and light winds block nutrient-rich water -high salinity Eddies pump nutrient-rich water from ocean floor that sustains rich microbial population from oceanservice.noaa.gov

Whole-Genome Shotgun Sequencing from MMG Genetics and Genomics Wiki

Bias in Sequencing High Volume of Genomes Computational assembly algorithms set based on depth of coverage, in turn based on variation in genome size and relative abundance of specific genomes Less abundant species expected to have genomes of only few sequences Thus, when setting coverage depth to identify unique regions for backbone assembly of all these genomes, more abundant genomes would be labeled as repetitive More abundant genomes are assembled more poorly from Genomenewsnetwork.org Alleviated by manual assembly of large, nonrepetitive contigs. Expected coverage based on these assemblies

Figure 1 Satellite image of ocean chlorophyll around their collection site. Station 3 experienced elevated chlorophyll levels relative to 1&11&13

Figure 2 Scaffold sets representing a conglomerate of Prochlorococcus strains Prochlorococcus MED4 genome (outer ring) Chromosome map depicting conserved gene order Color = Position. Red = Start ; Blue = End ; Black = Non-conserved gene. Mismatching colors/positions likely the result of chromosomal rearrangements.

Figure 3 uncultured marine archaeon Comparing Scaffolds to Crenarchael clone 4B7. Predicted 4B7 proteins & scaffolds show significant homology, and arrayed in positional order. BLASTp matches scored at least 25% similarity. Lines delineate scaffold borders.

Figure 4 Megaplasmids Circular diagrams of 9 complete megaplasmids.  Depths ranging from 4 to 36  Inner circles = reverse coding genes  Genes colored according to category [Table 1]

Table 1 Breakdown of predicted genes by category. 28,023 genes sorted into multiple categories. 1,214,207 genes vs 137,885 sequence currently archived. Additional hypothetical genes ID’d via conserved open reading frames. 69,901 novel genes identified.

Figure 5A Prochlorococcus-related scaffold Sample of multiple sequence alignment Blue = Contigs Green = fragments Yellow = Assembly stages used to create contigs. Collapsed several fragments to form the final contig.

Figure 5B Global structure of previously mentioned scaffold, with respect to assembly.

Figure 6 Depiction of the phylogenetic diversity observed.  Phylogenetic markers: 16S rRNA, EF-G, EF-Tu, HSP70, RecA, RpoB and 16S rRNA. Identified via HMM and BLAST library comparison/search

Figure 7 Detection of Proteorhodopsin/homologs.  Detection labeled according to where samples were gathered. Codes for light-driven proton pump.  Allows for light-cycling WITHOUT chlorophyll.

Shortcomings in Research Methodology Their BAC libraries were created in bacteria from either terrestrial, or nutrient- rich ocean environments. Sampling sites may have biased their estimates of biodiversity Despite their effort, they only compiled two nearly complete genomes. With the help of fully sequenced templates from a microbe database All of this work was done on relatively small organisms Employing this method on more complex organisms may entail work orders of magnitude.

Further reading Falkowski, P. G., Vargas, C. Shotgun Sequencing in the Sea: A Blast from the Past? Science. 304, (2004). Selvaraj S., Dixon J. R., Bansal, V., Ren B. (2013). Whole-genome haplotype reconstruction using proximity ligation and shotgun sequencing. Nature biotechnology. 31, Ruder, K. (2004). Exploring the Sargasso Sea. Retrieved from

End Summary: The Take Away Whole-genome shotgun sequencing methodology used on genomes of microbial populations in Sargasso Sea Gene content, genetic diversity, and relative concentration of species elucidated from billion bp of nonredundant DNA sequences 1800 known microbial species identified, 148 new microbial phylotypes detected 1.2 million new genes identified, including more than 782 new rhodopsin-like photoreceptors “Tip of the iceberg”- data here alone suggest massive microbial diversity