Jonathan B. Puritz, Christopher M. Hollenbeck, and John R. Gold Fishing for selection, but only catching bias: library effects in double-digest RAD data.

Slides:



Advertisements
Similar presentations
Admixture in Horse Breeds Illustrated from Single Nucleotide Polymorphism Data César Torres, Yaniv Brandvain University of Minnesota, Department of Plant.
Advertisements

Cultivation of the blue mussel (Mytillus edulis) has grown strongly in Scotland over the last ten years. The further development of sustainable and productive.
Detecting selection using genome scans
Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland.
ILVO - Plant (Applied Genetics and Breeding) Development of EST markers and evaluation of their use in evergreen.
Recombination and genetic variation – models and inference
GENOMICS TERM PROJECT Assessment of Significance in a SNP.
A genetic assessment of Bay Scallop restoration in Bogue Sound, North Carolina Sherman, M. 1, D. Schmidt 2, A.E. Wilbur 1 1 Department of Biology and Marine.
Using genomics to study segregated hatchery effects in western Washington steelhead Sewall F. Young 1,2 Kenneth I. Warheit 1,2 James E. Seeb 2 1 Washington.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Sequencing Neanderthal DNA
Signatures of Selection
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
FINAL EXAM: TAKE-HOME Assessment of Significance in Cancer Gene SNPs.
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
Constant Allele Frequencies Hardy-Weinberg Equilibrium.
Zachary Bendiks. Jonathan Eisen  UC Davis Genome Center  Lab focus: “Our work focuses on genomic basis for the origin of novelty in microorganisms (how.
Committee Meeting April 24 th 2014 Characterizing epigenetic variation in the Pacific oyster (Crassostrea gigas) Claire Olson School of Aquatic and Fishery.
1 Genetic Variability. 2 A population is monomorphic at a locus if there exists only one allele at the locus. A population is polymorphic at a locus if.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
Todd J. Treangen, Steven L. Salzberg
GBS Bioinformatics Pipeline(s) Overview
The East African Lake Malawi represents one of the largest and most diverse adaptive radiations on earth, with over 700 species of haplochromine cichlid.
Targeted next generation sequencing for population genomics and phylogenomics in Ambystomatid salamanders Eric M. O’Neill David W. Weisrock Photograph.
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
By Zemin Ning & Adam Spargo Informatics Division The Wellcome Trust Sanger Institute The SSAHA2 Application Pack.
Conservation Genetics of the Plains Topminnow, Fundulus sciadicus The plains topminnow (Fundulus sciadicus) is a freshwater killifish endemic to the Great.
PHYLOGEOGRAPHY OF THE BOBWHITES Damon Williford, Randy W. DeYoung, Leonard A. Brennan, Fidel Hernández Caesar Kleberg Wildlife Research Institute Texas.
Managing Next Generation Sequence Data with GMOD Dave Clements 1, Scott Cain 2, Paul Hohenlohe 3, Nicholas Stiffler 3, Paul Etter 3, Eric Johnson 3, William.
Phylogenomics “The intersection of phylogenetics and genomics”
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Geuvadis Analysis Meeting 16/02/2012 Micha Sammeth CNAG – Barcelona.
Molecular markers Non-PCR based 1courtesy of Carol Ritland.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.

Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
Citizen Science as an Integral Component of Reef Fish Research and Monitoring Efforts Along Florida's Atlantic Coast Justin J. Solomon, Russell G. Brodie,
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
CASE7——RAD-seq for Grape genetic map construction
Population genetics of Liothyrella neozelanica in Breaksea Sound Erik Suring University of Otago, Dunedin, New Zealand Marine Science 480 Research Project.
Conservation Genetics
Current Data And Future Analysis Thomas Wieland, Thomas Schwarzmayr and Tim M Strom Helmholtz Zentrum München Institute of Human Genetics Geneva, 16/04/12.
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
Synteny - many distantly related species have co- linear maps for portions of their genomes; co-linearity between maize and sorghum, between maize and.
Initial Assessment of Habitat Use by Stocked Lake Sturgeon in the Genesee River D. E. DITTMAN 1 and E. C. ZOLLWEG 2 1 Tunison Laboratory of Aquatic Science,
From Reads to Results Exome-seq analysis at CCBR
Gene flow and speciation. Mechanism for speciation Allopatric speciation Sympatric speciation.
Did the collapse affect the genetic diversity of European Anchovy (Engraulis encrasicolus, L.) population in of Bay of Biscay? Joyanta Bir Master MER University.
Samuel A. Logan, Prattana Phuekvilai and Kirsten Wolff
Department of Forest Resources and Environmental Conservation
Do we really need theory?
Signatures of Selection
Lucas D. Baker1 Vikram E. Chhatre2 Hayley C. Lanier1
Introduction to RAD Acropora millepora.
Figure 2. Number of SNPs detected from empirical ddRAD-Seq analysis
A Hybrid Algorithm for Multiple DNA Sequence Alignment
The characterisation of mtDNA deletions using long-read sequencing
Discovery tools for human genetic variations
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Basic SNP characteristics using different filter settings for missing data. Basic SNP characteristics using different filter settings for missing data.
Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate Immunity Genes  Matthieu Deschamps, Guillaume Laval,
Matthieu Foll, Oscar E. Gaggiotti, Josephine T
Segregation distortion in chromosome 3.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Volume 22, Issue 1, Pages (January 2012)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Jonathan B. Puritz, Christopher M. Hollenbeck, and John R. Gold Fishing for selection, but only catching bias: library effects in double-digest RAD data in a non-model marine species Marine Genomics Laboratory Harte Research Institute Texas A&M University-Corpus Christi ResultsIntroduction Conclusions Double-digest, restriction-site associated DNA sequencing (ddRAD) has become a powerful and useful approach for population genomics, especially for non-model organisms (Peterson et al. 2012). However, once population-genomics studies extend beyond a single library, imprecision of agarose gel-size selection during library preparation has the potential to introduce bias if the same set of homologous genomic fragments across all samples is not sequenced or if SNP allelic identity is linked to RAD genomic fragment length. In high gene flow species, such as many marine organisms, small of amounts of systematic bias could overwhelm a weak signal of population structure and lead to misinterpretation of spurious outlier SNPs as potential candidates for adaptive loci. 13/64 227/5246 Methods 532 individual red snapper (Lutjanus campechanus) from 15 different localities were sequenced across four different ddRAD libraries, two prepared with agarose gel-based size selection, and two prepared with automated size selection via a Pippin Prep. Sequence data was processed with the dDocent pipeline (Puritz et al. 2014). Individuals from two geographic localities were sequenced from multiple libraries across different size-selection techniques. Separating each of the two localities into library ‘populations’, F ST -based outlier detection (L OSITAN, Antao et al. 2008) was used to identify 1,751 SNPs with potential library bias (high F ST between different library subsets from the same locality). The total SNP data set was then filtered by low coverage individuals, minor allele frequency, call rate, minimum and maximum mean site depth, quality vs depth, allele balance, paired status, uneven balance of forward and reverse reads, SNP clusters, and HWE. The final SNP call set was then separated into two subsets, one containing all loci and one that was filtered for identified SNPs with potential library bias. B AYE S CAN (Foll and Gagiotti 2008) and Discriminant Analysis of Principal Components (Jombart et al. 2010) were used to examine the potential effects of library biased loci. There is significant library bias across multiple ddRAD libraries and biased loci are disproportionately found in outlier loci relative to neutral loci. Stringent bioinformatic filtering does not remove all biased loci (~20% remain in inferred outliers), and even a few biased loci can overwhelm weak signatures of selection or population structure in a data set. Library bias in ddRAD data could be mitigated by (i) randomizing individuals across libraries, and (ii) repeating localities or subsets of localities across libraries. Field Collection (For data presented): North Carolina Department of Environmental and Natural Resources, Division of Marine Fisheries; Florida Fish and Wildlife Research Institute, Florida Fish and Wildlife Conservation Commission; Panama City Laboratory and Pascagoula Laboratory, Southeast Fisheries Science Center, National Marine Fisheries Service, and Crew of the Oregon II, National Marine Fisheries Service. Funding: MARFIN Program of the National Marine Fisheries Service. Literature Cited Antao, T., Lopes, A., Lopes, R.J., Beja-Pereira, A., Luikart, G., BMC Bioinformatics. Foll, M., and Gaggiotti, O.E., Genetics 180. Jombart, T., Devillard. S., and Balloux, F BMC Genetics. Peterson, B.K., Weber, J.N., Kay, E.H., Fisher, H.S., Hoekstra, H.E., PLoS One. Puritz, J.B., Hollenbeck, C.M., Gold, J.R., PeerJ. Acknowledgements