A pseudomolecule of 774 Mb: the 3B experience

Slides:



Advertisements
Similar presentations
Genomics for Triticeae improvement FP7 European Project.
Advertisements

Sequencing the Maize Genome Maize Genome Sequencing Consortium
Maize Genetics, Genomics, Bioinformatics workshop
LS-SNP: Large-scale annotation of coding non- synonymous SNPs based on multiple information sources -Bioinformatics April 2005.
Rainer Lehtonen PhD, Genomics and genetics project leader Metapopulation Research Group Department of Biological and Environmental Sciences, University.
The IWGSC: Building the sequence-based foundation for accelerated wheat breeding Kellye A. Eversole IWGSC Executive Director & The IWGSC Cereals for Food,
Genomics Chapter 18.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Sequencing Status of the Chromosome 8 and New Marker Development toward a Genetic Map Construction between Micro-Tom and Ailsa Craig SOL Genomics Workshop.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Positional Cloning LOD Sib pairs Chromosome Region Association Study Genetics Genomics Physical Mapping/ Sequencing Candidate Gene Selection/ Polymorphism.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Expanding the Tool Kit for BAC Extension Summary of completion criteria developed for NSF Tomato Sequencing Workshop January 14, 2007.
Genomics and Proteomics DNA sequencing: ddNTP chain termination.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Evaluation of PacBio sequencing to improve the sunflower genome assembly Stéphane Muños & Jérôme Gouzy Presented by Nicolas Langlade Sunflower Genome Consortium.
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Assembly Bonnie Hurwitz Graduate student TMPL.
The IWGSC: Strategies & Activities to Sequence the Bread Wheat Genome Kellye A. Eversole IWGSC Executive Director & The IWGSC Wheat Breeding 2014: Tools,
Genome Sequencing and Assembly High throughput Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Kerstin Howe, Mario Caccamo, Ian Sealy The Zebrafish Genome Sequencing Project Bioinformatics resources.
CUGI Pilot Sequencing/Assembly Projects Christopher Saski.
The New Zealand Institute for Plant & Food Research Limited Potato Genome Sequencing Consortium, notes from the edge Dr Susan Thomson, Dr Mark Fiers, Dr.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
CO 10.
The IWGSC: Strategies & Activities to Sequence the Bread Wheat Genome
Whole genome scans to localise QTL X. Likely positionQTL Chromosome with mapped markers BAC Contig Spanning QTL region New MarkersCandidate Genes Fine.
Genomics BIT 220 Chapter 21.
SOL 2008 October 12-16, Cologne, Germany CHROMOSOME 7 THE FRENCH CONTRIBUTION TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497 T0676 TM18 CT54 T0966.
Genome Sequencing in the Legumes Le et al Phylogeny Major sequencing efforts Minor sequencing efforts ~14 MY ~45 MY.
UMR ASP UMR ASP Structural & Comparative Genomics in Bread Wheat TriAnnotPipeline A LifeGrid Project based on AUVERGRID F. Giacomoni, M.
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
CS177 Lecture 10 SNPs and Human Genetic Variation
The Changing Face of Sequencing
Solanum lycopersicum Chromosome 4 Sequencing Update UK-SOL– Dec 2008 Wellcome Trust Medical Photographic Library.
4th Solanaceae Genome Workshop 2007, September 09th- 13th, Jeju Island, Korea THE FRENCH CONTRIBUTION TO THE INTERNATIONAL TOMATO GENOME SEQUENCING PROGRAM.
Development and Application of SNP markers in Genome of shrimp (Fenneropenaeus chinensis) Jianyong Zhang Marine Biology.
© 2010 by The Samuel Roberts Noble Foundation, Inc. 1 The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA 2 National Center.
10cM - Linkage Mapping Set v2 ABI Median intermarker distance: 4.7 Mb Mean intermarker distance: 5.6 Mb Mean genetic gap distance: 8.9 cM Average Heterozygosity.
Jan Pačes Institute of Molecular Genetics AS CR
Bombus terrestris, the buff-tailed bumble bee Native to Europe A managed pollinator Commercially available Reared in greenhouses Important pollinator in.
Chromosome 12 M. Pietrella 1, G. Falcone 1, E. Fantini 1, A. Fiore 1, M.R. Ercolano 2, A. Barone 2, M.L. Chiusano 2, S. Grandillo 3, N. D’Agostino 2, A.
HeterochromatinEuchromatin Relative chromosome length Relative bivalent diameter X 1.23 X 1.00 Relative area Relative optical density.
1.Data production 2.General outline of assembly strategy.
Human Genome.
Lindsay A. Shearer1, Lorinda K
Sunflower Genomic Resources Consortium – Update Meeting (1)assemble, annotate, and curate the sunflower reference genome; (2)integrate the reference sequence.
16 th April 2007 Christine Nicholson, Mapping Core Group Wellcome Trust Sanger Institute Tomato Chromosome 4 Mapping & Use of FPC Copyright Wellcome Trust.
454 Genome Sequence Assembly and Analysis HC70AL S Brandon Le & Min Chen.
Meet the ants Camponotus floridanus Carpenter ant Harpegnathos saltator Jumping ant Solenopsis invicta Red imported fire ant Pogonomyrmex barbatus Harvester.
Are Roche 454 shotgun reads giving a accurate picture of the genome?
Fragaria vesca Herbaceous, perennial Genotypic diversity
Structural & Functional Annotation Information System (DB)
Tomato Sequencing Project Meeting at SOL 2008, Oct. 15, 2008
Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite.
Denovo genome assembly of Moniliophthora roreri
Professors: Dr. Gribskov and Dr. Weil
Very important to know the difference between the trees!
BAC-Based Physical Map of the Rice Genome.
Genome organization and Bioinformatics
Lecture 9 Genome Mapping By Ms. Shumaila Azam
ToxoDB ApiDB Workshop June 2006.
CSCI 1810 Computational Molecular Biology 2018
Next-generation DNA sequencing
Introduction to Sequencing
Sequence the 3 billion base pairs of human
Fine mapping of SRT1. Fine mapping of SRT1. To fine map the SRT1 locus, we compared the sequencing data and developed a set of InDel markers in the 5 Mb.
Mapping of srt1 by BSA-seq.
Presentation transcript:

A pseudomolecule of 774 Mb: the 3B experience Frédéric CHOULET INRA GDEC – Clermont-Ferrand, France

Sequenced physical map 3B MTP-BAC sequencing Sequenced physical map  #MTP BACs 8452  #BAC pools 922  #Roche 8 kb MP lib. 922  bp coverage (Roche/454) 36x 3B  BAC-ends (Sanger) 42,551  Whole Genome Prof. tags 327,282  Whole 3B shotgun (Illumina) 82x

3B physical map Physical map  #BACs 132,000 (19x)  #BAC-contigs 1282 900 Mb Physical map  #BACs 132,000 (19x)  #BAC-contigs 1282  #MTP-BACs 8452

Assembly and scaffolding 3B-v1 16,136 scaff 1,040 Mb 3B ACGTAGACTACA

Assembly and scaffolding 3B-v1 16,136 scaff 1,040 Mb Curation of the scaffolding V. Barbe, S. Mangenot (Genoscope) 18% Ns 3B-v3 4,999 scaff 992 Mb 13% Ns Integration of BAC-end match positions Parsing of MP read positions scaff00001 scaff00013 scaff00024 scaff00008 scaff00011 scaff00007 scaff00005

Assembly and scaffolding 3B-v1 16,136 scaff 1,040 Mb Curation of the scaffolding V. Barbe, S. Mangenot (Genoscope) 18% Ns Gap filling Seq. error corrections JM. Aury, A. Couloux (Genoscope) 3B-v3 4,999 scaff 992 Mb 13% Ns 3B-v4 - 8% Ns Illumina reads Whole 3B Shotgun 109,914 gaps filled 126,290 bases corrected (error rate: 0.1%)

Assembly and scaffolding 3B-v1 16,136 scaff Curation of the scaffolding V. Barbe, S. Mangenot (Genoscope) 1,040 Mb Gap filling Seq. error corrections JM. Aury, A. Couloux (Genoscope) 3B-v4 4,999 scaff 992 Mb Redundancy removal and scaffold merging S. Theil (INRA GDEC) Pool_A Pool_B ctg1 ctg2 2,808 scaff 833 Mb 3B-v443 redundancy: 160 Mb scaffAssembler.pl

Search for shared TE-junctions

Ordering scaffolds SNP discovery 3B-v443 2,808 scaff 833 Mb SNP discovery SureSelect® seq. capture (E. Paux, N. Cubizolles, E. Rey) Bait TE DNA captured from 10 genotypes gene 52,265 baits isbpProbeDesign.pl 39,077 SNPs

Ordering scaffolds SNP discovery Genotyping mapping pop 3,075 SNPs Genetic mapping (P. Sourdille) Anchor map: 384 indiv Cs x Renan + Neighbor map: 3865 markers LD mapping (F. Balfourier) 367 lines from a core-collection

Ordering scaffolds 3B genetic map 366 bins LD map 554 bins 0 cM 19 LD blocks 554 bins

Linkage Disequilibrium 64 markers at the same genetic position

Ordering scaffolds SNP discovery Genotyping mapping pop Integration of phys. map info pseudomolBuilder.pl 1358 scaff 774 Mb pseudomolecule unlocalized 1450 scaff 59 Mb 93% 7% N

orientation unknown: 48% of the seq. 3 cM 1 2 4 5 6 ? ? ? A B C D E orientation unknown: 48% of the seq. micro-order unknown: 554 bins / 1358 scaff Future Improvements RH map Optical map Long reads

Annotation CLARI-TE TRIANNOT 774 Mb 7264 protein coding genes 234,606 TEs

Bioinformatics Assembly Scaffolding/pseudomolecule construction Newbler gapCloser ssrFinishing Scaffolding/pseudomolecule construction isbpProbeDesign.pl scaffAssembler.pl pseudomolBuilder.pl Annotation triAnnot (new modules: filtering, pseudogenes, transfer annotation) clari-TE & clari-TE-lib Data management gowDB (Bio::DB::seqFeatureStore) Gbrowse @ URGI

Pseudomolecule vs GenomeZipper 3B-CSS 3B-MTP

Acknowledgments M. Alaux L. Couderc V. Jamilloux H. Quenesville Genoscope M. Alaux L. Couderc V. Jamilloux H. Quenesville URGI A. Alberti V. Barbe J. Poulain C. Durand S. Mangenot JM. Aury A. Couloux P. Wincker Catherine Feuillet Sébastien Theil Natasha Glover Josquin Daron CNRGV IEB Lise Pingault H. Berges A. Bellec J. Dolezel J. Safar Hélène Rimbert Nelly Cubizolles Etienne Paux TGAC BIA J. Rogers, M. Caccamo et al. Pierre Sourdille C. Gaspin François Balfourier VIB Jacques Le Gouis K. Vandepoele Nicolas Guilhot SAB MIPS P. Schnable S. Rounsley D. Ware J. Rogers K. Eversole Philippe Leroy K. Mayer et al. Aurélien Bernard