Finishing tomato chromosomes #6 and #12 using a Next Generation whole genome shotgun approach Roeland van Ham, CBSG, NL René Klein Lankhorst, EUSOL Giovanni.

Slides:



Advertisements
Similar presentations
Mo17 shotgun project Goal: sequence Mo17 gene space with inexpensive new technologies Datasets in progress: Four-phases of 454-FLX sequencing to max of.
Advertisements

Chr9 A ntonio Granell IBMCP-Valencia Spain Tomato Sequencing, Madison July 2006.
Click to edit Master title style Irys data analysis January 10 th, 2014.
Progress on the sequencing of the euchromatic gene rich space of chromosome 6 of Solanum lycopersicum cv. Heinz 1706 Sander Peters Cologne Oct 2008.
Sequencing Status of the Chromosome 8 and New Marker Development toward a Genetic Map Construction between Micro-Tom and Ailsa Craig SOL Genomics Workshop.
Expanding the Tool Kit for BAC Extension Summary of completion criteria developed for NSF Tomato Sequencing Workshop January 14, 2007.
Novel multi-platform next generation assembly methods for mammalian genomes The Baylor College of Medicine, Australian Government and University of Connecticut.
CS273a Lecture 4, Autumn 08, Batzoglou Hierarchical Sequencing.
Evaluation of PacBio sequencing to improve the sunflower genome assembly Stéphane Muños & Jérôme Gouzy Presented by Nicolas Langlade Sunflower Genome Consortium.
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?
Tomato Chromosome 8 sequencing at Kazusa DNA Research Institute Erika Asamizu.
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
Genome Assembly Bonnie Hurwitz Graduate student TMPL.
EU-SOL 2008 November 13-16, Toulouse, FRANCE CHROMOSOME 7 SEQUENCING Current status and perspective TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497.
Chromosome 8 Sequencing: Current Status and Future Prospects toward Finishing Shusei Sato, Erika Asamizu, Takakazu Kaneko, Hiroyuki Fukuoka, Satoshi Tabata.
Solanum lycopersicum Chromosome 4 Sequencing Update SOL Germany– October 2008 Wellcome Trust Medical Photographic Library.
CUGI Pilot Sequencing/Assembly Projects Christopher Saski.
The New Zealand Institute for Plant & Food Research Limited Potato Genome Sequencing Consortium, notes from the edge Dr Susan Thomson, Dr Mark Fiers, Dr.
PE-Assembler: De novo assembler using short paired-end reads Pramila Nuwantha Ariyaratne.
Tomato Chromosome 4: A Mapping & Sequencing Update 28 th September 2005 Christine Nicholson Mapping Core Group Welcome Trust Sanger Institute, UK.
Update tomato chr. 6 Roeland van Ham Centre for BioSystems Genomics The Netherlands.
SOL 2008 October 12-16, Cologne, Germany CHROMOSOME 7 THE FRENCH CONTRIBUTION TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497 T0676 TM18 CT54 T0966.
Sequence assembly using paired- end short tags Pramila Ariyaratne Genome Institute of Singapore SOC-FOS-SICS Joint Workshop on Computational Analysis of.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
P. Tang ( 鄧致剛 ); RRC. Gan ( 甘瑞麒 ); PJ Huang ( 黄栢榕 ) Bioinformatics Center, Chang Gung University. Genome Sequencing Genome Resequencing De novo Genome.
WGP Tomato EU-SOL meeting July 15, 2009 Antoine Janssen.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
Status report on gap closure of the human chromosome 5 BAC map Authentication of C5 BAC maps Map and sequence status Gap status and steps used to close.
Mapping and sequencing chromosome 6 of Solanum lycopersicum cv
SIZE SELECT SHEAR Shotgun DNA Sequencing (Technology) DNA target sample LIGATE & CLONE Vector End Reads (Mates) SEQUENCE Primer.
The Changing Face of Sequencing
Solanum lycopersicum Chromosome 4 Sequencing Update UK-SOL– Dec 2008 Wellcome Trust Medical Photographic Library.
FINISHING WORKSHOP APRIL 2008 CHROMOSOME 7 THE FRENCH CONTRIBUTION TG216 TG438 T1112 T1355 T1328 T1428 T1962 T1414 T1497 T0676 TM18 CT54 T0966 T0731 TM15.
Problems of Genome Assembly James Yorke and Aleksey Zimin University of Maryland, College Park 1.
Vervet Monkey Genomics: Genome Canada and Génome Québec Physical Map Project J. Wasserscheid, G. Leveque, C. Nagy, C. Pinsonnault, and K. Dewar, McGill.
Chromosome 2 Doil Choi, Sunghwan Jo KOREA. Cytological architecture of chromosome kb/µm DAPI (4’-6-diamidino-2-phenylindole) stained pachytene chromosome.
Progress tomato chromosome 6 René Klein Lankhorst.
Chromosome 12 M. Pietrella 1, G. Falcone 1, E. Fantini 1, A. Fiore 1, C. Perla 1, M.R. Ercolano 2, A. Barone 2, M.L. Chiusano 2, S. Grandillo 3, N. D’Agostino.
Chromosome 12 M. Pietrella 1, G. Falcone 1, E. Fantini 1, A. Fiore 1, M.R. Ercolano 2, A. Barone 2, M.L. Chiusano 2, S. Grandillo 3, N. D’Agostino 2, A.
Wageningen, April 24-25, 2008 II Tomato Finishing Workshop Chromosome 12 Update ENEA, Rome University of Naples ‘Federico II’ CRIBI and Univ. of Padua.
HeterochromatinEuchromatin Relative chromosome length Relative bivalent diameter X 1.23 X 1.00 Relative area Relative optical density.
Applied Bioinformatics Week 5. Topics Cleaning of Nucleotide Sequences Assembly of Nucleotide Reads.
1.Data production 2.General outline of assembly strategy.
Italy: tomato chr. 12 Country Representative: Dr. Giovanni Giuliano Maria Luisa Chiusano Maria Raffaella Ercolano University.
The Genome Assemblies of Tasmanian Devil Zemin Ning The Wellcome Trust Sanger Institute.
Solanum lycopersicum Chromosome 4 Mapping and Finishing Update SRC-UK and Wellcome Trust Sanger Institute SOL Korea – September 2007 Wellcome Trust Medical.
The Wellcome Trust Sanger Institute
Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.
Day Two. DAY TWO 9:00 – 9:10Recap of day one 9:10 – 9:55TOPAAS demo (Sander) 9:55 – 10:15Coffee break 10:30 – 11:30New Technology Data 11:30 – 12:30High.
13 th January 2008 Plant & Animal Genome Conference Progress with Sequencing Tomato Chromosome 4 Clare Riddle Tomato Project Group Wellcome Trust Sanger.
COMPUTATIONAL GENOMICS GENOME ASSEMBLY
16 th April 2007 Christine Nicholson, Mapping Core Group Wellcome Trust Sanger Institute Tomato Chromosome 4 Mapping & Use of FPC Copyright Wellcome Trust.
26 th July 2006 Christine Nicholson, Mapping Core Group Karen McLaren, Finishing Group Leader Wellcome Trust Sanger Institute Sequencing the Gene Space.
Plasmodium falciparum (3D7) - published in Draft coverage. No sequence updates for a year. No new annotation since? Leishmania major Friedlin - version.
ALLPATHS: De Novo Assembly of Whole-Genome Shotgun Microreads
CURRENT STATUS ON SEQUENCING OF CHROMOSOME 12 Mara Ercolano Ischia, 2005.
Tomato Sequencing Project Meeting at SOL 2008, Oct. 15, 2008
Ssaha_pileup - a SNP/indel detection pipeline from new sequencing data
Very important to know the difference between the trees!
Finishing the human genome sequence?
Plant & Animal Genome Conference
Progress on sequencing tomato chromosome 12
Progress in sequencing chromosome 6
Padova sequencing contribution:
Progress in sequencing chromosome 6
CSCI 1810 Computational Molecular Biology 2018
Sequence the 3 billion base pairs of human
The Potato Genome Sequencing Consortium: An Update
Presentation transcript:

Finishing tomato chromosomes #6 and #12 using a Next Generation whole genome shotgun approach Roeland van Ham, CBSG, NL René Klein Lankhorst, EUSOL Giovanni Giuliano, ENEA, IT Giorgio Valle, Univ. Padua, IT Michiel van Eijk, Keygene, NL Satoshi Tabata, Kazusa, JP

Status International project ( )

►Overall progress is slow ►Several chromosomes have large gaps |efforts to identify novel seed BACs within various of these gaps have remained unsuccessful ►Higher gene density in heterochromatin than expected ►Combined strategies using NGS technologies now enable de novo sequencing of large, complex genomes

1. Status sequencing chr ►Chr 6: 155 BACs sequenced (12.6 Mb non-redundant)  66 seed, 89 extension BACs  118 HTGS1, 37 HTGS3  28 BAC contigs, 9 singletons ►Chr 12: 55 BACs sequenced (5.1 Mb non-redundant)  34 seed, 31 extension BACs  21 HTGS1, 11 HTGS2, 23 HTGS3  14 BAC contigs, 20 singletons

2. Example: What is required to finish chr. 6? 12.6 Mb155 BACs 20.4 Mb250 BACs 32.0 Mb381 BACs estimated no. of gaps: -26 small (< 4 BACs) -13 large (4-15 BACs) estimated no. of BACs to sequence: ~160 BACs ~ +100 ~ +230

3. Options to finish chr. 6 and 12 A.Continue classical sequencing by BAC walking B.Purify and shotgun sequence chr |combination of flow cytometry and chromosome amplification C.Sequence chr by shotgun sequencing whole genome A.Time-consuming, expensive, no seed BACs anchored in large gaps B.~ one year to develop technology C.Exploit next generation sequencing, sequence 99.9% of genome

4a. The initiative Together with our partners we will produce: ►A whole genome physical map based on 10X Genome Analyzer (Solexa) generated AFLP sequence tags on BAC’s ►A 20X genome coverage in 454 reads using upcoming Titanium upgrade |read length ~400, ~500 Mb per run |use combination of shotgun and paired-end runs (short and long-jump inserts, 3 and ~20 kb) ►A 30 X genome coverage in SOLID reads |reads ~30 bp, paired-ends ~3 kb ►~3 Million Sanger reads from Selected BAC Mixture (SBM-data, Kazusa)

4b. The initiative We will assemble this data together with all currently available data: |300,000 BAC ends (120,000 pairs) |180,000 fosmid ends (90,000 pairs) |~30% euchromatic sequence (66 Mb) ►Anchor the contigs to new physical map using AFLP sequence tags

5. The challenge: assemble the genome ►Use 66 Mb of available sequence to benchmark procedure |~strategy used for Vitis genome 1.Match all vs. all reads, 100% identity 2.Cluster reads and divide in repeat and low copy (unique) clusters 3.Separately assemble low copy clusters 4.Merge assembled clusters, lowering stringency step-wise 5.Use BAC-end, fosmid-end and SOLiD/454 paired ends to scaffold and build supercontigs 6.Anchor clusters / supercontigs to novel physical map (KeyGene)

6. Funding ►10 X Solexa BAC based physical map, KeyGene/BSP: Secured |Data production Q1 & Q ►15 X SOLID coverage, The Netherlands: Secured |Data production has started October 2008 ►10 X 454 coverage, The Netherlands: Application (CBSG 2012) |Data production expected to start December 2008 ►15 X SOLID coverage, Italy: Secured |Data production expected to start November 2008 ►10 X 454 coverage, Italy: Secured |Data production expected to start November 2008 ►SBM data set (Kazusa); Data available

7. Data release ►The data will consist of an assembly of next gen data with contigs as much as possible anchored to new physical map ►All data will be released to SOL Consortium for the purpose of finishing the Heinz 1706 genome ►Data release within the Consortium will follow the newly proposed international standards: “ENCODE Consortia data release policy” (draft 11/09/2008). In a nutshell: |Data will be released by the data producers as soon as possible after verification of the data |Users of the data are not allowed to publish the data without consent of the data producers for a moratorium period of 9 months |In case of consent, proper reference to the data producers should be made |After expiration of the moratorium period, data users may only publish the data when making proper reference to the data producers

8. Time line (estimate) ►Production of SOLID and 454 data: October 2008 – April 2009 ►Production of the physical map: Jan – July 2009 ►Assembly of all data sets: May 2009 – August 2009 ►Release of assembly to SOL Consortium: September 2009

9. Invitation ►Other SOL members are welcome to join the “seed consortium” for Next Gen Tomato Sequencing, provided that: |Novel significant expertise and/or data sets are brought in (sequence coverage, assembly resources, etc.) |Own funding is secured |The time line can be adhered to |The policy of data release is subscribed