The New Zealand Institute for Plant & Food Research Limited Potato Genome Sequencing Consortium, notes from the edge Dr Susan Thomson, Dr Mark Fiers, Dr Jeanne Jacobs
The New Zealand Institute for Plant & Food Research Limited Potato Genome Sequencing – why? Solanaceae - important family (tomato, eggplant, petunia, tobacco, and capsicum) Potato is now the 3 rd largest global food crop
The New Zealand Institute for Plant & Food Research Limited Potato Genome Sequencing – the beginning The Potato Genome Sequencing Consortium is an initiative of Wageningen University & Research Center PGSC brings together a global community to complete the project. Individual partners were assigned different chromosomes.
The New Zealand Institute for Plant & Food Research Limited PGSC – member countries
The New Zealand Institute for Plant & Food Research Limited PGSC – the beginning 1995 – Genetic map of potato, diploid mapping population SH (SH ) x RH (RH )
The New Zealand Institute for Plant & Food Research Limited Genetic map – Ultra High Density genetic map generated, ~10,000 AFLP markers (genome ~840Mb, 12 markers/Mb) PGSC – the beginning
The New Zealand Institute for Plant & Food Research Limited Ultra high-density genetic map 2001 Genetic map – BAC library, using RH ,000 BACs average insert of 120Kb. 73,000 fingerprinted by AFLP PGSC – the beginning
The New Zealand Institute for Plant & Food Research Limited Ultra high-density genetic map 2001 BAC library 2002 Genetic map – AFLP analysis of BACs used to build up contigs of overlapping BACs. Selective AFLPs used to anchor certain BACs (and contigs) to physical map. PGSC – the beginning
The New Zealand Institute for Plant & Food Research Limited Ultra high-density genetic map 2001 BAC library 2002 Genetic map 1995 Physical map /6 – Initiate genome sequencing. BAC by BAC Sanger sequence. Start with anchored seed BACs. 6x coverage, BACs/chromosome. PGSC – the beginning
The New Zealand Institute for Plant & Food Research Limited Ultra high-density genetic map 2001 BAC library 2002 Genetic map 1995 Physical map 2006 Dec 2009 – end date for full annotated potato genome sequence Sequencing start 2005/6 PGSC – the beginning
The New Zealand Institute for Plant & Food Research Limited Sequencing start 2005/6 PGSC – the beginning Annotation and sequence Dec 2009 Early 2008 – BAC sequencing status: chromosome 7 not started, others very few BACs done.
The New Zealand Institute for Plant & Food Research Limited PGSC – the worries Sanger BAC by BAC slow Despite UHD map of 10,000 markers, still large gaps in physical map reducing number of seed BACs Made more problematic by ‘stops’ caused by repeat elements and lack of overlapping BACs
The New Zealand Institute for Plant & Food Research Limited PGSC – the solutions Bigger and better machines! Next Generation Sequencing (NGS) technologies making Whole Genome Shotgun (WGS) sequencing more financially feasible (data/$). RH is highly heterozygous, leading to assembly issues. Continue RH sequencing using mainly NGS methods
The New Zealand Institute for Plant & Food Research Limited PGSC – the solutions Introducing a new line, DM: DM R44 Doubled Monohaploid, homozygous, line. (Ref: Lightbourn GJ, Jelesko JG, Veillieux RE Genome 50 (5):492–501.) DM flowers well. Can be used as female parent in crosses with most diploid potato germplasm.
The New Zealand Institute for Plant & Food Research Limited PGSC – mapping No genetic knowledge for DM R44 Diploid mapping population: DM x DI (China Runtush) F1 x DI Mapping population 2 x 96 well plates with DNA of mapping population, along with parents. Generated by International Potato Center (CIP), Peru.
The New Zealand Institute for Plant & Food Research Limited PGSC – mapping Preliminary Scaffold assembly of DM derived from Illumina data: (generated by Beijing Genome Institute, BGI) No. of sequences57681 max scaffold length min scaffold length100 total assembly length average scaffold length12180 median scaffold length179 n50* * n50 = largest first, align along length of genome. n50 is size of scaffold at 50% genome. As at 20 August 2009
The New Zealand Institute for Plant & Food Research Limited PGSC – mapping 550 newly generated SSR markers*; SSRsInstituteCountry 100 Plant and Food Research New Zealand 100 Universidad Nacional Agraria La Molina Peru 100 International Potato Centre Peru 100 Scottish Crop Research Institute Scotland 50 Instituto Nacional de Tecnologia Agropecuaria Argentina 50 The Irish Agriculture and Food Development Authority, Teagasc Eire 50 Institute of Bioengineering Russian Federation *SSRs generated by BGI, China Preliminary results 14/44 were monomorphic, 15/44 tested show polymorphism in DI, 15/44 show polymorphism between DM/DI
The New Zealand Institute for Plant & Food Research Limited PGSC – mapping Sequence Tagged Markers (STM). Known to map to regions spanning all 12 chromosomes. - ~60 Ste markers, currently being mapped in an SHxRH population. Generated by large scale in-silico design of SSRs from ESTs in public database. (Ref: Tang J, Baldwin SJ, Jacobs JM, Linden CG, Voorrips RE, Leunissen JA, van Eck H, Vosman B. BMC Bioinformatics Sep 15;9:374)
The New Zealand Institute for Plant & Food Research Limited PGSC – mapping SNP data – EST data aligned to DM scaffold. (Robin Buell, courtesy of SolCAP USDA project Design ~ 2000 markers for use with BeadXpress (Illumina) (Glenn Bryan, Scottish Crop Research Institute) Aiming for > 1000 mapped. DArT data* – Two discovery arrays with over 30,000 probes to begin. Discovered 3000 candidate markers. It is hoped that 1000 to 1500 unique DM markers will segregate in the mapping population. Sequencing of 7000 DArT markers will also be carried out. * Diversity Arrays Mapping data will be combined with results from:
The New Zealand Institute for Plant & Food Research Limited PGSC – assembly Plans for an in silico* pipeline to improve scaffold bringing together data from: - SOL Genomics Network - Tomato genome - Markers; SSR, SNP and DArT - RH UHD/physical map information * Dan Bolser, University of Dundee, Scotland
The New Zealand Institute for Plant & Food Research Limited PGSC – the present & future LineIn progressSanger sequencing Illumina runs Roche/454 runs RH WGS + Long Jump libraries 10 X coverage WGS 60 X coverage BAC library 150,000 BAC end sequences + 2,000 BAC clones Random sheared BAC library (~100kb)120,000 BAC end sequences DM WGS + Long jump libraries 10 X coverage WGS + 500bp to 10kb libraries 65 X coverage Fosmid library (~ 35kb)100,000 end sequences BAC libray200,000 BAC end sequences
The New Zealand Institute for Plant & Food Research Limited Add into assembly pipeline, data from Transcriptome sequencing: 16 runs, a combination of different tissues and conditions for DM and also RH
The New Zealand Institute for Plant & Food Research Limited Acknowledgements Plant & Food Research is part of the international Potato Genome Sequencing Consortium (PGSC). For more information, visit Website going live as of 1 st September. PFR – Lincoln Jeanne Jacobs Mark Fiers Samantha Baldwin