Download presentation
Presentation is loading. Please wait.
1
Summary of Current Assembly
“Blueprints for Blueberry” –The current status of assembly and annotation of the blueberry genome Robert W. Reid1, Ying-Chen Lin4, Raad Gharaibeh1, Cory Brouwer1, Jeannie Rowland2, Dory Maine3, Rachel Walstead1, Mary Ann Lila4, Allan Brown5 1 Dept. of Bioinformatics and Genomics, University of North Carolina Charlotte, Charlotte, NC Dept. of Horticulture, Washington State University, WA USDA-ARS, MD Plants for Human Health Institute, North Carolina State University, Kannapolis, NC International Institute of Tropical Agriculture (IITA), CGIAR, Arusha, Tanzania Abstract Summary of Current Assembly Here we report on the current state of the diploid blueberry genome and describe some of the current resources available. While many plant sequencing efforts are underway across the plant kingdom, the vaccinium genus is relatively under represented. Blueberry is unique in that it has a long develop period (juvenile period = 3 years), is unable to self propagate, Our goal is to generate a genome reference that acts as a resource for more efficient breeding. Marker guided breeding benefits from having a genome resource to associate gene regulation with desirable plant traits. These efforts will ultimately lead to improvements in growing, berry collecting, processing and improved human nutrition. Table 1: Assembly improves with additional sequencing and BAC end sequences. To aid in scaffolding, Sanger BAC ends were sequenced and incorporated into the assembly using a modified version of SSPACE. Repeat Analysis: Summary of repeat content in genome Annotations Augustus gene prediction produced 113,003 gene predictions. All predictions are available and viewable in a genome browser at vaccinium.org and within the IGB browser ( Of these gene predictions, 79,096 protein predictions were greater than 100 residues in length. These proteins were spread across 6,707 scaffolds. Interproscan annotated 38,217 proteins (48%) and of these 19,506 proteins were assigned GO terms across 1,483 different GO identifiers. Figure 3 summarizes the detected blueberry gene ontologies from Blast2Go for both biological processes and molecular functions. The top processes detected were DNA metabolic processes (GO: ), cellular protein modification process (GO: ) and RNA metabolic processes (GO: ). Repeat analysis was produced via RepeatModelor, MISA, RepeatScout and Repeatmasker and is summarized in the table above (right). Transcription factor analysis discovered 1,889 transcription factors a reciprocal best hit BLAST search against the Plant Transcription Factor Database v 3.0 . Figure 1: Assembly numbers for the latest blueberry genome assembly. This includes all scaffolds and contigs with no cutoff. There are a total of 104,711 assembled sequences with 13,860 contigs/scaffolds being greater than 1000 nucleotides (NT) long. Materials & methods Core genes that align to genome Improved blueberry genetic linkage map 77% Figure 2: Overview of the blueberry variety used for linkage mapping and genome Assembly. Diploid W85-20 is a wild selection from New Jersey that was selected based on it’s cold hardiness properties. Actively growing leaves were used as the source material for sequencing. For marker linkage analysis (paper in preparation & Rowland 2014, see Fig. 4), a screening population was produced using W as a grandparent. All tissue was provided by Jeannie Rowland (USDA-ARS). Sequencing was completed using Illumina, Sanger technologies and 454 GS FLX pyrosequencing. For paired-end library construction, genomic DNA was sheared to 3, 8 and 20 kb, respectively, using HydroshearTM. Illumina GA2 and HiSeq mate pair sequences were produced and 1 additional lane of Illumina HiSeq was sequenced using the Nextera mate pair preparation kit with an average insert size of 7000 base pairs. Contigs were assembled via MaSuRCA and Newbler followed by assembly merging using GARM. Figure 5: Assessing genome completeness, we aligned 458 core genes from arabidopsis (described in CEGMA) to our assembly (aligning via exonerate), we found 354 aligned (77%). Blasting these same core genes to the available blueberry transcriptome produced 450 hits (98%). Future plans are to incorporate BUSCO (busco.ezlab.org) for further assessments. Assembly sise is 484MB but flow cytometry has estimated the genome size to be 600MB. To learn more about our future efforts or about joining the blueberry consortium, please contact us. Annotation pipeline for gene ontology Figure 4: Depiction of latest diploid blueberry map. The current diploid map contains 318 markers, with 92 added markers (in red) since previously reported (Rowland, Molecular Breeding, December 2014, Volume 34, Issue 4, pp ). Of the new SSR markers added, > 95% align to the assembly. There is an average of 26.5 markers per linkage group. Available online resources for blueberry Figure 3: Annotation highlights. Left panel: pipeline summary of GO annotations generated from automated gene predictions Future directions of development: Like so many other plant genomes, the blueberry genome requires longer read lengths to resolve repeat regions and anchor numerous small contigs generated so far. The blueberry consortium has begun with both PAC-Bio sequencing and will be employing the Dovetail Genomic’s Chicago sequencing strategy. We plan to add Optical Mapping to integrate linkage maps and scaffolds in the future. Funds for the blueberry genome project have been provided by North Carolina General Assembly, NC State University, USDA-ARS, University of Florida, UNC Charlotte and the P2EP (p2ep.org). Figure 6: Online resources for blueberry. (Upper left) Vaccinium.org hosts an online BLAST server (Lower left) Soon to be added to vaccinium.org will be an annotated browser generated using GenSAS 2 which is currently in development. (Right) The Integrated Genome Browser (IGB) is freely available at bioviz.org and includes an annotated browser including automated gene predictions, repeat regions as well as berry transcriptome alignments at various stages of berry development. Improved linkage maps are also in development. Poster P1131
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.