Cassava Genome from Ancestor to Cultivar

Slides:



Advertisements
Similar presentations
Frontiers of Genetics Chapter 13.
Advertisements

Genomics – The Language of DNA Honors Genetics 2006.
Introduction to genomes & genome browsers
The IWGSC: Building the sequence-based foundation for accelerated wheat breeding Kellye A. Eversole IWGSC Executive Director & The IWGSC Cereals for Food,
9 Genomics and Beyond Brief Chapter Outline
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Background About the Pufferfish: Fugu is a teleost fish belonging to the order Tetraodontiformes. Fugu rubripes, an eukaryota and vertebrate, more commonly.
Physical Mapping I CIS 667 February 26, Physical Mapping A physical map of a piece of DNA tells us the location of certain markers  A marker is.
Genes. Outline  Genes: definitions  Molecular genetics - methodology  Genome Content  Molecular structure of mRNA-coding genes  Genetics  Gene regulation.
BIO 402/502 Advanced Cell & Developmental Biology I Section IV: Dr. Berezney.
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Genomes summary 1.>930 bacterial genomes sequenced. 2.Circular. Genes densely packed Mbases, ,000 genes 4.Genomes of >200 eukaryotes (45.
MiRNA targets Using undergraduate molecular biology labs to discover targets of miRNAs in humans Adam Idica, Jordan Thompson, Irene Munk Pedersen, Pavan.
Genome of Drosophila species Olga Dolgova UAB Barcelona, 2008.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
HC70AL Final Presentation Chris McQuilkin June 4 th, 2009.
Genome Organization and Evolution. Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Molecular techniques in plant breeding, Cantho september DAY 1: Molecular biology, the basics Chapter 1: DNA structure and gene expression, nuclear.
Development and Application of SNP markers in Genome of shrimp (Fenneropenaeus chinensis) Jianyong Zhang Marine Biology.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
AP Biology Control of Eukaryotic Genes.
© 2010 by The Samuel Roberts Noble Foundation, Inc. 1 The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA 2 National Center.
Proposed redefinition of “gene” requires it to have a biological role Gerstein MB, …, Snyder M Genome Res 17: example of complexities observed.
Genomics and Forensics
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Human Genome.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
.1Sources of DNA and Sequencing Methods.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 2 Genome Assembly.
MPL The DNA Sequence of chimpanzee chromosome 22 and comparative analysis with its human ortholog, chromosome 21 Bioinformatics Dae-Soo Kim.
CASE7——RAD-seq for Grape genetic map construction
BLAST Sequences queried against the nr or grass databases. GO ANALYSIS Contigs classified based on homology to known plant or fungal genes Next.
Accessing and visualizing genomics data
Gene Technologies and Human ApplicationsSection 3 Section 3: Gene Technologies in Detail Preview Bellringer Key Ideas Basic Tools for Genetic Manipulation.
GENOME ORGANIZATION AS REVEALED BY GENOME MAPPING WHY MAP GENOMES? HOW TO MAP GENOMES?
Risheng Chen et al BMC Genomics
Virginia Commonwealth University
Bioinformatics for Research
Metagenomic Species Diversity.
Fragaria vesca Herbaceous, perennial Genotypic diversity
The Transcriptional Landscape of the Mammalian Genome
Human Genome Project.
Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite.
Utilizing the Illumina deep sequencing technique to define
Summary of Current Assembly
Genomes and their evolution
GENE EXPRESSION AND REGULATION
Section 3: Gene Technologies in Detail
Very important to know the difference between the trees!
Chapter 4 “DNA Finger Printing”
Genomes and Their Evolution
MICROBIAL GENETICS CHAPTER 7.
Today… Review a few items from last class
Genomes and Their Evolution
Volume 5, Issue 1, Pages (January 2012)
Genome organization and Bioinformatics
Characterization of microRNA transcriptome in tumor, adjacent, and normal tissues of lung squamous cell carcinoma  Jun Wang, MD, PhD, Zhi Li, MD, PhD,
Volume 10, Issue 6, Pages (June 2017)
Chapter 9 Organization of the Human Genome
CSCI 1810 Computational Molecular Biology 2018
Sucrose transport, loading and unloading model in cassava
Presentation by: Hannah Mays UCF - BSC 4434 Professor Xiaoman Li
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
Part I. Introduction and Genetic Engineering
Circular RNA Transcriptomic Analysis of Primary Human Brain Microvascular Endothelial Cells Infected with Meningitic Escherichia coli  Ruicheng Yang,
Presentation transcript:

Cassava Genome from Ancestor to Cultivar GCP21-II S3 Cassava Genome from Ancestor to Cultivar Wenquan Wang Ph. D Chinese Cassava Genomics Consortium Institute of Tropical Biosciences & Biotechnology, CATAS Uganda, June 19, 2012

Biological characteristics of cassava High photosynthesis High starch accumulation Extremely tolerance to drought and barren soil Heterozygosity and somatic propagation.

Bottleneck in aspect of genetics for developing cassava industry Less known genetic diversity in evolution Lack knowledge for mechanisms of high photosynthesis and starch metabolism Uncovering function of drought and barren soil tolerance Less understanding adaptation to different kinds of diseases and pests of cassava plant Lack tools for genotyping in cassava breeding

Genotypes used for whole genome sequencing W14 (Manihot esculenta. ssp. flabellifolia) Semi-wild species KU50 (Manihot esculanta Crantz) Cultivar (starchy) S1.600 (Manihot esculanta Crantz) Cultivar (sugary ) W14 KU50

Characteristics of the three genotypes for sequencing W14 KU50 S1.600 Regeneration Seeds mainly Stems Tuber root small large very large Photosynthesis middle high Fresh root yield low 10 folds 5-10 folds Starch content 4-5% 30% 5-6 folds 5% added 12-15% sugar, 2-3 folds

Net photosynthesis rate difference of W14 and KU50 in developing stages

contigs/scaffolds >10kb Genome assembly of W14 and KU50 W14 KU50 all contigs/ scafolds contigs/scaffolds >10kb Fold genome coverage (Gb) 97.88 45.31 Number of contigs/scaffolds 54,426 15,234 62,763 7,441 Total span 475 Mb 302 Mb 416 Mb 167 Mb N50 14 kb 21 kb 12 kb 23 kb Largest contigs/scaffold 183 kb 123 kb Average scaffold length 9 kb 20 kb 7 kb 22 kb GC(%): 34.63% 34.47% 36.02% 36.07%

Repeats account and divergence rate in W14 and KU50 AM560 40.2% W14 36.8% KU50 25.7% 12% 22% 17%

LTR in situ hybridization in all the chromosomes

Genome coverage in gene region Transcript coverage 80.5% KU50 W14 Transcripts coverage 97.1% EST coverage 73.4% KU50 W14 EST coverage 91.5%

Evaluation of assembly of W14 Miss match rate: 4.9/10000; mm and gap rate, 3.9/1000

Gene prediction in genomes of W14 and KU50 Gene Number: 43986 31480 Gene Length: 70 Mb 39 Mb Coding Region Length: 46 Mb 28 Mb Gene Density(%): 9.94% 9.95% Mean Length of Intergenic: 4 kb Maxmium Length of Intergenic: 52 kb 44 kb Exon Number: 181,158 124,694 Exon Number/Gene: 4.13 3.97 Exon Length: Mean Length of Exon: 252.73 225.44 Maxmium Length of Exon: 9 kb 6 kb GC(%) of Exon: 44.09% 42.65% Intron Number: 137266 93287 Intron number/Gene: 3.13 2.97 Intron Length: 31 Mb Mean Length of Intron: 336.12 327.71 Maxmium Length of Intron: 11 kb 14 kb GC(%) of Intron: 32.81% 33.40%

Annotation of genes predicted Genome W14 KU50 Predicated genes Number 43,892 Percentage (%) 31,407 Swissprot 28,808 65.63% 19,240 61.26% TrEMBL 38,784 88.36% 26,723 85.09% InterPro/GO 39,918 90.95% 24,344 77.51% KEGG 35,451 80.77% 24,247 77.20% COG 18,205 41.48% 12,162 38.72% NR/NT 38,802 88.40% 26,739 85.14% Total annotated 41,934 95.54% 27,549 87.72% Un-annotated 1,958 4.46% 3,858 12.28%

BAC library and physic map constructed in W14 Description Index BAC library coverage BAC library insertion size Number of BAC clones fingerprinted   Number of high quality fingerprints used for assembly  Number of contigs Number of singletons  Total length of the contigs   N50 contig length  Longest contig Average number of clones per contig    93,000 clones, >10x 130kb 30,000 ? 2484   984 675.93 Mb 336.38 kb  1981.98kb 2.16

Genome diversity decreasing in evolution Heterozygosity of genome W14, KU50 and AM560 sample # SNPs SNPs density (1SNPs/n bp) # gene SNPs gene SNPs density # SNPs in exon SNPs In exon density (per SNPs/bp) W14 1,377,370 1/257 295,358 1/270 220,600 1/272 KU50 806,271 1/286 109,701 1/336 43,610 1/422 AM560 (S3) 506,746 1/693 73,628 1/6170 46,524 1/5583

SNPs divergence in genome of wild ancestor W14 and cultivar KU50 Sample # SNPs SNPs density (SNPs/ bp) # gene SNPs gene SNPs density # SNPs in exon SNPs In exon density (SNPs/bp) # intergenics SNPs intergenics SNPs density ( SNPs/bp) W14 4,812,287 6.94/ 1000 1,574,460 1/294 563,588 1/676 3,237,827 1/160 KU50 3,620,860 4.57/ 516,278 1/894 187,122 1/1947 3,104,582 1/229 S1.600 2,977,198 4.10/ 517,321 1/893 186,413 1/1935 2,459,877 1/255

SNPs shared and distribution Samples # SNPs # SNPs unique # SNPs in gene # SNPs in exon # intergenics SNPs # SNPs in repeat regions W14 4,812,287 4,065,298 1,574,460 563,588 3,237,827 1,751,276 KU50 3,620,860 1,976,538 516,278 187,122 3,104,582 2,142,290 S1.600 2,977,198 1,375,917 517,321 186,413 2,459,877 1,737,544 W14-KU50 570,695 219,335 200,908 75,356 369,787 184,454 W14-S1600 527,654 176,294 205,509 76,873 322,145 162,873 KU50-S1600 1,424,987 1,073,627 281,464 101,783 1,143,523 770,687 W14-KU50-S1600 351,360 143,721 53,735 207,639 98,970

Indels divergence in genome of wild ancestor W14 and cultivar KU50 Sample # indels # indels density # insertion # deletion average length W14 390,652 0.80/1000 159,467 231,080 3.59 KU50 275,639 0.79/1000 132,396 143,200 3.65 S1.600 217,226 0.64/1000 103,964 113,207 4.07

SNP/Indels among four cassava genomes

Transcriptome for photosynthesis and starch metabolism in cultivar Arg7 and wild ancestor W14 Transcriptome sequenced samples: C1 Arg7 Early root C2 Arg7 Middle root C3 Arg7 Later root C4 W14 Middle root C5 Arg7 Developing stem C6 Arg7 Functional leaf C7 W14 Functional leaf C8 W14 Developing stem

Expression profiling of genes for starch and photosynthesis pathways

Comparative expression folds of genes for photosynthesis: Arg7/W14

Cell Wall metabolism, Arg7/W14, red-high expression in root of W14

Sucrose glycolysis in root of Arg7 is weak than in W14

Comparative expression folds of genes for starch metabolism in leaf and storage root: Arg7/W14

Expression folds of genes for starch accumulation in tuber root of KU50 than in W14

Phylogenetic tree of SuSy and INV

An efficient starch biosynthesis model in tuber root of cassava

miRNAs and drought tolerance in cassava a set of 148 miRNAs in cassava have been predicted by sequencing of14 small RNA samples and referenced to genome which of 41 are novels and 107 are conserved. miRNAs and targets related to drought and development of leaf and tuber root have been found.

Eleven drought and cold inducible miRNAs with interesting targets were revealed and confirmed by qPCR miRNA Targets miR1045125 Protein binding, Zinc ion binding; Transcription factor, DNA binding miR1230481 zinc ion binding, protein serine/threonine kinase, ATP binding miR3747522 Enzyme inhibitor activity, Pectinesterase activity; DNA-binding protein-related; FUNCTIONS IN: transcription factor activity miR5178028 Encodes a H3/H4 histone acetyltransferase; Encodes eukaryotic translation initiation factor. miR3615546 oxidative phosphorylation uncoupler activity, binding oxidoreductase activity, Iron ion binding miR1229496 Protein kinase family--kinase activity, small molecular g-protein miR4806982 Kinase activity; Nuclear protein required for early embryogenesis miR5815094 auxin induced gene (IAA1) encoding a short-lived nuclear-localized transcriptional regulator protein; acetylglucosaminyltransferase ; transferase activity

ABA biosynthesis pathway in tuber root of cassava

Expression of genes in carotene and ABA synthesis pathways

Comparative genomics among cassava, Jatropha and castor bean Unique gene families: Cassava 2043 Jatropha 532 Castor bean 826 Shared gene families: 12041

Gene families in virion part and reproduction found only Coherence in biological processes among cassava, Jatropha and castor bean of Eurphorbiceace Gene families in virion part and reproduction found only in cassava

Database: Cassava-genome.cn

Ongoing work Genome fine mapping integrated assembly with physic map, BAC-end sequences and BAC-pooling sequences. Chromosomes location with assembling BACs and scarfolds based on in situ hybridization Functional verification of genes for important pathways Development of SNP markers and molecular design breeding

Summary Genome drafts of an ancestor and a cultivar in cassava were assembled and annotated. Genome diversity decreased from wild ancestor to cultivar in domestication;Millions of SNPs were discovered and recommended for genotyping in cassava. Advanced an efficient starch biosynthesis pathway in tuber root of cassava.

Acknowledgement CATAS BX Feng, Z Xia, XC Zhou, KM Li, PH Li, M Peng, WQ Wang BIG-CAS JF Xiao, JX Liu, SN Hu SCBG-CAS Gong Xiao, Chi Song, Ying Wang EMBRAPA, Brazil, Luiz C UC Davis Mingcheng Luo XJIEG-CAS Bin Liu, Binxiao Feng SIS-CAS Jun Yang, Peng Zhang Fudan U Zhicheng Wu, Ruiqi Liao, Shuigen Zhou Copenhagen U, Demark Rubini, Birger Muller Nanjing Agricultural U Qunfeng Lu