Analysis of the bread wheat genome using whole- genome shotgun sequencing Manuel Spannagl MIPS, Helmholtz Center Munich Analysis of the bread wheat genome.

Slides:



Advertisements
Similar presentations
Advancing Science with DNA Sequence Maize Missouri 17 chromosome 10 project update Dan Rokhsar 3 October 2006.
Advertisements

Mo17 shotgun project Goal: sequence Mo17 gene space with inexpensive new technologies Datasets in progress: Four-phases of 454-FLX sequencing to max of.
Hexaploid wheat- Triticum aestivum 2n= 6x= A B D abcdabcd abcdabcd abcdabcd Similar gene orders but different content of similar repeats 7A.
Considerations for Analyzing Targeted NGS Data HLA
MCB 317 Genetics and Genomics Topic 11, part 2 Genomics.
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Development of COS markers in grasses Isabelle Bertin, Pauline Stephenson and Michelle Leverington-Waite John Innes Centre.
Chap. 6 Problem 2 Protein coding genes are grouped into the classes known as solitary (single) genes, and duplicated or diverged genes in gene families.
Updating the human reference assembly V.A. Schneider, P. Flicek, T. Graves, T. Hubbard & D.M. Church for the Genome Reference Consortium
Mission statement Barley (Hordeum vulgare L.) was one of the first domesticated cereal grains, originating in the Fertile Crescent.
Some Jolly Fun with Barley ESTs David Marshall & All the Folks in Computational Biology.
Wheat chr 3B T. urartuAe. taushii Class I With LTR Gypsy Copia LTR unclass Without LTR LINE SINE0.01 Class.
Introduction  Human leukocyte antigen (HLA) is the major histocompatibility complex (MHC) in humans  Group of genes ('superregion') on chromosome 6.
The IWGSC: Building the sequence-based foundation for accelerated wheat breeding Kellye A. Eversole IWGSC Executive Director & The IWGSC Cereals for Food,
The Structure and Function of the Expressed Portion of the Wheat Genomes.
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
Gene Prediction Methods G P S Raghava. Prokaryotic gene structure ORF (open reading frame) Start codon Stop codon TATA box ATGACAGATTACAGATTACAGATTACAGGATAG.
Effects of Gluten Composition and Molecular Weight Distribution on the Noodle Making Potential of Hard White Wheats Caryn Ong Bioresource Research (Biotechnology)
CS273a Lecture 2, Autumn 10, Batzoglou DNA Sequencing (cont.)
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
The IWGSC: Strategies & Activities to Sequence the Bread Wheat Genome Kellye A. Eversole IWGSC Executive Director & The IWGSC Wheat Breeding 2014: Tools,
Plants.ensembl.org / The transPLANT project is funded by the European Commission within its 7 th Framework Programme under the thematic.
Mouse Genome Sequencing
Maps and Markers Gramene SAB Report Jan CMap Improvements Expanded, reorganized and hidden menus New map glyphs –Number of features –Crop map –Magnify.
CO 10.
H = -Σp i log 2 p i. SCOPI Each one of the many microbial communities has its own structure and ecosystem, depending on the body environment it exists.
A Comparative mapping resource GRAMENE Doreen Ware USDA ARS Cold Spring Harbor Laboratory
The IWGSC: Strategies & Activities to Sequence the Bread Wheat Genome
05/04/2005 Informatics Meeting C. elegans – “Back To The Future”. Paul Davis (aka Huey)
Screening a Library Plate out library on nutrient agar in petri dishes. Up to 50,000 plaques or colonies per plate.
EXPLORING DEAD GENES Adrienne Manuel I400. What are they? Dead Genes are also called Pseudogenes Pseudogenes are non functioning copies of genes in DNA.
Figure S1_Yao Qin et al. Figure S1 Occurrence and distribution of trihelix family in different plant species. Red branches in the cladogram indicate that.
What does the word Promoter mean? It is the place at which RNA Pol II binds. But the word is incorrectly used to describe Enhancers plus Promoter.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
1. The next plant genome was the 450 Mbp genome of rice, Oryza sativa
A Sequenciação em Análises Clínicas Polymerase Chain Reaction.
1 The Interrupted Gene. Ex Biochem c3-interrupted gene Introduction Figure 3.1.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
The Changing Face of Sequencing
Plants.ensembl.org / The transPLANT project is funded by the European Commission within its 7 th Framework Programme under the thematic.
CS 461b/661b: Bioinformatics Tools and Applications Software Algorithm Mathematical Models Biology Experiments and Data.
IPlant Genomics in Education Workshop Genome Exploration in Your Classroom.
Table 8.3 & Alberts Fig.1.38 EVOLUTION OF GENOMES C-value paradox: - in certain cases, lack of correlation between morphological complexity and genome.
Families with >5 genes are more common in plants than in animals adapted from Lockton S, Gaut BS Trends Genet 21:
APPGSTA Meeting January 2015 An introduction to wheat.
MPL The DNA Sequence of chimpanzee chromosome 22 and comparative analysis with its human ortholog, chromosome 21 Bioinformatics Dae-Soo Kim.
The Food Guide Pyramid Good 4 U Staff Training. Basic Food Groups Fats, oils, and sweets Milk, yogurt, and cheese Lean meat, poultry, fish, eggs, beans,
Lesson Overview 17.4 Molecular Evolution.
Genome Annotation Assessment in Drosophila melanogaster by Reese, M. G., et al. Summary by: Joe Reardon Swathi Appachi Max Masnick Summary of.
454 Genome Sequence Assembly and Analysis HC70AL S Brandon Le & Min Chen.
Gene Technologies and Human ApplicationsSection 3 Section 3: Gene Technologies in Detail Preview Bellringer Key Ideas Basic Tools for Genetic Manipulation.
Welcome to the combined BLAST and Genome Browser Tutorial.
Chapter 14 GENETIC TECHNOLOGY. A. Manipulation and Modification of DNA 1. Restriction Enzymes Recognize specific sequences of DNA (usually palindromes)
Institute of Crop Sciences, CAAS
IPlant Genomics in Education Workshop Genome Exploration in Your Classroom.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
Shin-Han Shiu Department of Plant Biology
Lesson Overview 17.4 Molecular Evolution.
Alu insert, PV92 locus, chromosome 16
Pick a Gene Assignment 4 Requirements
Genetic Engineering Text Ch. 13 Pg
UniProt: Universal Protein Resource
families with >5 genes are more common in plants than in animals
by Jorge Dubcovsky, and Jan Dvorak
H = -Σpi log2 pi.
Volume 8, Issue 6, Pages (June 2015)
A Sequenciação em Análises Clínicas
Cereal Genome Evolution: Grasses, line up and form a circle
Presentation transcript:

Analysis of the bread wheat genome using whole- genome shotgun sequencing Manuel Spannagl MIPS, Helmholtz Center Munich Analysis of the bread wheat genome using whole- genome shotgun sequencing Manuel Spannagl MIPS, Helmholtz Center Munich

Wheat - why bother? ① Many varieties incl. bread wheat, durum („pasta“) wheat… ② Third most-produced cereal with 651 millions tons (2010), cultivated worldwide in different climates ③ Leading source of vegetable protein in human food

The Challenge

Wheat – a WGS approach Aims and Goals

① 5x 454 WGS sequencing => 85 Gb sequence, 220 million reads ② ~79% of reads repeat-related ③ direct Low-copy-number genome assembly (LCG, Newbler) => collapses many homologous gene sequences ④ to prevent collapsing of homologous gene sequences and reduce complexity => orthologous group assembly at high stringency Wheat – a WGS approach

① Use fully sequenced and analysed reference genomes (rice, Brachypodium, sorghum) ② Group genes into families (Orthologous Groups) ③ Use the orthologous group representatives as sequence baits to capture corresponding sequence reads. ④ Do sub-assembly for each „orthologous bin“ seperately WGS assembly using „in silico exon capture“

Bread Wheat Genaology

Ortholome directed assembly circumvents limitations faced by WGS assembly

The ortholome directed assembly delivers ordered segments

The ortholome directed assembly delivers ordered segments II 132

Coverage of Orthologous Group

Gene Copy Retention after Polyploidization - Calibration of the method- Gene Copy Retention after Polyploidization - Calibration of the method- 97%99%100% Maize Hexaploid Rice „TRice“

Gene Copy Retention after Polyploidization

Gene fragments are abundant in wheat

Gene fragments are abundant in the wheat genome

Expanded Wheat Gene Families

Shotguns (Illumina 80x (T.monococcum)) and 454 (3x (Ae.tauschii)) cDNA seq‘s from the Ae. speltoides group (B) Can A and D genome shotgun data be used to dissect the ABD of wheat? The Three Nephews: the A, B and D‘s of wheat

The Three Nephews: Similarity on a Sequence Basis

Wheat A, B and D Assignment using Machine Learning (SVM)

Particular Gene Categories are preferentially retained

Franz Marc „Hocken im Schnee“ Almost full gene complement detected and structured 10000s of pseudogenes detected Separation of A, B and D using machine learning with > 75% accuracy Complementary to chromosome sorting approaches Applicable to polyploids in general to get genome overview Rapid and economic approach to pragmatically cope with limitations in sequence technology Summary

„In Silico Exon Capture“ Statistics

The composition of A, B and D are similar

acknowledgements MIPS Matthias Pfeifer Klaus Mayer All other group members The UK Wheat Consortium Mike Bevan Neil Hall Anthony Hall Keith Edwards Rachel Brenchley CSHL Dick McCombie UC Davis & USDA Albany Jan Dvorak Mincheng Luo Olin Anderson Kansas State University Bikram Gill Sunish Segal EBI Paul Kersey Dan Bolser