CARS at ICOPA XII, August 2010 Next-gen. Haemonchus contortus genomics.

Slides:



Advertisements
Similar presentations
Functional Genomics with Next-Generation Sequencing
Advertisements

Rainer Lehtonen PhD, Genomics and genetics project leader Metapopulation Research Group Department of Biological and Environmental Sciences, University.
Nicotinic Receptors and Anthelmintic Resistance Richard J. Martin, Adrian J. Wolstenholme, S. M. Williamson & Alan P. Robertson Department of Biomedical.
Next Generation Sequencing in Virus and Parasite Research.
Genome analysis and annotation. Genome Annotation Which sequences code for proteins and structural RNAs ? What is the function of the predicted gene products.
Some new sequencing technologies. Molecular Inversion Probes.
DNA Sequencing. The Walking Method 1.Build a very redundant library of BACs with sequenced clone- ends (cheap to build) 2.Sequence some “seed” clones.
Assembly.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Bioinformatics Alternative splicing Multiple isoforms Exonic Splicing Enhancers (ESE) and Silencers (ESS) SpliceNest Lecture 13.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?
Genome Assembly Bonnie Hurwitz Graduate student TMPL.
Making, screening and analyzing cDNA clones Genomic DNA clones
Delon Toh. Pitfalls of 2 nd Gen Amplification of cDNA – Artifacts – Biased coverage Short reads – Medium ~100bp for Illumina – 700bp for 454.
Puccinia graminis genome project Les J Szabo USDA ARS Cereal Disease Lab Department of Plant Pathology University of Minnesota.
Today’s Lecture Genetic mapping studies: two approaches
Titus Brown Qingpeng Zhang John Blischak Welcome!.
Next generation sequencing Xusheng Wang 4/29/2010.
Genome Sequencing & App. of DNA Technologies Genomics is a branch of science that focuses on the interactions of sets of genes with the environment. –
Mouse Genome Sequencing
Kerstin Howe, Mario Caccamo, Ian Sealy The Zebrafish Genome Sequencing Project Bioinformatics resources.
PE-Assembler: De novo assembler using short paired-end reads Pramila Nuwantha Ariyaratne.
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
Screening a Library Plate out library on nutrient agar in petri dishes. Up to 50,000 plaques or colonies per plate.
Laboratory for Molecular and Computational Genomics Optical Mapping of E-coli O157:H7 Alex Lim.
Next generation sequence data and de novo assembly For human genetics By Jaap van der Heijden.
Genome Sequencing in the Legumes Le et al Phylogeny Major sequencing efforts Minor sequencing efforts ~14 MY ~45 MY.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Biological Motivation for Fragment Assembly Rhys Price Jones Anne R. Haake.
ModENCODE August 20-21, 2007 Drosophila Transcriptome: Aim 2.2.
The Changing Face of Sequencing
Jan Pačes Institute of Molecular Genetics AS CR
Bombus terrestris, the buff-tailed bumble bee Native to Europe A managed pollinator Commercially available Reared in greenhouses Important pollinator in.
Overview of the Drosophila modENCODE hybrid assemblies Wilson Leung01/2014.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
1.Data production 2.General outline of assembly strategy.
The Tree of Life How do we select a gene sequence for comparison?
Introduction to RNAseq
Today Elements of complex genomes Protein domains and exon shuffling
GigAssembler. Genome Assembly: A big picture
August 20, 2007 BDGP modENCODE Data Production. BDGP Data Production Project Goals 21,000 RACE experiments 6,000 cDNA’s from directed screening and full.
Genome Annotation Assessment in Drosophila melanogaster by Reese, M. G., et al. Summary by: Joe Reardon Swathi Appachi Max Masnick Summary of.
Dobrynin et al., Genome Biology,  The African cheetah  Fastest land animal  Ancestors were distributed in the Americas, Europe and Asia until.
454 Genome Sequence Assembly and Analysis HC70AL S Brandon Le & Min Chen.
Meet the ants Camponotus floridanus Carpenter ant Harpegnathos saltator Jumping ant Solenopsis invicta Red imported fire ant Pogonomyrmex barbatus Harvester.
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
Work Presentation Novel RNA genes in A. thaliana Gaurav Moghe Oct, 2008-Nov, 2008.
Human Genome Project.
CAP5510 – Bioinformatics Sequence Assembly
Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite.
Denovo genome assembly of Moniliophthora roreri
M. roreri de novo genome assembly using abyss/1.9.0-maxk96
اجابة السؤال الاول.
Genome sequence assembly
Professors: Dr. Gribskov and Dr. Weil
Pre-genomic era: finding your own clones
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Mark M Metzstein, H.Robert Horvitz  Molecular Cell 
Today… Review a few items from last class
Bioinformatics: Buzzword or Discipline (???)
Gene Sizes Vary Strachan p146 DYSTROPHIN.
ToxoDB ApiDB Workshop June 2006.
The Release 5.1 Annotation of Drosophila melanogaster Heterochromatin
The transcript profiles in the three human cell lines based on RNA sequencing (RNA‐seq). The transcript profiles in the three human cell lines based on.
Introduction to Sequencing
Relative abundance and expression of the 10 most abundant MAGs in the bioreactor at day 96. Relative abundance and expression of the 10 most abundant MAGs.
Presentation transcript:

CARS at ICOPA XII, August 2010 Next-gen. Haemonchus contortus genomics

The life cycle of H. contortus Refs.: Gasser, R.H., unpub. review; Nikolaou and Gasser (2006), Int. J. Parasitol. 36, ; Prichard and Geary (2008), Nature 452,

C. elegans C. briggsae C. japonica C. remanei C. brenneri ElegansCaenorhabditis C. sp. 3 PS1010 C. drosophilae C. sp. 1 SB341 Droso- philae Brugia malayi Pristionchus pacificus Meloidogyne hapla Meloidogyne incognita H. contortus' evolutionary context Heterorhabditis bacteriophora Haemonchus contortus

Protein and DNA functions are conserved Refs.: Rufener et al. (2009), PLoS Pathog. 5, e ; Hu et al. (2010), Biotechnol. Adv. 28, *Monepantel R mutation * * Ce-ant-1.1 Hc-ant-1.1 *

Next-generation sequencing in a nutshell Ref.: Miller et al. (2010), Genomics 95,

Assemble DNA reads Map RNA reads onto assembly Scaffold contigs with RNA ERANGE (Bowtie + BLAT) RNAPATH Final assembly Repeat once Filter out E. coli contigs Velvet Assembly strategy used for C. sp 3 PS1010

RNA-mediated scaffolding of Velvet genomic supercontigs (RNAPATH) Within-supercontig RNA-seq reads RNA-seq exons Velvet genomic supercontigs Velvet+RNAPATH supercontigs Cross-supercontig RNA-seq reads RNA scaffolding

PS1010 assembly statistics AssemblyTotal (Mb) Super- contigs Max. sc. size (kb) N50 (kb)Pred. genes GenomicVelvet (k = 47) K ,741 GenomicVelvet + RNAPATH x K ,851 cDNAVelvet K n/a Genome size: 100 Mb [?]. Est. genomic coverage: ~170x. An additional 4.6 Mb of Velvet supercontigs matched E. coli.

H. contortus (McMaster) assembly statistics AssemblyTotal (Mb) Super- contigs Max. sc. size (kb) N50 (kb)Pred. genes/pe ptides GenomicVelvet (k = 37) K n/a GenomicVelvet + RNAPATH [L1 cDNA] K n/a GenomicVelvet + RNAPATH [L2 cDNA] K n/a cDNAOases K ,671 Genome size: ~290 to 340 Mb. Est. genomic coverage: ~35x.

Only 40% of reads map to the assembly Reads mapped with bowtie by Titus Brown. Of reads which did map, 50% went to ≥1.8 kb contigs.

What is to be done? 1. Brute force: push coverage up to at least 50x. Latest round of sequencing just raised it to ~47x. 2. Larger insert sizes: current libraries are ≤325 nt. ~500 nt goal; also, try jumping libraries (after PS1010 test). 3. Work smarter: remove erroneous or highly repetitive reads. New method for removing low-freq. 32-mers from Brown et al.:

Thanks: Robin GasserIsolated genomic DNA Bronwyn CampbellIsolated stage- and sex-specific RNA Neil YoungAided DNA and RNA work Ali MortazaviDevised RNAPATH; earlier H. contortus assembly Brian WilliamscDNA library construction Lorian SchaefferIllumina sequencing Igor AntoshechkinOptimized sequencing protocols Titus Brownk-mer filtering Jason Pell, Adina Chuang Jacobs Genome CenterInfrastructure and funding ARC, NIH, and HHMI