CARS at ICOPA XII, August 2010 Next-gen. Haemonchus contortus genomics
The life cycle of H. contortus Refs.: Gasser, R.H., unpub. review; Nikolaou and Gasser (2006), Int. J. Parasitol. 36, ; Prichard and Geary (2008), Nature 452,
C. elegans C. briggsae C. japonica C. remanei C. brenneri ElegansCaenorhabditis C. sp. 3 PS1010 C. drosophilae C. sp. 1 SB341 Droso- philae Brugia malayi Pristionchus pacificus Meloidogyne hapla Meloidogyne incognita H. contortus' evolutionary context Heterorhabditis bacteriophora Haemonchus contortus
Protein and DNA functions are conserved Refs.: Rufener et al. (2009), PLoS Pathog. 5, e ; Hu et al. (2010), Biotechnol. Adv. 28, *Monepantel R mutation * * Ce-ant-1.1 Hc-ant-1.1 *
Next-generation sequencing in a nutshell Ref.: Miller et al. (2010), Genomics 95,
Assemble DNA reads Map RNA reads onto assembly Scaffold contigs with RNA ERANGE (Bowtie + BLAT) RNAPATH Final assembly Repeat once Filter out E. coli contigs Velvet Assembly strategy used for C. sp 3 PS1010
RNA-mediated scaffolding of Velvet genomic supercontigs (RNAPATH) Within-supercontig RNA-seq reads RNA-seq exons Velvet genomic supercontigs Velvet+RNAPATH supercontigs Cross-supercontig RNA-seq reads RNA scaffolding
PS1010 assembly statistics AssemblyTotal (Mb) Super- contigs Max. sc. size (kb) N50 (kb)Pred. genes GenomicVelvet (k = 47) K ,741 GenomicVelvet + RNAPATH x K ,851 cDNAVelvet K n/a Genome size: 100 Mb [?]. Est. genomic coverage: ~170x. An additional 4.6 Mb of Velvet supercontigs matched E. coli.
H. contortus (McMaster) assembly statistics AssemblyTotal (Mb) Super- contigs Max. sc. size (kb) N50 (kb)Pred. genes/pe ptides GenomicVelvet (k = 37) K n/a GenomicVelvet + RNAPATH [L1 cDNA] K n/a GenomicVelvet + RNAPATH [L2 cDNA] K n/a cDNAOases K ,671 Genome size: ~290 to 340 Mb. Est. genomic coverage: ~35x.
Only 40% of reads map to the assembly Reads mapped with bowtie by Titus Brown. Of reads which did map, 50% went to ≥1.8 kb contigs.
What is to be done? 1. Brute force: push coverage up to at least 50x. Latest round of sequencing just raised it to ~47x. 2. Larger insert sizes: current libraries are ≤325 nt. ~500 nt goal; also, try jumping libraries (after PS1010 test). 3. Work smarter: remove erroneous or highly repetitive reads. New method for removing low-freq. 32-mers from Brown et al.:
Thanks: Robin GasserIsolated genomic DNA Bronwyn CampbellIsolated stage- and sex-specific RNA Neil YoungAided DNA and RNA work Ali MortazaviDevised RNAPATH; earlier H. contortus assembly Brian WilliamscDNA library construction Lorian SchaefferIllumina sequencing Igor AntoshechkinOptimized sequencing protocols Titus Brownk-mer filtering Jason Pell, Adina Chuang Jacobs Genome CenterInfrastructure and funding ARC, NIH, and HHMI