Download presentation
Presentation is loading. Please wait.
Published byAlexander Moody Modified over 6 years ago
1
A Hybrid Assembly System in Zebrafish Pooled Clones
Zemin Ning The Wellcome Trust Sanger Institute 1
2
extended long reads of 1-2Kb
30-75 bp Insert ~300 bp Solexa assembly Genome/Chromosome Assembly Fishing WGS Reads WGS Reads 5X Combined Reads FuzzyPath Phusion or Phrap Phusion Solexa Reads
3
Read Coverage or Kmer Coverage
4
Minimum Kmer Coverage is 2
5
Kmer Extension & Repeat Junctions
Pileup of other reads like 454, Sanger etc at a repeat junction Consensus Means to handle repeats: - Base quality - Read pair - Fuzzy kmers - Closely related reference - 454 or Sanger reads
6
Pooled Clones: Zfish 9, Pig 3
Clone Name Length (bp) Finished Cloning Vector Species Capillary Data Pathway zH117H1 129221 Yes pTARBAC2.1 D. rerio /nfs/repository/d0012/zH117H1 zH141B18 119622 /nfs/repository/d0012/zH141B18 zH151M17 122622 /nfs/repository/d0014/zH151M17 zH117E7 139449 /nfs/repository/d0015/zH117E7 zH137D22 122615 /nfs/repository/d0023/zH137D22 zH97A24 113538 /nfs/repository/d0027/zH97A24 zH146D21 109862 /nfs/repository/d0040/zH146D21 zH140N19 118794 /nfs/repository/d0013/zH140N19 zH147D24 111470 /nfs/repository/d0011/zH147D24 bE2F11 170585 pTARBAC1.3_BamHI S. scrofa /nfs/repository/d0027/bE2F11 bE156J20 210831 /nfs/repository/d0041/bE156J20 bE240L11 216560* No /nfs/repository/d0012/bE240L11 * Finished length may be shorter or longer once complete
7
Boundary of Solexa Contigs WGS DH reads and contigs
8
Mapping of Solexa Reads On the Reference
9
Zfish and “Pig” Clone Assemblies
Solexa reads: Number of reads: million; Estimated size of covered region: Mbp; Read length: 2x36bp; Estimated read coverage: ~180X; Insert size: / bp; Zfish DH reads: 12,539 Assembly features: - contig stats Solexa Hybrid_Ctg Hybrid_Super N contigs: Bases: Mbp 1.68 Mbp 1.69 Mbp N50 size: , ,817 74,598 Largest 23, , ,808 Averaged: , ,072 17,815 Coverage: ~72.6 % ~73% ~73% Errors: ? ? ?
10
Second Set with 50 Zfish Clones
Solexa reads: Number of reads: million; Estimated size of covered region : ~9.0 Mbp; Read length: 2x54bp; Estimated read coverage: ~190X; Insert size: / bp; Zfish DH capillary reads: 112,583 Assembly features: - contig stats Solexa Hybrid_Ctg Hybrid_Super N contigs: 3, Bases: Mbp 8.39 Mbp 8.43 Mbp N50 size: , ,448 70,703 Largest 23, , ,224 Averaged: , ,194 23,493 Coverage: ~50% ~93% ~94% Errors: ? ? ?
13
maq ssaha2
14
maq ssaha2
15
Contig of hybrid assembly
Contig of Zv8 Contig of hybrid assembly
16
Acknowledgements: Yong Gu James Bonfiled Hannes Ponstingl
Helen Beasley Siobhan Whitehead Michael Quail Tony Cox
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.