Download presentation
Presentation is loading. Please wait.
Published byMaurice Mason Modified over 8 years ago
1
Meet the ants Camponotus floridanus Carpenter ant Harpegnathos saltator Jumping ant Solenopsis invicta Red imported fire ant Pogonomyrmex barbatus Harvester ant Linepithema humile Argentine ant Atta cephalotesAcromyrmex echinatior Leafcutter ants
2
Now meet their genomes…
3
SpeciesCitationPlatform (Coverage) Assembly Program(s) Scaffold Length N50 (total) Harpegnathos saltator Jumping ant Bonasio et al 2010 Science Illumina (104x) SOAP de novo 6 lib.- 3 paired end, 3 mate pair 598 Kb (297 Mb) Camponotus floridanus Carpenter ant Bonasio et al 2010 Science Illumina (102x) SOAP de novo- 3 paired end, 3 mate pair 603 Kb (238 Mb) Acromyrmex echinatior Leafcutter ant Nygaard et al 2011 Genome Research Illumina (123x) SOAP de novo 5 lib.– 2 paired end, 3 mate pair 1.1 Mb (300Mb) Atta cephalotes Leafcutter ant Suen et al. 2011 PNAS 454 (18-20x) Roche GS Assembler 5.1 Mb (317 Mb) Solenopsis invicta Fire ant Wurm et al. 2011 PNAS 454 + Illumina (~55x) SOAP denovo + Roche GS Assembler 720 Kb (353 Mb) Linepithema humile Argentine ant Smith et al. 2011 PNAS 454 + Illumina (23x) Roche GS Assembler + Celera CABOG 1.3 Mb (43 Mb) Pogonomyrmex barbatus Harvester ant Smith et al. 2011 PNAS 454 (10-12x) Celera CABOG793 Kb (235 Mb)
4
Generic assembly procedure Assemble fragments into contigs Scaffolding– connecting contigs using mate-pair information
5
Steps involved in Illumina Assembly 1) Download data (qseq file– sequences with quality scores) 2) Filter data A) Filter low quality reads B) Trim adapter sequences 3) SOAPdenovo steps A) Preassembly error correction (Identify pairs of reads sharing a common sequence (k-mer, e.g. 17-20), estimate k-mer frequency, and remove erroneous k-mers) B) Construct contigs based on short insert libraries (200-800bp) C) Join contigs into scaffolds using information from large insert mate pair libraries (1Kb-10Kb) D) Do local reassembly of unresolved gap regions using Gap Closer for SOAPdenovo
6
2) Filtering data (specifics) A) Remove low quality reads – Remove reads that do not pass GA analysis Failed_Chastity filter (have an N in the last column of the GA export file) – Can use R BioConductor ShortRead package (may have to convert files from qseq to fastq format)BioConductor B) Remove adapter sequences – need adapter sequence information from person that did sequencing – Can use vectorstrip in EMBOSS
7
Computational power and time required for SOAPdenovo? Li et al 2010 Genome Research
8
And compared to other programs Lin et al 2011 Genomics
9
Acromyrmex echinatior genome raw data NCBI: SRA Acromyrmex genome Mate pair libraries (More redundant, To build scaffolds) Shotgun libraries (Broader coverage, To build contigs)
11
Paired end sequencing (<1Kb) Mate pair library, paired end sequencing (>1Kb)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.