Presentation is loading. Please wait.

Presentation is loading. Please wait.

M. roreri de novo genome assembly using abyss/1.9.0-maxk96

Similar presentations


Presentation on theme: "M. roreri de novo genome assembly using abyss/1.9.0-maxk96"— Presentation transcript:

1 M. roreri de novo genome assembly using abyss/1.9.0-maxk96
Abyss 1.9.0: introduces a new tool called Sealer for closing scaffold gaps. Also, it has Konnector, a fast and memory-efficient tool to fill the gap between paired-end reads. GROUP 5 Hyeim Jung Pedro Pablo Parra Diana Vanessa Sarria Zuniga Jacob Shoemake

2 Construction of contigs Solving Ambiguities and merging contigs
without using the paired-end information Solving Ambiguities and merging contigs Using paired-end information 1 2 HOW ABySS WORKS… Assembly algorithm: two major steps Required Select a ABySS compiled version depending on a maximum k-mer size K-mer size: Kmergenie Input library files Paired-end Unpaired(Single-end) Mate pair The assembly is performed in two major steps. First, without using the paired-end information, contigs are extended until either they cannot be unambiguously extended or come to a blunt end due to a lack of coverage. In the second step the paired-end information is used to resolve ambiguities and merge contigs. The paired-end information is used to identify contigs that can be linked together. Two contigs are considered to be linked if at least p pairs (by default p = 5) join the contigs Contain Konnector: to fill the gap between paired-end reads Sealer: for closing scaffold gaps

3 OUR ASSEMBLY STRATEGIES… Two assembly types
abyss-pe k=87 name=assembly5 lib='pe1' mp='mp1' pe1=‘paired PE.1.fq paired PE2.fq’ se=’unpaired PE-MP’ mp1=‘paired MP.1.fq paired MP.2.fq’ Assembly 3 abyss-pe k=81 name=assembly3 lib='pe1 pe2' mp='mp1' pe1=‘paired PE.1.fq paired PE2.fq’ pe2=‘paired MP.1.fq paired MP.2.fq’ se=’unpaired PE-MP’ mp1=‘paired MP.1.fq paired MP.2.fq’ Paired PE and Unpaired PE-MP 87 Paired PE-MP and Unpaired PE-MP 81 Note: mp1 is used for scaffolding. Do not contribute to the consensus sequence.

4 Assembly 3 Assembly 5 Contigs Contigs Scaffolds Scaffolds Paired MP
Paired PE Paired PE Paired MP Unpaired PE&MP Unpaired PE&MP Scaffolds Scaffolds Paired MP Paired MP

5 Evaluation of best assemblies
Quast Report without reference genome Bowtie2 Assembly File # contigs Largest Total Length N50 # N's Predicted genes Mapped PE reads assembly_5 contigs.fa (total, --min-contig 500bp) 4328 (>= 0 bp) 9711 (>= 1000 bp) 3544 (>= 5000 bp) 1887 (>= bp) 1181 (>= bp) 604 (>= bp) 268 553,471 (total, --min-contig 500bp) 57.68Mb (>= 0 bp) 58.59Mb (>= 1000 bp) 57.12Mb (>= 5000 bp) 52.99Mb (>= bp) 47.96Mb (>= bp) 38.75Mb (>= bp) 27.02Mb 45,432 46,124 (unique) 17734 (>= 0 bp) (>= 300 bp) 21553 (>= 1500 bp) 1189 (>= 3000 bp) 6 60.40% aligned concordantly exactly 1 time 22.51% aligned concordantly >1 times Total 82.91% scaffolds.fa (total, --min-contig 500bp) (>= 0 bp) (>= 1000 bp) (>= 5000 bp) (>= bp) (>= bp) (>= bp) 1,036, ,564 (total, --min-contig 500bp) Mb (>= 0 bp) Mb (>= 1000 bp) 57.37Mb Mb (>= 5000 bp) Mb (>= bp) Mb (>= bp) Mb Mb (>= bp) Mb 99,290 51,001 568,877 945 (unique) (>= 0 bp) (>= 300 bp) (>= 1500 bp) (>= 3000 bp) 66 66 60.41% aligned concordantly exactly 1 time 22.54% aligned concordantly >1 times Total %  assembly_3 (total, --min-contig 500bp) 4816 (>= 0 bp) (>= 1000 bp) 3514 (>= 5000 bp) 1642 (>= bp) 1078 (>= bp) 567 (>= bp) 256 1,035,772 (total, --min-contig 500bp) 56.36Mb (>= 0 bp) 61.10Mb (>= 1000 bp) 55.45Mb (>= 5000 bp) 50.87Mb (>= bp) 46.79Mb (>= bp) 38.70Mb (>= bp) 27.77Mb 48,947 247,454 (unique) 17570 (>= 0 bp) (>= 300 bp) 21274 (>= 1500 bp) 1171 (>= 3000 bp) 63 58.95% aligned concordantly exactly 1 time 22.55% aligned concordantly >1 times Total 81.5% (total, --min-contig 500bp) (>= 0 bp) (>= 1000 bp) (>= 5000 bp) (>= bp) (>= bp) (>= bp) 1,771, ,868 (total, --min-contig 500bp) Mb (>= 0 bp) Mb (>= 1000 bp) Mb (>= 5000 bp) (>= bp) Mb (>= bp) Mb (>= bp) Mb 102,079 51,480 1,600,849 806 (unique) (>= 0 bp) (>= 300 bp) (>= 1500 bp) (>= 3000 bp) 63 63 58.97% aligned concordantly exactly 1 time 22.62% aligned concordantly >1 times Total 81.59%  Evaluation of best assemblies PE: , peak 301 MP: , peak 1700 Quast options: quast/3.2 --gene-finding --eukaryote Bowtie2 options: bowtie2/ very-sensitive-local --no-unal --phred33 -p 20

6 conclusions Total Length of Assembly # Scaffolds Largest scaffold N50
Abyss assembly Broken Comment Total Length of Assembly (~) Assembly 5 Assemblies: Same Broken: Assembly 3 has 1.1 Mb less. # Scaffolds Assembly 3 has many Scaffolds <500 bp compared with Assembly 5. Largest scaffold Assembly 3 N50 Assembly 3 (~) Abyss: Assemb. 3 has 2,789 bp more Broken: Assemb. 3 has 479 bp more. # N's Abyss: Assemb. 3 has 1 Mb more N's. Broken: Assemb. 5 has 139 more N's. # Unique predicted genes Assembly 5 (~) Abyss: Assemb. 5 has 67 genes more Broken: Assemb. 3 has 71 genes more Mapped paired end reads Assemb. 5 has 1.36% more (82.95% vs 81.59%).

7

8 reads; of these: (100.00%) were paired; of these: (17.09%) aligned concordantly 0 times (60.40%) aligned concordantly exactly 1 time (22.51%) aligned concordantly >1 times ---- pairs aligned concordantly 0 times; of these: (61.27%) aligned discordantly 1 time pairs aligned 0 times concordantly or discordantly; of these: mates make up the pairs; of these: 37310 (1.11%) aligned 0 times (21.66%) aligned exactly 1 time (77.23%) aligned >1 times 99.93% overall alignment rate


Download ppt "M. roreri de novo genome assembly using abyss/1.9.0-maxk96"

Similar presentations


Ads by Google