Download presentation
Presentation is loading. Please wait.
1
1 SC'03, Nov. 15–21, 2003 A Million-Fold Speed Improvement in Genomic Repeats Detection John W. Romein Jaap Heringa Henri E. Bal Vrije Universiteit, Amsterdam Vrije Universiteit Faculty of Sciences, Department of Computer Science Bio-Informatics Group & Computer Systems Group Amsterdam, the Netherlands
2
2 SC'03, Nov. 15–21, 2003 repeats in bio sequences important to detect essential for evolution protein structure & function diseases hard to detect any length mutations insertions/deletions different fragment sizes tandem and distant
3
3 SC'03, Nov. 15–21, 2003 repro delineates repeats ☺ sensitive two phases 1.find top alignments (slow) 2.find repeats replaced phase 1 old algorithm ☹ O(n 4 ) n < 2,000 new algorithm ☺ O(n 3 ) n < 60,000 ☺ 3-level parallel: SIMD, SMP, cluster
4
4 SC'03, Nov. 15–21, 2003 sidestep: sequence alignment superpose two sequences ( TATGCAG, TCTGAG ) match symbols vertically (good: +2, bad: -1) allow gaps (-2-1*length) maximize score compute matrix using dynamic programming
5
5 SC'03, Nov. 15–21, 2003 sidestep: local alignment Find sub-sequences that match well Ignores non-matching values before and after the subsequence (by disallowing negative values) Construct actual alignment: O(n 3 ) time Computing only the scores: O(n 2 ) time (see paper)
6
6 SC'03, Nov. 15–21, 2003 summary (TATGCAG, TCTGAG) => 6 takes O( n 2 ) time (TATGCAG, TCTGAG) => takes O( n 3 ) time Matching TATGCAG with TCTGAG gives same result as matching only the substrings TATGCAG and TCTGAG
7
7 SC'03, Nov. 15–21, 2003 finding top alignments red lines: top alignments split sequence every possible way align subsequence-pair best is first top alignment trick: find next best (top) alignment using O(n 2 ) algorithm n times; construct top alignment using O(n 3 ) algorithm repeat while avoiding found top alignments user typically wants 5-30 top alignments ordered list, do most promising alignments first realign 3-10%
8
8 SC'03, Nov. 15–21, 2003 performance old vs. new sequence: longest known protein (titin) speed improvement increases with sequence length
9
9 SC'03, Nov. 15–21, 2003 parallel alignment parallelism within alignment ☹ loop-carried dependency concurrent alignments ☹ speculative parallelism ☺ good performance three-level parallelism SSE/SSE2 multimedia extensions (SIMD) shared memory MIMD distributed memory MIMD
10
10 SC'03, Nov. 15–21, 2003 SIMD parallelism multimedia extensions 4 (SSE) or 8 (SSE2) parallel operations on consecutive 2-byte words compiler intrinsics compute 4 (or 8) neighboring matrices concurrently ☹ interleaved memory layout use fine-grained hardware for coarse-grained computation applicable to any program that does many alignments
11
11 SC'03, Nov. 15–21, 2003 SSE/SSE2 performance speedups w.r.t. new algorithm superlinear speedups MAX operator 8 extra mmx/xmm registers scheduling cache-aware alignment: 4 – 6.5 times faster
12
12 SC'03, Nov. 15–21, 2003 MIMD parallelism SIMD (SSE) parallelism is speculative If a matrix (alignment) is ‘promising’, its neighbors probably also are promising MIMD parallelism: use dynamic task scheduling, selecting most promising tasks from a job queue Shared memory (SMP): easy Distributed memory: MPI, master/worker
13
13 SC'03, Nov. 15–21, 2003 total parallel performance SMP: 2 CPUs 2 2 times faster cluster: 64*2 CPUs 548 – 889-fold speedup Up to 125x faster than SSE version on 1 CPU
14
14 SC'03, Nov. 15–21, 2003 conclusions new algorithm >> 100 times faster much more for longer sequences parallel: SSE(2), SMP, cluster SSE(2) parallelism yields superlinear speedups 128 CPUs: 548 – 889-fold speedup 1,000,000-fold speed improvement
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.