1 Bio-Sequence Analysis with Cradle’s 3SoC™ Software Scalable System on Chip Xiandong Meng, Vipin Chaudhary Parallel and Distributed Computing Lab Wayne State University Detroit, MI USA
2 Outline Motivation Smith-Waterman Algorithm Related Works Parallel Architecture of 3Soc Chip Implementation Strategy on 3SoC chip Performance Evaluation Future works References
3 z Genetic sequence databases are growing exponentially z Discovered sequences are analyzed by comparison with databases z Complexity of sequence comparison is proportional to the product of query size times database size Analysis too slow on sequential computers z Analysis too slow on sequential computers z Two possible approaches Heuristics Heuristics, e.g. BLAST, FastA, Exhaustive, e.g. Smith-Waterman Exhaustive, e.g. Smith-Waterman, z The implementation of the most sensitive algorithms, Smith-Waterman algorithm on the 3SOC parallel chip multiprocessor architecture could provide high quality and performance at low cost. Motivation
4 Sequence Alignment Comparison BLAST FastA Smith- Waterman Slower Search Speed Faster LowerData QualityHigher Heuristics Heuristics, e.g. BLAST, FastA, but the more efficient the heuristics, the worse the quality of the results Exhaustive, e.g. Smith-Waterman Exhaustive, e.g. Smith-Waterman, get high-quality results but long computation time CCA – CGAAGCTTGGCTGGAACAGGACTTCTG - GG : : : : : : : : : : : : : : : : : : : : : : : CCAGCC AAGCTTCGTGGGCA -AGGAGGCCAGCGG
5 Smith-Waterman Algorithm Optimal local alignment of two sequences Performs an exhaustive search for the optimal local alignment –Complexity O(n m) for sequence lengths n and m Based on the 'dynamic programming' (DP) algorithm –Fill the DP matrix –Find the maximal value (score) in the matrix –Trace back from the score until a 0 value is reached
6 Smith-Waterman Algorithm (cont.) Optimal local alignment of two sequences Performs an exhaustive search for the optimal local alignment Based on the 'dynamic programming' (DP) algorithm
7 Smith-Waterman Algorithm (Example) ATCTCGTATGATGGTCTATCAC Align S1=ATCTCGTATGATG S2=GTCTATCAC G T C T A T C A C ATCTCGTATGATG =1, =1 A T C T C G T A T G A T G A T C T C G T A T G A T G G T C T A T C A C G T C T A T C A C
8 Data dependencies in S-W Algorithm Computational dependencies in the Smith-Waterman alignment matrix. a1 a2 a3 a4 Database Sequence b1 b2 b3 b4 b5 b6
9 Related Works Kestrel parallel processor 512-Processing Elements board at UC at Santa Cruz Field Programmable Gate Array PCI board with one 144-PE FPGA at University of Tsukuba Fuzion PE on a single chip at Nanyang Technical University
10 Architecture of 3SOC chip Characteristics of 3SoC Quads Quad is the primary unit of replication for 3SoC. A 3SoC chip has one or more Quads. PEs Each PE is a 32-bit processor with 16-bit instructions and thirty- two 32-bit registers. The PE is rated at approximately 90 MIPS. DSEs Each DSE is a 32-bit processor with 128 registers. The DSE is the primary compute engine of the UMS and is rated at approximately 350 MIPs for integer or floating point performance.
11 Protein Sequence search on 3SoC chips Compare the Query Sequence with Well-Known Sequence in Sequence Database Do the S-W sequence alignment Output Results Input DNA/Protein Query Sequence 3SoC Chips
12 Implementation Strategy on 3SoC DSE2 DB2 DSE24 DB24 DSE23 DB23 DSE1 DB1 12 PEs Query Sequences Output results
13 Implementation Details of S-W Algorithm PE MTE Buffer 1 DSE 1DSE 2 Buffer 2 Buffer 1 Query Seq. DB Seq1 DB Seq2 DRAM Inside chip Outside Chip
14 Performance Analysis
15 Demonstrated that 3SOC chip multiprocessor architecture can be applied efficiently for Comparative Genomics. Optimize implementation. Apply the next generation 3SOC Architecture on High Performance Embedded Systems for Bioinformatics computing. Conclusions and Future Work
16 References 1.Smith, T.F. AND Waterman, M.S. Identification of common molecular subsequences. J. Mol. Biol., 147, (1981), Needleman, S. and Wunsch, C. A general method applicable to the search for similarities in the amino acid sequence of two sequences.. J. Mol. Biol., 48(3), (1970), Hughey, R. Parallel hardware for sequence comparison and alignment. (1996) Comput. Appl. Biosci. 12, Schmidt, B., Schroder, H. and Schimmler, M. Massively Parallel Solutions for Molecular Sequence Analysis, International Parallel and Distributed Processing Symposium: IPDPS Workshops (2002), p Yamagucchi Y., Maruyama, T. High Speed Homology Search with FPGA. Pacific Symposium on Biocomputing Grate, L., Diekhan, M., Dahle, D. and Hughey, H. Sequence Analysis With the Kestrel SIMD parallel Processor.Pacific Symposium on Biocomputing 2001 pp
17 Questions?