Download presentation
Presentation is loading. Please wait.
1
Optimal Sum of Pairs Multiple Sequence Alignment David Kelley
2
Dynamic Programming Extension Standard pairwise sequence alignment methods can be extended to handle k strings
3
But… Runtime is O(2 k N k ) k = # of sequences N = average length of sequences Space is O(N k ) Quickly becomes unfeasible
4
Enter Carillo-Lipman Lower bound the score Estimate distance from cell to end Calculate sum of all pairwise distances from cell to end If current score + estimate < lower bound Ignore that path
5
MSA Implemented in 1989 program MSA. Used a simple progressive alignment procedure to obtain a lower bound “generally can align 6 to 8 sequences of length 200-300 residues”
6
Gupta 1995 update Re-implemented MSA more efficiently Uses a star-tree heuristic for lower bound Ran on Sun SparcStation 10 with 128MB of RAM Runtimes varied (based on similarity of sequences too) 10 Globin B proteins of ~150 a.a. took 10 min
7
Can we do better? Better hardware more RAM multi-core processors Better heuristics MUSCLE, MAFFT very fast, accurate Higher lower bound means more of the matrix can be ignored
8
My Project Implement concepts from Carillo-Lipman Use MUSCLE for lower bound Look for opportunities to parallelize Using openMP Run on modern hardware
9
Can optimal alignment be made practical? How much better can we do than the previous attempts? How will maximizing sum of pairs compare to more popular alignment programs? Compare on multiple sequence alignment database, BAliBase
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.