Download presentation
Presentation is loading. Please wait.
1
Space Efficient Alignment Algorithms Dr. Nancy Warter-Perez June 24, 2005
2
Space Efficient Alignment Algorithms2 Outline Algorithm complexity Complexity of dynamic programming alignment algorithms Hirschberg’s Divide and Conquer algorithm
3
June 24, 2005Space Efficient Alignment Algorithms3 Algorithm Complexity Indicates the space and time (computational) efficiency of a program Space complexity refers to how much memory is required to execute the algorithm Time complexity refers to how long it will take to execute (compute) the algorithm Generally written in Big-O notation O represents the complexity (order) n represents the size of the data set Examples O(n) – “order n”, linear complexity O(n 2 ) – “order n squared”, quadratic complexity Constants and lower orders ignored O(2n) = O(n) and O(n 2 + n + 1) = O(n 2 )
4
June 24, 2005Space Efficient Alignment Algorithms4 Complexity of Dynamic Programming Algorithms for Global/Local Alignment Time complexity – O(m*n) For each cell in the score matrix, perform 3 operations Compute Up, Left, and Diagonal scores O(3*m*n) = O(m*n) Space complexity – O(m*n) Size of scoring matrix = m*n Size of trace back matrix = m*n O(2*m*n) = O(m*n) Where, m and n are the lengths of the sequences being aligned. Since m n, O( n 2 ) – quadratic complexity!
5
June 24, 2005Space Efficient Alignment Algorithms5 Memory Requirements For a sequence of 200-500 amino acids or nucleotides O(n 2 ) = 500 2 = 250,000 If store each score as a 32-bit value = 4 bytes, it requires 1,000,000 bytes to represent the scoring matrix! If store each trace back symbol as a character (8-bit value), it requires 250,000 bytes to represent the trace back matrix
6
June 24, 2005Space Efficient Alignment Algorithms6 Simple Improvement for Scoring Matrix In reality, the space complexity of the scoring matrix is only linear, i.e., O(2*min(m,n)) = O(min(m,n)) O(min(m,n)) O(n) for sequences of comparable lengths 2,000 bytes (instead of 1 million) But, trace back still quadratic space complexity
7
June 24, 2005Space Efficient Alignment Algorithms7 Hirschberg’s “Divide and Conquer” Space Efficient Algorithm Compute the score matrix(s) between the source (0,0) and (n, m/2). Save m/2 column of s. Compute the reverse score matrix (s reverse ) between the sink (n, m) and (0,m/2). Save the m/2 column of s reverse. Find middle (i, m/2) satisfies max 0 i n {s(i, m/2) + s reverse (n- i, m/2)} Recursively partition problem into 2 subproblems middle m/2m (0,0) (n,m) n i m/2m (0,0) (n,m) n middle m/2m (0,0) n middle (n,m) m (0,0) (n,m) n m (0,0) n(n,m) m (0,0) n(n,m) Source Sink
8
June 24, 2005Space Efficient Alignment Algorithms8 Pseudo Code of Space- Efficient Alignment Algorithm Path (source, sink) If source and sink are in consecutive columns output the longest path from the source to the sink Else middle middle vertex between source and sink Path (source, middle) Path (middle, sink)
9
June 24, 2005Space Efficient Alignment Algorithms9 Complexity of Space-Efficient Alignment Algorithm Time complexity Equal to the sum of the areas of the rectangles Area + ½ Area + ¼ Area + … 2*Area where, Area = n*m O(2n*m) = O(n*m) Quadratic time/computation complexity (same as before) Space complexity Need to save a column of s and s reverse for each computation (but can discard after computing middle) O(min(n,m)) – if m < n, switch the sequences (or save a row of s and s reverse instead) Linear space complexity!!
10
June 24, 2005Space Efficient Alignment Algorithms10 Project Teams and Presentation Assignments (Revised) Base Project (Global Alignment): Miguel and Joseph Extension 0 (Global Alignment - all alignments with max score): Mario Extension 1 (Ends-Free Global Alignment): Sherri and Dhonam Extension 2 (Local Alignment): Kung-Hua, and Swapna Extension 3 (Affine Gap Penalty): Cory and Nam Extension 4 (Database): Melinda and Dana Extension 5 (Space Efficient Algorithm): Michael and Sarah
11
June 24, 2005Space Efficient Alignment Algorithms11 Workshop Work on Sequence Alignment project Email me a progress report on Tuesday, June 28 th Specify the implementation status for each module List each function within a module and specify it’s status Date written Date testing completed Author Include functions in the list that are not completed (I.e., not written yet or fully tested). For these cases, write TBD (to be determined) in the respective date field. Only one report per group, but cc your partner on your e-mail!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.