Presentation is loading. Please wait.

Presentation is loading. Please wait.

Space Efficient Alignment Algorithms and Affine Gap Penalties

Similar presentations


Presentation on theme: "Space Efficient Alignment Algorithms and Affine Gap Penalties"— Presentation transcript:

1 Space Efficient Alignment Algorithms and Affine Gap Penalties
Dr. Nancy Warter-Perez

2 Space Efficient Alignment Algorithms
Outline Algorithm complexity Complexity of dynamic programming alignment algorithms Memory efficient algorithms Hirschberg’s Divide and Conquer algorithm Affine gap penalty Space Efficient Alignment Algorithms

3 Space Efficient Alignment Algorithms
Algorithm Complexity Indicates the space and time (computational) efficiency of a program Space complexity refers to how much memory is required to execute the algorithm Time complexity refers to how long it will take to execute (compute) the algorithm Generally written in Big-O notation O represents the complexity (order) n represents the size of the data set Examples O(n) – “order n”, linear complexity O(n2) – “order n squared”, quadratic complexity Constants and lower orders ignored O(2n) = O(n) and O(n2 + n + 1) = O(n2) Space Efficient Alignment Algorithms

4 Space Efficient Alignment Algorithms
Complexity of Dynamic Programming Algorithms for Global/Local Alignment Time complexity – O(m*n) For each cell in the score matrix, perform 3 operations Compute Up, Left, and Diagonal scores O(3*m*n) = O(m*n) Space complexity – O(m*n) Size of scoring matrix = m*n Size of trace back matrix = m*n O(2*m*n) = O(m*n) Where, m and n are the lengths of the sequences being aligned. Since m  n, O(n2 ) – quadratic complexity! Space Efficient Alignment Algorithms

5 Space Efficient Alignment Algorithms
Memory Requirements For a sequence of amino acids or nucleotides O(n2) = 5002 = 250,000 If store each score as a 32-bit value = 4 bytes, it requires 1,000,000 bytes to represent the scoring matrix! If store each trace back symbol as a character (8-bit value), it requires 250,000 bytes to represent the trace back matrix Space Efficient Alignment Algorithms

6 Simple Improvement for Scoring Matrix
In reality, the space complexity of the scoring matrix is only linear, i.e., O(2*min(m,n)) = O(min(m,n)) O(min(m,n))  O(n) for sequences of comparable lengths 2,000 bytes (instead of 1 million) But, trace back still quadratic space complexity Space Efficient Alignment Algorithms

7 Hirschberg’s “Divide and Conquer” Space Efficient Algorithm
middle m/2 m (0,0) (n,m) n i Source Sink Compute the score matrix(s) between the source (0,0) and (n, m/2). Save m/2 column of s. Compute the reverse score matrix (sreverse) between the sink (n, m) and (0,m/2). Save the m/2 column of sreverse. Find middle (i, m/2) satisfies max 0 in {s(i, m/2) + sreverse(n-i, m/2)} Recursively partition problem into 2 subproblems Space Efficient Alignment Algorithms

8 Pseudo Code of Space-Efficient Alignment Algorithm
Path (source, sink) If source and sink are in consecutive columns output the longest path from the source to the sink Else middle middle vertex between source and sink Path (source, middle) Path (middle, sink) Space Efficient Alignment Algorithms

9 Complexity of Space-Efficient Alignment Algorithm
Time complexity Equal to the sum of the areas of the rectangles Area + ½ Area + ¼ Area + …  2*Area where, Area = n*m O(2n*m) = O(n*m) Quadratic time/computation complexity (same as before) Space complexity Need to save a column of s and sreverse for each computation (but can discard after computing middle) O(min(n,m)) – if m < n, switch the sequences (or save a row of s and sreverse instead) Linear space complexity!! Reference: Space Efficient Alignment Algorithms

10 Space Efficient Alignment Algorithms
Gap Penalties Gap penalties account for the introduction of a gap - on the evolutionary model, an insertion or deletion mutation - in both nucleotide and protein sequences, and therefore the penalty values should be proportional to the expected rate of such mutations. Space Efficient Alignment Algorithms

11 Space Efficient Alignment Algorithms

12 Source: http://www.apl.jhu.edu/~przytyck/Lect03_2005.pdf
Space Efficient Alignment Algorithms

13 Space Efficient Alignment Algorithms

14 Space Efficient Alignment Algorithms

15 Space Efficient Alignment Algorithms

16 Space Efficient Alignment Algorithms

17 Space Efficient Alignment Algorithms

18 Space Efficient Alignment Algorithms

19 Space Efficient Alignment Algorithms
Project Verification - Use EMBOSS Pairwise Alignment Tool Space Efficient Alignment Algorithms

20 Space Efficient Alignment Algorithms
Project Verification – LALIGN Space Efficient Alignment Algorithms

21 Space Efficient Alignment Algorithms
Workshop Work on Sequence Alignment project me a progress report by 6 p.m. on Tuesday, July 3rd Specify the implementation status for each module List each function within a module and specify it’s status Date written Date testing completed Author Include functions in the list that are not completed (I.e., not written yet or fully tested). For these cases, write TBD (to be determined) in the respective date field. Only one report per group, but cc your partner on your ! Space Efficient Alignment Algorithms


Download ppt "Space Efficient Alignment Algorithms and Affine Gap Penalties"

Similar presentations


Ads by Google