Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology.

Similar presentations


Presentation on theme: "Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology."— Presentation transcript:

1 Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology

2 2002-10-09Genomics & Computational Biology2 Dynamic Programming Optimization problems: find the best decision one after another Subproblems are not independent Subproblems share subsubproblems Solve subproblem, save its answer in a table

3 2002-10-09Genomics & Computational Biology3 Four Steps of DP 1.Characterize the structure of an optimal solution 2.Recursively define the value of an optimal solution 3.Compute the value of an optimal solution in a bottom-up fashion 4.Construct an optimal solution from computed information

4 2002-10-09Genomics & Computational Biology4 Sequence Alignment Sequence 1: G A A T T C A G T T A Sequence 2: G G A T C G A

5 2002-10-09Genomics & Computational Biology5 Align or insert gap G A A T T C A G T T A | | | G G A _ T C _ G _ _ A G _ A A T T C A G T T A | | | G G _ A _ T C _ G _ _ A

6 2002-10-09Genomics & Computational Biology6 Three Steps of SA 1.Initialization: gap penalty 2.Scoring: matrix fill 3.Alignment: trace back

7 2002-10-09Genomics & Computational Biology7 Step 1: Initialization GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -2 G -4 A -6 T -8 C -10 G -12 A -14

8 2002-10-09Genomics & Computational Biology8 Step 2: Scoring A = a 1 a 2 …a n, B = b 1 b 2 …b m S ij : score at (i,j) s(a i b j ) : matching score between a i and b j w : gap penalty figure source

9 2002-10-09Genomics & Computational Biology9 Step 2: Scoring Match: +2 Mismatch: -1 Gap: -2

10 2002-10-09Genomics & Computational Biology10 Step 2: Scoring GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -22 G -4 A -6 T -8 C -10 G -12 A -14 0 + 2 = 2 -2 + (-2) = -4 0 + 2 = 2 -2 + (-2) = -4

11 2002-10-09Genomics & Computational Biology11 Step 2: Scoring GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -220 G -4 A -6 T -8 C -10 G -12 A -14 -2 + (-1) = -3 -4 + (-2) = -6 2 + (-2) = 0 -2 + (-1) = -3 -4 + (-2) = -6 2 + (-2) = 0

12 2002-10-09Genomics & Computational Biology12 Step 2: Scoring GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -220 G -40 A -6 T -8 C -10 G -12 A -14 -2 + 2 = 0 2 + (-2) = 0 -4 + (-2) = -6 -2 + 2 = 0 2 + (-2) = 0 -4 + (-2) = -6

13 2002-10-09Genomics & Computational Biology13 Step 2: Scoring GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -220 -4-6-8-10-12-14-16-18 G -401-3-5-7-9-8-10-12-14 A -6-2231-3-5-7-9-11-10 T -8-401531-3-5-7-9 C -10-6-234531 -3-5 G -12-8-4-31234531 A -14-10-6-20153423

14 2002-10-09Genomics & Computational Biology14 Step 3: Trace back GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -220 -4-6-8-10-12-14-16-18 G -401-3-5-7-9-8-10-12-14 A -6-2231-3-5-7-9-11-10 T -8-401531-3-5-7-9 C -10-6-234531 -3-5 G -12-8-4-31234531 A -14-10-6-20153423

15 2002-10-09Genomics & Computational Biology15 Step 3: Trace back G A A T T C A G T T A G G A _ T C _ G _ _ A G A A T T C A G T T A G G A T _ C _ G _ _ A

16 2002-10-09Genomics & Computational Biology16 Excercise GCATCCG G A T C G Match: +2 Mismatch: -1 Gap: -2

17 2002-10-09Genomics & Computational Biology17 Excercise GCATCCG 0-2-4-6-8-10-12-14 G -220 -4-6-8-10 A -40120-2-4-6 T -20420-2 C -8-40-22642 G -10-6-20456 Match: +2 Mismatch: -1 Gap: -2 G C A T C C G G A T C G

18 2002-10-09Genomics & Computational Biology18 Amino acids Match/mismatch → Substitution matrix

19 2002-10-09Genomics & Computational Biology19 Global & Local alignment Global: Needlman-Wunsch Algorithm Local: Smith-Waterman Algorithm From Mount Bioinformatics Chap 3

20 2002-10-09Genomics & Computational Biology20 References Sequence alignment with Java applet –http://linneus20.ethz.ch:8080/5_4_5.htmlhttp://linneus20.ethz.ch:8080/5_4_5.html


Download ppt "Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology."

Similar presentations


Ads by Google