Download presentation
Presentation is loading. Please wait.
Published byTamsin McCarthy Modified over 9 years ago
1
CSCI 256 Data Structures and Algorithm Analysis Lecture 16 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some by Iker Gondra
2
Dynamic Programming Review Recipe –Characterize structure of problem –Recursively define value of optimal solution –Compute value of optimal solution –Construct optimal solution from computed information Dynamic programming techniques –Binary choice: weighted interval scheduling –Multi-way choice: segmented least squares –Adding a new variable: knapsack –Dynamic programming over intervals: RNA secondary structure
3
RNA Secondary Structure RNA: String B = b 1 b 2 b n over alphabet { A, C, G, U } Secondary structure: RNA is single-stranded so it tends to loop back and form base pairs with itself. This structure is essential for understanding behavior of molecule G U C A GA A G CG A U G A U U A G A CA A C U G A G U C A U C G G G C C G Ex: GUCGAUUGAGCGAAUGUAACAACGUGGCUACGGCGAGA complementary base pairs: A-U, C-G
4
RNA Secondary Structure Secondary structure: A set of pairs S = { (b i, b j ) } that satisfy –[Watson-Crick] S is a matching and each pair in S is a Watson-Crick complement: A-U, U-A, C-G, or G-C –[No sharp turns] The ends of each pair are separated by at least 4 intervening bases. If (b i, b j ) S, then i < j - 4 –[Non-crossing] If (b i, b j ) and (b k, b l ) are two pairs in S, then we cannot have i < k < j < l C GG C A G U U UA A UGUGGCCAU ok GG C A G U UA G A UGGGCAU sharp turn G 44 C GG C A U G U UA A GUUGGCCAU crossing
5
RNA Secondary Structure Out of all the secondary structures that are possible for a single RNA molecule, which are the ones that are likely to arise? –Free energy: Usual hypothesis is that an RNA molecule will form the secondary structure with the optimum total free energy –Goal: Given an RNA molecule B = b 1 b 2 b n, find a secondary structure S that maximizes the number of base pairs approximate by number of base pairs http://www.genebee.msu.su/services/rna2_reduced.html
6
RNA Secondary Structure: Subproblems First attempt: OPT(j) = maximum number of base pairs in a secondary structure on substring b 1 b 2 b j. Either –j is not involved in a pair Find optimal secondary structure in: b 1 b 2 b j-1 –j pairs with t for some t < j – 4 OPT(j-1) 1 tj
7
RNA Secondary Structure: Subproblems If j pairs with some t where t < j – 4, then the no crossover rule tells us that we can’t have a base pair (k,l) where k < t < l < j; this implies we can’t have (k,l) where 1 ≤ k ≤ t -1 and t + 1 ≤ l ≤ j -1 This means that any other pair (k,l) in an optimal structure is either in b 1,b 2,…,b t-1 or in b t+1,…b j-1 So we must look at two subproblems which are decoupled due to the noncrossing constraint: Find the optimal secondary structure in: b 1 b 2 b t-1 Find the optimal secondary structure in: b t+1 b t+2 b j-1
8
RNA Secondary Structure: Subproblems What is different here???? The second subproblem is not on our list of subproblems, because it does not begin with b 1. We need more subproblems! We need to be able to work with subproblems that do not begin with b 1.
9
Dynamic Programming Over Intervals Notation: OPT(i, j) = maximum number of base pairs in a secondary structure of the substring b i b i+1 b j –Case 1: i j - 4 OPT(i, j) = 0 by no-sharp turns condition –Case 2: i < j - 4 If Base b j is not involved in a pair (Watson –Crick) OPT(i, j) = OPT(i, j-1) If Base b j pairs with b t for some t, i t < j - 4 the non-crossing constraint means that: OPT(i, j) = 1 + max t { OPT(i, t-1) + OPT(t+1, j-1) } take max over t such that i t < j-4 and b t and b j are Watson-Crick complements
10
Hence if i < j – 4 we want the maximum of the two values for Opt(i,j) Opt(i,j) = max ( OPT(i, j-1), 1 + max t { OPT(i, t-1) + OPT(t+1, j-1) } )
11
Dynamic Programming Over Intervals What order to solve the sub-problems? –Looking at the recurrence relation, we see that we are invoking solutions to subproblems on shorter intervals –Need to evaluate Opt for shortest intervals first – this is different from the subset sum (and knapsack) strategy of doing row by row –To achieve this need to set an auxillary variable k to a constant and use values of i and j which keep j-i = k –As k gets larger, the interval for the subproblem b i,b i+1,…b j grows
12
Dynamic Programming Over Intervals –Running time: O(n 3 ) Why??? RNA(b 1,…,b n ) { Initialize Opt[i, j] = 0 whenever i j-4 (ie, i+4 ≥ j) for k = 5, 6, …, n-1 for i = 1, 2, …, n-k set j = i + k Compute Opt[i, j] return Opt[1, n] } using recurrence
13
Running time analysis: There are O(n 2 ) subproblems to solve and evaluating the recurrence in each problem takes O(n) time (because we have to find the max over the t’s such that b t and b j are allowable pairs) So running time is O(n 3 )
14
Example: ACCGGUAGU Recall: base pairs allowed: AU, UA, CG, GC What is the basic array that we need to fill?? (here n= 9)
15
Example: ACCGGUAGU Note if i > j, let Opt (i,j) = 0 (Why??) Need two dimensions to present the array M of for values for Opt (i,j) – one for the left endpoint of the interval being considered, and one for the right endpoint Some initial values are 0 – whenever i ≥ j – 4 (Why??) Begin with k = 5; loop over the i’s from 1 to 4 (= 9 – 5) for Opt (1,6); t = 1 is only t with 1 ≤ t < 6 - 4, and b 1 b 6 is AU allowable base pair so Opt (1,6) = max( 0, max(1+0+0) ) = 1 Opt (2,7) t = 2 is only t with 2 ≤ t < 7-4; but b 2 b 7 is CA – not an allowable base pair so no ts satisfy the conditions and Opt(2,7) = Opt (2,6) = 0 Next value??
16
Example: ACCGGUAGU Opt(3,8); t = 3 only possible t and b 3 b 8 is allowable base pair so Opt(3,8)= max( Opt(3,7), max ( 1 + Opt(3,0) + Opt(2,7) ) ) = max ( 0, max(1 + 0+0) ) = 1 Next value to calculate??
17
Example: ACCGGUAGU It is Opt(4,9) Now let k = 6 and do Opt( 1, 7) then Opt( 2,8), then Opt (3,9) …. Note for Opt (1,7) both t = 1 and t = 2 satisfy the inequality i ≤ t < j – 4 (i = 1 and j = 7); are both base pair allowable? This is a fully worked example in the text – check out more values to make sure you are following the algorithm
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.