Presentation is loading. Please wait.

Presentation is loading. Please wait.

Improving Free Energy Functions for RNA Folding RNA Secondary Structure Prediction.

Similar presentations

Presentation on theme: "Improving Free Energy Functions for RNA Folding RNA Secondary Structure Prediction."— Presentation transcript:

1 Improving Free Energy Functions for RNA Folding RNA Secondary Structure Prediction

2 Why RNA is Important Machinery of protein construction Catalytic role in cells –May be possible to destroy specific sequences of RNA (to interrupt protein production) –RNase P (Cech/Altman c.1981)

3 RNA Structural Levels Primary AAUCG...CUUCUUCCA Secondary Tertiary Secondary: Tertiary:

4 Abstracting the problem U C C AG G A C Zuker (1981) Nucleic Acids Research 9(1) 133-149

5 Why it is hard Hofacker et al. (1994) Monat. Chem. 125 167-188 Large search space (hard to enumerate)

6 Why it is hard Secondary structure does not exist. –Unlike proteins –Putative structures (prone to revision) Quality of Energy Functions –Discussed later

7 Current Algorithms Single-Strand –Minimum Free Energy (Zuker et. al. 1981) –Partition Functions (McCaskill 1990) Comparative Sequence Analysis –Max. Weighted Matching (Nussinov et. al. 1978) –Stochastic CFG (Sakikibara et. al. 1994) –Phylogenetic Trees (Gulko et. al. 1995) –Statistical Significance (Noller & Woese, early 80’s) See proposal for references

8 MFE / Tinoco Hypothesis The free energy of a secondary structure equals the sum of the free energies of the loops and stacked pairs Tinoco et al. (1971) Nature 230 362-367.

9 Proposed System Secondary Structures MFE (E) GA (E’) AAUCG...CUUCUUCCA 1 2 3

10 Step I - Calc MFE Structure Given a sequence  apply the MFE algorithm –Generates secondary structure S 

11 Step II - Structural Similarity Given a database of experimentally verified RNA structures –Let Q  be the database structure most similar to S  –Based on RNase P Database (Brown 1999)

12 Step III - Construct E’ Create a new energy function:

13 Discussion on E’ E’ has global information Global information precludes the use of dynamic programming (MFE, Partition) Leaves (stochastic) combinatorial optimization  Gradient Descent (no  E/  S) 3Genetic Algorithms / Simulated Annealing

14 Step IV - Genetic Algorithm RNA Structural Prediction by GA –Input: sequence  –Output: structure that maximizes E’ for  –Steady State Genetic Algorithm –Pseudoknots forbidden (conflicts) –Fitness = -E’ –Effect of Similarity(Q , S  ) diminishes with each generation (pseudo-SA).

15 Genetic Algorithm - Repn. Stem-loop representation (Chen et. Al. 2000) –Window method (EMBOSS Palindrome) 23 52 (23 52 3 3.2) start end length weight

16 Genetic Algorithm - Operators Mutation –Add stem from stem pool to a child Crossover P1 P2 C1 C2 Fit stems of P2 into C1 or C2 randomly. Placement must be conflict free.

17 Preliminary Results E’ does not lead to drastic speed up Genetic algorithm is very slow –If initial population generated randomly from stem pool. –Use suboptimal folding for initial population.

18 Preliminary Results Explained The real structure is usually very similar the Tinoco optimal structure. View E’ as a way of choosing among the suboptimal structures.

19 Future Work More testing on the entire RNase P Database (> 400 structures) Tune E’ Accuracy comparison to MFE and Partition Function Algorithms Parallelize genetic algorithm

20 END

Download ppt "Improving Free Energy Functions for RNA Folding RNA Secondary Structure Prediction."

Similar presentations

Ads by Google