Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost.

Similar presentations


Presentation on theme: "Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost."— Presentation transcript:

1 Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost model –Covariance

2 Lecture 9 CS5662 Motivation First molecule of life - “Jack of all trades, by necessity” –RNA genes produce numerous kinds of RNA (distinct from mRNA from protein genes) tRNA, rRNA, components in splicing, protein trafficking Basis of global classification of life –Phylogenetic studies of RNA shed light on relationship between various life forms

3 Lecture 9 CS5663 Concepts RNA has structure –Base pairing “No date? No problem, I complement myself, if imperfectly.” –Stem, Loop, Bulge, Junction Promiscuous base pairing –Not just GC, AU, but also GU 1:n mapping between structure and sequence –Distinction between structure and sequence as evolutionary constraints

4 Lecture 9 CS5664 Dot-matrix Analysis 3’ Flip sequence Generate complement Compare result with original sequence Limitation: Which is the best structure?

5 Lecture 9 CS5665 Maximum base-pairs algorithm Premise: Best structure is structure with maximum base-pairs Approach: Dynamic Programming Optimality: Best structure for RNA sequence must be based on best structure of subsequences Scoring function: Base-pair = 1; Non-base- pair = 0;

6 Lecture 9 CS5666 Maximum base-pairs algorithm Algorithm: S(i,j) = max[ {S(i+1, j-1) + m ij }, {S(i+1, j)}, {S(i, j-1)}, max k | (i<k<j) {S(i, k) + S(k+1, j)} ]

7 Lecture 9 CS5667 Maximum base-pairs algorithm

8 Lecture 9 CS5668 Case for better cost function Limitations of maximal base pair algorithm: –Fails to distinguish between contiguous versus scattered base-pairs –Ignores other energetic contributions Stacking interactions (“Affinity between adjacent ballroom dancing pairs”) RNA flexible, but not infinitely so (“India rubber man, but still can’t fold into box”). Need to penalize distorted structures, proportional to degree of distortion

9 Lecture 9 CS5669 Minimum Energy Algorithm (MFOLD) Key differences –Score n base-pair interactions not explicitly but indirectly as  n-1 stacking interactions –Stacking costs depend on exact sequence of base-pairs (Table 8.2a) –Also penalize non-paired bases, differentially (Table 8.2b) –More involved implementation –Suboptimal structures also found

10 Lecture 9 CS56610 Minimum Energy Algorithm (MFOLD) Steps (Fig 8.6, 8.7) –First, exhaustive enumeration of possible pairs, noting exact pairs –Identify local diagonals and score based on stacking costs –Use these values, instead of just number of base pairs, to carry out dynamic programming similar to base-pair counting algorithm

11 Lecture 9 CS56611 Covariance based analysis Based on –Base pairing is conserved, even if individual bases are not –Structure (pattern of base pairing, not base-pairs per se) is of key importance for RNA molecules Approach –Multiply align RNA sequences with similar function –Identify columns across which base-pairing is conserved by statistical analysis of joint probability versus random probability Benefits –Identify structural core elements –Rank search results of structure prediction methods

12 Lecture 9 CS56612 Miscellany Stochastic grammars for RNA structure prediction/analysis (Fig. 8.16) –Productions like “G. non-terminal. C” encode complementarity Searches for RNA genes –Sequence alignment per se not useful –Searches based on self-complementary regions in sliding windows useful


Download ppt "Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost."

Similar presentations


Ads by Google