Presentation is loading. Please wait.

Presentation is loading. Please wait.

110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction BCB 444/544 Lecture 24  Protein Tertiary Structure Prediction #24_Oct17.

Similar presentations


Presentation on theme: "110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction BCB 444/544 Lecture 24  Protein Tertiary Structure Prediction #24_Oct17."— Presentation transcript:

1 110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction BCB 444/544 Lecture 24  Protein Tertiary Structure Prediction #24_Oct17

2 210/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Mon Oct 15 - Lecture 23 Protein Tertiary Structure Prediction Chp 15 - pp 214 - 230 Wed Oct 17 & Thurs Oct 18 - Lecture 24 & Lab 8 (Terribilini) RNA Structure/Function & RNA Structure Prediction Chp 16 - pp 231 - 242 Fri Oct 18 - Lecture 25 Gene Prediction Chp 8 - pp 97 - 112 Required Reading (before lecture)

3 310/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction New Reading & Homework Assignment ALL: HomeWork #4 (emailed & posted online Sat AM) Due: Mon Oct 22 by 5 PM (not Fri Oct 19) Read: Ginalski et al.(2005) Practical Lessons from Protein Structure Prediction, Nucleic Acids Res. 33:1874-91. http://nar.oxfordjournals.org/cgi/content/full/33/6/1874 http://nar.oxfordjournals.org/cgi/content/full/33/6/1874 (PDF posted on website) Although somewhat dated, this paper provides a nice overview of protein structure prediction methods and evaluation of predicted structures. Your assignment is to write a summary of this paper - for details see HW#4 posted online & sent by email on Sat Oct 13

4 410/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Seminars this Week BCB List of URLs for Seminars related to Bioinformatics: http://www.bcb.iastate.edu/seminars/index.html Oct 18 Thur - BBMB Seminar 4:10 in 1414 MBB Sachdeve Sidhu (Genentech) Phage peptide and antibody libraries in protein engineering and ligand selection Oct 19 Fri - BCB Faculty Seminar 2:10 in 102 ScI Lyric Bartholomay (Ent, ISU) TBA

5 510/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Chp 15 - Tertiary Structure Prediction SECTION V STRUCTURAL BIOINFORMATICS Xiong: Chp 15 Protein Tertiary Structure Prediction Methods Homology Modeling Threading and Fold Recognition Ab Initio Protein Structural Prediction CASP

6 610/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Tertiary Structure Prediction Methods 2 (or 3) Major Methods: 1.Comparative Modeling: Homology Modeling (easiest!) Threading and Fold Recognition (harder) 2.Ab Initio Protein Structural Prediction (really hard)

7 710/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction 1.Align target sequence with template structures in fold library (usually from the PDB) 2.Calculate energy score to evaluate "goodness of fit" between target sequence & template structure 3.Rank models based on energy scores Target Sequence Structure Templates ALKKGF…HFDTSE Steps in Threading

8 810/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction A Local Example: Rapid Threading Approach for Protein Structure Prediction Kai-Ming Ho, Physics Haibo Cao Yungok Ihm Zhong Gao James Morris Cai-zhuang Wang Drena Dobbs, GDCB Jae-Hyung Lee Michael Terribilini Jeff Sander Cao H, Ihm Y, Wang, CZ, Morris, JR, Su, M, Dobbs, D, Ho, KM (2004) Three-dimensional threading approach to protein structure recognition Polymer 45:687-697

9 910/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Simplify: Template structure representation Å if (contact) Otherwise A neighbor in sequence (non-contact) i j 1 N Template structure ( contact matrix) Yungok Ihm

10 1010/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Simplify: Energy Function Interaction “counts” only if two hydrophobic amino acid residues are in contact At residue level, pair-wise hydrophobic interaction is dominant: E =  i,j C ij U ij C ij : contact matrix U ij = U ( residue I, residue J ) MJ : U = U ij LTW : U = Q i *Q j HP : U = {1,0} Yungok Ihm

11 Energy calculation: Contact energy Miyazawa-Jernigan (MJ) matrix : 210 parameters Statistical potential Li-Tang-Wingreen (LTW) : 20 parameters Contact Energy : with C M F I L CMFILVWCMFILVW 046 054 -020 049 -001 006 057 001 003 -008 052 018 010 -001 -004 ~ solubility ~ hydrophobicity contact matrix Yungok Ihm

12 i j 1 N Template Structure Contact Energy Contact Matrix Sequence AVFMRIHNDIVYNDIANTTQ Sequence Vector otherwise (a neighbor in sequence),0 56 if,1   ij C rC Å Scoring Function Summary of Ho Threading Procedure Yungok Ihm

13 Can complexity be further reduced? Consider simplifying structure representation, too ALKKGF…HFDTSE Sequence – Structure (1D – 3D problem) (1D – 2D problem) (1D – 1D problem) Sequence – Contact Matrix Sequence – 1D Profile Haibo Cao

14 Represent contact matrix by its dominant eigenvector (1D profile) First eigenvector (with highest eigenvalue) dominates the overlap between sequence and structure Higher ranking (rank > 4) eigenvectors are “sequence blind” Haibo Cao

15 1510/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Threading Alignment Step - now fast! Align target sequence vector (1D) with eigenvector profile of template structure (1D) 1D Profile Maximize the overlap between the Sequence ( S ) and the profile ( P ) allowing gaps Calculate contact energy using the alignment: E c New profile Cao et al Polymer 45 (2004)

16 1610/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Parameters for alignment? Gap penalty: Insertion/deletion in helices or strands is strongly penalized; smaller penalties for in/dels in loops Gap penalties apply to alignment score only, not to energy calculation Size penalty: If a target residue and aligned template residue differ in radius by > 0.5Å and if residue is involved in > 2 contacts, alignment is penalized Size penalties apply to alignment score only, not to energy calculation Loop Helix ALKKGFG…HFDTSE Yungok Ihm

17 1710/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction How incorporate secondary structure? Predict secondary structure of target sequence (PSIPRED, PROF, JPRED, SAM, GOR V) N + = total number of matches between predicted & actual secondary structure of template N - = total number of mismatches N s = total number of residues selected in alignment “Global fitness” : f = 1 + (N + - N - ) / N s E mod = f * E threading Yungok Ihm

18 How much better is this “fit” than random? E shuffle : Shuffled Sequence vs Structure E relative = E mod – E shuffled Yungok Ihm Avg E score for same sequence shuffled (randomized) many times E score modifed to reflect fit with predicted 2' structure

19 1910/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Performance Evaluation? "Blind Test" CASP5 Competition (CASP7 is most recent) (Critical Assessment of Protein Structure Prediction) Given: Amino acid sequence Goal: Predict 3-D structure (before experimental results published)

20 2010/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Typical Results: (well, actually, our BEST Results): HO = #1-Ranked CASP5 Prediction for this Target Target 174 PDB ID = 1MG7 Actual Structure Predicted Structure T174_1 T174_2 Cao, Ihm, Wang, Dobbs, Ho

21 2110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction FR Fold Recognition (targets manually assessed by Nick Grishin) ----------------------------------------------------------- Rank Z-Score Ngood Npred NgNW NpNW Group-name 1 24.26 9.00 12.00 9 12 Ginalski 2 21.64 7.00 12.00 7 12 Skolnick Kolinski 3 19.55 8.00 12.50 9 14 Baker 4 16.88 6.00 10.00 6 10 BIOINFO.PL 5 15.25 7.00 7.00 7 7 Shortle 6 14.56 6.50 11.50 7 13 BAKER-ROBETTA 7 13.49 4.00 11.00 4 11 Brooks 8 11.34 3.00 6.00 3 6 Ho-Kai-Ming 9 10.45 3.00 5.50 3 6 Jones-NewFold ----------------------------------------------------------- FR NgNW - number of good predictions without weighting for multiple models FR NpNW - number of total predictions without weighting for multiple models Overall Performance in CASP5 Contest ~8th out of 180 (M. Levitt, Stanford)

22 2210/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction CASP - Check it out! Critical Assessment of Protein Structure Prediction http://predictioncenter.gc.ucdavis.edu/ http://predictioncenter.gc.ucdavis.edu/ CASP7 contest - 2006: http://www.predictioncenter.org/casp7/Casp7.html Provides assessment of automated servers for protein structure prediction (LiveBench, CAFASP, EVA) & URLs for them Related contests & resources: Protein Function Prediction (part of CASP) CAPRI = Critical Assessment of Predicted Interactions New: CASPM = CASP for M = Mutant proteins Predict effects of small (point) mutations, e.g., SNPs

23 2310/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Another Convenient List of Links for Protein Prediction Servers http://en.wikipedia.org/wiki/List_of_protein_structure_pre diction_software

24 2410/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Chp 13 - Protein Structure Visualization, Comparison & Classification SECTION V STRUCTURAL BIOINFORMATICS Xiong: Chp 13 Protein Structure Visualization, Comparison & Classification Protein Structural Visualization  Protein Structure Comparison Protein Structure Classification

25 2510/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Protein Structure Comparison Methods 3 Basic Approaches for Aligning Structures (see Xiong textbook for details) 1.Intermolecular 2.Intramolecular 3.Combined But, very active research area - many recent new methods 3 Popular Methods: DALI = Distance Matrix Alignment of Structures (Holm) FSSP Database SSAP = Sequential Structure Alignment Program (Orengo) CATH Database CE = Combinatorial Extension (Bourne) VAST at NCBI URLS: http://en.wikipedia.org/wiki/Structural_alignment_software

26 2610/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Another local example : Combining Structure Prediction, Machine Learning & "Real" (wet-lab) Experiments to Investigate the Lentiviral Rev Protein: A Step Toward New HIV Therapies Susan Carpenter (Washington State Univ) Wendy Sparks Yvonne Wannemuehler Drena Dobbs, GDCB Jae-Hyung Lee Michael Terribilini Kai-Ming Ho, Physics Yungok Ihm Haibo Cao Cai-zhuang Wang Gloria Culver, BBMB Laura Dutca

27 2710/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Provirus Cytoplasm Nucleus Late: Structural Proteins Progeny RNA Macromolecular interactions mediated by Rev protein in lentiviruses (HIV & EIAV) pre-mRNA AAAA (protein-protein) NUCLEAR EXPORT AAAA Rev NUCLEAR IMPORT Spliceosome AAAA Early: Regulatory Proteins Tat Rev MULTIMERIZATION AAAA Rev RNA BINDING Rev (protein-RNA) Susan Carpenter

28 2810/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Rev is essential for lentiviral replication Rev is a small nucleoplasmic shuttling protein (HIV Rev 115 aa; EIAV Rev 165 aa) Recognizes a specific binding site on viral RNA: Rev Responsive Element (RRE) Interacts with CRM1 to export incompletely spliced viral RNAs from nucleus to the cytoplasm Specific domains of Rev mediate nuclear localization, RNA binding, and nuclear export Critical role of Rev in lentiviral replication makes it an attractive target for antiviral (AIDs) therapy

29 2910/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Problem: no high resolution Rev structure! not even for HIV Rev, despite intense effort ($$) Why?? Rev aggregates at concentrations needed for NMR or X- ray crystallography What about insights from sequence comparisons? "undetectable" sequence similarity among Revs from different lentiviruses (eg, EIAV vs HIV <10%) But: We know that lentiviral Rev proteins are functionally "homologous" - even in highly diverse lentiviruses

30 3010/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Computationally model structures of lentiviral Rev proteins - using structural threading algorithm (with Ho et al) Predict critical residues for RNA-binding, protein interaction - using machine learning algorithms (with Honavar et al ) Test model and predictions - using genetic/biochemical approaches (with Carpenter & Culver) - using biophysical approaches (with Andreotti & Yu groups) Initially: focus on EIAV Rev & RRE Hypothesis: Rev proteins from diverse lentiviruses share structural features critical for function Approach:

31 3110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction  HIV-1 Rev Functional domains: EIAV vs HIV Rev 1 31 165  EIAV Rev NES NLS RRDRW ERLEKRRRK RBM Folding ? exon 1 exon 2 NES - Nuclear Export Signal NLS - Nuclear Localization Signal RBM - putative RNA Binding Motif 1 116 NES NLS/RBM RQARRNRRRRWR

32 3210/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Predicted EIAV Rev Structure Yungok Ihm

33 3310/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction EIAVHIVFIV SIV DimerHIV Dimer Comparison of Predicted Rev Structures Yungok Ihm

34 3410/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction A Predicted Structure HIV Rev N-terminus B NMR Structure HIV Rev N-terminal Peptide (Battiste & Williamson) C Overlay Alignment of Predicted & NMR Structures Predicted vs Experimental Structure of N-terminal region of HIV Rev Yungok Ihm

35 3510/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Location of functional residues EIAV Rev? Yungok Ihm Putative RBM NES Leu36,45,49: On surface, consistent with role in nuclear export Leu95 & Leu109: Buried in core, critical hydrophic contacts for fold?

36 3610/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Mutate hydrophobic residues predicted to be critical for helical packing in core L65 L95 L109 Yungok Ihm Single Ala Mutation L  A Single Asp Mutation L  D Negligible effect on Rev activity Dramatic change in Rev activity? Insert charged aa in hydrophobic core Double Ala Mutation L  L  A  A Reduction in Rev activity? L65 vs L95 & L109 Single mutants: Leu to Ala Leu to Asp Double mutants: Leu to Ala

37 3710/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Activity of Rev Structural Mutants Sham RI pcDNA3 Functional Analysis of Rev Structural Mutants in vivo (CAT assay) Wendy Sparks

38 3810/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Functional domains: EIAV vs HIV Rev  HIV-1 Rev - RNA interaction - Protein interaction NES - Nuclear Export Signal NLS - Nuclear Localization Signal RBM - putative RNA Binding Motif Green Red 1 116 NES NLS/RBM RQARRNRRRRWR  EIAV Rev NES NLS RRDRW ERLEKRRRK RBM Folding ?

39 3910/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Putative RNA-binding Motifs & Predicted RNA-binding Residues Mapped onto Predicted EIAV Rev Structure 61 71 81 91 ARRHLGPGPT QHTPS RRDRW IREQILQAEV L Q ERLE WRIR … ++ +++++++ +++++ +++++ + + 31 41 51 61 71 81 91 101 111 121 131 141 151 161 DPQGPLESDQ WCRVLRQSLP EEKISSQTCI ARRHLGPGPT QHTPS RRDRW IREQILQAEV L QERLE WRIR GVQQVAKELG EVNRGIWREL HFREDQRGDF SAWGDYQQAQ ERRWGEQSSP RVLRPGDS KR RRK HL ++ + ++ +++++++ +++++ +++++ + + + + ++++ ++ +++ ++++++++ ++ +++ ++ 121 131 141 151 161 HFREDQRGDF SAWGDYQQAQ ERRWGEQSSP RVLRPGDS KRRRK HL + ++++ ++ +++ ++++++++ +++++ ++ Michael Terribilini Yungok Ihm KRRRK RRDRW ERLE

40 4010/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Express & purify MBP-ERev deletion mutants 60 42 30 22 Marker MBP 1-165 31-165 31-14557-165 57-14557-124 125-165146-165 MBP-ERev 1-165 31-165 31-145 57-165 57-145 57-124 125-165 146-165 NES NLS 1 31 57 125 146 165 RBM Folding? Jae-Hyung Lee MBP

41 4110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction MBP-ERev binds specifically to RRE in vitro sense antisense 31-165 BSA MBP 1-165 BSA MBP 1-165 31-165 Cold RRE No protein No cold RRE UV crosslinkingCompetition Undigested 32 P-RRE Jae-Hyung Lee

42 PREDICTED: Structure Protein binding residues RNA binding residues KRRRK RRDRW VALIDATED: Protein binding residues RNA binding residues EIAV Rev: Binding Predictions vs Experiments + + 131 141 151 161 QRGDFSAWGDYQQAQERRWGEQSSPRVLRPGDS KRRRK HL ++++++++++ ++ +++ ++++ ++ 61 71 81 91 ARRHLGPGPTQHTPS RRDRW IREQILQAEVLQ ERLE WRI +++++++++++++++ +++++++++++++ +++ 41 51 GP L ESDQWCRV L RQS L PEEKISSQTCI ++ + +++++ + + Lee et al (2006) J Virol 80:3844 Terribilini et al (2006) PSB 11:415 57-165 MBP WT 31-16531-145 145-165 RRDRW ERLE KRRRK NES 57 125145 165 31 FOLD? NLS/RBM RBM Jae-Hyung Lee

43 4310/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction AADAA AALA KAAAK Roles of Putative RNA Binding Motifs? NES NLS RRDRW ERLEKRRRK RBD ERDE RBD 1 31 57 124 146 165 Jae-Hyung Lee

44 Rev RNA Binding Motifs: Predicted vs Experiment AADAA AALA KAAAK ERDE PREDICTED: Structure Protein binding residues RNA binding residues KRRRK RRDRW VALIDATED: Protein binding residues RNA binding residues + + 131 141 151 161 QRGDFSAWGDYQQAQERRWGEQSSPRVLRPGDS KRRRK HL ++++++++++ ++ +++ ++++ ++ 61 71 81 91 ARRHLGPGPTQHTPS RRDRW IREQILQAEVLQ ERLE WRI +++++++++++++++ +++++++++++++ +++ 41 51 GP L ESDQWCRV L RQS L PEEKISSQTCI ++ + +++++ + +   RRDRW ERLE KRRRK NES 57 125 145 165 31 KAAAK AADAA AALA ERDE WT NLS RBMFOLD? NLS/RBM Jae-Hyung Lee

45 KRRRK RRDRW Summary: Predictions vs Experiments 131 141 151 161 QRGDFSAWGDYQQAQERRWGEQSSPRVLRPGDS KRRRK HL ++++++++++ ++ +++ ++++ ++ 61 71 81 91 ARRHLGPGPTQHTPS RRDRW IREQILQAEVLQ ERLE WRI +++++++++++++++ +++++++++++++ +++ 41 51 GP L ESDQWCRV L RQS L PEEKISSQTCI ++ + +++++ + + Lee et al (2006) J Virol 80:3844 Terribilini et al (2006) PSB 11:415 RRDRW ERLE KRRRK NES 57 125 145 165 31 FOLD NLS/RBM RBM ERLE

46 4610/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Conclusions & Future Directions Combination of computational & wet lab approaches revealed that: EIAV Rev has a bipartite RNA binding domain Two Arg-rich RBMs are critical RRDRW in central region (but not ERLE) KRRRK at C-terminus, overlapping the NLS Based on computational modeling, the RBMs are in close proximity within the 3-D structure of protein Lentiviral Rev proteins & their cognate RRE binding sites may be more similar in structure than has been appreciated Lee et al (2006) J Virol 80:3844 Terribilini et al (2006) PSB 11:415 Future: Computational: Use Rev-RRE model system to discover "predictive rules" for protein-RNA recognition Experimental?

47 4710/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Experimentally determine the structure of Rev-RRE complex !!!

48 Building “Designer” Zinc Finger DNA-binding Proteins J Sander, P Zaback, F Fu, J Townsend, R Winfrey D Wright, K Joung, L Miller, D Dobbs, D Voytas Wright et al (2006) Nature Protocols Sander et al (2007) Nucleic Acids Res

49 4910/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Chp 16 - RNA Structure Prediction SECTION V STRUCTURAL BIOINFORMATICS Xiong: Chp 16 RNA Structure Prediction (Terribilini) RNA Function Types of RNA Structures RNA Secondary Structure Prediction Methods Ab Initio Approach Comparative Approach Performance Evaluation

50 5010/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction RNA Function Storage/transfer of genetic information Newly discovered regulatory functions - RNAi pathways especially Catalytic

51 5110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction RNA types & functions Types of RNAsPrimary Function(s) mRNA - messengertranslation (protein synthesis) regulatory rRNA - ribosomaltranslation (protein synthesis) t-RNA - transfertranslation (protein synthesis) hnRNA - heterogeneous nuclearprecursors & intermediates of mature mRNAs & other RNAs scRNA - small cytoplasmicsignal recognition particle (SRP) tRNA processing snRNA - small nuclear snoRNA - small nucleolar mRNA processing, poly A addition rRNA processing/maturation/methylation regulatory RNAs (siRNA, miRNA, etc.) regulation of transcription and translation, other??

52 5210/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction RNA Structure RNA forms complex 3D structures Mainly single stranded The single RNA strand can self-hybridize to form base paired regions

53 5310/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Levels of RNA Structure Like proteins, RNA has primary, secondary, and tertiary structures Primary structure - base sequence Secondary structure - single stranded or base paired Tertiary structure - 3D structure Rob Knight Univ Colorado

54 5410/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction RNA Structure Prediction RNA tertiary structure is very difficult to predict Focus on predicting RNA secondary structure Given a RNA sequence, predict the secondary structure of the molecule Almost all methods ignore higher order secondary structures like psuedoknots

55 5510/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Base Pairing in RNA G-C, A-U, G-U ("wobble") & variants http://www.fli-leibniz.de/ImgLibDoc/nana/IMAGE_NANA.html#basepairs See: IMB Image Library of Biological MoleculesIMB Image Library of Biological Molecules

56 5610/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Common structural motifs in RNA Helices Loops Hairpin Interior Bulge Multibranch Pseudoknots Fig 6.2 Baxevanis & Ouellette 2005

57 5710/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction RNA Secondary Structure Prediction Methods Two main types of methods Ab initio - based on calculating the most energetically favorable secondary structure Comparative approach - based on evolutionary comparison of multiple related RNA sequences

58 5810/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Ab Initio Prediction Only requires a single RNA sequence Calculates minimum free energy structure Base pairing lowers free energy of the structure, so methods attempt to find secondary structure with maximal base pairing

59 5910/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Ab Initio Prediction Free energy is calculated based on parameters determined in the wet lab Known energy associated with each type of base pair Base pair formation is not independent - multiple base pairs adjacent to each other are more favorable than individual base pairs - cooperative Bulges and loops adjacent to base pairs have a free energy penalty

60 6010/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Ab Initio Energy Calculation Method Search for all possible base-pairing patterns Calculate the total energy of the structure based on all stabilizing and destabilizing forces Fig 6.3 Baxevanis & Ouellette 2005

61 6110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Dot Matrices Can be used to find all possible base pair patterns Compare the input sequence to itself and put a dot anywhere there is a complimentary base R Knight 2005

62 6210/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Dynamic Programming Finding the best possible secondary structure is difficult - lots of possibilities Compare RNA sequence with itself Apply scoring scheme based on energy parameters for base pairs, cooperativity, and penalties for destabilizing forces Find path that represents the most energetically favorable secondary structure

63 6310/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Problem DP returns the SINGLE best structure There may be many structures with similar energies Also, your predicted secondary structure is only as good as the energy parameters used Solution - return multiple structures with near optimal energies

64 6410/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Popular Ab Initio Prediction Programs Mfold Combines DP with thermodynamic calculations Fairly accurate for short sequences, less accurate as sequence length increases RNAfold Returns multiple structures near the optimal structure Computes a larger number of potential secondary structures than Mfold, so it uses a simplified energy function

65 6510/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Comparative Approach Uses multiple sequence alignment Assumes related sequences fold into the same secondary structure

66 6610/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Covariation RNA functional motifs are conserved To maintain RNA structure during evolution, a mutation in a base paired residue must be compensated for by a mutation in the base that it pairs with Comparative methods search for covariation patterns in MSA

67 6710/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Consensus Structures Predict secondary structure of each individual sequence Compare all structures and see if there is a most common structure

68 6810/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Popular Comparative Prediction Programs Two types Require user to provide MSA No MSA required

69 6910/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction RNAalifold Requires user to provide the MSA Creates a scoring matrix combining minimum free energy and covariation information DP is used to select the minimum free energy structure

70 7010/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Foldalign User provides a pair of unaligned RNA sequences Foldalign constructs alignment then computes a commonly conserved structure Suitable only for short sequences

71 7110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Dynalign User provides two input sequences Dynalign calculates possible secondary structures using algorithm similar to Mfold Dynalign compares multiple structures from both sequences to find a common structure

72 7210/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction Performance Evaluation Ab initio methods achieve correlation coefficient of 20-60% Comparative approaches achieve correlation coefficient of 20-80% Programs that require user to supply MSA are more accurate Comparative programs are consistently more accurate than ab initio programs Base-pairs predicted by comparative sequence analysis for large & small subunit rRNAs are 97% accurate when compared with high resolution crystal structures! - Gutell, Pace


Download ppt "110/17/07BCB 444/544 F07 ISU Terribilini #24 - RNA Secondary Structure Prediction BCB 444/544 Lecture 24  Protein Tertiary Structure Prediction #24_Oct17."

Similar presentations


Ads by Google