Outline Nucleic Acids Basics

Slides:



Advertisements
Similar presentations
RNA Secondary Structure Prediction
Advertisements

Chapter 19 (part 1) Nucleic Acids. Information encoded in a DNA molecule is transcribed via synthesis of an RNA molecule The sequence of the RNA molecule.
Nucleotides and Nucleic Acids. Definitions Nucleic acids are polymers of nucleotides In eukaryotic cells nucleic acids are either: Deoxyribose nucleic.
25.1 DNA, Chromosomes, and Genes When a cell is not dividing, its nucleus is occupied by chromatin, DNA (deoxyribonucleic acid), twisted around organizing.
1 Nucleic acids: DNA and RNA Done By Majed Felemban.
Key Concepts Nucleotides consist of a sugar, phosphate group, and nitrogen-containing base. Ribonucleotides polymerize to form RNA. Deoxyribonucleotides.
Bioinformatics Master Course II: DNA/Protein structure-function analysis and prediction Lecture 12: DNA/RNA structure Centre for Integrative Bioinformatics.
Nucleic Acids and Protein Synthesis
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Bioinformatics Master Course II: DNA/Protein structure-function analysis and prediction Lecture 12: DNA/RNA structure prediction Centre for Integrative.
Functional RNA - Introduction Biochemistry 4000 Dr. Ute Kothe.
. Class 5: RNA Structure Prediction. RNA types u Messenger RNA (mRNA) l Encodes protein sequences u Transfer RNA (tRNA) l Adaptor between mRNA molecules.
Introduction to Bioinformatics Lecture 20: Sequencing genomes.
(Foundation Block) Dr. Sumbul Fatma
DNA & RNA Structure Fig 1.9. Deoxyribonucleic acid (DNA) is the genetic material -Stores genetic information in the form of a code: a linear sequence.
NUCLEIC ACIDS STRUCTURE AND FUNCTION RNADNA. MONONUCLEOTIDE PHOSPHATE PENTOSE SUGAR ORGANIC BASE.
Namedanny van noort OfficeRoom 410 building#139 (ICT) tel: webhttp://bi.snu.ac.kr/ Where to find me.
CHAPTER 10: DNA,RNA & Protein Synthesis
Chapter 10 – DNA, RNA, and Protein Synthesis
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Structure and function of nucleic acids.. Heat. Heat flows through the boundary of the system because there exists a temperature difference between the.
Strand Design for Biomolecular Computation
Molecular Biology (Foundation Block) The central dogma of molecular biology Nucleotide chemistry DNA, RNA and chromosome structure DNA replication Gene.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
NOTES: Ch 5, part 2 - Proteins & Nucleic Acids Proteins have many structures, resulting in a wide range of functions ● Proteins account for more.
Molecular Biology I-II The central dogma of molecular biology Nucleotide chemistry DNA, RNA and Chromosome Structure DNA Replication Gene Expression Transcription.
DNA Deoxyribonucleic Acid Structure and Function.
DNA “The Molecule of Life”. Do Now What is DNA? Why is it important? Who helped to discover DNA and it’s structure? Draw a picture of what you think DNA.
The Components and Structure of DNA
Newer method to sequence whole genomes –Uses allyl protecting group: Sequencing by Synthesis.
Doug Raiford Lesson 7.  RNA World Hypothesis  RNA world evolved into the DNA and protein world  DNA advantage: greater chemical stability  Protein.
Nucleic Acids and Protein Synthesis 10 – 1 DNA 10 – 2 RNA 10 – 3 Protein Synthesis.
RNA Structure Prediction RNA Structure Basics The RNA ‘Rules’ Programs and Predictions BIO520 BioinformaticsJim Lund Assigned reading: Ch. 6 from Bioinformatics:
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Master Course DNA/Protein Structure- function Analysis and Prediction Lecture 12.
Nucleic Acids: Cell Overview and Core Topics. Outline I.Cellular Overview II.Anatomy of the Nucleic Acids 1.Building blocks 2.Structure (DNA, RNA) III.Looking.
Chapter 5 Part 5 Nucleic Acids 1. The amino acid sequence of a polypeptide is programmed by a discrete unit of inheritance known as a. A gene is a segment.
Motif Search and RNA Structure Prediction Lesson 9.
THE NUCLEIC ACIDS DNA & RNA. DNA-DeoxyriboNucleic Acid  DNA is the genetic material present in chromosomes  Made up of monomers called “nucleotides”
Nucleotides and nucleic acids Nucleotides: small molecules –Components of nucleic acids –Energy storage (eg. ATP) –Signal transduction/info transfer (eg.
L. Bahiya Osrah LAB 1 INTRODUCTION TO NUCLEIC ACIDS STRUCTURAL PROPERTIES.
LOGO Course lecturer : Jasmin Šutković 3th June 2016 Organic Chemistry – Spring 2016 Lecture 14 Nucleic acids and protein synthesis.
Molecular Biology - I Dr. Sumbul Fatma Clinical Chemistry Unit Department of Pathology.
RNAs. RNA Basics transfer RNA (tRNA) transfer RNA (tRNA) messenger RNA (mRNA) messenger RNA (mRNA) ribosomal RNA (rRNA) ribosomal RNA (rRNA) small interfering.
AAA AAAU AAUUC AUUC UUCCG UCCG CCGG G G Karen M. Pickard CISC889 Spring 2002 RNA Secondary Structure Prediction.
Nucleic Acids and Protein Synthesis How we make the proteins that our body is made of.
Structure of Nucleic Acids
The Secret Of Heredity It unlocks the mystery of the inheritance!
Chapter 10 – DNA, RNA, and Protein Synthesis
Nucleic acid Dr. Sahar Al Shabane.
DNA Structrue & Function
Molecular biology (1) (Foundation Block).
Nucleotides and Nucleic Acids
PROTEIN SYNTHESIS.
DNA Structure and Function
Predicting RNA Structure and Function
Lec2 م. م مياسه مثنى.
Nucleotides and nucleic acids
Fundamentals of Organic Chemistry
Chapter 2 Nucleic Acids.
The Secret Of Heredity It unlocks the mystery of the inheritance!
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
Fundamentals of Organic Chemistry CHAPTER 10: Nucleic Acids
Nucleic Acids And Protein Synthesis
Fundamentals of Organic Chemistry
Dr. Israa ayoub alwan Lec -6-
Molecular biology (1) (Foundation Block).
Presentation transcript:

Outline Nucleic Acids Basics The Problem of Predicting Nucleic Acid Structure Thermodynamics and Phylogeny Comparison A commonly used program for predicting RNA secondary structure---mFOLD Predicting RNA Tertiary Structures by Mc-Sym

Biological Functions of Nucleic Acids DNA transcription mRNA translation Protein tRNA (adaptor in translation) rRNA (component of ribosome) snRNA (small nuclear RNA, component of splicesome) snoRNA (small nucleolar RNA, takes part in processing of rRNA) RNase P (ribozyme, processes tRNA) SRP RNA (component of signal recognition particle) ……..

A Base + A Ribose Sugar + A Phosphate Nucleic Acid Basics Nucleic Acids Are Polymers Each Monomer Consists of Three Moieties: Nucleotide A Base + A Ribose Sugar + A Phosphate Nucleoside A Base Can be One of the Five Rings (next):

Nucleic Acid Bases Pyrimidines Purines Pyrimidines and Purines Can Base-Pair (Watson-Crick Pairs)

Nucleic Acids As Heteropolymers Nucleosides, Nucleotides Single Stranded DNA 5’ 3’ A single stranded RNA will have OH groups at the 2’ positions Note the directionality of DNA or RNA

Structure Overview of Nucleic Acids Unlike three dimensional structures of proteins, DNA molecules assume simple double helical structures independent on their sequences. There are three kinds of double helices that have been observed in DNA: type A, type B, and type Z, which differ in their geometries. The double helical structure is essential to the coding functional of DNA. Watson (biologist) and Crick (physicist) first discovered double helix structure in 1953 by X-ray crystallography. RNA, on the other hand, can have as diverse structures as proteins, as well as simple double helix of type A. The ability of being both informational and diverse in structure suggests that RNA was the prebiotic molecule that could function in both replication and catalysis (The RNA World Hypothesis). In fact, some virus encode their genetic materials by RNA (retrovirus)

Three Dimensional Structures of Double Helices A-DNA A-RNA Minor Groove Major Groove

Forces That Stabilize Nucleic Acid Double Helix There are two major forces that contribute to stability of helix formation Hydrogen bonding in base-pairing Hydrophobic interactions in base stacking 5’ 3’ Same strand stacking cross-strand stacking

Types of DNA Double Helix Type A: major conformation of RNA, minor conformation of DNA; Type B: major conformation of DNA; Type Z: minor conformation of DNA 5’ 3’ A B Z Narrow tight Wide Less tight Left-handed Least tight

3D Structures of RNA: Transfer RNA Structures Secondary Structure Of large ribosomal RNA Tertiary Structure Of large ribosome subunit TyC Loop Anticodon Stem Variable loop D Loop Anticodon Loop

3D Structures of RNA: Ribosomal RNA Secondary Structure Of large ribosomal RNA Tertiary Structure Of large ribosome subunit Ban et al., Science 289 (905-920), 2000

3D Structures of RNA: Catalytic RNA Tertiary Structure Of Self-splicing RNA Secondary Structure Of Self-splicing RNA

Secondary Structures of Nucleic Acids DNA is primarily in duplex form. RNA is normally single stranded which can have a diverse form of secondary structures other than duplex.

More Secondary Structures Pseudoknots: Source: Cornelis W. A. Pleij in Gesteland, R. F. and Atkins, J. F. (1993) THE RNA WORLD. Cold Spring Harbor Laboratory Press. rRNA Secondary Structure Based on Phylogenetic Data

Predicting RNA Secondary Structures By Thermodynamics Method Minimize Gibbs Free Energy By Phylogenetic Comparison Method Compare RNA Sequences of Identical Function From Different Organisms By Combination of the Above Two Methods In principle, this could be the most powerful method

Thermodynamics G = H - TS Gibbs Free Energy, G Describes the energetics of biomolecules in aqueous solution. The change in free energy, G, for a chemical process, such as nucleic acid folding, can be used to determine the direction of the process: G=0: equilibrium G>0: unfavorable process G<0: favorable process Thus the natural tendency for biomolecules in solution is to minimize free energy of the entire system (biomolecules + solvent).  G = H - TS H is enthalpy, S is entropy, and T is the temperature in Kelvin. Molecular interactions, such as hydrogen bonds, van der Waals and electrostatic interactions contribute to the H term. S describes the change of order of the system. Thus, both molecular interactions as well as the order of the system determine the direction of a chemical process. For any nucleic acid solution, it is extremely difficult to calculate the free energy from first principle Biophysical methods can be used to measure free energy changes

The Equilibrium Partition Function For a population of structures, S, a partition function Q and the probability for a particular folding, s can be calculated: The heat capacity for the RNA can be obtained: and Heat capacity can be measured experimentally.

Energy Minimization Method (mFOLD) An RNA Sequence is called R= {r1,r2,r3…rn}, where ri is the ith ribonucleotide and it belongs to a set of {A, U, G, C} A secondary structure of R is a set S of base pairs, i.j, which satisfies: 1=<i<j=<n; j-i>4 (can’t have loop containing less than 4 nucleotides); If i,j and i’.j’ are two basepairs, (assume i =< i’), then either i = i’ and j = j’ (same base pair) i < j < i’ < j’ (i.j proceeds i’.j’) or i < i’ < j’< j (i.j includes i’. j’) (this excludes pseudoknots which is i<i’<j<j’) If e(i,j) is the energy for the base pair i.j, the total energy for R is The objective is to minimize E(S). 5’ 3’ 5’ 3’

Free Energy Parameters Extensive database of free energies for the following RNA units has been obtained (so called “Tinoco Rules” and “Turner Rules”): Single Strand Stacking energy Canonical (AU GC) and non-canonical (GU) basepairs in duplexes Still lacking accurate free energy parameters for Loops Mismatches (AA, CA etc) Using these energy parameters, the current version of mFOLD can predict ~73% phylogenetically deduced secondary structures.

Dynamic Programming (mFOLD) An Example of W(i,j) A matrix W(i,j) is computed that is dependent on the experimentally measured basepair energy e(i,j) Recursion begins with i=1, j=n If W(i+1,j)=W(i,j), then i is not paired. Set i=i+1 and start the recursion again. If W(i,j-1)=W(i,j), then j is not paired. Set j=j-1 and start the recursion again. If W(i,j)=W(i,k)+W(k+1,j) , the fragment k+1,j gets put on a stack and the fragment i…k is analyzed by setting j = k and going back to the recursion beginning. If W(i,j)=e(i,j)+W(i+1,j-1), a basepair is identified and is added to the list by setting i=i+1 and j=j-1

Suboptimal Folding (mFOLD) For any sequence of N nucleotides, the expected number of structures is greater than 1.8N A sequence of 100 nucleotides has 3x1025 foldings. If a computer can calculate 1000 strs./s-1, it would take 1015 years! mFOLD generates suboptimal foldings whose free energy fall within a certain range of values. Many of these structures are different in trivial ways. These suboptimal foldings can still be useful for designing experiments.

Running mFOLD http://bioinfo.math.rpi.edu/~mfold/rna/form1.cgi Constraints can be entered force bases i,i+1,...,i+k-1 to be double stranded by entering: F   i   0   k on 1 line in the constraint box. force consecutive base pairs i.j,i+1.j-1, ...,i+k-1.j-k+1 by entering: F   i   j   k on 1 line in the constraint box. force bases i,i+1,...,i+k-1 to be single stranded by entering: P   i   0   k on 1 line in the constraint box. prohibit the consecutive base pairs i.j,i+1.j-1, ...,i+k-1.j-k+1 by entering: P   i   j   k on 1 line in the constraint box. prohibit bases i to j from pairing with bases k to l by entering: P   i-j   k-l on 1 line in the constraint box.

Fold 5’-CUUGGAUGGGUGACCACCUGGG-3’ No constraint F 1 21 2 entered

Secondary Structure Prediction for Aligned RNA Sequences Both energy as well as RNA sequence covariation can be combined to predict RNA secondary structures To quantify sequence covariation, let fi(X) be the frequency of base X at aligned position I and fij(XY) be the frequency of finding X in i and Y in j, the mutual information score is (Chiu & Kolodziejczak and Gutell & Woese) if for instance only GC and GU paris at positions i and j then Mij=0. The total energy for RNA is set to a linear combination of measured free energy plus the covariance contribution

Other Secondary Prediction Methods Vienna: http://www.tbi.univie.ac.at/~ivo/RNA/ uses the same recursive method in search the folding space Added the option of computing the population of RNA secondary structures by the equilibrium partition function Specific heat of an RNA can be calculated by numerical differentiation from the equilibrium partition function RNACAD:http://www.cse.ucsc.edu/research/compbio/ssurrna.html An effort in improving multiple RNA sequence alignment by taking into both primary as well secondary structure information Use Stochastic Context-Free Grammars (SCFGs), an extension of hidden Markov models (HMMs) method Bundschuh, R., and Hwa, T. (1999) RNA secondary structure formation: A solvable model of heteropolymer folding. PHYSICAL REVIEW LETTERS 83, 1479-1482. This work treats RNA as heteropolymer and uses a simplified Go-like model to provide an exact solution for RNA transition between its native and molen phases.

Predicting RNA 3D Structures Currently available RNA 3D structure prediction programs make use the fact that a tertiary structure is built upon preformed secondary structures So once a solid secondary structure can be predicted, it is possible to predict its 3D structure The chances of obtaining a valid 3D structure can be increased by known space constraints among the different secondary segments (e.g. cross-linking, NMR results). However, there are far less thermodynamic data on 3-D RNA structures which makes 3-D structure prediction challenging.

Mc-Sym Mc-Sym uses “backtracking” method to solve a general problem in computer science called the constraint satisfaction problem (CSP) Backtracking algorithm organizes the search space as a tree where each node corresponds to the application of an operator At each application, if the partially folded RNA structure is consistent with its RNA conformational database, the next operator is applied, otherwise the entire attached branch is pruned and the algorithm backtracks to the previous node.

Mc-Sym (Continued) The selection of a spanning tree for a particular RNA is left to the user, but it is suggested that the nucleotides imposing the most constraints are introduced first Users also supply a particular Mc-Sym “conformation” for each nucleotide. These “conformers” are derived from currently available 3D databases

Mc-Sym (Continued) Sample script: RELATIONS SEQUENCE ;; 1 A r GAAUGCCUGCGAGCAUCCC ;; DECLARE 1 helixA * 2 helixA * 3 helixA * 4 helixA * 5 helixA * 6 helixA * …………… 19 helixA * RELATIONS ;; 18 helix * 19 17 helix * 18 16 helix * 17 ………. 5 helix * 6 4 helix * 5 3 helix * 4 2 helix * 3 1 helix * 2 BUILD ; 19 18 17 16 15 14 13 12 12 11 10 9 8 7 6 5 4 3 2 1 CONSTRAINTS ;; (enter experimental constraints) 18 2 3.0

RNA-protein Interactions There is currently no computational method that can predict the RNA-protein interaction interfaces; Statistical methods have been applied to identify structure features at the protein-RNA interface. For instance, ENTANCLE finds that most atoms contributed from a protein to recogonizing an RNA are from main chains (C, O, N, H), not from side chains! But much remain to be done; Electrostatic potential has primary importance in protein-RNA recognition due to the negatively charged phosphate backbones. Efforts are made to quantify electrostatic potential at the molecular surface of a protein and RNA in order to predict the site of RNA interaction. This often provides good prediction at least for the site on the protein.

References Predicting RNA secondary structures: good reviews 1. Turner, D. H., and Sugimoto, N. (1988) RNA structure prediction. Annu Rev Biophys Biophys Chem 17, 167-92. 2. Zuker, M. (2000) Calculating nucleic acid secondary structure. Curr Opin Struct Biol 10, 303-10. Obtaining experimental thermodynamics parameters: 3. Xia, T., SantaLucia, J., Jr., Burkard, M. E., Kierzek, R., Schroeder, S. J., Jiao, X., Cox, C., and Turner, D. H. (1998) Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. Biochemistry 37, 14719-35. 4. Borer, P. N., Dengler, B., Tinoco, I., Jr., and Uhlenbeck, O. C. (1974) Stability of ribonucleic acid double-stranded helices. J Mol Biol 86, 843-53. Thermodynamics Theory for RNA structure prediction: 5. Bundschuh, R., and Hwa, T. (1999) RNA secondary structure formation: A solvable model of heteropolymer folding. PHYSICAL REVIEW LETTERS 83, 1479-1482. 6. McCaskill, J. S. (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29, 1105-19.