Bioinformatics Master Course II: DNA/Protein structure-function analysis and prediction Lecture 12: DNA/RNA structure Centre for Integrative Bioinformatics VU
Biological Functions of Nucleic Acids DNA transcription mRNA translation Protein tRNA (transfer RNA, adaptor in translation) rRNA (ribosomal RNA, component of ribosome) snRNA (small nuclear RNA, component of splicesome) snoRNA (small nucleolar RNA, takes part in processing of rRNA) RNase P (ribozyme, processes tRNA) SRP RNA (RNA component of signal recognition particle) …….. transcription + translation = expression
Eukaryotes have spliced genes… DNA makes RNA makes Protein
Some facts about human genes Comprise about 3% of the genome Average gene length: ~ 8,000 bp Average of 5-6 exons/gene Average exon length: ~200 bp Average intron length: ~2,000 bp ~8% genes have a single exon Some exons can be as small as 1 or 3 bp. HUMFMR1S is a typical gene : 17 exons bp long, comprising 3% of a 67,000 bp gene The human factor VIII gene (whose mutations cause hemophilia A) is spread over ~186,000 bp. It consists of 26 exons ranging in size from 69 to 3,106 bp, and its 25 introns range in size from 207 to 32,400 bp. The complete gene comprises ~9 kb of exon and ~177 kb of intron. The biggest human gene yet is for dystrophin. It has >30 exons and is spread over 2.4 million bp.
Nucleic Acid Basics Nucleic Acids Are Polymers Each Monomer Consists of Three Moieties: Nucleotide A Base + A Ribose Sugar + A Phosphate Nucleoside A Base Can be One of the Five Rings:
Nucleic Acid Basics Nucleic Acids Are Polymers Each Monomer Consists of Three Moieties: Nucleotide A Base + A Ribose Sugar + A Phosphate Nucleoside A Base Can be One of the Five Rings: Pyrimidines Purines Pyrimidines and Purines Can Base-Pair (Watson-Crick Pairs)
Nucleic Acids As Heteropolymers Nucleosides, NucleotidesSingle Stranded DNA A single stranded RNA will have OH groups at the 2’ positions Note the directionality of DNA or RNA 5’ 3’
Stability of base-pairing C-G base pairing is more stable than A-T (A- U) base pairing 3 rd codon position has freedom to evolve (synonymous mutations) Species can therefore optimise their G-C content (e.g. thermophiles are GC rich)
DNA compositional biases Base composition of genomes: E. coli: 25% A, 25% C, 25% G, 25% T P. falciparum (Malaria parasite): 82%A+T Translation initiation: ATG (AUG) is the near universal motif indicating the start of translation in DNA coding sequence. Genetic diseases Cystic Fibrosis Known since very early on (“Celtic gene”) Inherited autosomal recessive condition (Chr. 7) Symptoms: –Clogging and infection of lungs (early death) –Intestinal obstruction –Reduced fertility and (male) anatomical anomalies CF gene CFTR has 3-bp deletion leading to Del508 (Phe) in 1480 aa protein (epithelial Cl - channel) – protein degraded in ER instead of inserted into cell membrane
Structure Overview of Nucleic Acids Unlike three dimensional structures of proteins, DNA molecules assume simple double helical structures independent on their sequences. There are three kinds of double helices that have been observed in DNA: type A, type B, and type Z, which differ in their geometries. The double helical structure is essential to the coding function of DNA. Watson (biologist) and Crick (physicist) first discovered the double helix structure in 1953 by X-ray crystallography. RNA, on the other hand, can have as diverse structures as proteins, as well as simple double helix of type A. The ability of being both informational and diverse in structure suggests that RNA was the prebiotic molecule that could function in both replication and catalysis (The RNA World Hypothesis). In fact, some viruses encode their genetic materials by RNA (retrovirus)
Three Dimensional Structures of Double Helices A-DNA A-RNA Major Groove Minor Groove Forces That Stabilize Nucleic Acid Double Helix There are two major forces that contribute to stability of helix formation –Hydrogen bonding in base-pairing –Hydrophobic interactions in base stacking 5’ 3’ Same strand stacking cross-strand stacking
Types of DNA Double Helix Type A: major conformation of RNA, minor conformation of DNA; Type B: major conformation of DNA; Type Z: minor conformation of DNA 5’ 3’ 5’ 3’ 5’ 3’ AB Z Narrow tight Wide Less tight Left-handed Least tight
Secondary Structures of Nucleic Acids DNA is primarily in duplex form. RNA is normally single stranded which can have a diverse form of secondary structures other than duplex. Non-B-DNA secondary structures Cruciform Triple-helical H-DNA Slipped DNA = Hoogsteen basepair
Secondary Structures of Nucleic Acids DNA is primarily in duplex form. RNA is normally single stranded which can have a diverse form of secondary structures other than duplex. More Secondary Structures Pseudoknots: Source: Cornelis W. A. Pleij in Gesteland, R. F. and Atkins, J. F. (1993) THE RNA WORLD. Cold Spring Harbor Laboratory Press. rRNA Secondary Structure Based on Phylogenetic Data
3D Structures of RNA: Transfer RNA Structures Anticodon Stem D Loop T C Loop Variable loop Anticodon Loop Secondary Structure of tRNA Tertiary Structure of tRNA Ban et al., Science 289 ( ), 2000 Secondary Structure Of large ribosomal RNA Tertiary Structure Of large ribosome subunit 3D Structures of RNA: Ribosomal RNA Structures
3D Structures of RNA: Catalytic RNA Secondary Structure Of Self-splicing RNA Tertiary Structure Of Self-splicing RNA Some structural rules: Base-pairing is stabilising Un-paired sections (loops) destabilise 3D conformation with interactions makes up for this
Sense/antisense RNA antisense RNA blocks translation through hybrisization with coding strand Sense/antisense peptides Have been therapeutically used Sense/antisense proteins Does it make (anti)sense?