The response of amino acid frequencies to directional mutation pressure in mitochondrial genomes is related to the physical properties of the amino acids.

Slides:



Advertisements
Similar presentations
The genetic code.
Advertisements

Protein Synthesis (making proteins)
 -GLOBIN MUTATIONS AND SICKLE CELL DISORDER (SCD) - RESTRICTION FRAGMENT LENGTH POLYMORPHISMS (RFLP)
ATG GAG GAA GAA GAT GAA GAG ATC TTA TCG TCT TCC GAT TGC GAC GAT TCC AGC GAT AGT TAC AAG GAT GAT TCT CAA GAT TCT GAA GGA GAA AAC GAT AAC CCT GAG TGC GAA.
RNA Say Hello to DNA’s little friend!. EngageEssential QuestionExplain Describe yourself to long lost uncle. How do the mechanisms of genetics and the.
Supplementary Fig.1: oligonucleotide primer sequences.
Gene Mutations Worksheet
Transcription & Translation Worksheet
Crick’s early Hypothesis Revisited. Or The Existence of a Universal Coding Frame Axel Bernal UPenn Center for Bioinformatics Jean-Louis Lassez Coastal.
Introduction to Molecular Biology. G-C and A-T pairing.
1 Essential Computing for Bioinformatics Bienvenido Vélez UPR Mayaguez Lecture 5 High-level Programming with Python Part II: Container Objects Reference:
 Genetic information, stored in the chromosomes and transmitted to the daughter cells through DNA replication is expressed through transcription to RNA.
In vitro expression of BVDV capsid protein Corpus Christi College, University of Oxford Glycobiology Institute, Department of Biochemistry KOR SHU CHAN.
Today… Genome 351, 8 April 2013, Lecture 3 The information in DNA is converted to protein through an RNA intermediate (transcription) The information in.
Dictionaries.
GENE MUTATIONS aka point mutations. DNA sequence ↓ mRNA sequence ↓ Polypeptide Gene mutations which affect only one gene Transcription Translation © 2010.
IGEM Arsenic Bioremediation Possibly finished biobrick for ArsR by adding a RBS and terminator. Will send for sequencing today or Monday.
Nature and Action of the Gene
Biological Dynamics Group Central Dogma: DNA->RNA->Protein.
Gene Prediction in silico Nita Parekh BIRC, IIIT, Hyderabad.
Math 15 Introduction to Scientific Data Analysis Lecture 10 Python Programming – Part 4 University of California, Merced Today – We have A Quiz!
More on translation. How DNA codes proteins The primary structure of each protein (the sequence of amino acids in the polypeptide chains that make up.
Undifferentiated Differentiated (4 d) Supplemental Figure S1.
Supplemental Table S1 For Site Directed Mutagenesis and cloning of constructs P9GF:5’ GAC GCT ACT TCA CTA TAG ATA GGA AGT TCA TTT C 3’ P9GR:5’ GAA ATG.
Lecture 10, CS5671 Neural Network Applications Problems Input transformation Network Architectures Assessing Performance.
Fig. S1 siControl E2 G1: 45.7% S: 26.9% G2-M: 27.4% siER  E2 G1: 70.9% S: 9.9% G2-M: 19.2% G1: 57.1% S: 12.0% G2-M: 30.9% siRNF31 E2 A B siRNF31 siControl.
PART 1 - DNA REPLICATION PART 2 - TRANSCRIPTION AND TRANSLATION.
TRANSLATION: information transfer from RNA to protein the nucleotide sequence of the mRNA strand is translated into an amino acid sequence. This is accomplished.
Today… Genome 351, 8 April 2013, Lecture 3 The information in DNA is converted to protein through an RNA intermediate (transcription) The information in.
NSCI 314 LIFE IN THE COSMOS 4 - The Biochemistry of Life on Earth Dr. Karen Kolehmainen Department of Physics CSUSB
Prodigiosin Production in E. Coli Brian Hovey and Stephanie Vondrak.
Passing Genetic Notes in Class CC106 / Discussion D by John R. Finnerty.
Supplementary materials
Dictionaries. A “Good morning” dictionary English: Good morning Spanish: Buenas días Swedish: God morgon German: Guten morgen Venda: Ndi matscheloni Afrikaans:
Suppl. Figure 1 APP23 + X Terc +/- Terc +/-, APP23 + X Terc +/- G1Terc -/-, APP23 + X G1Terc -/- G2Terc -/-, APP23 + X G2Terc -/- G3Terc -/-, APP23 + and.
Structure and Function of DNA DNA Replication and Protein Synthesis.
RA(4kb)- Atggagtccgaaatgctgcaatcgcctcttctgggcctgggggaggaagatgaggc……………………………………………….. ……………………………………………. ……………………….,……. …tactacatctccgtgtactcggtggagaagcgtgtcagatag.
Example 1 DNA Triplet mRNA Codon tRNA anticodon A U A T A U G C G
Topic: Replication of DNA Standard: Explain the role of DNA in storing and transmitting cellular information.
Name of presentation Month 2009 SPARQ-ed PROJECT Mutations in the tumor suppressor gene p53 Pulari Thangavelu (PhD student) April Chromosome Instability.
DNA, RNA and Protein.
Protein Synthesis DNA RNA Protein.
Modelling Proteomes.
Bellringer Three consecutive bases in mRNA are known as what?
Supplementary information Table-S1 (Xiao)
Sequence – 5’ to 3’ Tm ˚C Genome Position HV68 TMER7 Δ mt. Forward
DNA – Review Unit 4.
Python.
Supplemental Table 3. Oligonucleotides for qPCR
Laboratory Encounters in Plant Genomics
GENE MUTATIONS aka point mutations © 2016 Paul Billiet ODWS.
Supplementary Figure 1 – cDNA analysis reveals that three splice site alterations generate multiple RNA isoforms. (A) c.430-1G>C (IVS 6) results in 3.
Huntington Disease (HD)
Section Objectives Relate the concept of the gene to the sequence of nucleotides in DNA. Sequence the steps involved in protein synthesis.
DNA By: Mr. Kauffman.
Protein Synthesis Review Answers
DNA and RNA.
Gene architecture and sequence annotation
Schematic of the PCR assay.
PROTEIN SYNTHESIS RELAY
More on translation.
Molecular engineering of photoresponsive three-dimensional DNA
Fundamentals of Protein Structure
Laboratory Encounters in Plant Genomics
Python.
Bellringer Please answer on your bellringer sheet:
Station 2 Protein Synethsis.
6.096 Algorithms for Computational Biology Lecture 2 BLAST & Database Search Manolis Piotr Indyk.
Shailaja Gantla, Conny T. M. Bakker, Bishram Deocharan, Narsing R
Presentation transcript:

The response of amino acid frequencies to directional mutation pressure in mitochondrial genomes is related to the physical properties of the amino acids and to the structure of the genetic code. Daniel Urbina, Bin Tang, Paul G Higgs. Department of Physics, McMaster University, Hamilton, Ontario L8S 4M1, Canada. This is the genetic code used in vertebrate mitochondrial DNA. It shows the mapping between the 64 possible codons and the 20 possible amino acids. The shaded boxes are four-codon families. Third-position sites in four-codon families are synonymous (or fourfold degenerate). Base changes may occur at these sites without influencing the amino acid. Hence, selection should be negligible (or at least very weak). In contrast, most first and second position changes are non- synonymous. Therefore selection should be significant at these sites. 1 Mitochondria are organelles inside eukaryotic cells. They possess their own genomes that are distinct from the main genome in the nucleus. Typical animal mitochondrial genomes contain 12 protein-coding genes, 2 rRNAs and 22 tRNAs. 2 For each of the 473 species in OGRe, we measured T4 (the frequency of T at the fourfold-degenerate sites), and T1 and T2 (the frequency of T at first and second positions). T4 varies enormously due to mutational pressure, from less than 10% to more than 90%. T1 and T2 vary almost linearly with T4, but over a narrower range. This shows that both mutation and selection influence T1 and T2. By fitting a mutation-selection model to the data, we can estimate the relative strength of mutation and selection. The slope for T2 is less than for T1, which shows that selection against second-position substitutions is stronger than that against first-position substitutions. Similar plots are also seen for A, C and G. Result – The graph shows that there is a strong correlation between Proximity and Responsiveness (R = 0.86, p < ). This confirms the hypothesis in part 8, and means that physical properties have a direct influence on evolutionary properties. Summary – The frequencies of bases and amino acids in mitochondrial genomes vary in a complex way due to the action of directional mutation pressure on the DNA and stabilizing selection pressure on the protein sequences. Our model of mutation- selection balance explains the trends seen in these frequencies (see 5, 7 and 8). We developed a measure of similarity between amino acids that enabled us to make quantitative predictions about the responsiveness of the different amino acids to mutation pressure (see 6 and 9). This work also reveals non-random patterns of similarity between neighbouring amino acids in the genetic code that are of interest from the point of view of the evolution of the genetic code itself The PCA plot shows that the amino acids FLIMV in the first column of the genetic code table form a tight cluster with very similar physical properties. This is also true for SPTA in the second column. Most of the third-column amino acids are fairly similar to one another. Surprisingly, the fourth-column amino acids are all very different. There is also no particular similarity between amino acids in the same row of the genetic code. This explains what we saw in part 5 – The similarity between amino acids in columns 1, 2 and 3 means that many first-position substitutions are only weakly selected against, whereas the dissimilarity between amino acids in the same row means that second-position substitutions are more strongly selected against. 6 9 Filled symbols are data points from fish genomes. Open symbols are derived from a mutation-selection model. The solid and dashed lines are linear regressions through the data and theory points, respectively. This is the front page of OGRe, our relational database for the comparative analysis of mitochondrial genomes. OGRe contains information on gene sequences, gene order and genome rearrangements. Please visit OGRe on-line at SECOND POSITION TCAGTHIRD POSITION FIRSTPOSITIONFIRSTPOSITION TTTT F 1 TTC F TCT S TCC S 6 TCA S TCG S TAT Y 10 TAC Y TGT C 17 TGC C T C A G T C A G TTA L 2 TTG L TAA Stop TAG Stop TGA W 18 TGG W CCTT L CTC L CTA L CTG L CCT P CCC P 7 CCA P CCG P CAT H 11 CAC H CGT R CGC R 19 CGA R CGG R T C A G T C A G CAA Q 12 CAG Q AATT I 3 ATC I ACT T ACC T 8 ACA T ACG T AAT N 13 AAC N AGT S 20 AGC S T C A G T C A G ATA M 4 ATG M AAA K 14 AAG K AGA Stop AGG Stop GGTT V GTC V 5 GTA V GTG V GCT A GCC A 9 GCA A GCG A GAT D 15 GAC D GGT G GGC G 21 GGA G GGG G T C A G T C A G GAA E 16 GAG E Vol. Bulk.PolaritypIHyd.1Hyd.2Surface Area Fract. Area AlaA ArgR AsnN AspD CysC GlnQ GluE GlyG HisH Aims of this project - Here we study the variation in frequency of DNA bases in the protein-coding regions of mitochondrial genomes and the corresponding variation in frequency of amino acids in the proteins. Directional mutation pressure in DNA – The rates of mutation between the four bases are usually not equal. This causes a mutational pressure that drives the base frequencies away from 25%. If no selection acts on the DNA, base frequencies will reach an equilibrium determined by mutation. The frequencies of bases at synonymous sites vary enormously in mitochondrial genomes, indicating that mutation pressure varies in direction among species. Response of amino acid frequencies – Mutation pressure will alter the frequency of usage of codons in gene sequences. This will cause amino-acid substitutions in the proteins that will often be deleterious. Selection will therefore oppose variation in the frequencies of bases and amino acids. In mitochondrial sequences, it is observed that amino acid frequencies vary considerably in response to base frequency changes. Mutation pressure is thus strong enough to drive amino acid frequencies away from their optimal values. Influence of physical properties – Most observed amino acid substitutions are between amino acids with similar physical properties. Selection acts less strongly against these changes because they have a smaller effect on protein structure and function. Here we will show that the physical properties of the amino acids determine the degree to which amino acid frequencies can respond to mutation pressure. Strand asymmetry - There is an asymmetry in replication of the two DNA strands in mitochondrial genomes. The strands are subject to different mutational pressures and the base frequencies are not equal on the two strands. All the data in this study refer to frequencies on the plus strand of the genome, which codes for the majority of genes. 4 Image reproduced from This is a table of 8 physical properties of amino acids that are thought to influence protein folding and function (Volume, Polarity, Hydrophobicity etc.) Using Principal Component Analysis, we projected this 8-dimensional space into 2d, so that the similarities between the amino acids can be clearly visualized. On the left, we show the variation in the frequencies of three amino acids in response to the variation in T4. Serine shows a significant increase; threonine shows a significant decrease; and alanine shows no trend. The direction and magnitude of these trends is influenced by mutations at all three codon positions. 8 On the right, we show the slope of the linear regression of each of the amino acid frequencies against each of the four base frequencies. The amino acids are numbered in the order they appear in the genetic code diagram (see part 4). Note that serine has two separate blocks and is thus numbered twice. Key point – The amino acids in the first two columns (numbers 1-9) have large slopes that may be either positive or negative, i.e. they are responsive to mutational pressure. The amino acids in the third and fourth columns (numbers 10-21) have slopes close to zero, i.e. they are non-responsive. Hypothesis – An amino acid will respond significantly to mutational pressure at the DNA level if there are neighbouring amino acids in the genetic code to which it can mutate that have similar physical properties. If the neighbouring amino acids are very different in properties, selection will oppose these mutations, and the amino acid will not be responsive to mutation pressure. Squares are data points from fish genomes. Triangles are derived from a mutation-selection model. Responsiveness – We measured the slope for each amino acid against each of the four base frequencies for two independent data sets (fish and mammals). We define the responsiveness of an amino acid as the root mean square value of these 8 slopes. Proximity – We define the distance d ij between any pair of amino acids as the euclidean distance between them in the 8d physical property space (after normalizing each property to have unit variance). We then define the proximity of an amino acid as the mean of 1/d ij for all its neighbouring amino acids (i.e. those accessible by a single mutation in the DNA). A high-proximity amino acid is one whose neighbours have similar physical properties.