Codon usage bias Ref: Chapter 9

Slides:



Advertisements
Similar presentations
A novel method for measuring codon usage bias and estimating its statistical significance Codon usage bias or CUB, a phenomenon in which synonymous codons.
Advertisements

Chapter 10 Table of Contents Section 1 Discovery of DNA
A genome-wide perspective on translation of proteins Jan 2012 Regulatory Genomics Lecturer: Prof. Yitzhak Pilpel.
Codon models R CGT CGC R D GAC GCC A Synonymous substitution Nonsynonymous substitution.
Molecular Clock I. Evolutionary rate Xuhua Xia
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Chapter 4 Transcription and Translation. The Central Dogma.
The origins & evolution of genome complexity Seth Donoughe Lynch & Conery (2003)
The phylogenetics project data revealed! October 4, 2010 BIOS E-127.
TRANSLATION The process of converting the information stored in mRNA into a protein is called translation mRNA carries information from a gene to a structure.
8 The Molecular Genetics of Gene Expression. Fig. 8.6c Transcription Elongation.
The phylogenetics project data revealed! October 4, 2010 OEB 192.
Bio 178 Lecture 29 DNA and Gene Expression. Reading Chapters 14 & 15 Quiz Material Questions on P 300 & 318 Chapters 14 & 15 Quizzes on Text Website (
RNA = RiboNucleic Acid Synthesis: to build
FROM GENE TO PROTEIN: TRANSLATION & MUTATIONS Chapter 17.
A genome-wide perspective on translation of proteins Dec 2012 Regulatory Genomics Lecturer: Prof. Yitzhak Pilpel.
Colinearity of Gene and Protein DNA RNA protein genotype function organism phenotype DNA sequence amino acid sequence transcription translation.
Chapter 13.2 (Pgs ): Ribosomes and Protein Synthesis
1. A mutation occurs at the midpoint of a gene, altering all amino acids encoded after the point of mutation. Which mutation could have produced this.
Biology 1060 Chapter 17 From Gene to Protein. Genetic Information Important: Fig Describe how genes control phenotype –E.g., explain dwarfism in.
Signposts for translation initiation: An illustration of formulating a research project Xuhua Xia
Unit 8 Review Questions.
CHMI E.R. Gauthier, Ph.D. 1 CHMI 2227E Biochemistry I Gene expression.
Xuhua Xia Signposts for translation initiation: An illustration of formulating a research project Xuhua Xia.
Codon Bias Examination measuring the effect of codon usage deviations on protein expression level Emmanuel Levi Research Group 2013.
DNA Structure & Function. Perspective They knew where genes were (Morgan) They knew what chromosomes were made of Proteins & nucleic acids They didn’t.
GENE EXPRESSION. Root Words to Know Trans = acrossScript = write Poly = manySynth = make.
Protein Synthesis How to code for the correct amino acids.
1 TRANSCRIPTION AND TRANSLATION. 2 Central Dogma of Gene Expression.
Codon usage bias Ref: Chapter 9 Xuhua Xia dambe.bio.uottawa.ca.
Copyright © 2009 Pearson Education, Inc. Chapter 14 The Genetic Code and Transcription Copyright © 2009 Pearson Education, Inc.
Genes and How They Work Chapter The Nature of Genes information flows in one direction: DNA (gene)RNAprotein TranscriptionTranslation.
Chapter 10 Opener. Figure 10.1 Metabolic Diseases and Enzymes.
Transcription and Translation: What does a cell (or organism) do with its genes??
CHAPTER 9 Proteins and Their Synthesis CHAPTER 9 Proteins and Their Synthesis Copyright 2008 © W H Freeman and Company.
The Genetic Code. The DNA that makes up the human genome can be subdivided into information bytes called genes. Each gene encodes a unique protein that.
1 Codon Usage. 2 Discovering the codon bias 3 In the year 1980 Four researchers from Lyon analyzed ALL published mRNA sequences of more than about 50.
Chapter 12-3: RNA & Protein Synthesis Essential Questions:  What are 3 types of RNA?  What is the function of 3 types of RNA?  What happens during transcription?
1 What forces constrain/drive protein evolution? Looking at all coding sequences across multiple genomes can shed considerable light on which forces contribute.
KEY CONCEPT 8.5 Translation converts an mRNA message into a polypeptide, or protein.
1. 2 Discovering the codon bias 3 Il codice genetico è DEGENERATO.
Discovering the codon bias
Relationship between Genotype and Phenotype
From DNA to Protein - Gene Expression: RNA and Protein
GENETICS.
Causes of Variation in Substitution Rates
The Function of DNA.
Pipelines for Computational Analysis (Bioinformatics)
Basic concepts in molecular evolution
DNA vs RNA.
Gene Expression: from DNA to protein
Codon Bias as a Means to Fine-Tune Gene Expression
RNA and Protein Synthesis
RNA and Protein Synthesis
Translation elongation, amino acid usage, and codon usage indices
Translation initiation and co-evolution between SD and aSD in bacteria
Relationship between Genotype and Phenotype
Chapter 15 The Genetic Code
CHMI 2227E Biochemistry I Gene expression
Transcription is the synthesis of RNA under the direction of DNA
Eukaryote Regulation and Gene Expression
Translation.
in the Gene’s Polypeptide Product
RNA and Protein Synthesis
Chapter 10 – The Gene and Protein Synthesis
DNA Transcription and Translation
Gene Structure.
Relationship between Genotype and Phenotype
AS Level Paper 1 and 2. A2 Level Paper 1 and 3 - Topics 1-4
Gene Structure.
Presentation transcript:

Codon usage bias Ref: Chapter 9 Xuhua Xia xxia@uottawa.ca http:// dambe.bio.uottawa.ca

Objectives Understand how codon usage bias affect translation efficiency and gene expression Biomedical relevance Protein drugs in pharmaceutical industry Transgenic experiments in agriculture Factors affecting codon usage bias Indices measuring codon usage bias Develop bioinformatic skills to study the genomic codon usage. Slide 2 Xuhua Xia

Codon Usage Bias Observation: Strongly biased codon usage in a variety of species ranging from viruses, mitochondria, plastids, prokaryotes and eukaryotes. Hypotheses: Differential mutation hypothesis, e.g., Transcriptional hypothesis of codon usage (Xia 1996 Genetics 144:1309-1320 ) Different selection hypothesis, e.g., (Xia 1998 Genetics 149: 37-44) Predictions: From mutation hypothesis: Concordance between codon usage and mutation pressure From Selection hypothesis: Concordance between differential availability of tRNA and differential codon usage. The concordance is stronger in highly expressed genes than lowly expressed genes (CAI is positively correlated with gene expression). UCC~tRNA~Gly GCC~tRNA~Gly Polycistronic mRNA Ribosome Gene 1 Gene 2 Gene 3 RNA polymerase Protein Slide 3 Xuhua Xia

Codon usage of HEGs in yeast You may be wondering about Cys codon family which has 4 tRNAs matching UGC, but none matching UGU. We would have predicted that UGC should be preferred, but the opposite is true. Why? One might think that, because Cys is rarely used, the codon family is not under selection, so that codon usage will be at the mercy of mutation bias. Because the yeast genome is AT-biased, we expect U-ending codon to be more than C-ending codon. Unfortunately, the explanation is wrong because 1) the mutation bias is not sufficient for the 3/39 ratio, and 2) the lowly expressed genes, which should be even more affected by mutation bias, did not exhibit a strong bias comparable to 3/9. This criticism is also applicable to another explanation stating that the GCA anticodon can decode C-ending and U-ending codons equally well. Slide 4 Xuhua Xia Xia 2007. Bioinformatics and the cell.

Calculation of RSCU RSCU and proportion: Different scaling. RSCU (Sharp et al. 1986) is codon-specific Slide 5 Xuhua Xia

RSCU (HIV-1 vs Human) (a) E G I K L P Q R S T V 0.5 1 1.5 2 2.5 RSCU (Human) RSCU (HIV-1) A-ending C-ending G-ending U-ending Fig. 1. Relative synonymous codon usage (RSCU) of HIV-1 compared to RSCU of highly expressed human genes. Data points for codons ending with A, C, G or U are annotated with different combinations of colors and symbols. A-ending codons exhibit strong discordance in their usage between HIV-1 and human and are annotated with their coded amino acids. van Weringh et al. 2011. MBE. Slide 6 Xuhua Xia

RSCU (HTLV-1 vs Human) Relative synonymous codon usage (RSCU) of HTLV-1 compared to RSCU of highly expressed human genes. Data points for codons ending with A, C, G or U are annotated with different combinations of colors and symbols. A-ending codons exhibit strong discordance in their usage between HIV-1 and human and are annotated with their coded amino acids. Slide 7 Xuhua Xia

Calculation of CAI N2,3,4: Number of 2-, 3-, 4-fold codon families Compound 6- or 8-fold codon families should be broken into two codon families CAI is gene-specific. 0  CAI  1 CAI computed with different reference sets are not comparable. Problem with computing w as Fi/Fi.max: Suppose an amino acid is rarely used in highly expressed genes, then there is little selection on it, and the codon usage might be close to even, with wi  1. Now if we have a lowly expressed gene that happen to be made of entire of this amino acid, then the CAI for this lowly expressed gene would be 1, which is misleading. There has been no good alternative. Further research is needed. Slide 8 Xuhua Xia

Weak mRNA predictive power y = 5.6507x + 4.1367 R 2 = 0.1936 10 20 30 40 50 60 70 80 0.5 1.5 2.5 3.5 4.5 mRNA abundance Protein abundance ENO1 FRS2 Slide 9 Xuhua Xia

Effect of Codon Usage Bias y = 70.398x - 11.739 R 2 = 0.5668 10 20 30 40 50 60 70 80 0.05 0.25 0.45 0.65 0.85 Codon usage bias Protein abundance ENO1 FRS2 Slide 10 Xuhua Xia

Any problem with the mutation hypothesis? Table 2. Frequency of A residues, length and codon adaptation index (CAI) for the three HIV-1 early (tat, rev and nef) and five late (gag-pol, vif, vpu, vpr, and env) coding sequences (CDS). Gene CDS (bp) CAI tat 261 0.66875 rev 351 0.66211 nef 621 0.67523 gag 1503 0.62784 pol 3012 0.58139 vif 579 0.61941 vpr 291 0.64272 vpu 249 0.49068 env 2571 0.61924 van Weringh et al. 2011. MBE.

Problem with CAI and a new ITE AA Codon Cfnon-HEG CFHEG tRNA A GCA 20 40 3 GCG 80 60 CAI ITE AA Codon CFnon-HEG CFHEG w pHEG pnon-HEG s A GCA 20 40 2/3 0.4 0.2 2 1 GCG 80 60 0.6 0.8 0.75 0.375 50 0.5 0.3 CAI is a special case of ITE (when there is no background codon usage bias) Slide 12 Xuhua Xia

Problem with CAI and a new ITE AA Codon CFnon-HEG CFHEG w Gene1 Gene2 A GCA 20 40 2/3 10 GCG 80 60 1 30 𝐶𝐴𝐼= 𝑒 𝐹 𝑖 ln⁡( 𝑤 𝑖 ) 𝐹 𝑖 CAI1 = 0.9221; CAI2 = 0.8503 Wrong conclusions: 1. Excellent codon adaptation in the codon family (high CAI values) 2. Gene 1 has better codon adaptation than Gene2. AA Codon CFnon-HEG CFHEG pHEG pnon-HEG s w Gene1 Gene2 A GCA 20 40 0.4 0.2 2 1 10 GCG 80 60 0.6 0.8 0.75 0.375 30 E. coli data 𝐼 𝑇𝐸 = 𝑒 𝐹 𝑖 ln⁡( 𝑤 𝑖 ) 𝐹 𝑖 ITE.1 = 0.4563;ITE.2 = 0.5552 Correct conclusions: 1. Poor codon adaptation in the codon family (low ITE values) 2. Gene 2 has better codon adaptation than Gene1. Slide 13 Xuhua Xia

Problem with CAI and a new ITE AA Codon CFOther CFHEG tRNA A GCA 25511 1973 3 GCG 43261 2654 CAI ITE AA Codon CFOther CFHEG w pHEG pOther s A GCA 25511 1973 0.7434 0.4264 0.3710 1.1495 1 GCG 43261 2654 0.5736 0.6290 0.9118 0.7932 0.5 0.8528 1.1472 E. coli data CAI is a special case of ITE (when there is no background codon usage bias) Slide 14 Xuhua Xia

Contrast between CAI and ITE Kudla et al. (2009) engineered a synthetic library of 154 genes, all encoding the same protein but differing in degrees of codon adaptation, to quantify the effect of differential codon usage on protein production in E. coli. They concluded that “codon bias did not correlate with gene expression” and that “translation initiation, not elongation, is rate-limiting for gene expression” ITE reveals that Low protein production with low ITE, regardless of translation initiation efficiency If translation initiation is efficient, protein production increases with ITE. Slide 15 of x

Hypothesis and Predictions Met Leu Glu Lys Gln Arg Trp tRNAMet/CAU tRNALeu/UAA tRNAGlu/UUC tRNALys/UUU tRNAGln/UUG tRNAArg/UCU tRNATrp/UCA AUG UUG GAG AAG CAG AGG UGG AUA UUA GAA AAA CAA AGA UGA A-ending codons are favoured by both mutation and tRNA-mediated selection. AUA is favoured by mutation, but not by tRNA-mediated selection Predictions: 1. Proportion of A-ending codons (or RSCU) should be smaller in the Met codon family than in other R-ending codon families: PNNA = NNNA/NNNG 2. Availability of tRNAMet/UAU should increase PAUA. Xuhua Xia Xia et al. 2007

Testing prediction 1 Carullo, M. and Xia, X. 2008 J Mol Evol 66:484–493. Slide 17 Xuhua Xia

Testing prediction 2 Fig. 5. Relationship between PAUA and PUUA, highlighting the observation that PAUA is greater when both a tRNAMet/CAU and a tRNAMet/UAU are present than when only tRNAMet/CAU is present in the mtDNA, for bivalve species (a) and chordate species (b). The filled squares are for mtDNA containing both tRNAMet/CAU and tRNAMet/UAU genes, and the open triangles are for mtDNA without a tRNAMet/UAU gene. Xia, X. 2012. In: RS Singh et al.. Evolution in the fast lane: Rapidly evolving genes and genetic systems. Oxford University Press.