Signposts for translation initiation: An illustration of formulating a research project Xuhua Xia
The Protocol What is known (which involves much reading and doing) Formulating hypothesis based on what is known Derive predictions from the hypothesis: –Predictions are always about the relationship between or among measurable variables. –Predictions involving variables that cannot be measured is of no value in science. Design experiments to test the predictions –Methods to measure the variables relevant to the prediction –Methods to assess the relationship among the variables to confirm or reject the predictions Results –All results should be presented with respect to the predictions. –Anything that is biologically interesting but not directly related to the predictions should be in the Discussion section Discussion –Does the method measure the variables as you intend it to? –Does your conclusions depend on assumptions that may not be valid under certain circumstances? –.....
E. coli 5’ UTR What is known: From “reading”: Signposts for translation initiation are often located around or upstream of the start codon. The signposts are often a short motif From “doing”: a dramatic pattern
Hypothesis, prediction & methods Hypothesis: the pattern is related to translation initiation, i.e., a dramatic increase in purine and dramatic decreases in pyrimidine enhance translation initiation. Prediction: If the hypothesis is correct, then we expect highly expressed genes to exhibit the pattern more strongly than the lowly expressed genes. –It is a relationship involving two variables The gene expression The strength of the pattern –The variables need to be measurable Methods: how should we measure the variables? –Gene expression (CAI or results from wet lab measurements) –The pattern: graphic characterization Numerical characterization (e.g., the variance among the four frequencies)
Results testing the predictions Highly expressed genes Lowly expressed genes You could do statistics to show that the pattern in the left is significantly stronger than that in the right, but often a picture is worth 1000 words + 10 p values. Results not directly related to the prediction but should be discussed: the difference in frequency distribution at sites 0-70
Prokaryotic translation initiation Shine-Dalgarno (SD) sequence in the 5’ UTR matches the anti-SD (ASD) sequence at the 3’ end of ssu rRNA What is an SD? –Outdated: SD consensus is AGGAGG, binding to UCCUCC in the 3’ end of ssu rRNA In E. coli, for example, the sequence is AGGAGGU. This sequence helps recruit the ribosome to the mRNA to initiate protein synthesis by aligning it with the start codon. The complementary sequence (UUCCUCC). –Modern
Secondary structure of E. coli 16S rRNA Yassin A et al. PNAS 2005;102: ASD: 3’ AUUCCUCCACUA---..5’ SD: 5..--AGGAGG---..AUG–..3’
D2D2 aSD: pyrimidine-rich SD 1 A U G D toAUG mRNA ssu Ribosome SD 2 D1D1 (a) (b) (c) (d) Modern Definition of SD Is it important to have weak bonds here so that the stem can be open a bit to increase flexibility? Prabhakaran et al. 2015
Refine the hypothesis 16S rRNA 3’ ATTCCTCCACTAGGTTGGCG--- 5’ Z2705 GAGATTAACTCAATCTAGAGGGTATTAATAATG 16S rRNA 3’ ATTCCTCCACTAGGTTGGCG--- 5’ Z5748 CTGAACATACGAATTTAAGGAATAAAGATAATG 16S rRNA 3’ ATTCCTCCACTAGGTTGGCG--- 5’ Z3810 AACCGCCGCTTACCAGCAGGAGGTGATGAAAUG 16S rRNA 3’ ATTCCTCCACTAGGTTGGCG--- 5’ Z2225 TGATCCGCGTATCGGACGTGGAGGTGGTGAATG It is the pairing, not the motif AGGAGG, that is important. D toAUG = 17 D toAUG = 15 D toAUG = 14
Pairing and reading frame 16S rRNA 3’ ATTCCTCCACTAGGTTGGCG--- 5’ Z4094 CAGTTTAACTAGTGACTTGAGGAAAACCTAATG 16S rRNA 3’ ATTCCTCCACTAGGTTGGCG--- 5’ Z4981 GGCACACTTAATTATTAAAGGTAATACACTATG 16S rRNA 3’ ATTCCTCCACTAGGTTGGCG--- 5’ Z0849 TATTAGATTTGTATTCACCGGAGTGATGTAATG 16S rRNA 3’ ATTCCTCCACTAGGTTGGCG--- 5’ Z0749 CATCTCATCGAAAACACGGAGGAAGTATAGATG Multiple SD sequences binds to 16S rRNA 3n nucleotides apart. Hypothesis: the pairing contributes to the determination of the reading frame Prediction: highly expressed genes should exhibit the pattern more strongly than lowly expressed genes
Hypothesis, prediction, tests Pairing between SD sequence and aSD are essential for translation initiation Prediction: Modifying the SD or aSD to disrupt base pairing will reduce protein production The prediction was initially supported (A. Hui, H. de Boer PNAS 84:4762–4766 –Mutating SD to disrupt the pairing: Protein production decreased –Mutating ASD to restore the pairing: Protein production is restored. Many counter examples (SD not needed for initiation): –The classic Nirenberg and Matthaei experiment with poly-U –P. Melancon et al The anti-Shine–Dalgarno region in Escherichia coli 16S ribosomal RNA is not essential for the correct selection of translational starts. Biochemistry, 29:3402–3407 (Removed the last ~30 nt in 16S rRNA) –D.C. Fargo et al Shine–Dalgarno-like sequences are not required for translation of chloroplast mRNAs in Chlamydomonas reinhardtii chloroplasts or in Escherichia coli Mol. Gen. Genet. 257:271–282 –S. Sartorius-Neef, F. Pfeifer In vivo studies on putative Shine–Dalgarno Sequences of the halophilic archaeon Halobacterium salinarum Mol. Microbiol., 51:579–588 (Efficient translation of leaderless mRNA) What genes need SD (still an unanswered question)?
Progress of science Observation Hypothesis Predictions and tests Universally accepted: Working theory New observations contradicting the theory Refine hypothesis to accommodate new observations New hypothesis to accommodate new observations
D toAUG AUG E. coli GGAUCACCUCCUUA 3’ B. subtilis UCACCUCCUUUCUA 3’ AUG D toAUG E. coli B. subtilis E. coli B. subtilis Effect of a single substitution
A new hypothesis An accessible initiation codon is essential for translation initiation (T. Nakamoto 2006 BBRC 341: ): –A leaderless mRNA can be translated because the initiation AUG is highly accessible at the 5’ end –SD and ASD pairing prevents secondary structure formation involving the initiation AUG and makes the AUG more accessible. –Synthetic mRNA without the SD sequence but can be efficiently translated are typically without secondary structure, rendering the initiation AUG readily accessible. Secondary structure may embed SD or start codon and hide the translation start signal Prediction: reduced secondary structure in seuqences flanking SD and start codon
Secondary structure and start codon Xuhua Xia Slide 15 Probhakaran et al. unpublished.
Another new hypothesis Translation initiation of both prokaryotic and eukaryotic genes depends on the ssu ribosome scanning along the mRNA Any mechanism that can pause the ssu ribosome near the initiation codon can enhance translation initiation.
Yeast 18S rRNA people.biochem.umass.edu
Yeast 5’ UTR Xuhua Xia Slide 18
Gene expression and 5’ UTR Xuhua Xia