Download presentation
Presentation is loading. Please wait.
1
Volume 26, Issue 5, Pages 692-697 (March 2016)
The Effect of Local Sequence Context on Mutational Bias of Genes Encoded on the Leading and Lagging Strands Jeremy W. Schroeder, William G. Hirst, Gabriella A. Szewczyk, Lyle A. Simmons Current Biology Volume 26, Issue 5, Pages (March 2016) DOI: /j.cub Copyright © 2016 Elsevier Ltd Terms and Conditions
2
Figure 1 Indels in Homopolymer Runs Are Enriched Outside of Coding Regions All starts and ends of CDSs were aligned at relative position zero. Indels were counted in 50-bp bins, offset by 25. Negative distances indicate the indel was 5′ to a CDS start site or 3′ to a CDS end site. The lines in each plot represent the locally weighted polynomial regression (loess) fit to the data. (A) The number of indels found in each bin without correcting for homopolymer run bias. (B) Uncorrected indel counts separated by the homopolymer run length in which the indel was produced. (C) Expected indel count in each bin after applying a correction for homopolymer run bias (see Supplemental Experimental Procedures). See also Figure S3. Current Biology , DOI: ( /j.cub ) Copyright © 2016 Elsevier Ltd Terms and Conditions
3
Figure 2 Transitions Display Complementary Symmetry between Replichores (A) Schematic representation of the B. subtilis chromosome. DNA replication initiates at oriC, which is at position zero in the reference genome, and proceeds bidirectionally toward terC, which is at 1.97 Mb in the reference genome. The left and right replichores of the chromosome are in green and black, respectively. (B) Cumulative distributions of the indicated types of transitions along the genome. The origin of replication is indicated by the vertical dashed red line and the terminus is at both ends. (C) A barplot displaying the mutation rate for the indicated types of transitions binned by replichore. The mutation rate is normalized to the number of each base in each replichore as described in Supplemental Experimental Procedures. Error bars represent 95% confidence intervals determined by bootstrapping. All comparisons between left and right replichores are statistically significant with the exception of T → C transitions in MMR+ lines. MMR intact refers to wild-type data, and MMR deficient refers to the pooled data for ΔmutSL, ΔwalJ, and mutL[E468K]. Current Biology , DOI: ( /j.cub ) Copyright © 2016 Elsevier Ltd Terms and Conditions
4
Figure 3 Regression of Base Pair Substitution Count against Coding Sequence Length, Expression, and Orientation (A) A graphical representation of linear regression analysis. Blue triangles indicate head-on CDSs, and red circles represent codirectional CDSs. Lines indicate the linear fit to the data, and the shaded region around each line indicates the 95% confidence interval for the fit. The plot on the left includes all CDSs, and the plot on the right excludes CDSs greater or less than 3 SD from either the mean length or expression (RPKM). See Table S4 for a summary of each CDS including which were determined to be outliers. (B) A table listing the variables determined to be significantly associated with the average number of BPSs found in CDSs either with or without outliers. See Table S3 for detailed results and Equation S5 in Supplemental Information for the regression model. Current Biology , DOI: ( /j.cub ) Copyright © 2016 Elsevier Ltd Terms and Conditions
5
Figure 4 Increased Mutation Rate in Head-On Genes Due to Sequence Composition (A) The transition rate for the focal base in each of the 64 possible triplet nucleotide sequences in MMR− lines is shown. Rates are normalized to the number of times each triplet is present in the leading strand. (B) The leading strand triplet composition of head-on CDSs is plotted versus that of codirectional CDSs. “Fraction of triplets” in the axis labels refers to the number of a given triplet divided by the total number of all triplets present in the leading strand of either head-on or codirectional CDSs. The triplets with the highest transition rates are plotted as larger red dots and indicated by arrows. The red dashed line indicates a slope of one so that differences between head-on and codirectional genes may be easily noticed. (C) Monte Carlo simulations were performed to generate transitions using the MMR− context-dependent transition rates shown in (A). The boxplots on the left indicate the distribution of transition rates for codirectional (red) and head-on (blue) CDSs using the MMR− context-dependent transition rates. Boxplots on the right indicate the distribution of transition rates with the 5′-CCG-3′ triplet rate set artificially to zero. Each pair of boxplots represents the distribution of mutation rates resulting from 1,000 independent simulations of 500,000 generations. The lower and upper bounds of each box indicate the first and third quartile, respectively. The band inside each box indicates the median. The lower and upper whiskers include values within 1.5 times the interquartile range of the first and third quartiles, respectively. (D) The simulation performed in (C) was carried out at a range of generations per iteration. For each number of generations, 1,000 iterations were performed, and hypothesis testing was carried out to test whether head-on genes had a mutation rate greater than that of codirectional genes. The proportion of those 1,000 p values less than or equal to 0.05 is plotted in the y axis against the number of generations per iteration. See also Figure S3. Current Biology , DOI: ( /j.cub ) Copyright © 2016 Elsevier Ltd Terms and Conditions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.