Download presentation
Presentation is loading. Please wait.
Published byTorje Amundsen Modified over 5 years ago
1
Antigenic Variation in the Lyme Spirochete: Insights into Recombinational Switching with a Suggested Role for Error-Prone Repair Theodore B. Verhey, Mildred Castellanos, George Chaconas Cell Reports Volume 23, Issue 9, Pages (May 2018) DOI: /j.celrep Copyright © 2018 The Author(s) Terms and Conditions
2
Cell Reports 2018 23, 2595-2605DOI: (10.1016/j.celrep.2018.04.117)
Copyright © 2018 The Author(s) Terms and Conditions
3
Figure 1 Schematic of Antigenic Variation in B. burgdorferi and Inference of Recombinational and Mutational Sequence Changes (A) The vls system is located adjacent to the right hairpin telomere of the lp28-1 plasmid in the B31 strain of B. burgdorferi. It includes the vlsE expression locus adjacent to the right telomere, as well as 15 silent cassettes, which share homology to the variable region of vlsE and are located upstream and in the opposite orientation. Unidirectional, segmental gene conversion events lead to cumulative and overlapping sequence changes in the expression locus to generate new chimeric VlsE antigens (Zhang et al., 1997; Zhang and Norris, 1998). The downward arrows indicate the position of the 17-bp direct repeats (DRs) that flank the silent cassettes and the vlsE variable region. (B) PacBio long-read sequences of an amplicon are mapped to the reference vlsE sequence and the cassettes using our VAST software. Sequence variation, represented by SNPs at different loci can be identified as originating from one or more silent cassette sequences, or none. Inference of switch events leads to a set of switch events (red and blue transparent boxes) that are most likely to explain the SNPs observed in the mapped read. SNPs that could not have been templated from the silent cassettes are designated as non-templated mutations (green). The Cs indicate the constant regions of vlsE. Cell Reports , DOI: ( /j.celrep ) Copyright © 2018 The Author(s) Terms and Conditions
4
Figure 2 Rate of Switch Event Accumulation, Cassette Utilization, and the Role of Percent Sequence Identity and the DRs (A) The number of inferred switch events per read from weeks 0–5 post-infection. The units of observation are individual reads and the mean ± SEM is shown, although the error is too small to be visible. The least-squares linear regression line is shown with 95% confidence band in gray. The rate of accumulation (slope) and its 95% confidence interval are labeled in red text. (B) The frequency of cassette usage for each cassette was calculated using two methods. In red, cassette usage frequency was determined by taking the mean of the frequencies of SNPs unique to each cassette. In blue, cassette frequencies are based on the inferred switch events, which take into account the neighboring SNPs to narrow down the possible sources of each templated SNP. This was computed by measuring the number of recombination attributed to each cassette (weighted appropriately for those switches that could have originated from multiple cassettes). (C) For each silent cassette, the percentage of sequence identity to the reference vlsE sequence (found in the expression locus) was computed using two methods. In red, insertions, deletions, and substitutions were scored equivalently; in green, indels were weighted proportional to their size. (D) Silent cassettes were classified by whether they contained 0, 1, or 2 DRs identical to those flanking the variable region in vlsE. Each bar represents a group of cassettes and their mean frequency of cassette usage (mean ± SD). Cell Reports , DOI: ( /j.celrep ) Copyright © 2018 The Author(s) Terms and Conditions
5
Figure 3 Length of Switch Events
(A and B) Histogram of minimal switch event lengths for (A) SCID mice (9,949 recombination events from 4,065 reads) and (B) WT mice (3,678 recombination events from 1,254 reads), with the mean shown in blue. (C) A correction curve from 2.5 × 1010 simulated switch events showing the relationship between the simulated length on the x axis and the observed minimal size (y axis). For each simulated switch size from 1 to 250 bp, the mean observed minimal switch size (black line) and one SD on either side (gray band) is shown (see text for further details). Cell Reports , DOI: ( /j.celrep ) Copyright © 2018 The Author(s) Terms and Conditions
6
Figure 4 Clustering of Switch Events
(A) Histogram of the number of switches per read from reads at week 1 in SCID mice (black), shown alongside a Poisson distribution expected from the same mean (red). A single read with 9 recombination events was an outlier and is not shown. Data were analyzed only from SCID mice to eliminate immune selection in WT mice that would confound analysis of recombination alone. (B) The same data (black) are shown alongside a zero-inflated Poisson distribution with maximum likelihood estimated parameters (red), where the correspondence between the observed is very close to that from the zero-inflated Poisson. (C) Distance between switch events for reads with exactly two inferred recombination events with the mean ± SEM shown. The distance between the two switches was calculated three ways: (1) using the minimal switch lengths (which have the largest distance), (2) using maximal switch lengths (the longest possible switch events, which have the shortest distance between the two, and (3) the midpoint switch lengths (the midpoint between the minimal and maximal endpoints). For each method, the switching data (black) were compared to a set of 1 million simulated 2-switch variants (blue) where the recombination events were located randomly and originated from random cassette sequences. Two-sample t test p values are shown above the brackets. Cell Reports , DOI: ( /j.celrep ) Copyright © 2018 The Author(s) Terms and Conditions
7
Figure 5 Association of Switch Boundaries with Local Sequence Identity and Relative Abundance of SNPs Originating from Cassette Edges (A) Each point represents a boundary uncertainty region, defined as a specific region between templated SNPs on a specific cassette and is the smallest unit in locating the position of a switch event boundary. Each is plotted by the number of switch events beginning or ending in that boundary, against the mean of a 50-bp moving average of the sequence identity within the boundary uncertainty region. Least-squares exponential regression line is shown in red. (B) Each point represents a SNP from the silent cassette repertoire, plotted by distance from the nearest cassette edge (or average distance, for SNPs from multiple cassettes of different start/stop coordinates) for SNPs located up to 100 bp from the edges. The y axis represents the ratio of observed to theoretical frequency (FObserved/FTheoretical). A moving average is shown based on 10 neighboring points in each direction. Cell Reports , DOI: ( /j.celrep ) Copyright © 2018 The Author(s) Terms and Conditions
8
Figure 6 Non-templated SNPs Accumulate over Time and Are Correlated with Switch Events (A) The frequency of SNPs not attributable to silent cassettes (mean ± SEM) is shown over time in the SCID mouse infection alongside a negative control of the same spirochetes cultured for 5 weeks in growth medium. Data were analyzed only from SCID mice to eliminate immune selection in WT mice that would add an extra layer of complication when trying to analyze the process resulting in DNA changes. Least-squares regression lines and associated 95% confidence bands are plotted for the SCID infection (red) and the culture control (blue). The unit of observation is each base sequenced. (B) For reads with 0–10 switches, the number of non-templated SNPs per read (mean ± SEM) is shown. The least-squares regression line is shown in red with its 95% confidence band shown in gray. Cell Reports , DOI: ( /j.celrep ) Copyright © 2018 The Author(s) Terms and Conditions
9
Figure 7 Correlation between the Positions of Templated and Non-templated Indels and Substitutions and Preference for Non-templated SNPs in the Boundary Regions on the Left Side of Well-Defined Switch Events (A) Frequency of templated indels (green, top) and non-templated indels (red, bottom) for each position in the vlsE amplicon. (B) Frequency of templated substitutions (green, top) and non-templated substitutions (red, bottom) for each position in the vlsE amplicon. (C) We examined 1,019 reads where there was a single switch event and no ambiguity in the position of the switch event. The left side of the graph shows switch events of all lengths and the right side shows only those of 10 bp or greater. Non-templated SNPs in each read were located in the interior, exterior, or in the left (5′ side on the coding strand) or right (3′ side) boundary uncertainty regions (BURs). Boundary uncertainty regions are the regions between SNPs where there is perfect sequence identity between the silent cassettes and the vlsE expression locus. The frequency of non-templated SNPs (mean ± SEM) are shown for each region. The unit of observation is the base, in order to account for the different sizes and locations of each region from read to read. Cell Reports , DOI: ( /j.celrep ) Copyright © 2018 The Author(s) Terms and Conditions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.