Download presentation
Presentation is loading. Please wait.
Published byPhilomena Dean Modified over 6 years ago
1
Automated Design of Synthetic Cell Classifier Circuits Using a Two-Step Optimization Strategy
Pejman Mohammadi, Niko Beerenwinkel, Yaakov Benenson Cell Systems Volume 4, Issue 2, Pages e14 (February 2017) DOI: /j.cels Copyright © 2017 Elsevier Inc. Terms and Conditions
2
Cell Systems 2017 4, 207-218.e14DOI: (10.1016/j.cels.2017.01.003)
Copyright © 2017 Elsevier Inc. Terms and Conditions
3
Figure 1 Cell Type Classification with Synthetic Gene Circuits
(A) Schematic representation of classifier circuit operation. The circuit senses three endogenous miRNA species designated by blue, red, and green colors. When delivered into a mixed cell population, the circuit triggers apoptosis (dotted red line) in cells with high concentration of red or blue miRNA and low concentration of green miRNA. (B) A classifier gene circuit (left) and the biochemical parameters used in the model. The corresponding logic circuit is on the right. (C) Schematic presentation of logic of a classifier circuit. General circuit is shown using American National Standards Institute (ANSI) symbols for logic gates. A circuit may include disjunctions (OR gates) with up to three miRNA inputs and a number of standalone NOT gates that feed into a conjunction (AND) gate with the total number of inputs not exceeding ten. Cell Systems 2017 4, e14DOI: ( /j.cels ) Copyright © 2017 Elsevier Inc. Terms and Conditions
4
Figure 2 Evaluating Classifier Circuit Performance Using Expected On-Off Ratio (A) Schematics of calculating the expected On-Off ratio (YES gate, top, AND gate, bottom). The plots on the left show the probability distributions for input gene expression (G1 and G2). The middle plots show the logic output I next to the continuous model output, C(θ), calculated with parameters θ. The violin plots on the right show the output probability distributions for the samples that fall into on and off input domains, respectively. (B) Distribution of parameter values in circuits with the top 10% of the On-Off ratios shown for NOT (top) and a YES (bottom) gates. Parameter correlations are displayed on the right. Cell Systems 2017 4, e14DOI: ( /j.cels ) Copyright © 2017 Elsevier Inc. Terms and Conditions
5
Figure 3 Classifier Gene Circuits before and after Parameter Optimization (A) Left to right, input-output relation for a logical YES gate, continuous YES gate with best-guess parameters (Table S1), and those optimized for the highest On-Off ratio. (B) Plots similar to (A) for a two-input AND circuit. (C and D) Input-output relation for YES (C) and two-input AND (D) circuits calculated using three parameter sets optimized for binarization thresholds (t) of 0.2% (θ1), 1% (θ2), and 5% (θ3) of the total cellular miRNA pool (Table S1). Cell Systems 2017 4, e14DOI: ( /j.cels ) Copyright © 2017 Elsevier Inc. Terms and Conditions
6
Figure 4 Evaluating Circuit Performance on Annotated Data
(A) Performance evaluation of two example classifier circuits over a given input dataset. The dataset (left) includes three positive and seven negative samples. Circuit outputs and the annotation are used to calculate the Area Under the receiver operator curve (AUC). True positive rate (TPR) and false positive rate (FPR) refer to the false and the true positive rate, respectively. Average margin (m¯) and worst margin (w) are shown. (B) Propagation of the binary margins along the circuit. A set of inputs, the output, and their respective margins are shown for a four-input example circuit. The output value, on/off, is calculated using Boolean logic. The margin at each gate is the smallest fold change that can switch its output value. The critical margin values propagated to the next part of the circuit are highlighted in bold red font. Cell Systems 2017 4, e14DOI: ( /j.cels ) Copyright © 2017 Elsevier Inc. Terms and Conditions
7
Figure 5 Synthetic Data Experiment (A) The simulation workflow.
(B–D) Upper and lower panels correspond respectively to the specific and general search methods. (B) Overall fraction of perfect classifiers identified using the search methods. (C) Probability of missing a perfect solution as a function of the employed convergence threshold. (D) Correlation between classification margins of circuits used to produce data (x axis) and the margin of learned circuits (y axis). The red line is the linear regression fit. Cell Systems 2017 4, e14DOI: ( /j.cels ) Copyright © 2017 Elsevier Inc. Terms and Conditions
8
Figure 6 Summary of Classifier Circuits Found Using the Specific Search Method on 18 Example Problems from Three Public Datasets For the breast cancer dataset and the human miRNA atlas data, all healthy samples in the data are used as negative samples. For the mouse lymphopoiesis data, the designated cell types are contrasted with all the remaining cell types in the dataset. (A) Circuit output levels. Each triangle corresponds to one sample, with red upward triangles corresponding to the positive and blue downward triangles to the negative samples. (B–D) Other panels show the area under the receiver operating characteristic (ROC) curve (AUC) (B), the average (C), and the worst (D) classification margin, respectively, for each of the classification tasks. The black bars show the values from the best learnt classifier (training performance), while the white boxes are the estimated values from 3-fold cross-validation (generalization performance). The error bars show the SEM estimate. Each classification was repeated for three different parameter sets θ1, θ2, and θ3 (Table S1), and the parameter set giving the best performance is reported in parentheses. (E) Performance of the identified classifier circuits as a function of implementation accuracy. Performance was estimated using Monte Carlo simulation. Each point is the average over 1,000 simulations of each circuit with different sets of biochemical parameters. All parameters are perturbed simultaneously and the x axis is the expected deviation of the implemented circuit parameters from the intended optimized parameter values. (F and G) Classification accuracy as a function of cell-to-cell variation in the input miRNA expressions (F) and in both miRNAs and the circuit parameters (G). Each point is estimated from 10,000 simulated single cells. The columns “Learning” and “Cross val.” correspond to the black and white bars in (B), and σ/μ is the coefficient of variation (see also Figures S5 and S6). Colors in the ROC AUC panels represent the average margin using the original parameters (i.e., the black bars in C). Cell Systems 2017 4, e14DOI: ( /j.cels ) Copyright © 2017 Elsevier Inc. Terms and Conditions
9
Figure 7 Reducing the Size of Learnt miRNA Classifier Circuits with Bootstrap-Based Pruning (A) Examples of learnt networks for all breast cancer samples (upper panel) and breast cancer cell lines (lower panel). (B) The same examples after pruning. (C) The left panel shows the number of genes in the best learnt circuits, and the corresponding pruned circuits for 18 real data experiments. The next two panels show the area under the ROC curve and average classification margin for the best and pruned classifiers estimated from the learning data (L) and cross-validation (CV). (D) Comparison of circuit performance learnt using specific and the general method in cross-validation samples. Cell Systems 2017 4, e14DOI: ( /j.cels ) Copyright © 2017 Elsevier Inc. Terms and Conditions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.