Universal microbial diagnostics using random DNA probes

Slides:



Advertisements
Similar presentations
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
Advertisements

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Optimized Numerical Mapping Scheme for Filter-Based Exon Location in DNA Using a Quasi-Newton Algorithm P. Ramachandran, W.-S. Lu, and A. Antoniou Department.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Summarized by Soo-Jin Kim
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Jeff J. Orchard, M. Stella Atkins School of Computing Science, Simon Fraser University Freire et al. (1) pointed out that least squares based registration.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Analyzing Expression Data: Clustering and Stats Chapter 16.
. Finding Motifs in Promoter Regions Libi Hertzberg Or Zuk.
A Robust and Accurate Binning Algorithm for Metagenomic Sequences with Arbitrary Species Abundance Ratio Zainab Haydari Dr. Zelikovsky Summer 2011.
Date of download: 6/29/2016 Copyright © 2016 SPIE. All rights reserved. Experimental SEM images of an ArF-photoresist pattern. The images are 2000 nm long.
(5) Notes on the Least Squares Estimate
Babak Alipanahi1, Andrew Delong, Matthew T Weirauch & Brendan J Frey
Research in Computational Molecular Biology , Vol (2008)
Alternative Computational Analysis Shows No Evidence for Nucleosome Enrichment at Repetitive Sequences in Mammalian Spermatozoa  Hélène Royo, Michael Beda.
Geometrical Properties of Gel and Fluid Clusters in DMPC/DSPC Bilayers: Monte Carlo Simulation Approach Using a Two-State Model  István P. Sugár, Ekaterina.
Matan Goldshtein, David B. Lukatsky  Biophysical Journal 
Sensitivity of RNA‐seq.
Volume 11, Issue 3, Pages (March 2018)
J. -C. Lagier, F. Armougom, M. Million, P. Hugon, I. Pagnier, C
Volume 3, Issue 1, Pages (July 2016)
Efficient Receptive Field Tiling in Primate V1
Volume 6, Issue 5, Pages e13 (May 2018)
Volume 28, Issue 7, Pages e5 (April 2018)
Volume 3, Issue 1, Pages (January 2013)
QuantiGene Plex Represents a Promising Diagnostic Tool for Cell-of-Origin Subtyping of Diffuse Large B-Cell Lymphoma  John S. Hall, Suzanne Usher, Richard.
5.2 Least-Squares Fit to a Straight Line
5.4 General Linear Least-Squares
Pejman Mohammadi, Niko Beerenwinkel, Yaakov Benenson  Cell Systems 
Volume 88, Issue 4, Pages (April 2005)
Inference of Environmental Factor-Microbe and Microbe-Microbe Associations from Metagenomic Data Using a Hierarchical Bayesian Statistical Model  Yuqing.
Divisive Normalization in Olfactory Population Codes
CA3 Retrieves Coherent Representations from Degraded Input: Direct Evidence for CA3 Pattern Completion and Dentate Gyrus Pattern Separation  Joshua P.
Volume 66, Issue 4, Pages (May 2010)
Volume 71, Issue 4, Pages (August 2011)
Christopher Deufel, Michelle D. Wang  Biophysical Journal 
Volume 5, Issue 4, Pages e4 (October 2017)
Spatially Periodic Activation Patterns of Retrosplenial Cortex Encode Route Sub-spaces and Distance Traveled  Andrew S. Alexander, Douglas A. Nitz  Current.
The Spatiotemporal Organization of the Striatum Encodes Action Space
Ju Tian, Naoshige Uchida  Neuron 
Volume 14, Issue 7, Pages (February 2016)
Neuronal Selectivity and Local Map Structure in Visual Cortex
Redundancy in the Population Code of the Retina
Volume 91, Issue 5, Pages (September 2016)
Srinivas Ramachandran, Kami Ahmad, Steven Henikoff  Molecular Cell 
The Ventriloquist Effect Results from Near-Optimal Bimodal Integration
Timing, Timing, Timing: Fast Decoding of Object Information from Intracranial Field Potentials in Human Visual Cortex  Hesheng Liu, Yigal Agam, Joseph.
Robust Selectivity to Two-Object Images in Human Visual Cortex
Gilad A. Jacobson, Peter Rupprecht, Rainer W. Friedrich 
Ariel Afek, Itamar Sela, Noa Musa-Lempel, David B. Lukatsky 
Volume 11, Issue 3, Pages (March 2018)
Robust Driving Forces for Transmembrane Helix Packing
Rapid Formation of Robust Auditory Memories: Insights from Noise
Volume 41, Issue 2, Pages (January 2011)
Volume 28, Issue 7, Pages e5 (April 2018)
Probing the Limits to Positional Information
Volume 66, Issue 4, Pages (May 2010)
Yongli Zhang, Junyi Jiao, Aleksander A. Rebane  Biophysical Journal 
Feature computation and classification of grating pitch.
by Peter J. Turnbaugh, Vanessa K. Ridaura, Jeremiah J
Fig. 4 Model of the average SSE.
by Jacqueline Austermann, Jerry X
by Pan Tao, Xiaorong Wu, and Venigalla Rao
Fig. 4 Randomized benchmarking measurement.
Efficient Receptive Field Tiling in Primate V1
Evolution-informed forecasting of seasonal influenza A (H3N2)‏
Presentation transcript:

Universal microbial diagnostics using random DNA probes by Amirali Aghazadeh, Adam Y. Lin, Mona A. Sheikh, Allen L. Chen, Lisa M. Atkins, Coreen L. Johnson, Joseph F. Petrosino, Rebekah A. Drezek, and Richard G. Baraniuk Science Volume 2(9):e1600025 September 28, 2016 Copyright © 2016, The Authors

Fig. 1 Schematic of UMD platform. Schematic of UMD platform. (A) Genomic DNA is extracted from a bacterial sample and thermal-cycled with M random DNA probes. The genome- probe binding is quantified, producing a probe-binding vector y; in this study, the random probes are in the form of MBs, and the DNA-probe binding is quantified by the ratio of open/hybridized to closed/nonhybridized MBs. (B) The hybridization binding level of each probe to a potentially large reference database of N bacterial genomes (B1, B2, …, BN) is predicted using a thermodynamic model and stored in an M × N hybridization affinity matrix Φ. NCBI, National Center for Biotechnology Information. (C) Assuming that K bacterial species comprises the sample, the probe-binding vector y is a sparse linear combination of the corresponding K columns of the matrix Φ weighted by the bacterial concentrations x, that is, y = Φx + n, where the vector n accounts for noise and modeling errors. When K is small enough and M is large enough, Φ can be effectively inverted using techniques from compressive sensing, yielding the estimate for the microbial makeup of the sample x; in this illustration, the K = 2 bacteria-labeled B2 and B7 are present in the sample. Amirali Aghazadeh et al. Sci Adv 2016;2:e1600025 Copyright © 2016, The Authors

Fig. 2 Random probe design and hybridization affinity computation process. Random probe design and hybridization affinity computation process. (A) DNA sequence structure of five test random DNA probes. (B) Both strands of the bacterial genome (blue lines) are first thermodynamically aligned with the probe sequence. The sequence of the bacteria is segmented into fragments of roughly equal length (~100 to 200 nt), each containing a significant hybridization affinity with the probe. Then, all of the bacterial fragments and probe sequences along with the experimental conditions are fed into the DNA software (18) to predict all stable probe-bacteria complexes and concentrations. These concentrations, in aggregate, determine the concentration of opened MBs, which is defined as the hybridization affinity of the probes to the bacterial genome. (C) Example of a predicted probe-bacteria fragment binding with many base pair mismatches. Amirali Aghazadeh et al. Sci Adv 2016;2:e1600025 Copyright © 2016, The Authors

Fig. 3 Binding patterns of five random probes correctly identify the bacteria present in nine diverse bacterial samples. Binding patterns of five random probes correctly identify the bacteria present in nine diverse bacterial samples. (A) Experimentally measured FRET ratios to quantify hybridization between bacterial DNA and probes 1 to 5. (B) Hybridization affinity between DNA samples and probes converted from FRET ratio through the probe characteristic curve fit equations (table S1 and fig. S2). (C) Heat map of normalized inner products between the experimentally obtained hybridization affinity and predicted hybridization affinities (by thermodynamic model) for nine DNA samples as a measure of the similarity of the probe measurements to the bacteria in the data set. DNA samples are clustered into three groups: (i) exact sequence known, (ii) exact sequence unknown, and (iii) clinical isolates (whose exact sequence is unknown). UMD correctly recovers the diagonally highlighted bacterium (with inner product >0.9). (D) The average ROC curve of UMD in detecting nine bacteria, assuming the independence of the different experiments. Each point on the curve corresponds to a threshold value between [−1,1]. UMD achieves high values of the AUC (AUC > 0.9). (E) Correlation of measured and simulated hybridization affinities and the NRMSE of the prediction (straight line corresponds to maximum correlation). All experiments were performed in triplicate, and the results shown here average over the trials with the error bars representing SEM. Amirali Aghazadeh et al. Sci Adv 2016;2:e1600025 Copyright © 2016, The Authors

Fig. 4 Performance of UMD platform in genus-level recovery of 40 species listed as the most common human infectious genera by CDC with different number of random probes M and noise variance σ. Performance of UMD platform in genus-level recovery of 40 species listed as the most common human infectious genera by CDC with different number of random probes M and noise variance σ. (A) The ROC curve in detecting single bacterium (K = 1) with different noise levels. σ0 = 2.4 × 10−8 M denotes the variance of the additive white Gaussian noise used in the simulation. This value is obtained from the experiments in Fig. 3 by calculating the propagated variance of measured FRET ratios. UMD performs more accurately with lower noise variance. The detection is almost perfect (AUC > 0.95) under noise variance σ = σ0/5. (B) The average ROC curve in detecting single bacterium using different number of random probes M and fixed noise variance σ = σ0. The detection performance universally improves over all the 40 species by increasing the number of random probes. With 15 random probes, UMD achieves almost perfect detection performance (AUC > 0.95). (C) The percentage of simulated trials, where K bacteria present in the samples were recovered correctly with zero false positives, among all possible bacteria mixtures (blue and red curves corresponding to K = 2 and K = 3 bacteria, respectively). Simulations were repeated 1000 times with randomly selected MBs, and error bars represent SD. (D and E) Confusion matrices illustrating the detection result of UMD using M = 3 and M = 10 probes selected by the GPS algorithm. Amirali Aghazadeh et al. Sci Adv 2016;2:e1600025 Copyright © 2016, The Authors