Automatic Analysis of Ion Mobility Spectrometry – Mass Spectrometry (IMS-MS) Data Hyejin Yoon School of Informatics Indiana University Bloomington December 5, 2008 Advisor: Dr. Haixu Tang
Outline 1. Introduction 2. Motivation 4. IMS-MS Analyzer 5. Results 3. Workflow of IMS-MS Data Analysis 6. Future Work 7. References 8. Acknowledgements
Mass Spectrometry (MS) Generic mass spectrometry (MS)- based proteomics experiment [Ruedi Aebersold et al.] Measures molecular mass (mass-to-charge ratio) of a sample Mass spectrum Tandem MS (MS/MS)
Application of MS Molecule identification/quantitation accurate molecular weight confirm the molecular formula substitution of a amino acid or post-translational modification Structural and sequence information from MS/MS
Liquid Chromatography – Mass Spectrometry MS Combined with Liquid Chromatography (LC) LC-MS, LC-MS/MS Advantages Provides a steady stream of different samples More precise Higher confident Limitation Molecule at low abundance levels Low depth of coverage for complex samples Slow: Liquid phase A schematic diagram of LC-MS [
Ion mobility spectrometry (IMS) Fast: Gas phase Ion Mobility Spectrometry – Mass Spectrometry (IMS-MS) E Buffer Gas DETECTOR Gate High-throughput proteomics platform based on ion- mobility time-of-flight mass spectrometry [Belov et. al. ASMS]
IMS-MS Distinguish different ions having identical mass-to-charge ratios Separates out conformers Increases depth of coverage, confidence Used to measure cross-section Reduces noise Fast separation: Gas phase Advantages of IMS-MS A schematic diagram of IMS-MS [Hoaglund CS, et al. 1998]
IMS-MS “Frame” 3-dimensional data: drift time, m/z, intensity 2D Color map Rarely done so far, Few analysis SW LC-IMS-MS LC coupled to MS-MS 4-dimensional data frame, drift time, m/z, intensity Multiple frames Advantage Multiple measurements per LC peak Increasing peak capacity Increase depth of coverage Reproducible, increase confidence MS vs. IMS-MS MS Mass Spectrum 2-dimensional data: m/z, intensity Many tools to analyze LC-MS
Motivation for Automatic IMS-MS Analysis Challenging data analysis, due to multi-dimensional nature of data Need for an automatic data analysis tool for the studies using IMS-MS/LC-IMS-MS instruments Visualize IMS-MS, LC-IMS-MS data m/z, drift time space Mass, drift time space Feature/Peak detection Deisotope isotopic distributions to get monoisotopic mass & charge state Identify IMS-MS peaks using two dimensions (mass/ drift time) User-friendly
Workflow of IMS-MS Analysis IMS-MS / LC-IMS-MS System IMS-MS / LC-IMS-MS System Biological sample mixture Biological sample mixture Visualization & Feature-finding Algorithm Visualization & Feature-finding Algorithm Peak-picking Algorithm Peak-picking Algorithm Visualization & Deisotoping Algorithm Visualization & Deisotoping Algorithm IMS-MS Analyzer Feature List IMS-MS Data IMS-MS Data IMS-MS Peak List IMS-MS Peak List Monoisotope (peak) List Monoisotope (peak) List LC-IMS-MS Data LC-IMS-MS Data Monoisotope (peak) Lists Monoisotope (peak) Lists Feature Lists Feature Lists IMS-MS Peak Lists IMS-MS Peak Lists
IMS-MS Analyzer: 2D Color Map and Deisotoping Visualization & Feature-finding Algorithm Visualization & Feature-finding Algorithm Peak-picking Algorithm Peak-picking Algorithm Visualization & Deisotoping Algorithm Visualization & Deisotoping Algorithm IMS-MS Analyzer Feature List IMS-MS Data IMS-MS Data Peak List Monoisotope (peak) List Monoisotope (peak) List LC-IMS-MS Data LC-IMS-MS Data Monoisotope (peak) Lists Monoisotope (peak) Lists Feature Lists Feature Lists Peak Lists Peak Lists
2D Color Map and Zoom :::: :::: Input (drift scan, TOF bin, intensity) calibration coefficients drift time, m/z, color code Plot drift time vs. m/z vs. intensity
2D Color Map and Zoom
Single drift scan view
Single Drift Scan Processing Peak-picking on spectra Remove spectral noise Deisotoping Algorithm THRASH [Horn et al. 2000] algorithm Detect accurate monoisotopic mass and charge state
THRASH on a frame THRASH entire frame THRASH scan by scan a peak list in the form of monoisotopic masses observed across continuous drift-times. Results saved as a csv file
IMS-MS Analyzer: THRASH 2D map and Feature Finding Visualization & Feature-finding Algorithm Visualization & Feature-finding Algorithm Peak-picking Algorithm Peak-picking Algorithm Visualization & Deisotoping Algorithm Visualization & Deisotoping Algorithm IMS-MS Analyzer Feature List IMS-MS Data IMS-MS Data Peak List Monoisotope (peak) List Monoisotope (peak) List LC-IMS-MS Data LC-IMS-MS Data Monoisotope (peak) Lists Monoisotope (peak) Lists Feature Lists Feature Lists Peak Lists Peak Lists
THRASH 2D map 2D map of drift time vs. m/z THRASH frame 2D map of drift-time vs. monoisotopic mass
Feature Finding Feature: a drift profile for a specific mass value Preliminary step to Identify IMS-MS peaks Sliding Window approach Cluster monoisotopic ions located across continuous drift-times Report representative monoisotopic mass, drift-time value, maximum intensity, total intensity, charge and range of drift-time that correspond to a particular feature Feature profile view Manually visualizing Gaussian fitting to the feature
Feature Finding
IMS-MS Analyzer: Peak-Picking Visualization & Feature-finding Algorithm Visualization & Feature-finding Algorithm Peak-picking Algorithm Peak-picking Algorithm Visualization & Deisotoping Algorithm Visualization & Deisotoping Algorithm IMS-MS Analyzer Feature List IMS-MS Data IMS-MS Data IMS-MS Peak List IMS-MS Peak List Monoisotope (peak) List Monoisotope (peak) List LC-IMS-MS Data LC-IMS-MS Data Monoisotope (peak) Lists Monoisotope (peak) Lists Feature Lists Feature Lists Peak Lists Peak Lists
Peak-Picking Overlapping peaks: isomeric molecules or conformational change in a molecules Apply Gaussian mixture models Use Expectation-Maximization (EM) algorithm Goodness-of-fit to find the best fitting Gaussian mixture Choose Gaussian means to represent IMS-MS peaks
Peak-picking Examples
Gaussian Mixture Models (GMMs) There are k components of Gaussian i ’ th component: w i Mean of component w i : μ i Each component generates data from a Gaussian function with mean μ i and variance σ i 2 Each datapoint is generated according to probability of component i: P(w i ) N(μ i, σ i 2 ) We need to find μ 1, μ 2, …, μ k which give maximum likelihood
EM Algorithm Alternate between Expectation (E) step and Maximization (M) step E step computes an expectation of the likelihood by including the unobserved variables as if they were observed M step computes the maximum likelihood estimates of the parameters by maximizing the expected likelihood found on the E step Begin next round of the E step using the parameters found on the M step and repeat the process
On the t’th iteration let our estimates be E step M step EM for GMMs
How well the model fits a set of observed data Discrepancy between observed values and the values expected under the model Based on goodness-of-fit we determine the best fitting Gaussian mixture within user specified max components Goodness-of-Fit
Peak-picking
Peak-picking Results
IMS-MS Analyzer: LC-IMS-MS Processing Visualization & Feature-finding Algorithm Visualization & Feature-finding Algorithm Peak-picking Algorithm Peak-picking Algorithm Visualization & Deisotoping Algorithm Visualization & Deisotoping Algorithm IMS-MS Analyzer Feature List IMS-MS Data IMS-MS Data Peak List Monoisotope (peak) List Monoisotope (peak) List LC-IMS-MS Data LC-IMS-MS Data Monoisotope (peak) Lists Monoisotope (peak) Lists Feature Lists Feature Lists IMS-MS Peak Lists IMS-MS Peak Lists
Analyzing LC-IMS-MS data Data set of multiple frames 4D data Binary search algorithm to find the target frame Processing all frames automatically : :
2D Map of LC-IMS-MS
THRASH/peak-picking of LC-IMS-MS
Results IMS-MS sample (Cellobiose) LC-IMS-MS sample (Human Plasma) # of Deisotoped ions 5370~266 per frame # of IMS-MS peaks 350~18 per frame
Future Work Biological sample LC-IMS-MS Systems LC-IMS-MS Systems LC-IMS-MS dataset LC-IMS-MS dataset IMS-MS/MS dataset IMS-MS/MS dataset Precursor Feature/Peak List Precursor Feature/Peak List Fragment Peak List Fragment Peak List MS/MS Spectra + Precursor information MS/MS Spectra + Precursor information Downstream Computational Analysis - Protein identification - Protein quantitation - Biological pathway reconstruction Precursor Peak List Precursor Peak List Drift Profile Aligner De- isotoping Peak Picking Feature Detector Feature Detector Fragment Feature/Peak List Fragment Feature/Peak List
References Aebersold R, Mann M, Mass spectrometry-based proteomics, Nature Mar 13;422(6928): Guerrera IC, Kleiner O. Application of mass spectrometry in proteomics, Biosci Rep Feb-Apr;25(1-2): Clemmer DE, Jarrold MF, Ion mobility measurements and their applications to clusters and biomolecules, J Mass Spectrom. 1997;32: Hoaglund CS, Valentine SJ, Sporleder CR, Reilly JP, Clemmer DE, Three-dimensional ion mobility/TOFMS analysis of electrosprayed biomolecules, Anal Chem Jun 1;70(11): Baker ES, Clowers BH, Li F, Tang K, Tolmachev AV, Prior DC, Belov ME, Smith RD, Ion Mobility Spectrometry–Mass Spectrometry Performance Using Electrodynamic Ion Funnels and Elevated Drift Gas Pressures, J Am Soc Mass Spectrom Jul;18(7): Horn DM, Zubarev RA, McLafferty FW, Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules, J Am Soc Mass Spectrom Apr;11(4): ml
Acknowledgements Prof. Haixu Tang, School of Informatics Lab-mates Anoop Mayampurath, Mina Rho, Jun Ma, Yong Li, Paul Yu, Chao Ji, Indrani Sarkar Chemistry Department Stephen Valentine Manny Plasenci Ruwan Thushara Kurulugama Prof. David E. Clemmer Faculty and staff, School of Informatics