Overview Purpose: Accurately estimate peptide retention based on spectrum library data utilizing commonly observed peptides in place of synthetic standards.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Protein Quantitation II: Multiple Reaction Monitoring
Conclusion The workflow presented provides a strategy to incorporate unbiased glycopeptide identification to generate an initial list of targets for data.
Predictor of Customer Perceived Software Quality By Haroon Malik.
Targeted Quantification of O-Linked Glycosylation Site for Glycan Distribution Analysis Scott M. Peterman 1, Amol Prakash 1, Bryan Krastins 1, Mary Lopez.
ACCELERATING CLINICAL AND TRANSLATIONAL RESEARCH Metabolomics/Proteomics and Genomics at IUB Indiana CTSI – Purdue Retreat Monday,
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Data Processing Algorithms for Analysis of High Resolution MSMS Spectra of Peptides with Complex Patterns of Posttranslational Modifications Shenheng Guan.
Initial testing of longwave parameterizations for broken water cloud fields - accounting for transmission Ezra E. Takara and Robert G. Ellingson Department.
Evaluating Hypotheses
20-30% of a trypsinised proteome are constituted of peptides with Mw≥3000 (TReP) Identification of large peptides by shotgun MS is not efficient Isolation.
Previous Lecture: Regression and Correlation
FIGURE 5. Plot of peptide charge state ratios. Quality Control Concept Figure 6 shows a concept for the implementation of quality control as system suitability.
Improving Throughput for Highly Multiplexed Targeted Quantification Methods Using Novel API-Remote Instrument Control and State-Model Data Acquisition.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Random Sampling  In the real world, most R.V.’s for practical applications are continuous, and have no generalized formula for f X (x) and F X (x). 
Proteomics Understanding Proteins in the Postgenomic Era.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS Gygi et al (2003) PNAS 100(12), presented by Jessica.
Proteomics Informatics Workshop Part III: Protein Quantitation
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Statistical Process Control
Conclusion  Comprehensive workflow identified approximately 70% more high confident peptide as compare to general search strategy.  The comprehensive.
Comparison of chicken light and dark meat using LC MALDI-TOF mass spectrometry as a model system for biomarker discovery WP 651 Jie Du; Stephen J. Hattan.
Proteomics and Biomarker Discovery Discovery to Targets for a Phosphoproteomic Signature Assay: One-stop shopping in Skyline Jake Jaffe Skyline Users Meeting.
Production of polypeptides, Da, and middle-down analysis by LC-MSMS Catherine Fenselau 1, Joseph Cannon 1, Nathan Edwards 2, Karen Lohnes 1,
Results Intelligent Data Acquisition Initial discovery experiments were performed to help drive targeted quantitative experiments (Figures 1, 2). The discovery.
2007 GeneSpring MS GeneSpring for Metabolite BioMarker Analysis using Mass Spectrometry data Agilent Q-TOF VIP Visit Jan 16-17, 2007 Santa Clara, CA Thon.
Introduction The GPM project (The Global Proteome Machine Organization) Salvador Martínez de Bartolomé Bioinformatics support –
Computer Science 101 Modeling and Simulation. Scientific Method Observe behavior of a system and formulate an hypothesis to explain it Design and carry.
An introduction and possible applications Ariane Kahnt
Proteomics and Biomarker Discovery “Research-grade” Targeted Proteomics Assay Development: PRMs for PTM Studies with Skyline or, “How I learned to ditch.
Bringing Metrology to Clinical Proteomic Research David Bunk Chemical Science and Technology Laboratory National Institute of Standards and Technology.
Acknowledgements This work is supported by NSF award DBI , and National Center for Glycomics and Glycoproteomics, funded by NIH/NCRR grant 5P41RR
Common parameters At the beginning one need to set up the parameters.
Improving Peptide Searching Workflow to Maximize Protein Identifications Shadab Ahmad 1, Amol Prakash 1, David Sarracino 1, Bryan Krastins 1, MingMing.
Integrated Targeted Quantitative Method for Insulin and its Therapeutic Analogs Eric Niederkofler 1, Dobrin Nedelkov 1, Urban Kiernan 1, David Phillips.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
A Comprehensive Comparison of the de novo Sequencing Accuracies of PEAKS, BioAnalyst and PLGS Bin Ma 1 ; Amanda Doherty-Kirby 1 ; Aaron Booy 2 ; Bob Olafson.
A Phospho-Peptide Spectrum Library for Improved Targeted Assays Barbara Frewen 1, Scott Peterman 1, John Sinclair 2, Claus Jorgensen 2, Amol Prakash 1,
A new "Molecular Scanner" design for interfacing gel electrophoresis with MALDI-TOF ThP Stephen J. Hattan; Kenneth C. Parker; Marvin L. Vestal SimulTof.
CS 461b/661b: Bioinformatics Tools and Applications Software Algorithm Mathematical Models Biology Experiments and Data.
We obtained breast cancer tissues from the Breast Cancer Biospecimen Repository of Fred Hutchinson Cancer Research Center. We performed two rounds of next-gen.
Genome of the week - Enterococcus faecalis E. faecalis - urinary tract infections, bacteremia, endocarditis. Organism sequenced is vancomycin resistant.
Software Project MassAnalyst Roeland Luitwieler Marnix Kammer April 24, 2006.
Epigenetic Processes from a Molecular Perspective INBRE Meeting 2/16/10.
Inference: Probabilities and Distributions Feb , 2012.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview of Mass Spectrometry
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
© Copyright 2009 by the American Association for Clinical Chemistry Plasma Renin Activity by Liquid Chromatography– Tandem Mass Spectrometry (LC-MS/MS):
Isotope Labeled Internal Standards in Skyline
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
An unsupervised conditional random fields approach for clustering gene expression time series Chang-Tsun Li, Yinyin Yuan and Roland Wilson Bioinformatics,
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
DIA Method Design, Data Acquisition, and Assessment
Protein quantitation I: Overview (Week 5). Fractionation Digestion LC-MS Lysis MS Sample i Protein j Peptide k Proteomic Bioinformatics – Quantitation.
Using Scaffold OHRI Proteomics Core Facility. This presentation is intended for Core Facility internal training purposes only.
Conjoint Analysis. 1. Managers frequently want to know what utility a particular product feature or service feature will have for a consumer. 2. Conjoint.
Data independent acquisition methods for metabolomics Stephen Tate, Ron Bonner AB SCIEX, 71 Four Valley Drive, Concord, ON, L4K 4V8 Canada A high resolution.
Date of download: 6/24/2016 Copyright © The American College of Cardiology. All rights reserved. From: Proteomic Strategies in the Search of New Biomarkers.
Table 1. Quality Parameters Being Considered for Evaluation
V. Protein Chips 1. What is Protein Chips 2. How to Make Protein Chips
Proteomics Informatics David Fenyő
NoDupe algorithm to detect and group similar mass spectra.
Is Proteomics the New Genomics?
The Coming Age of Complete, Accurate, and Ubiquitous Proteomes
Proteomics Informatics David Fenyő
Network-Based Coverage of Mutational Profiles Reveals Cancer Genes
Presentation transcript:

Overview Purpose: Accurately estimate peptide retention based on spectrum library data utilizing commonly observed peptides in place of synthetic standards. Methods: We consolidate many months’ worth of LC-MS/MS data into a library of MSMS spectra. Our automated analysis selects endogenous peptides to act as standards which are used to predict retention times of any peptide in the library. Results: Seventeen peptides were identified as appropriate endogenous standards. Relative retention time information stored in the library allowed us to predict the retention times of 1750 peptides more accurately than predictions based on hydrophobicity. Introduction Spectrum libraries are an invaluable starting point for developing targeted assays (e.g. SRM, PRM) because they provide information about fragmentation patterns and retention times. When library data are collected under a variety of LC conditions, the use of synthetic peptide standards can greatly improve the ability to accurately predict retention time in new experiments. Unfortunately, any samples not including those peptide standards cannot be used in the predictions. We present a method for selecting peptides endogenous to a sample to act as standards and demonstrate their use for predicting retention times of other peptides including those with chemical modifications, which indicate portability to both unmodified and post-translationally modified peptides. Methods Sample Preparation Activity-based protein profiling (ABPP) was performed on various human lung cancer cells and five pairs of tumor and adjacent control human tissue samples. Thermo Scientific™ Pierce™ ActivX ™ desthiobiotin ATP probes were used to interact with ATP utilizing enzymes and lysine close to the active sites were labeled with desthiobiotin. Liquid Chromatography and Mass Spectrometry Trypsin-digested samples were run on one of three gradients (2 hr on HPLC, 2 hr on UPLC, 4 hr on UPLC). The validation experiment used a 4 hr gradient on UPLC. Spectra were acquired on a Thermo Scientific™ LTQ Orbitrap™ MS using data-dependent acquisition. Data Analysis Peptide identification was done in Thermo Scientific™ Proteome Discoverer™ (PD) software. The spectrum library was built using the Crystal node for PD version 1.4. A custom script was written to analyze the library entries and find appropriate endogenous peptides to use as standards. Spectrum Library Retention Time Prediction Based on Endogenous Peptide Standards Barbara Frewen 1, Bin Fang 2, Scott Peterman 1, John Koomen 2, Jiannong Li 2, Eric Haura 2, Bryan Krastins 1, David Sarracino 1, Mary Lopez 1, Amol Prakash 1 1 Thermo Scientific BRIMS, Cambridge, MA, 2 Moffitt Cancer Center, Tampa, FL FIGURE 1. Frequency of peptide observation. The library collected spectra from 250 DDA runs. Peptides were observed with varying frequency, between 1 and 233 runs. We focused on the 50 most frequently seen peptides (circled in inset). Results Peptide Frequency in the Spectrum Library Assembly of the Crystal spectrum library collected the retention time information into one resource. The library contained 220,542 spectra from 250 LCMS runs including 9,109 peptide sequences (12,063 total with modified forms). As these samples did not contain a synthetic peptide standard, we first sought appropriate endogenous peptides. The best candidates for peptides to act as retention time landmarks are those most commonly seen from run to run. We looked at the frequency of peptides in the 250 runs used to build the library. No peptides were observed in every run, the most commonly seen peptide having 233 appearances. (Figure 1) We selected the 50 most commonly seen peptides which were seen in no fewer than 185 runs. Endogenous Peptides for Retention Time Landmarks Starting with the 50 most commonly seen peptides, we winnowed down the list to find a set of peptides that both covered the entire elution profile and consistently eluted in the same order relative to each other. An in-house script automated the process. First we record the relative order of the 50 peptides in all 250 runs, for each pair of peptides A and B, keeping track of how often A came before B. Next we use a greedy algorithm to select a consistent set. Start the set with one peptide. For each remaining peptide, try adding it to the set in the appropriate order. If it cannot be placed unambiguously relative to the existing peptides in the set, eliminate this peptide. We found seventeen that eluted in a consistent order. They are plotted at their observed retention times in several library runs in Figure 2. FIGURE 2. A. Retention times of landmark peptides in library data. The observed retention times of the seventeen peptides selected to act as landmarks were plotted for 68 runs in the library. Runs from each of three gradients are plotted together. The rank order of the peptides is the same in all runs, but the absolute times differ even for runs with the same gradient. Peptides are distributed across the entire gradient, with a higher density in the early-to-middle times. B. Histogram of number of landmark peptides in each run. Not every peptide was observed in every run, but there are enough in most cases to cover the whole gradient. Conclusion Endogenous peptides can successfully act as retention time landmarks and accurately estimate RT in new gradients.  Spectrum libraries capture valuable retention time information.  Our algorithm finds endogenous peptides with consistent elution behavior to act as standards.  We can accurately predict the retention time of any library peptide by estimating it relative to the standard peptides. Therefore, comparisons can more easily be made across datasets with accurate mass and retention time measurements (AMT). This capability also enables method transfer to scheduled LC-MRM.  Library-based estimated retention times are closer to the observed times than predictions made based on hydrophobicity. References 1.Krokhin OV, Spicer V. (2009) Peptide retention standards and hydrophobicity indexes in reversed-phase high-performance liquid chromatography of peptides. Anal Chem 81(22): Acknowledgments This research has been funded by the National Cancer Institute (P30 CA076292, P50-CA to EH, R21 CA to EH, R21 CA to JK) and the American Lung Association (LCD N to JK) as well as the Moffitt Foundation. All trademarks are the property of Thermo Fisher Scientific and its subsidiaries. For Research Use Only. Not for use in diagnostic procedures. This information is not intended to encourage use of these products in any manners that might infringe the intellectual property rights of others. FIGURE 3. A. Observed retention times of target peptides are stored as the distance between the two nearest landmark peptides. B. Retention time predictions are made by projecting the relative times on to the known times of the landmarks on a new gradient. Landmark Peptides Observed Number of Runs FIGURE 4. Comparison of estimated and observed retention times of 1750 peptides. A. Histogram of predicted minus observed retention times for both prediction methods. B. Library-predicted minus observed retention time vs. the observed retention time. Use relative retention times to estimate RT on new gradient The Crystal library computes a relative retention time for each peptide stored in the library as a distance between the two nearest landmark peptides. (Figure 3a) These are used to estimate the retention times on a new gradient (Figure 3b). First, the RT of the landmarks must be measured on the new gradients. Then the relative RTs can be projected on to this new gradient and the average time is taken as the estimate. We estimated the times of 1750 peptides on a 4 hour gradient. In addition, we compare our estimates to estimates based on peptide hydrophobicity (Krohkin, 2009). The accuracy of the estimate is measured as the difference between the estimated and observed times. Figure 4 plots the accuracy of the two estimation methods as well as accuracy of the library predictions as a function of the observed time. Library predictions are much closer than the hydrophobicity predictions to the observed retention times with most falling with in +/- 10 minutes of the observed time. Predictions are not consistently earlier or later than observed, but there is a slight trend for the prediction to be too early at the beginning of the run and too late at the end of the run. This may be due to having fewer landmarks at the ends of the run Time difference (minutes) Number of peptides Time difference (minutes) Retention time (minutes) AB Library Hydrophobicity A. B.