MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein Mass.

Slides:



Advertisements
Similar presentations
David Campbell 1,, Eric Deutsch 1, Henry Lam 1, Hamid Mirzaei 1, Paola Picotti 2, Jeff Ranish 1, Ning Zhang 1, and Ruedi Aebersold 1,2,3 1.Institute for.
Advertisements

Protein Quantitation II: Multiple Reaction Monitoring
1 st MS 2 2 nd 3 rd 4 th 5 th 6 th 10 th 9 th 8 th 7 th Relative Intensity Fill Times Scan Times “shotgun sequencing”
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
PepArML: A model-free, result-combining peptide identification arbiter via machine learning Xue Wu, Chau-Wen Tseng, Nathan Edwards University of Maryland,
De Novo Sequencing v.s. Database Search Bin Ma School of Computer Science University of Waterloo Ontario, Canada.
Bin Ma, CTO Bioinformatics Solutions Inc. June 5, 2011.
Building and Using Libraries of Peptide Ion Fragmentation Spectra S.E. Stein, L.E. Kilpatrick, M. Mautner, P. Neta, J. Roth National Institute of Standards.
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
FIGURE 5. Plot of peptide charge state ratios. Quality Control Concept Figure 6 shows a concept for the implementation of quality control as system suitability.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
My contact details and information about submitting samples for MS
Facts and Fallacies about de Novo Sequencing & Database Search.
Analysis of tandem mass spectra - II Prof. William Stafford Noble GENOME 541 Intro to Computational Molecular Biology.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Spectral Counting. 2 Definition The total number of identified peptide sequences (peptide spectrum matches) for the protein, including those redundantly.
Proteomics Informatics Workshop Part III: Protein Quantitation
Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications.
Evaluated Reference MS/MS Spectra Libraries Current and Future NIST Programs.
Tryptic digestion Proteomics Workflow for Gel-based and LC-coupled Mass Spectrometry Protein or peptide pre-fractionation is a prerequisite for the reduction.
Daniel C. Liebler Vanderbilt University School of Medicine Vanderbilt, Tennessee Performance and Optimization of LC-MS/MS Platforms for Unbiased Discovery.
Karl Clauser Proteomics and Biomarker Discovery Taming Errors for Peptides with Post-Translational Modifications Bioinformatics for MS Interest Group ASMS.
Production of polypeptides, Da, and middle-down analysis by LC-MSMS Catherine Fenselau 1, Joseph Cannon 1, Nathan Edwards 2, Karen Lohnes 1,
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
A Phospho-Peptide Spectrum Library for Improved Targeted Assays Barbara Frewen 1, Scott Peterman 1, John Sinclair 2, Claus Jorgensen 2, Amol Prakash 1,
Laxman Yetukuri T : Modeling of Proteomics Data
Search Engine Result Combining Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown University Medical Center.
Novel Algorithms for the Quantification Confidence in Quantitative Proteomics with Stable Isotope Labeling* Novel Algorithms for the Quantification Confidence.
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
Poster produced by Faculty & Curriculum Support (FACS), Georgetown University Medical Center Application of meta-search, grid-computing, and machine-learning.
A Reference Library of Peptide Ion Fragmentation Spectra: Yeast S.E. Stein, L.E. Kilpatrick, P. Neta, Q.L. Pu, J. Roth, X. Yang National Institute of Standards.
PEAKS: De Novo Sequencing using Tandem Mass Spectrometry Bin Ma Dept. of Computer Science University of Western Ontario.
A Reference Library of Peptide Ion Fragmentation Spectra Stephen Stein 1 ; Lisa Kilpatrick 2 ; Pedatsur Neta 1 ; Jeri Roth 1 ; Xiaoyu Yang 1 National Institute.
Peptide Identification via Tandem Mass Spectrometry Sorin Istrail.
Improving the Sensitivity of Peptide Identification Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown University Medical.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
Background Spectral library searching Spectral library searching is an alternative approach to traditional sequence database searching for peptide inference.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Poster produced by Faculty & Curriculum Support (FACS), Georgetown University Medical Center Application of meta-search, grid-computing, and machine-learning.
Application of meta-search, grid-computing, and machine-learning can significantly improve the sensitivity of peptide identification. The PepArML meta-search.
Material Measurement Lab Material Measurement Laboratory Mass Spectrometry Data Center Biomolecular Measurement Division Q. Dong; M. Lorna A. De Leoz;
Geranyl acetate C12H20O2. Mass Spectral Libraries An Ever-Expanding Resource for Chemical Identification Steve Stein Mass Spectrometry Data Center National.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Material Measurement Lab Material Measurement Laboratory Q. Dong; M. Lorna A. De Leoz; L.E. Kilpatrick; Y. Liang; X. Yan; X. Yang; S.E. Stein Building.
Constructing high resolution consensus spectra for a peptide library
DIA Method Design, Data Acquisition, and Assessment
Protein quantitation I: Overview (Week 5). Fractionation Digestion LC-MS Lysis MS Sample i Protein j Peptide k Proteomic Bioinformatics – Quantitation.
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
MS Libraries for Forensics: DART-MS and GC-MS
Algorithms and Computation: Bottom-Up Data Analysis Workflows
Jarrett Egertson, Ph.D. MacCoss Lab
View  text zoom  large Set properties text size to 14 point
Agenda Welcome from the Skyline team!
Refining Peptide Fragmentation Models for Improved Confidence in Sequence/Spectrum Matching Karl Clauser Broad Institute of MIT and Harvard Cambridge,
Proteomics Informatics David Fenyő
NoDupe algorithm to detect and group similar mass spectra.
Is Proteomics the New Genomics?
Shotgun Proteomics in Neuroscience
Summary of intralaboratory and interlaboratory variation for metrics for three LTQ and three LTQ-Orbitrap instruments in six replicate analyses of a tryptic.
Performance metrics for triplicate analyses of a tryptic digest of the CPTAC yeast reference proteome on four LTQ-Orbitraps at three different sites in.
Sim and PIC scoring results for standard peptides and the test shotgun proteomics dataset. Sim and PIC scoring results for standard peptides and the test.
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
Kuen-Pin Wu Institute of Information Science Academia Sinica
Operation manual of AI SIDA
Presentation transcript:

MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein Mass Spectrometry Data Center

Library searching in not new Organize for Reuse

MS Library Searching Hertz, Hites and Biemann Anal. Chem. (1971). PBM: McLafferty, Hertel, Villwock Org. Mass Spectrom. (1974). SISCOM: Damen, Henneberg, Weimann, Anal. Chem. Acta (1978). INCOS: Sokolow, Karnofsky, Gustafson, Finnigan Application Report 2 (March 1978). Stein, Scott J. Amer. Soc. Mass Spectrom., (1994).

‘Dot Product’ (cosine of ‘angle’ between a pair of spectra) Measured = f(m/z abundance) Reference = f(m/z abundance) f(abundance) : Weight as you like Sum over all peaks in common Normalize

Traditional GC/MS Library Search

Variability Depends on S/N ~7,000 Radiodurans Peptides, LCQ (PNNL/NCRR) Medians

Library Searching for Peptides LIBQUEST (Yates) –Yates et al, Anal. Chem., 1998, 70, 3557 X!Hunter (Beavis) –Craig et al, J. Proteome Res., 2006, 5, 1843 BiblioSpec (MacCoss) –Frewen et al., Anal. Chem. 2006, 78, 5678 Spectral Comparison (Kearney) –Liu et al, Proteome Science 2007, 5:3 SpectraST (Aebersold) –Lam et al., Proteomics , NIST Peptide Ion Fragmentation Library –June 2006 release (US-HUPO – March 2004)

Why Spectrum Libraries? More sensitive Better scoring Faster Annotation Unrestricted precursor ion

Identification by Spectrum Matching is More Sensitive than by Spectrum/Sequence Matching Simple Protein Mix

Spectrum/Spectrum Scores are More Robust than Sequence/Spectrum Scores Sequence score 99% Confidence

0.005/s vs. 6.2/s per query spectrum Matching Spectra is Faster than Matching Sequence

Reference Library Building Extract identified spectra from sequence search –Multiple search engines –Instrument-class specific Create ‘consensus’ spectra –Two or more matching spectra, also save best Assign probability of being correct –Refine confidence starting from decoy FDR –Classify peptides – tryptic, missed cleavage, semi, mods Create searchable spectral library –Resolve conflicts, add annotation

Three Classes of Libraries I. Conventional Target Identification –Peptides (Proteins) II. Identifiable –By unconventional searching III. Not Identifiable –Account for all recurring spectra –QA/QC

I. OMSSA overlap with MS/MS Library Search K 6/ K 6/07 Identified spectra (1% FDR) for 1-D Yeast NCI/CPTAC – Vanderbilt

II. Identify What we Can Derive Class-specific FDR Tryptic –Simple –Expected missed cleavages –Unexpected missed cleavages Semitryptic (cleaved tryptic) –No missed cleavage In source (with parent at same retention) In sample –Missed cleavage In source (with parent) In sample (obey rules) Uncommon – reject Others …

Atypical Peptide Ions use Sequence Search Method Tryptic only with many mods Less common: Methylation, Phosphorylation, … Artifacts: Na, K, Carbamyl InsPecT/Pevzner (Unidentified, +70) High charge states, >2 missed cleavages Use class specific score thresholds

HSA/Fibrinogen/Transferrin Mix 6124 Consensus Peptide Spectra, IT, Qtof, TofTof Ion Trap Peptide Ions: 1300 HSA, 1100 Fibrinogen, 700 Transferrin

contiguous = tryptic, exploded = semitryptic

III. Library of Recurring, Unidentified Spectra Create consensus spectra –From similar spectra from an experiment Combine from multiple experiments Identify spectra in other experiments –QA/QC: Artifacts, in standards, … –Apply other sequencing methods

Assign all Spectra Identified Spectrum –Matches library peptide or unidentified spectrum –Subset of peaks match library spectrum (impure) –Similar to a matched spectrum (cluster) Not a Peptide –Low S/N Maximum/Median <15 –High charge state (many large peaks) Proteins, large fragments, … –One dominant peak Stable ion, not peptide –Singly charged (high/low abund < 1.2) Probable artifact, lower probability of identification –Narrow m/z range Peptide?

exploded = identified, contiguous = unidentified

Library Pipeline of the Future assigned No ID Pep. Lib Unass. Lib unassigned No ID Garbage filter Sequence Search, De Novo, Theoretical Spec, Similarity,... No ID assigned Mass spectrometer

NCI/NIH - CPTAC: Clinical Proteomic Technology Assessment for Cancer Technology assessment; develop standard protocols and clinical reference sets; and evaluate methods to ensure data reproducibility. Broad Institute of MIT and Harvard, Memorial Sloan-Kettering Cancer Center, Purdue University, University of California, San Francisco,, and Vanderbilt University School of Medicine. NCI grants (U24CA , U24CA , U24CA , U24CA , and U24CA ).

Run-to-Run Chromatographic Reproducibility

Broad Orbitrap Vandy Orbitrap NYU Orbitrap INCAPS LTQ NIST LTQ Vandy LTQ Purdue LTQ YICENQDSISSK Lab-to-Lab Chromatography

HSA_CAM_SigmaA9511_5H_8MS2_m2_10de_040406_05

Measures of Reproducibility Identified ions –Unique peptides, Ions, Spectrum counts Unidentified components –Classify by type, link to origin Ion cluster analysis –MS1 linked to MS2 Chromatography –Time evolution of ion clusters

Ion Component Analysis

Ion Component Analysis (Yeast)

Components in Replicate Runs total sampled identified ▲▼ run 1,2 ■ in both