Presentation is loading. Please wait.

Presentation is loading. Please wait.

MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein Mass.

Similar presentations


Presentation on theme: "MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein Mass."— Presentation transcript:

1 MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein Mass Spectrometry Data Center

2 Library searching in not new Organize for Reuse

3 MS Library Searching Hertz, Hites and Biemann Anal. Chem. (1971). PBM: McLafferty, Hertel, Villwock Org. Mass Spectrom. (1974). SISCOM: Damen, Henneberg, Weimann, Anal. Chem. Acta (1978). INCOS: Sokolow, Karnofsky, Gustafson, Finnigan Application Report 2 (March 1978). Stein, Scott J. Amer. Soc. Mass Spectrom., (1994).

4 ‘Dot Product’ (cosine of ‘angle’ between a pair of spectra) Measured = f(m/z abundance) Reference = f(m/z abundance) f(abundance) : Weight as you like Sum over all peaks in common Normalize

5 Traditional GC/MS Library Search

6 Variability Depends on S/N ~7,000 Radiodurans Peptides, LCQ (PNNL/NCRR) Medians

7 Library Searching for Peptides LIBQUEST (Yates) –Yates et al, Anal. Chem., 1998, 70, 3557 X!Hunter (Beavis) –Craig et al, J. Proteome Res., 2006, 5, 1843 BiblioSpec (MacCoss) –Frewen et al., Anal. Chem. 2006, 78, 5678 Spectral Comparison (Kearney) –Liu et al, Proteome Science 2007, 5:3 SpectraST (Aebersold) –Lam et al., Proteomics 2007 6, 655-667 NIST Peptide Ion Fragmentation Library –June 2006 release (US-HUPO – March 2004)

8

9 Why Spectrum Libraries? More sensitive Better scoring Faster Annotation Unrestricted precursor ion

10 Identification by Spectrum Matching is More Sensitive than by Spectrum/Sequence Matching Simple Protein Mix

11 Spectrum/Spectrum Scores are More Robust than Sequence/Spectrum Scores Sequence score 99% Confidence

12 0.005/s vs. 6.2/s per query spectrum Matching Spectra is Faster than Matching Sequence

13 Reference Library Building Extract identified spectra from sequence search –Multiple search engines –Instrument-class specific Create ‘consensus’ spectra –Two or more matching spectra, also save best Assign probability of being correct –Refine confidence starting from decoy FDR –Classify peptides – tryptic, missed cleavage, semi, mods Create searchable spectral library –Resolve conflicts, add annotation

14 Three Classes of Libraries I. Conventional Target Identification –Peptides (Proteins) II. Identifiable –By unconventional searching III. Not Identifiable –Account for all recurring spectra –QA/QC

15 I. OMSSA overlap with MS/MS Library Search 747 1350 353 34K 6/06 318 1752 833 78K 6/07 Identified spectra (1% FDR) for 1-D Yeast NCI/CPTAC – Vanderbilt

16

17 II. Identify What we Can Derive Class-specific FDR Tryptic –Simple –Expected missed cleavages –Unexpected missed cleavages Semitryptic (cleaved tryptic) –No missed cleavage In source (with parent at same retention) In sample –Missed cleavage In source (with parent) In sample (obey rules) Uncommon – reject Others …

18 Atypical Peptide Ions use Sequence Search Method Tryptic only with many mods Less common: Methylation, Phosphorylation, … Artifacts: Na, K, Carbamyl InsPecT/Pevzner (Unidentified, +70) High charge states, >2 missed cleavages Use class specific score thresholds

19 HSA/Fibrinogen/Transferrin Mix 6124 Consensus Peptide Spectra, IT, Qtof, TofTof Ion Trap Peptide Ions: 1300 HSA, 1100 Fibrinogen, 700 Transferrin

20 contiguous = tryptic, exploded = semitryptic

21 III. Library of Recurring, Unidentified Spectra Create consensus spectra –From similar spectra from an experiment Combine from multiple experiments Identify spectra in other experiments –QA/QC: Artifacts, in standards, … –Apply other sequencing methods

22 Assign all Spectra Identified Spectrum –Matches library peptide or unidentified spectrum –Subset of peaks match library spectrum (impure) –Similar to a matched spectrum (cluster) Not a Peptide –Low S/N Maximum/Median <15 –High charge state (many large peaks) Proteins, large fragments, … –One dominant peak Stable ion, not peptide –Singly charged (high/low abund < 1.2) Probable artifact, lower probability of identification –Narrow m/z range Peptide?

23 exploded = identified, contiguous = unidentified

24

25 Library Pipeline of the Future assigned No ID Pep. Lib Unass. Lib unassigned No ID Garbage filter Sequence Search, De Novo, Theoretical Spec, Similarity,... No ID assigned Mass spectrometer

26 NCI/NIH - CPTAC: Clinical Proteomic Technology Assessment for Cancer http://proteomics.cancer.gov Technology assessment; develop standard protocols and clinical reference sets; and evaluate methods to ensure data reproducibility. Broad Institute of MIT and Harvard, Memorial Sloan-Kettering Cancer Center, Purdue University, University of California, San Francisco,, and Vanderbilt University School of Medicine. NCI grants (U24CA126476-01, U24CA126485-01, U24CA126480-01, U24CA126477-01, and U24CA126479-01).

27

28 Run-to-Run Chromatographic Reproducibility

29 Broad Orbitrap Vandy Orbitrap NYU Orbitrap INCAPS LTQ NIST LTQ Vandy LTQ Purdue LTQ YICENQDSISSK Lab-to-Lab Chromatography

30 HSA_CAM_SigmaA9511_5H_8MS2_m2_10de_040406_05

31 Measures of Reproducibility Identified ions –Unique peptides, Ions, Spectrum counts Unidentified components –Classify by type, link to origin Ion cluster analysis –MS1 linked to MS2 Chromatography –Time evolution of ion clusters

32 Ion Component Analysis

33 Ion Component Analysis (Yeast)

34 Components in Replicate Runs total sampled identified ▲▼ run 1,2 ■ in both


Download ppt "MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein Mass."

Similar presentations


Ads by Google