Presentation is loading. Please wait.

Presentation is loading. Please wait.

Refining Peptide Fragmentation Models for Improved Confidence in Sequence/Spectrum Matching Karl Clauser Broad Institute of MIT and Harvard Cambridge,

Similar presentations


Presentation on theme: "Refining Peptide Fragmentation Models for Improved Confidence in Sequence/Spectrum Matching Karl Clauser Broad Institute of MIT and Harvard Cambridge,"— Presentation transcript:

1 Refining Peptide Fragmentation Models for Improved Confidence in Sequence/Spectrum Matching Karl Clauser Broad Institute of MIT and Harvard Cambridge, MA Strategies for Improving Reliability in Protein/Peptide Identification Workshop National Cancer Institute's (NCI) Clinical Proteomic Technologies for Cancer (CPTAC) Initiative and the National Institute of Standards and Technology (NIST) Gaithersburg, MD November 15, 2007 9/17/2018 7:50:32 AM

2 Continuum of MS/MS Interpretation Strategies
Unsequenced Genome Unexpected PTM Multiple Mutations Validate DB match Spectral Quality Sequenced Genome Expected PTM Expected Chem Mods Single Mutations Repeat Sample Analysis Related Spectra Unexpected Partial Chem Mods Overlapping Peptides De Novo Sequence Database Match Spectral Library Fragmentation Modeling Peak Detection Collection Curation Each Digestion Enzyme Each Instrument Model Each Collision Energy SCORING 9/17/2018 7:50:32 AM

3 Are Spectral Similarity Scoring Functions Appropriate for Scoring Database Search Sequence/Spectrum Matches? ( S( Ai * Bi) )2 SAi 2 * S Bi 2 Difference of the 3 Scores Extent of Emphasis on low abundance ions Dot product Sokolow, 1978 General Features of the Scores Normalize Intensities & Number of Peaks Score contribution for each ion scales with its intensity Max Value is 1.0 Spectral Contrast Angle Wan, 2001 S( Ai * Bi) ( SAi 2 * SBi 2)1/2 Numerator Bonus for Matched ions Denominator Penalty for Unmatched ions A Experimental Penalty for Unmatched ions B Theoretically Possible S( Ai * Bi ) 1/2 ( SAi * SBi )1/2 Similarity Zhang, 2004 The naïve fragment ion models in current database search programs tend to substantially over predict the number of fragment ions. I.e. allow for all possible ion types with similar intensities 9/17/2018 7:50:32 AM

4 (R)L/Q|A|E|A|Q|Q|L|R(K)
Zhang Kinetic Model ZSS: 0.826 SM Core Model Experimental 9/17/2018 7:50:32 AM

5 (R)L/Q|A|E|A|Q|Q|L|R(K)
Zhang Kinetic Model ZSS: 0.826 SM Dominant Ion Model v9 Experimental 9/17/2018 7:50:32 AM

6 Non-assignment Penalty
SpectrumMill Core Scoring of MS/MS Interpretations (R)E F E|I|I|W|V T K(H) 9.24 78.1% DEBS: -3 DFwdRev: 1.75 Score = Assignment Bonus (Ion Type Weighted) - Non-assignment Penalty (Intensity Weighted) Peak Selection: De-Isotoping, S/N thresholding, Parent - neutral removal, Charge assignment Match to Database Candidate Sequences SPI (%) Scored Peak Intensity 9/17/2018 7:50:32 AM

7 Refining Peptide Fragmentation Models to Account for Dominant Peaks
Goal Increase gap between score of correct and incorrect sequences for each spectrum Increase score of correct sequence Decrease score of incorrect sequence Principles to Capture Bonus - dominant peaks should score a bonus for ion types and AA adjacencies that should be dominant, i.e. at N-term of Pro. Penalty - dominant peaks should be penalized if assigned to sites that should not be dominant. Correct peak assignments should rarely be penalized (<1%) Charge location - At a dominant site, formation of a b or y type ion depends primarily on the location of basic residues in the peptide, not the fragmentation site. Scoring scheme should still work if an enzyme besides trypsin is used. Do not try to predict which of b, y, b++, y++ will dominate Frequency - dominant fragmentation does not always occur at particular AA adjacencies. Instead there are tendencies, additional factors not yet understood. How dominant - It is very difficult to predict how dominant a peak will be when it occurs at a cleavage that is expected to be dominant. Do not try to predict the extent of dominance. Scale a peak’s score with the extent of dominance that occurs. 9/17/2018 7:50:32 AM

8 Additional Intensity Scaled Score on Dominant Peaks
Bonus b or y at expected dominant cleavage Penalty not expected dominant cleavage M PM NM n R 2 2 2 c R 0 0 2 n P 2 2 2 c DnE 1 2 2 nc HK 2 2 0 c ILV 1 1 0 AG, FG, QG, YG 1 0 0 n-1: HALIVFYW, !n: PG 1* 1* 0 b2/yn bn-2 contains RHK 0 0 0 nterm E b-h2o 1 1 1 nterm Q b-nh yn-2 DnEST 1 1 0 yn-2 QH 1 1 0 Dominant Ions Expected by Peptide Sequence *not counted as expected, but bonus if occurs Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg (amount to add to base score) Bonus Score 1.5 0.75 0.25 0.125 0.0 Class 1 Class 2 Class 0 (amount to subtract from base score) Penalty Score 1.5 0.25 0.0 Intensity (Dominant Peak/ Tallest non-dominant peak) Tallest non-dominant = peak min( seq: tallest not expected, spectrum: top 5 ratio gap ) 9/17/2018 7:50:32 AM

9 Frequency and Distribution of Dominant Ions v9
67% 72% 76% 5758 2974 177 Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg Precursor z=2, 6699 spectra, <0.5% FPR from a trypsin GeLC/MS/MS experiment on an LTQ-FT 9/17/2018 7:50:32 AM

10 Frequency of AA Adjacency-Dependent Dominant Ions – v9, z=2
Mobile Partially Mobile 4525 spectra 2061 spectra # dominant ions # total cleavages >0.8 - (<3 obsv) Non-mobile 114 spectra 9/17/2018 7:50:32 AM

11 Frequency of Position-Dependent Dominant Ions – v9, z=2
Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg 9/17/2018 7:50:32 AM

12 DIS performance Partially Mobile v9
P050531_PBD_WT 2061 spectra unique peptides 1477 spectra w/ dominant peaks 2060 spectra yield same match DIS v Orig z=2 only, m/z tolerance = 0.05 DIS better R1-R2 only DIS better both DIS worse both DIS better score only Score Rank1 – Rank2 DIS worse DIS worse DIS better DIS better 9/17/2018 7:50:32 AM

13 Spectral Similarity Comparison of Fragmentation Models
Zhang Kinetic Model SM Core Model v9 9/17/2018 7:50:32 AM

14 Zhang Kinetic Model SM Dominant Ion Model v9 SM Core Model v9
9/17/2018 7:50:32 AM

15 Dominant Ions – Mobile b2/yn-2
(K)V/N|I|V|P V I\A K(A) ZSS: 0.697 Zhang Kinetic Model Experimental 9/17/2018 7:50:32 AM

16 Dominant Ions – Mobile b2/yn-2
(K)A N|S/N/L/V L|Q|A|D\R(S) ZSS: 0.534 Zhang Kinetic Model Experimental 9/17/2018 7:50:32 AM

17 Dominant Ions – PM nterm Q b-nh3
(K)Q D/A Q/S/L/H|G D|I|P Q\K(Q) ZSS: 0.808 Zhang Kinetic Model Experimental 9/17/2018 7:50:32 AM

18 Dominant Ions – PM nterm E b-h2o
(K)E/G E/D/S/S/V/I/H|Y\D|D\K(A) ZSS: 0.753 Zhang Kinetic Model Experimental 9/17/2018 7:50:32 AM

19 Dominant Ions – PM y++-h2o @ yn-2 DnEST
(K)V/A|E|I/E|H|A\E\K(E) ZSS: 0.724 Zhang Kinetic Model Experimental 9/17/2018 7:50:32 AM

20 Dominant Ions – PM y++-nh3 @ yn-2 QH
(R)I/L|Q|E/H|E|Q\I\K(K) ZSS: 0.779 Zhang Kinetic Model Experimental 9/17/2018 7:50:32 AM

21 Dominant Ions – PM bn-2 contains RHK
(R)R/Q V E|E\E|I\L|A|L\K(A) ZSS: 0.799 Zhang Kinetic Model Experimental 9/17/2018 7:50:32 AM

22 Non-dominant Ions – Mobile
(R)T/V T V W E L I/S|S/E/Y|F|T|A|E\Q R(Q) ZSS: 0.427 Zhang Kinetic Model Experimental 9/17/2018 7:50:32 AM

23 Next Steps Refine definitions of dominant adjacencies & positions
Add/combine factors to increase frequency of occurrence/expectation Bonus Class Cleavage Matrix Refine intensity ratio and dominant ion categories across charge states. Verify suitability across multiple enzymes. Verify suitability across multiple collision energies 28, 30, 35. Test scoring system on phospho-peptides and other labeling chemistries that produce strong marker ions and/or bear charge. Examine diminishing the relative score of minor ion types (a, b/y-h2o, b/y-nh3) Minimize role of SPI (secondary score) in validation. Evaluate/Integrate Zhang Fragment model 9/17/2018 7:50:32 AM

24 Acknowledgements Broad Institute Steve Carr Jake Jaffe MIT
Michael Yaffe Majbrit Hjerrld Drew Lowery Agilent Technologies Joe Roark David Horn Amgen Zhongqi Zhang 9/17/2018 7:50:32 AM


Download ppt "Refining Peptide Fragmentation Models for Improved Confidence in Sequence/Spectrum Matching Karl Clauser Broad Institute of MIT and Harvard Cambridge,"

Similar presentations


Ads by Google