Manual De Novo Peptide MS/MS Interpretation For Evaluating Database Search Results Karl R. Clauser Broad Institute of MIT and Harvard Cold Spring Harbor Proteomics Course July, 2009 9/19/2018 7:05:56 AM
Outline AA properties Fragmentation pathways and ion types b/y pairs Fragment charge from mass defect Non-mobile proton Neutral loss ion types Phosphosite ambiguity Sample handling chemistry artifacts Isobaric co-eluters Mass tolerance units and isobaric AA’s Other Tutorials Dominant ions AA adjacencies Positions 9/19/2018 7:05:56 AM
AA structures K H R 128 137 156 D E 115 129 N Q 114 128 S T Y pK: N-term 7.5 pK: C-term 3.5 K H R 128 137 156 pK: 10 6 12 D E 115 129 pK: 4.0 4.5 N Q 114 128 S T Y 87 101 163 P 97 M C 131 103 (+57 IAA) L I 113 113 G A V 57 71 99 F W 147 186 http://ionsource.com/Card/clipart/aaclipart.htm 9/19/2018 7:05:56 AM
Charge-directed Fragmentation Scheme zHz+ O O O O + H2N CH C NH CH C NH CH C NH CH C OH R1 R2 R3 R4 H b ion formation and/or y ion formation H2N CH C NH CH C NH CH C R1 R2 O R3 NH CH C OH R4 O H + b3 + y1 + Neutral pumped away by vacuum system + Neutral pumped away by vacuum system Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg For peptides with non-mobile protons, fragmentation tends to proceed via charge-remote mechanisms. MS/MS spectra will be dominated by a few ions, typically: C-term side of D, E N-term side of P
H2N CH C NH CH C NH CH C NH CH C OH Sequence Specfic Fragment Ion Types a3 b3 c3 nHn+ H2N CH C NH CH C NH CH C NH CH C OH R1 R2 R3 R4 O x1 y1 z1 Ion type restrictions residues delta a-NH3 contains NH3 residue RK NQ -17 b-NH3, y-NH3 contains NH3 residue RK NQ -17 b-H2O, y-H2O contains H2O residue ST DE -18 b-H3PO4, y-H3PO4 contains H3PO4 residue st -98 y++, b++ contains charged residues RHK 1
Complementary Ions b/y pairs 128 99 99 128 E V Q L V|E/S|G|G|G L|V|K|P G G\S\L\R 9/19/2018 7:05:56 AM
Dual Picket Fence A E/D|T|A|L|Y|Y|C A\K 163 163 113 71 101 115 163 163 113 71 101 115 115 101 71 113 163 163 9/19/2018 7:05:56 AM
Uniqueness of a Peptide Sequence Clauser, K. R.; Baker, P. R.; Burlingame, A. L. " Role of Accurate Mass Measurement ( +/- 10ppm) in Protein Identification Strategies Employing MS or MS/MS and Database Searching", Anal. Chem. 1999, 71, 2871-2882. 9/19/2018 7:05:56 AM
Diagnose Doubly Charged Fragment Ions I/A|D|A|H|L|D|R 9/19/2018 7:05:56 AM
Dominant Cleavage Proline N-side N F|P/S/P V D A A F R y9 b2 28 87 97 9/19/2018 7:05:56 AM
Sparse Dominant Fragmentation 115 202 202 115 (K)I S R|P G D|S D|D|S R(S) Non-mobile proton zpre < #Arg 9/19/2018 7:05:56 AM
Cry Babies (b-H2O & b pairs) P(m/z)-H2O P(m/z)-2H2O E/H/A|V/E|G/D|C D|F Q L L K 9/19/2018 7:05:56 AM
Interpreting MS/MS Spectra is Fun!! Kaitlin Aidan Jack Andrea 9/19/2018 7:05:56 AM
Source of Incorrect MS/MS Interpretations Major Database Peptide not in database. Mutation. MS/MS not from a peptide. Unanticipated Protein Chemistry Chemical modification, post-translational modification. Enzyme/Ion Source Non-specific cleavage. In-source fragmentation yields MS3. Minor Algorithm Fragment ion types of instrument not accounted for. Peak Detection. Instrument Resolution Wrong parent charge. Wrong fragment charge. User Competence Wrong parameters selected. 9/19/2018 7:05:56 AM
Phospho Site Ambiguity – S/T L P/S s/P/V|Y/E/D|A A S F K P(m/z)-H3PO4-H2O P(m/z)-H3PO4 P(m/z)-H3PO4-2H2O 9/19/2018 7:05:56 AM
Phospho Site Ambiguity – S/T L A G G Q/T/S Q|P T T|P L\T s/P Q R L A G G Q/T/S Q|P T T|P L\t S/P Q R 9/19/2018 7:05:56 AM
Reliability of LC/MS/MS Phosphoproteomic Literature Citation Approach Instrument #sites #ambiguous Scores Site Supplem. sites Shown Ambiq Labeled Shown Spectra Ballif, BA,…Gygi, SP 1DGel LCQ Deca XP 546 86 yes yes no 2004 MCP, 3, digest, SCX 1093-1101 LC/MS/MS Rush, J, … Comb, MJ digest lysate LCQ Deca XP 628 0 yes no no 2005, Nat Biotech, 23, pTyr Ab 94-101 LC/MS/MS Collins, MO, …Grant, SGN protein IMAC Q-Tof Ultima 331 42 no yes no 2005, J Biol Chem, 280, peptide IMAC 5972-5982 LC/MS/MS Gruhler, A, … Jensen, ON digest lysate LTQ-FT 729 0 yes no no 2005 MCP, 4, SCX, IMAC 310-327 LC/MS/MS “Resulting sequences were inspected manually …. When the exact site of phosphorylation could not be assigned for a given phosphopeptide, it was tabulated as ambiguous.” “All identified phosphopeptides were manually validated, and localization of phosphorylated residues within the individual peptide sequences were manually assigned…” “All spectra supporting the final list of assigned peptides used to build the tables shown here were reviewed by at least three people to establish their credibility.” “Assignment of phosphorylation sites was verified manually with the aid of PEAK Studio (Bioinformatics Solutions) software.” 9/19/2018 7:05:56 AM
Expect Woes & Nuisances Sample Handling Chemistry Carbamylation +43 nterm, Lys urea in digest buffer Deamidation +1 N -> D sample in acid pyroGlutamic acid -17 nterm Q sample in acid Oxidized Met +16 M gels Cys alkylation reagent +x n-term, W Data Dependent Acquisition Parameters Isobaric Co-eluters Protein Isoforms / Family Members Isobaric peptides from related proteins 9/19/2018 7:05:56 AM
Stinkers (b-NH3) & Pyroglutamic Acid -17 Da Q to q (R)Q L/Q/L/A|Q/E/A|A Q\K(R) P(m/z)-NH3 (R)q L/Q|L|A|Q|E|A|A\Q\K(R) 9/19/2018 7:05:56 AM
Deamidation G S/E/S|G|I|F|T|n\T K G S/E/S|G|I|F|T|D\T K 18.35 96.9% +0.007 Da G S/E/S|G|I|F|T|D\T K G S/E S\G\I\F\T\N/T K 6.62 43.4% +0.986 Da 9/19/2018 7:05:56 AM
Deamidation of Asn +1Da Asn –NH + O = Asp ionsource.com
Carbamylation N/S/L/E/T/L/L|y/K|P V/D\R V/S T A/Q/D V/I|Q Q t L\C K +43 18.4 89% V/S T A/Q/D V/I|Q Q t L\C K +0 11.1 68% +0 18.5 93% 9/19/2018 7:05:56 AM
Carbamylation from Urea in Digest Buffer +43Da CNHO +43Da 9/19/2018 7:05:56 AM
Carbamylated N-term I/G/E|G/T/y/G V|V|Y\K +43 b ions P(m/z)-CNHO P(m/z)-CNHO-H2O 9/19/2018 7:05:56 AM
Met Oxidation – localizing the site (R)G V D L D Q L/L|D|M|S|Y|E|Q|L|m|Q|L/Y S A R(Q) (R)G V D L D Q L/L|D|m|S/Y/E|Q|L|M|Q|L/Y S A R(Q) 9/19/2018 7:05:56 AM
Know Your Chromatographic Peak Widths (K)E E m E S A E G|L|K\G P/m\K(S) Top Database Search Result 8.78 71.0% DFwdRev: 3.49 Merged 4 spectra same precursor 50 sec window different peptides 9/19/2018 7:05:56 AM
Consequences of Inappropriate Tolerance Units (using Da tolerance when instrument errors are in ppm) too loose too tight just right Isobaric AA’s I = L (C6 H11 N1 O) = 113.08406 K ~ Q (C6 H12 N2 O, C5 H8 N2 O2) 128.09496 ~ 128.05858 D =0.03638 F~m (C9 H9 N O, C5 H9 N O S) 147.06841 ~ 147.0354 D =0.0330 Isobaric AA combinations GG=N (C4 H6 N2 O2 , C4 H6 N2 O2) 114.04293 GA=Q~K (C5 H8 N2 O2, C5 H8 N2 O2, C6 H12 N2 O) 128.09496 ~ 128.05858 D =0.03638 DA~W~VS (C7 H10 N2 O4, C11 H11 N2 O, C8 H14 N2 O3) 186.06405 ~ 186.07931 ~ 186.10044 D =0.01526 D =0.02113 9/19/2018 7:05:56 AM
Additional Resources Google: “de novo sequencing tutorial” Don Hunt and Jeff Shabanowitz - manual http://www.ionsource.com/tutorial/DeNovo/DeNovoTOC.htm Rich Johnson - manual http://www.abrf.org/ResearchGroups/MassSpectrometry/EPosters/ms97quiz/SequencingTutorial.html PEAKS - automated http://www.bioinformaticssolutions.com/products/peaks/support/tutorials/PEAKS_De_Novo.html 9/19/2018 7:05:56 AM
Physiochemical Complications to Computational Interpretation Incomplete Fragmentation Inconsistent intensity of fragment ion types Instrument type dependent Amino acid dependent Isobaric AA’s I = L (C6 H11 N1 O) K = Q (C6 H12 N2 O, C5 H8 N2 O2) Isobaric AA combinations GG=N (C4 H6 N2 O2 , C4 H6 N2 O2) GA=K=Q (C5 H8 N2 O2, C6 H12 N2 O, C5 H8 N2 O2) W=DA=VS (C11 H11 N2 O, C7 H10 N2 O4, C8 H14 N2 O3) Parent charge uncertainty Fragment charge uncertainty Chemical or post-translational modifications
Frequency of Dominance at Adjacent AA’s – v9, z=2 Mobile Partially Mobile 4525 spectra 2061 spectra # dominant ions # total cleavages >0.8 0.4 - 0.8 0.1 - 0.4 - (<3 obsv) Non-mobile 114 spectra 9/19/2018 7:05:56 AM
Frequency and Distribution Dominant Ions v9 67% 72% 76% 5758 2974 177 Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg Precursor z=2, 6699 spectra from a trypsin GeLC/MS/MS experiment on an LTQ-FT 9/19/2018 7:05:56 AM
Short Peptides Often Yield a Dominant Ion Cleavage Between Residues 2 & 3 Bonus C-side b2 residues at position 3: PRKH Bonus N-side b2 residues at position 1 or 2: PRKHNQqVILFYW Bonus ignore b2: niether of above but still dominant If there is a mobile or partially mobile proton, peptides of length <14 are likely to yield at least one intense fragment ion between residues 2 and 3 (yellow and pink curves shifted to shorter lengths, purple curve shifted to longer lengths). Intense ions are favored by the presence of PRKH at residue 3 or the presence of PRKHNQqVILFYW at residues 1 or 2. 9/19/2018 7:05:56 AM
Acknowledgements Broad Institute Steve Carr Terri Addona Jinyan Du MIT Michael Yaffe Majbrit Hjerrld Drew Lowery 9/19/2018 7:05:56 AM
Frequency of Position Dependent Dominant Ion(s) v9 Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg 9/19/2018 7:05:56 AM
Related Proteins : Distinct Non-differentiable Peptides (R)N P P R\F A\F|V|E|F|E|D|P\R(D) (R)R G G/P P\F A\F|V|E|F|E|D|P R(D) 9/19/2018 7:05:56 AM
Setting Autovalidation Thresholds Step 1 - Protein Mode Step 2 - Peptide Mode 2 or more peptides/protein Each spectrum: moderate or better score 1 peptide/protein Each spectrum: excellent score 9/19/2018 7:05:56 AM
Distinguishing 7 Family Members (14-3-3 proteins) 13.4 YWHAZ 16.5 YWHAE 13.0 SFN 11.1 YWHAB 11.4 YWHAT 13.3 YWHAG 13.5 YWHAH Gene Symbol Wt/Mut PBD None have a PLK-PBD binding motif: S[st]P Each has 2-4 PLK phosphorylation motifs: [ED]X[st][FLIYWVM] No phosphorylated peptides were recovered. 9/19/2018 7:05:56 AM
Chromatographic Peak Sampling: Abundance/Identity Trade-off MS Relative Abundance Periodic Focus on Several Peptides Abundance MS/MS Sliding Focus on Each Peptide Identity Retention Time 9/19/2018 7:05:56 AM
Distinguishing Family Members (ROCK1 & ROCK2 ) #Same #Distinguishing WT/Mut Gene Peps Peps PBD Symbol 5 39 30.8 ROCK2 5 1 26.5 ROCK1 9/19/2018 7:05:56 AM
Enabling Integrated Reverse Database Searches Each database sequence candidate passing the parent mass filter is additionally subject to “inner sequence reversal” and interpreted against the MS/MS spectrum. i.e. SAMPLER Becomes SELPMAR Search time increases ~1.5X 9/19/2018 7:05:56 AM
Reviewing Reversed Database Search Details This is an overview of the SpectrumMill software. The next several slides show some of these features in more detail. 9/19/2018 7:05:56 AM
Near-Identical Sequences Found in Reversed DB 9/19/2018 7:05:56 AM
Experimental Spectrum theoretical fragment ions Experimental Spectrum SEQUEST - preliminary search Pm Sequence Database Experimental Spectrum Relative Abundance Step 1 calculate ALL theoretical fragment ions for EACH sequence Step 3 Mass (m/z) Step 2 Filter Compare Pm Pm b y b y b y b y Filtered Experimental Spectrum Relative Abundance Model Spectrum a a a a Relative Abundance b-NH3 y-NH3 b-NH3 y-NH3 b-NH3 y-NH3 b-NH3 y-NH3 b-H2O y-H2O b-H2O y-H2O b-H2O y-H2O b-H2O y-H2O Mass (m/z) Mass (m/z)
MS-Tag preliminary search Sequence Database Pm Filtered Experimental Spectrum Relative Abundance Step 2 calculate partial ladders for EACH sequence Mass (m/z) Step 3 Step 1 Transform b b-NH3 b-H2O a Compare b b-NH3 b-H2O a Pm Pm b b-NH3 b-H2O a N-terminal Sequence Spectrum Relative Abundance Partial N-terminal Ladder Relative Abundance Mass (m/z) Pm Mass (m/z) Pm y-NH3 y y-NH3 y y-NH3 C-terminal Sequence Spectrum Relative Abundance Relative Abundance Partial C-terminal Ladder y Mass (m/z) Mass (m/z)
Mass Differences Correspond to Amino Acids u q e s e q u e Intensity n n c e e e q c s n e u s e c e m/z 9/19/2018 7:05:56 AM
Graphy Theory Based de novo Algorithms vertices (from peak m/z’s) edges (from mass differences) Transform to Spectrum Graph Find best path Sequence 9/19/2018 7:05:56 AM
SpectrumMill Scoring of MS/MS Interpretations Non-assignment Penalty (R)E F E|I|I|W|V T K(H) 9.24 78.1% DEBS: -3 DFwdRev: 1.75 Score = Assignment Bonus (Ion Type Weighted) - Non-assignment Penalty (Intensity Weighted) Peak Selection: De-Isotoping, S/N thresholding, Parent - neutral removal, Charge assignment Match to Database Candidate Sequences SPI (%) Scored Peak Intensity 9/19/2018 7:05:56 AM
Peptide Sequencing y3 y2 y1 y2 - NH3 y3 -H2O b2-H2O b3- NH3 a2 b2 a3 HO NH3+ | | R1 O R2 O R3 O R4 | || | || | || | H -- N --- C --- C --- N --- C --- C --- N --- C --- C --- N --- C -- COOH | | | | | | | H H H H H H H y3 y2 y1 y2 - NH3 y3 -H2O 9/19/2018 7:05:56 AM
Cry Babies (b-H2O w/o b) No b/y pairs E/C|L/Q/T/C/R 113 128 101 160 160 101 128 113 113 128 101 160 P(m/z)-H2O P(m/z)-2H2O No b/y pairs 9/19/2018 7:05:56 AM