Manual De Novo Peptide MS/MS Interpretation

Slides:



Advertisements
Similar presentations
Protein Sequencing and Identification by Mass Spectrometry.
Advertisements

Proteomics Informatics – Protein characterization I: post-translational modifications (Week 10)
1 st MS 2 2 nd 3 rd 4 th 5 th 6 th 10 th 9 th 8 th 7 th Relative Intensity Fill Times Scan Times “shotgun sequencing”
In-depth Analysis of Protein Amino Acid Sequence and PTMs with High-resolution Mass Spectrometry Lian Yang 2 ; Baozhen Shan 1 ; Bin Ma 2 1 Bioinformatics.
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Proteomics Informatics – Protein identification III: de novo sequencing (Week 6)
CSE182 CSE182-L12 Mass Spectrometry Peptide identification.
Protein Sequencing and Identification by Mass Spectrometry.
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005.
Fa 05CSE182 CSE182-L8 Mass Spectrometry. Fa 05CSE182 Bio. quiz What is a gene? What is a transcript? What is translation? What are microarrays? What is.
ProReP - Protein Results Parser v3.0©
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
De Novo Sequencing of MS Spectra
Mass Spectrometry. What are mass spectrometers? They are analytical tools used to measure the molecular weight of a sample. Accuracy – 0.01 % of the total.
Scaffold Download free viewer:
My contact details and information about submitting samples for MS
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Protein sequencing and Mass Spectrometry. Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
Karl Clauser Proteomics and Biomarker Discovery Taming Errors for Peptides with Post-Translational Modifications Bioinformatics for MS Interest Group ASMS.
The dynamic nature of the proteome
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
INF380 - Proteomics-91 INF380 – Proteomics Chapter 9 – Identification and characterization by MS/MS The MS/MS identification problem can be formulated.
Karl Clauser Proteomics and Biomarker Discovery 10/14/2015 9:47:49 AM 1 Manual De Novo Peptide MS/MS Interpretation For Evaluating Database Search Results.
Sequence Information Content in Peptide MS/MS Spectra Karl R. Clauser Broad Institute of MIT and Harvard BioInfoSummer 2012 University of Adelaide December,
Acknowledgements This work is supported by NSF award DBI , and National Center for Glycomics and Glycoproteomics, funded by NIH/NCRR grant 5P41RR
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
A Comprehensive Comparison of the de novo Sequencing Accuracies of PEAKS, BioAnalyst and PLGS Bin Ma 1 ; Amanda Doherty-Kirby 1 ; Aaron Booy 2 ; Bob Olafson.
A Phospho-Peptide Spectrum Library for Improved Targeted Assays Barbara Frewen 1, Scott Peterman 1, John Sinclair 2, Claus Jorgensen 2, Amol Prakash 1,
Laxman Yetukuri T : Modeling of Proteomics Data
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
CSE182 CSE182-L12 Mass Spectrometry Peptide identification.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Dual Picket Fence (K)L/D|I|L|D|Q|E|R(A)
Isotope Labeled Internal Standards in Skyline
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Tag-based Blind Identification of PTMs with Point Process Model 1 Chunmei Liu, 2 Bo Yan, 1 Yinglei Song, 2 Ying Xu, 1 Liming Cai 1 Dept. of Computer Science.
Salamanca, March 16th 2010 Participants: Laboratori de Proteomica-HUVH Servicio de Proteómica-CNB-CSIC Participants: Laboratori de Proteomica-HUVH Servicio.
Oct 2011 SDMBT1 Lecture 11 Some quantitation methods with LC-MS a.ICAT b.iTRAQ c.Proteolytic 18 O labelling d.SILAC e.AQUA f.Label Free quantitation.
Deducing protein composition from complex protein preparations by MALDI without peptide separation.. TP #419 Kenneth C. Parker SimulTof Corporation, Sudbury,
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Constructing high resolution consensus spectra for a peptide library
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
Identify proteins. Proteomic workflow Trypsin A typical sample We add a solution of 50 mM NH 4 HCO 3 (pH 7.8) containing trypsin ( µg/µl). Volume.
Database Search Algorithm for Identification of Intact Cross-Links in Proteins and Peptides Using Tandem Mass Sepctrometry 신성호.
Mass Spectrometry 101 (continued) Hackert - CH 370 / 387D
Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation.
A Database of Peak Annotations of Empirically Derived Mass Spectra
LC-MS/MS Identification of Impurities Present in Synthetic Peptide Drugs Dr Anna Meljon*, Dr Alan Thompson, Dr Osama Chahrour, and Dr John Malone Almac.
MassMatrix Search Results Explained
Pinpointing phosphorylation sites using Selected Reaction Monitoring and Skyline Christina Ludwig group of Ruedi Aebersold, ETH Zürich.
Karl R. Clauser Broad Institute of MIT and Harvard
Refining Peptide Fragmentation Models for Improved Confidence in Sequence/Spectrum Matching Karl Clauser Broad Institute of MIT and Harvard Cambridge,
Protein/Peptide Quantification
Proteomics Lecture 4 Proteases.
Modified Peptide MS/MS Interpretation
Bioinformatics Solutions Inc.
Manual De Novo Peptide MS/MS Interpretation
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra I
A perspective on proteomics in cell biology
Proteomics Informatics –
Protein Identification Using Tandem Mass Spectrometry
Shotgun Proteomics in Neuroscience
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
Presentation transcript:

Manual De Novo Peptide MS/MS Interpretation For Evaluating Database Search Results Karl R. Clauser Broad Institute of MIT and Harvard Cold Spring Harbor Proteomics Course July, 2009 9/19/2018 7:05:56 AM

Outline AA properties Fragmentation pathways and ion types b/y pairs Fragment charge from mass defect Non-mobile proton Neutral loss ion types Phosphosite ambiguity Sample handling chemistry artifacts Isobaric co-eluters Mass tolerance units and isobaric AA’s Other Tutorials Dominant ions AA adjacencies Positions 9/19/2018 7:05:56 AM

AA structures K H R 128 137 156 D E 115 129 N Q 114 128 S T Y pK: N-term 7.5 pK: C-term 3.5 K  H R 128 137 156 pK: 10 6 12 D E 115 129 pK: 4.0 4.5 N Q 114 128 S T Y 87 101 163 P 97 M C 131 103 (+57 IAA) L I 113 113 G A V 57 71 99 F W 147 186 http://ionsource.com/Card/clipart/aaclipart.htm 9/19/2018 7:05:56 AM

Charge-directed Fragmentation Scheme zHz+ O O O O + H2N CH C NH CH C NH CH C NH CH C OH R1 R2 R3 R4 H b ion formation and/or y ion formation H2N CH C NH CH C NH CH C R1 R2 O R3 NH CH C OH R4 O H + b3 + y1 + Neutral pumped away by vacuum system + Neutral pumped away by vacuum system Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg For peptides with non-mobile protons, fragmentation tends to proceed via charge-remote mechanisms. MS/MS spectra will be dominated by a few ions, typically: C-term side of D, E N-term side of P

H2N CH C NH CH C NH CH C NH CH C OH Sequence Specfic Fragment Ion Types a3 b3 c3 nHn+ H2N CH C NH CH C NH CH C NH CH C OH R1 R2 R3 R4 O x1 y1 z1 Ion type restrictions residues delta a-NH3 contains NH3 residue RK NQ -17 b-NH3, y-NH3 contains NH3 residue RK NQ -17 b-H2O, y-H2O contains H2O residue ST DE -18 b-H3PO4, y-H3PO4 contains H3PO4 residue st -98 y++, b++ contains charged residues RHK 1

Complementary Ions b/y pairs 128 99 99 128 E V Q L V|E/S|G|G|G L|V|K|P G G\S\L\R 9/19/2018 7:05:56 AM

Dual Picket Fence A E/D|T|A|L|Y|Y|C A\K 163 163 113 71 101 115 163 163 113 71 101 115 115 101 71 113 163 163 9/19/2018 7:05:56 AM

Uniqueness of a Peptide Sequence Clauser, K. R.; Baker, P. R.; Burlingame, A. L. " Role of Accurate Mass Measurement ( +/- 10ppm) in Protein Identification Strategies Employing MS or MS/MS and Database Searching", Anal. Chem. 1999, 71, 2871-2882. 9/19/2018 7:05:56 AM

Diagnose Doubly Charged Fragment Ions I/A|D|A|H|L|D|R 9/19/2018 7:05:56 AM

Dominant Cleavage Proline N-side N F|P/S/P V D A A F R y9 b2 28 87 97 9/19/2018 7:05:56 AM

Sparse Dominant Fragmentation 115 202 202 115 (K)I S R|P G D|S D|D|S R(S) Non-mobile proton zpre < #Arg 9/19/2018 7:05:56 AM

Cry Babies (b-H2O & b pairs) P(m/z)-H2O P(m/z)-2H2O E/H/A|V/E|G/D|C D|F Q L L K 9/19/2018 7:05:56 AM

Interpreting MS/MS Spectra is Fun!! Kaitlin Aidan Jack Andrea 9/19/2018 7:05:56 AM

Source of Incorrect MS/MS Interpretations Major Database Peptide not in database. Mutation. MS/MS not from a peptide. Unanticipated Protein Chemistry Chemical modification, post-translational modification. Enzyme/Ion Source Non-specific cleavage. In-source fragmentation yields MS3. Minor Algorithm Fragment ion types of instrument not accounted for. Peak Detection. Instrument Resolution Wrong parent charge. Wrong fragment charge. User Competence Wrong parameters selected. 9/19/2018 7:05:56 AM

Phospho Site Ambiguity – S/T L P/S s/P/V|Y/E/D|A A S F K P(m/z)-H3PO4-H2O P(m/z)-H3PO4 P(m/z)-H3PO4-2H2O 9/19/2018 7:05:56 AM

Phospho Site Ambiguity – S/T L A G G Q/T/S Q|P T T|P L\T s/P Q R L A G G Q/T/S Q|P T T|P L\t S/P Q R 9/19/2018 7:05:56 AM

Reliability of LC/MS/MS Phosphoproteomic Literature Citation Approach Instrument #sites #ambiguous Scores Site Supplem. sites Shown Ambiq Labeled Shown Spectra Ballif, BA,…Gygi, SP 1DGel LCQ Deca XP 546 86 yes yes no 2004 MCP, 3, digest, SCX 1093-1101 LC/MS/MS Rush, J, … Comb, MJ digest lysate LCQ Deca XP 628 0 yes no no 2005, Nat Biotech, 23, pTyr Ab 94-101 LC/MS/MS Collins, MO, …Grant, SGN protein IMAC Q-Tof Ultima 331 42 no yes no 2005, J Biol Chem, 280, peptide IMAC 5972-5982 LC/MS/MS Gruhler, A, … Jensen, ON digest lysate LTQ-FT 729 0 yes no no 2005 MCP, 4, SCX, IMAC 310-327 LC/MS/MS “Resulting sequences were inspected manually …. When the exact site of phosphorylation could not be assigned for a given phosphopeptide, it was tabulated as ambiguous.” “All identified phosphopeptides were manually validated, and localization of phosphorylated residues within the individual peptide sequences were manually assigned…” “All spectra supporting the final list of assigned peptides used to build the tables shown here were reviewed by at least three people to establish their credibility.” “Assignment of phosphorylation sites was verified manually with the aid of PEAK Studio (Bioinformatics Solutions) software.” 9/19/2018 7:05:56 AM

Expect Woes & Nuisances Sample Handling Chemistry Carbamylation +43 nterm, Lys urea in digest buffer Deamidation +1 N -> D sample in acid pyroGlutamic acid -17 nterm Q sample in acid Oxidized Met +16 M gels Cys alkylation reagent +x n-term, W Data Dependent Acquisition Parameters Isobaric Co-eluters Protein Isoforms / Family Members Isobaric peptides from related proteins 9/19/2018 7:05:56 AM

Stinkers (b-NH3) & Pyroglutamic Acid -17 Da Q to q (R)Q L/Q/L/A|Q/E/A|A Q\K(R) P(m/z)-NH3 (R)q L/Q|L|A|Q|E|A|A\Q\K(R) 9/19/2018 7:05:56 AM

Deamidation G S/E/S|G|I|F|T|n\T K G S/E/S|G|I|F|T|D\T K 18.35 96.9% +0.007 Da G S/E/S|G|I|F|T|D\T K G S/E S\G\I\F\T\N/T K 6.62 43.4% +0.986 Da 9/19/2018 7:05:56 AM

Deamidation of Asn +1Da Asn –NH + O = Asp ionsource.com

Carbamylation N/S/L/E/T/L/L|y/K|P V/D\R V/S T A/Q/D V/I|Q Q t L\C K +43 18.4 89% V/S T A/Q/D V/I|Q Q t L\C K +0 11.1 68% +0 18.5 93% 9/19/2018 7:05:56 AM

Carbamylation from Urea in Digest Buffer +43Da CNHO +43Da 9/19/2018 7:05:56 AM

Carbamylated N-term I/G/E|G/T/y/G V|V|Y\K +43 b ions P(m/z)-CNHO P(m/z)-CNHO-H2O 9/19/2018 7:05:56 AM

Met Oxidation – localizing the site (R)G V D L D Q L/L|D|M|S|Y|E|Q|L|m|Q|L/Y S A R(Q) (R)G V D L D Q L/L|D|m|S/Y/E|Q|L|M|Q|L/Y S A R(Q) 9/19/2018 7:05:56 AM

Know Your Chromatographic Peak Widths (K)E E m E S A E G|L|K\G P/m\K(S) Top Database Search Result 8.78 71.0% DFwdRev: 3.49 Merged 4 spectra same precursor 50 sec window different peptides 9/19/2018 7:05:56 AM

Consequences of Inappropriate Tolerance Units (using Da tolerance when instrument errors are in ppm) too loose too tight just right Isobaric AA’s I = L (C6 H11 N1 O) = 113.08406 K ~ Q (C6 H12 N2 O, C5 H8 N2 O2) 128.09496 ~ 128.05858 D =0.03638 F~m (C9 H9 N O, C5 H9 N O S) 147.06841 ~ 147.0354 D =0.0330 Isobaric AA combinations GG=N (C4 H6 N2 O2 , C4 H6 N2 O2) 114.04293 GA=Q~K (C5 H8 N2 O2, C5 H8 N2 O2, C6 H12 N2 O) 128.09496 ~ 128.05858 D =0.03638 DA~W~VS (C7 H10 N2 O4, C11 H11 N2 O, C8 H14 N2 O3) 186.06405 ~ 186.07931 ~ 186.10044 D =0.01526 D =0.02113 9/19/2018 7:05:56 AM

Additional Resources Google: “de novo sequencing tutorial” Don Hunt and Jeff Shabanowitz - manual http://www.ionsource.com/tutorial/DeNovo/DeNovoTOC.htm Rich Johnson - manual http://www.abrf.org/ResearchGroups/MassSpectrometry/EPosters/ms97quiz/SequencingTutorial.html PEAKS - automated http://www.bioinformaticssolutions.com/products/peaks/support/tutorials/PEAKS_De_Novo.html 9/19/2018 7:05:56 AM

Physiochemical Complications to Computational Interpretation Incomplete Fragmentation Inconsistent intensity of fragment ion types Instrument type dependent Amino acid dependent Isobaric AA’s I = L (C6 H11 N1 O) K = Q (C6 H12 N2 O, C5 H8 N2 O2) Isobaric AA combinations GG=N (C4 H6 N2 O2 , C4 H6 N2 O2) GA=K=Q (C5 H8 N2 O2, C6 H12 N2 O, C5 H8 N2 O2) W=DA=VS (C11 H11 N2 O, C7 H10 N2 O4, C8 H14 N2 O3) Parent charge uncertainty Fragment charge uncertainty Chemical or post-translational modifications

Frequency of Dominance at Adjacent AA’s – v9, z=2 Mobile Partially Mobile 4525 spectra 2061 spectra # dominant ions # total cleavages >0.8 0.4 - 0.8 0.1 - 0.4 - (<3 obsv) Non-mobile 114 spectra 9/19/2018 7:05:56 AM

Frequency and Distribution Dominant Ions v9 67% 72% 76% 5758 2974 177 Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg Precursor z=2, 6699 spectra from a trypsin GeLC/MS/MS experiment on an LTQ-FT 9/19/2018 7:05:56 AM

Short Peptides Often Yield a Dominant Ion Cleavage Between Residues 2 & 3 Bonus C-side b2 residues at position 3: PRKH Bonus N-side b2 residues at position 1 or 2: PRKHNQqVILFYW Bonus ignore b2: niether of above but still dominant If there is a mobile or partially mobile proton, peptides of length <14 are likely to yield at least one intense fragment ion between residues 2 and 3 (yellow and pink curves shifted to shorter lengths, purple curve shifted to longer lengths). Intense ions are favored by the presence of PRKH at residue 3 or the presence of PRKHNQqVILFYW at residues 1 or 2. 9/19/2018 7:05:56 AM

Acknowledgements Broad Institute Steve Carr Terri Addona Jinyan Du MIT Michael Yaffe Majbrit Hjerrld Drew Lowery 9/19/2018 7:05:56 AM

Frequency of Position Dependent Dominant Ion(s) v9 Proton Mobility Mobile: zpre > #Arg + #Lys + #His Partially mobile: zpre < #Arg + #Lys + #His and > #Arg Non-mobile: zpre < #Arg 9/19/2018 7:05:56 AM

Related Proteins : Distinct Non-differentiable Peptides (R)N P P R\F A\F|V|E|F|E|D|P\R(D) (R)R G G/P P\F A\F|V|E|F|E|D|P R(D) 9/19/2018 7:05:56 AM

Setting Autovalidation Thresholds Step 1 - Protein Mode Step 2 - Peptide Mode 2 or more peptides/protein Each spectrum: moderate or better score 1 peptide/protein Each spectrum: excellent score 9/19/2018 7:05:56 AM

Distinguishing 7 Family Members (14-3-3 proteins) 13.4 YWHAZ 16.5 YWHAE 13.0 SFN 11.1 YWHAB 11.4 YWHAT 13.3 YWHAG 13.5 YWHAH Gene Symbol Wt/Mut PBD None have a PLK-PBD binding motif: S[st]P Each has 2-4 PLK phosphorylation motifs: [ED]X[st][FLIYWVM] No phosphorylated peptides were recovered. 9/19/2018 7:05:56 AM

Chromatographic Peak Sampling: Abundance/Identity Trade-off MS Relative Abundance Periodic Focus on Several Peptides Abundance MS/MS Sliding Focus on Each Peptide Identity Retention Time 9/19/2018 7:05:56 AM

Distinguishing Family Members (ROCK1 & ROCK2 ) #Same #Distinguishing WT/Mut Gene Peps Peps PBD Symbol 5 39 30.8 ROCK2 5 1 26.5 ROCK1 9/19/2018 7:05:56 AM

Enabling Integrated Reverse Database Searches Each database sequence candidate passing the parent mass filter is additionally subject to “inner sequence reversal” and interpreted against the MS/MS spectrum. i.e. SAMPLER Becomes SELPMAR Search time increases ~1.5X 9/19/2018 7:05:56 AM

Reviewing Reversed Database Search Details This is an overview of the SpectrumMill software. The next several slides show some of these features in more detail. 9/19/2018 7:05:56 AM

Near-Identical Sequences Found in Reversed DB 9/19/2018 7:05:56 AM

Experimental Spectrum theoretical fragment ions Experimental Spectrum SEQUEST - preliminary search Pm Sequence Database Experimental Spectrum Relative Abundance Step 1 calculate ALL theoretical fragment ions for EACH sequence Step 3 Mass (m/z) Step 2 Filter Compare Pm Pm b y b y b y b y Filtered Experimental Spectrum Relative Abundance Model Spectrum a a a a Relative Abundance b-NH3 y-NH3 b-NH3 y-NH3 b-NH3 y-NH3 b-NH3 y-NH3 b-H2O y-H2O b-H2O y-H2O b-H2O y-H2O b-H2O y-H2O Mass (m/z) Mass (m/z)

MS-Tag preliminary search Sequence Database Pm Filtered Experimental Spectrum Relative Abundance Step 2 calculate partial ladders for EACH sequence Mass (m/z) Step 3 Step 1 Transform b b-NH3 b-H2O a Compare b b-NH3 b-H2O a Pm Pm b b-NH3 b-H2O a N-terminal Sequence Spectrum Relative Abundance Partial N-terminal Ladder Relative Abundance Mass (m/z) Pm Mass (m/z) Pm y-NH3 y y-NH3 y y-NH3 C-terminal Sequence Spectrum Relative Abundance Relative Abundance Partial C-terminal Ladder y Mass (m/z) Mass (m/z)

Mass Differences Correspond to Amino Acids u q e s e q u e Intensity n n c e e e q c s n e u s e c e m/z 9/19/2018 7:05:56 AM

Graphy Theory Based de novo Algorithms vertices (from peak m/z’s) edges (from mass differences) Transform to Spectrum Graph Find best path Sequence 9/19/2018 7:05:56 AM

SpectrumMill Scoring of MS/MS Interpretations Non-assignment Penalty (R)E F E|I|I|W|V T K(H) 9.24 78.1% DEBS: -3 DFwdRev: 1.75 Score = Assignment Bonus (Ion Type Weighted) - Non-assignment Penalty (Intensity Weighted) Peak Selection: De-Isotoping, S/N thresholding, Parent - neutral removal, Charge assignment Match to Database Candidate Sequences SPI (%) Scored Peak Intensity 9/19/2018 7:05:56 AM

Peptide Sequencing y3 y2 y1 y2 - NH3 y3 -H2O b2-H2O b3- NH3 a2 b2 a3 HO NH3+ | | R1 O R2 O R3 O R4 | || | || | || | H -- N --- C --- C --- N --- C --- C --- N --- C --- C --- N --- C -- COOH | | | | | | | H H H H H H H y3 y2 y1 y2 - NH3 y3 -H2O 9/19/2018 7:05:56 AM

Cry Babies (b-H2O w/o b) No b/y pairs E/C|L/Q/T/C/R 113 128 101 160 160 101 128 113 113 128 101 160 P(m/z)-H2O P(m/z)-2H2O No b/y pairs 9/19/2018 7:05:56 AM