2015.06.19 김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.

Slides:



Advertisements
Similar presentations
Kaizhong Zhang Department of Computer Science University of Western Ontario London, Ontario, Canada Joint work with Bin Ma, Gilles Lajoie, Amanda Doherty-Kirby,
Advertisements

Proteomics Informatics – Protein characterization I: post-translational modifications (Week 10)
De novo glycan structure search with CID MS/MS spectra of native N-glycopeptides Hannu Peltoniemi
Fast Algorithms For Hierarchical Range Histogram Constructions
Mass Spectrometry in Life Science: Technology and Data-Evaluation H. Thiele Bruker Daltonik, Germany.
CSE182 CSE182-L12 Mass Spectrometry Peptide identification.
Fa 05CSE182 CSE182-L7 Protein sequencing and Mass Spectrometry.
Peptide Identification by Tandem Mass Spectrometry Behshad Behzadi April 2005.
Data Processing Algorithms for Analysis of High Resolution MSMS Spectra of Peptides with Complex Patterns of Posttranslational Modifications Shenheng Guan.
PROTEIN IDENTIFICATION BY MASS SPECTROMETRY. OBJECTIVES To become familiar with matrix assisted laser desorption ionization-time of flight mass spectrometry.
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
Each results report will contain:
Scaffold Download free viewer:
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Analysis of tandem mass spectra - II Prof. William Stafford Noble GENOME 541 Intro to Computational Molecular Biology.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications.
Evaluated Reference MS/MS Spectra Libraries Current and Future NIST Programs.
Karl Clauser Proteomics and Biomarker Discovery Taming Errors for Peptides with Post-Translational Modifications Bioinformatics for MS Interest Group ASMS.
Proteomics Informatics – Data Analysis and Visualization (Week 13)
Comparison of chicken light and dark meat using LC MALDI-TOF mass spectrometry as a model system for biomarker discovery WP 651 Jie Du; Stephen J. Hattan.
Production of polypeptides, Da, and middle-down analysis by LC-MSMS Catherine Fenselau 1, Joseph Cannon 1, Nathan Edwards 2, Karen Lohnes 1,
The dynamic nature of the proteome
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
Acknowledgements This work is supported by NSF award DBI , and National Center for Glycomics and Glycoproteomics, funded by NIH/NCRR grant 5P41RR
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
A Comprehensive Comparison of the de novo Sequencing Accuracies of PEAKS, BioAnalyst and PLGS Bin Ma 1 ; Amanda Doherty-Kirby 1 ; Aaron Booy 2 ; Bob Olafson.
Laxman Yetukuri T : Modeling of Proteomics Data
Mass spectroscopy – learning objectives Outline the early developments in mass spectrometry. Outline the use of mass spectrometry in the determination.
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
MS Calibration for Protein Profiles We need calibration for –Accurate mass value Mass error: (Measured Mass – Theoretical Mass) X 10 6 ppm Theoretical.
Temple University MASS SPECTROMETRY INTRODUCTION Ilyana Mushaeva and Amber Moscato Department of Electrical and Computer Engineering Temple University.
INF380 - Proteomics-51 INF380 – Proteomics Chapter 5 – Fundamentals of Mass Spectrometry Mass spectrometry (MS) is used for measuring the mass-to-charge.
CSE182 CSE182-L12 Mass Spectrometry Peptide identification.
INF380 - Proteomics-71 INF380 – Proteomics Chap 7 –Protein Identification and Characterization by MS Protein identification in our context means that we.
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Eat Raw & Fresh: Introducing isotopic Mass-to-charge Ratio and Envelope Fingerprinting (iMEF) and ProteinGoggle for Protein Database Search Zhixin(Michael)
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
1 CH908 Structural Analysis by Mass Spectrometry revision lecture. Prof. Peter O’Connor.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Tag-based Blind Identification of PTMs with Point Process Model 1 Chunmei Liu, 2 Bo Yan, 1 Yinglei Song, 2 Ying Xu, 1 Liming Cai 1 Dept. of Computer Science.
ISOMATCH-web For automatic matching of isotope peak distributions ■ Automatic matching of a raw spectrum (ASCII format) to theoretical isotopic distributions.
Deducing protein composition from complex protein preparations by MALDI without peptide separation.. TP #419 Kenneth C. Parker SimulTof Corporation, Sudbury,
ISA Kim Hye mi. Introduction Input Spectrum data (Protein database) Peptide assignment Peptide validation manual validation PeptideProphet.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Constructing high resolution consensus spectra for a peptide library
DIA Method Design, Data Acquisition, and Assessment
Protein quantitation I: Overview (Week 5). Fractionation Digestion LC-MS Lysis MS Sample i Protein j Peptide k Proteomic Bioinformatics – Quantitation.
B Monoisotopic mass of neutral peptide M r (calc): Fixed modifications: Carbamidomethyl Ions score: 45 † Expect: ‡ Matches (red): 18/50.
Post translational modification n- acetylation Peptide Mass Fingerprinting (PMF) is an analytical technique for identifying unknown protein. Proteins to.
LC-MS/MS Identification of Impurities Present in Synthetic Peptide Drugs Dr Anna Meljon*, Dr Alan Thompson, Dr Osama Chahrour, and Dr John Malone Almac.
Protein Identification via Database searching
General Overview of the module and the methods
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra I
Proteomics Informatics –
A, high resolution MS/MS spectrum (lower panel) of 1435
NoDupe algorithm to detect and group similar mass spectra.
Schematic of MS1 filtering.
Shotgun Proteomics in Neuroscience
Processing of fragment ion information in DTA files to remove isotope ions and noise. Processing of fragment ion information in DTA files to remove isotope.
Sim and PIC scoring results for standard peptides and the test shotgun proteomics dataset. Sim and PIC scoring results for standard peptides and the test.
Proteomics Informatics David Fenyő
Interpretation of Mass Spectra
Presentation transcript:

김지형

Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra for the same peptide. Increase the throughput of proteomic experiments Incurs fragmentation of peptide ions having weak intensities Wrong interpretation the portion of wrong interpretation of precursor ion mass is up to ∼ 40%. Overlapping isotopic clusters are often observed with complex proteome samples and resulted in wrong interpretation of their masses

Introduction Determining isotopic clusters and their monoisotopic masses is the first step in interpreting complex mass spectra generated by high- resolution mass spectrometers Fast, automated and accurate interpretation of the vastly large amount of MS data a fundamental and critical step in MS-based proteomic experiments remains the subject of much research activity.

Introduction Mann et al. : suggested a deconvolution algorithm to find charge states. Senko et al. : introduced a notion of an “average” amino acid called averagine suggested a computational method for determination of monoisotopic masses using it. Zscore : a fast and automated isotopic cluster identification algorithm based on a charge scoring scheme. ESI-ISOCONV MATCHING PepList LASSO AID-MS THRASH

Introduction THRASH Most widely used algorithms Employs the Fourier transform/Patterson method for charge determination and least-squares fitting to compare a peak cluster with an averagine isotopic distribution. Cons : the use of least-squares fitting and/or averagine isotopic distribution often leads to an inaccurate monoisotopic mass that is 1-2 Da different from the correct value

Methods 1.Present a probabilistic model of isotopic distribution of a polypeptide 2.Describe approximations of intensity ratio and intensity ratio product functions Intensity ratio function  Intensity ratio of two adjacent peaks Intensity ratio products function  Intensity ratio products of three adjacent peaks 3.Isotopic clusters and mono-isotopic masses was determined from suggested algorithm

Methods 1.Isotopic Distribution Model Notations A ={C,H,N,O,S} Set of atoms that compose a polypeptide X a : the +a isotope of an atom X ( for each atom X ∈ A ) +a(1,2,4) isotope P Xa = Existential probability For example P C 1 =  % n x = the number of atom X in the polypeptide : elemental composition of a polypeptide

Methods 1.Isotopic Distribution Model Notations Isotopic Distribution of a polypeptide : the theoretical masses and intensities of the peaks generated by all instances of the polypeptide I k : the intensity of the kth peak in an isotopic distribution (k>=0) I 0 : the intensity of the monoisotopic peak I k (k >= 1) : the intensity of the peak whose mass difference from the monoisotopic peak is k. I0I0 IkIk

Methods 1.Isotopic Distribution Model Lemma1 The intensity I k in an isotopic distributions approximates to where

Methods 1.Isotopic Distribution Model Lemma1 We can compute I k by the coefficient of x k in the expansion of the following polynomial Intensity I k in an isotopic distribution of a polypeptide is regarded as the sum of existential probabilities of all polypeptide instances with mass difference k

Methods 1.Isotopic Distribution Model Lemma1 Intensity I 0 the probability of there being no isotopes in the constant term of polynomial P(x)

Methods 1.Isotopic Distribution Model Lemma1 Intensity I 1 the probability of there being only one +1 isotope coefficient of x in P(x)

Methods

1.Isotopic Distribution Model Lemma1 The intensity I k in an isotopic distributions approximates to where

Methods

2. Ratio Function and Ratio Product Functions Theorem 1 I k+1 /I k = cm + b Sampled about 100,000 tryptic peptides of 400 Da to 5,200 Da generated from UniProt DB 8.0 and computed the ratio I k+1 /I k for each peptide

Methods 2. Ratio Function and Ratio Product Functions Theorem 1 I k+1 /I k = cm + b The reason for choosing the threshold 1800 : A polypeptide within 1800 Da has the first and most abundant peak as its monoisotopic peak

Methods

RP max (k,m) RP min (k,m) RP avg (k,m)

Methods 3. Algorithm Overview 1)Peak picking 2)Pseudocluster identification 3)Isotopic cluster identification and monoisotopic mass determination 4)Duplicate cluster removal

Methods 3. Algorithm Overview 1)Peak picking Remove noise and select relatively high intensity peaks from raw spectrum In our experiment, we used the peak picking algorithm of Decon2LS

Methods 3. Algorithm Overview 2)Pseudocluster identification Identifying pseudoclusters by scanning the selected peaks from low m/z to high m/z Finding all the pseudoclusters starting at all peaks first find pseudoclusters with a charge state 1+ and find the other pseudoclusters with higher charge states. Let X denote the m/z of current peak. Then the range of the next peak’s m/z will be [ X+(D-E)/C … X+(D+E)/C ] D : estimated mass difference between two adjacent peaks in an isotopic cluseter E : the error bound

Methods

3. Algorithm Overview 3)Isotopic cluster identification and monoisotopic mass determination Monoisotopic Mass Calculation m : monoisotopic mass (most abundant peak in the pseudocluster) the most abundant peak : qth peak in the pseudocluster p : the number of missing peaks Score Calculation (score of pseudocluster) n : the number of peaks in the pseudocluster

Methods 3. Algorithm Overview 3)Isotopic cluster identification and monoisotopic mass determination Score Calculation (score of pseudocluster) Score about I k I k+2 / I k+1 2 Score about I k+1 / I k

Methods

R min < I' k+1 /I' k < R max  scoreR(k,m,p) > 0

Methods R min < I' k+1 /I' k < R max  scoreRP(k,m,p) > 0

3. Algorithm Overview 3)Isotopic cluster identification and monoisotopic mass determination Methods Score ≤ 00< The pseudocluster is selected and becomes an isotopic cluster discarded

3. Algorithm Overview 4) Duplicate cluster removal Because this algorithm consider all possible pseudoclusters, many pseudoclusters can be generated from a single isotopic cluster. Remove one of duplicate clusters as follows. Methods Most abundant peak same remove Charge state smaller remove lower same score remove lower

Results and Discussion Three programs were compared : RAPID, ICR2LS, Decon2LS Count the number of identified isotope clusters of known peptides whose amino acid sequences were identified by MS/MS It is difficult to pick out the isotopic clusters of known peptides because the MS data can contain many peptides whose monoisotopic clusters contain many peptides whose monoisotopic masses are very similar.

Results and Discussion So use the following method For each known peptide, find isotopic clusters of this peptide at the MS scan where this peptide was identified by MS/MS. If an isotopic cluster has the monoisotopic mass within a mass tolerance of 10 ppm, consider it a potentially correct isotopic cluster. Also look for peptide in adjacent scans. If no isotopic cluster is found within any of 10 consecutive scans, the cluster is discard

The number of isotopic clusters of 494 known peptides Results and Discussion new method10588 Decon2LS10104 ICR2LS9577

Results and Discussion

Reasons for different search results Some clusters are inherently ambiguous and each program can make different judgments. THRASH based algorithm  1~2 Da errors When the position of the most abundant peak is different from that of averagine

Results and Discussion

Identification of Overlapping Clusters there are many overlapping clusters  Hundreds of isotopic clusters crowded into a narrow range

Results and Discussion Clusters without sharing peaks were identified by all programs.

Results and Discussion Clusters with sharing peaks (blue) were only identified by paper’s method.

Results and Discussion Execution Time

Results and Discussion

Conclusion New probabilistic model & algorithm for determining isotopic distributions and monoisotopic masses Suggested algorithm found more isotopic clusters of identified peptides and Successfully resolved 1-2Da mismatch problem. Suggested algorithm Identified overlapping clusters well. Suggested algorithm was Faster than other algorithms.