Peak Detection with Chemical Noise Removal Using Short-Time FFT for a Kind of MALDI Data Xiaobo Zhou HCNR-CBI, Harvard Medical School and Brigham & Women’s.

Slides:



Advertisements
Similar presentations
A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data Anoop Mayampurath, Chuan-Yih Yu Info-690 (Glycoinformatics) Final.
Advertisements

1 InCoB 2009, Singapore Ren é Hussong et al. Highly accelerated feature detection in mass spectrometry data using modern graphics processing units Bioinformatics.
A novel preprocessing method using Hilbert Huang transform for MALDI-TOF and SELDI-TOF mass spectrometry data 吳立青 1.
SIGNAL PROCESSING TECHNIQUES USED FOR THE ANALYSIS OF ACOUSTIC SIGNALS FROM HEART AND LUNGS TO DETECT PULMONARY EDEMA 1 Pratibha Sharma Electrical, Computer.
Identification of variables and parameters for protein data analysis in clinical diagnostics David Yang Leighton Ing Mentor: Dr. Tina Xiao JPL/NASA.
Pores and Ridges: High- Resolution Fingerprint Matching Using Level 3 Features Anil K. Jain Yi Chen Meltem Demirkus.
PROTEIN IDENTIFICATION BY MASS SPECTROMETRY. OBJECTIVES To become familiar with matrix assisted laser desorption ionization-time of flight mass spectrometry.
Theodore Alexandrov, Michael Becker, Sören Deininger, Günther Ernst, Liane Wehder, Markus Grasmair, Ferdinand von Eggeling, Herbert Thiele, and Peter Maass.
Basics of 2-DE and MALDI-ToF MS
Announcements: Proposal resubmissions are due 4/23. It is recommended that students set up a meeting to discuss modifications for the final step of the.
Despeckle Filtering in Medical Ultrasound Imaging
Introduction to high-throughput analysis of proteins and metabolites by Mass Spectrometry The basic principle Brief introduction of techniques Computational.
Proteomics Informatics Workshop Part III: Protein Quantitation
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Proteome.
Comparison of chicken light and dark meat using LC MALDI-TOF mass spectrometry as a model system for biomarker discovery WP 651 Jie Du; Stephen J. Hattan.
Chapter 9 Mass Spectrometry (MS) -Microbial Functional Genomics 조광평 CBBL.
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
Tissue Imaging by 5 kHz High-Performance MALDI-TOF Poster Number TP191 Christina Vestal 1, Kenneth Parker 1, Kevin Hayden 1, George Mills 1, Marvin Vestal.
INF380 - Proteomics-61 INF380 – Proteomics Chapter 6 – Mass Spectrometry – MALDI TOF The MALDI-TOF instruments are the simplest MS instruments suitable.
Laxman Yetukuri T : Modeling of Proteomics Data
Metabolomics Metabolome Reflects the State of the Cell, Organ or Organism Change in the metabolome is a direct consequence of protein activity changes.
A new "Molecular Scanner" design for interfacing gel electrophoresis with MALDI-TOF ThP Stephen J. Hattan; Kenneth C. Parker; Marvin L. Vestal SimulTof.
Quantification of Membrane and Membrane- Bound Proteins in Normal and Malignant Breast Cancer Cells Isolated from the Same Patient with Primary Breast.
INF380 - Proteomics-51 INF380 – Proteomics Chapter 5 – Fundamentals of Mass Spectrometry Mass spectrometry (MS) is used for measuring the mass-to-charge.
High throughput Protein Measurement Techniques Harin Kanani.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Automated Fingertip Detection
Metabolomics MS and Data Analysis PCB 5530 Tom Niehaus Fall 2015.
Analysis of Traction System Time-Varying Signals using ESPRIT Subspace Spectrum Estimation Method Z. Leonowicz, T. Lobos
Introduction to high-throughput analysis of proteins and metabolites by Mass Spectrometry The basic principle Brief introduction of techniques Computational.
Deducing protein composition from complex protein preparations by MALDI without peptide separation.. TP #419 Kenneth C. Parker SimulTof Corporation, Sudbury,
2014 생화학 실험 (1) 6주차 실험조교 : 류 지 연 Yonsei Proteome Research Center 산학협동관 421호
Learning and Removing Cast Shadows through a Multidistribution Approach Nicolas Martel-Brisson, Andre Zaccarin IEEE TRANSACTIONS ON PATTERN ANALYSIS AND.
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
Chapter 11 Structure Determination: Mass Spectrometry, Infrared Spectroscopy, and Ultraviolet Spectroscopy.
Serum Diagnosis of Chronic Fatigue Syndrome (CFS) Using Array-based Proteomics Pingzhao Hu W Le, S Lim, B Xing, CMT Greenwood and J Beyene Hospital for.
RANIA MOHAMED EL-SHARKAWY Lecturer of clinical chemistry Medical Research Institute, Alexandria University MEDICAL RESEARCH INSTITUTE– ALEXANDRIA UNIVERSITY.
Yonsei Proteome Research Center Peptide Mass Finger-Printing Part II. MALDI-TOF 2013 생화학 실험 (1) 6 주차 자료 임종선 조교 내선 6625.
Mass Spectrometry makes it possible to measure protein/peptide masses (actually mass/charge ratio) with great accuracy Major uses Protein and peptide identification.
Direct Bacterial Profiling by Matrix-Assisted Laser Desorption−Ionization Time-of-Flight Mass Spectrometry for Identification of Pathogenic Neisseria 
FINGER PRINT RECOGNITION USING MINUTIAE EXTRACTION FOR BANK LOCKER SECURITY Presented by J.VENKATA SUMAN ECE DEPARTMENT GMRIT, RAJAM.
Fig. 1. proFIA approach for peak detection and quantification
S. Emonet, H.N. Shah, A. Cherkaoui, J. Schrenzel 
Volume 67, Issue 6, Pages (June 2005)
Matrix-assisted laser desorption ionization time-of-flight mass spectrometry, a revolution in clinical microbial identification  A. Bizzini, G. Greub 
12. Structure Determination: Mass Spectrometry and Infrared Spectroscopy Based on McMurry’s Organic Chemistry, 7th edition.
Douglas Walker 1, Karan Uppal 2, Dean Jones 2, Tianwei Yu 3,*
12. Structure Determination: Mass Spectrometry and Infrared Spectroscopy Based on McMurry’s Organic Chemistry, 7th edition.
Figure 1 Schematic representation of a typical MALDI-MSI workflow
Detection of microorganisms in blood specimens using matrix-assisted laser desorption ionization time-of-flight mass spectrometry: a review  M. Drancourt 
Proteomics in cardiovascular surgery
Two novel serum biomarkers for endometriosis screened by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry and their change.
S. Emonet, H.N. Shah, A. Cherkaoui, J. Schrenzel 
12. Structure Determination: Mass Spectrometry and Infrared Spectroscopy Based on McMurry’s Organic Chemistry, 7th edition.
Direct Bacterial Profiling by Matrix-Assisted Laser Desorption−Ionization Time-of-Flight Mass Spectrometry for Identification of Pathogenic Neisseria 
Softberry Mass Spectra (SMS) processing tools
F. Bittar, J.-M. Rolain  Clinical Microbiology and Infection 
Automated categorization of methicillin-resistant Staphylococcus aureus clinical isolates into different clonal complexes by MALDI-TOF mass spectrometry 
12. Structure Determination: Mass Spectrometry and Infrared Spectroscopy Based on McMurry’s Organic Chemistry, 7th edition.
Schematic of MS1 filtering.
Bioinformatics for Proteomics
Feature extraction and alignment for LC/MS data
Pierre P. Massion, MD, Richard M. Caprioli, PhD 
Mass Spectrometry THE MAIN USE OF MS IN ORG CHEM IS:
A, Absolute ion intensities of m/z 322, 922 and 1522 as function of the transfer time. A, Absolute ion intensities of m/z 322, 922 and 1522 as function.
The Coming Age of Complete, Accurate, and Ubiquitous Proteomes
Investice do rozvoje vzdělávání
Matrix-assisted laser desorption ionization time-of-flight mass spectrometry identifies 90% of bacteria directly from blood culture vials  W. Moussaoui,
Volume 67, Issue 6, Pages (June 2005)
Presentation transcript:

Peak Detection with Chemical Noise Removal Using Short-Time FFT for a Kind of MALDI Data Xiaobo Zhou HCNR-CBI, Harvard Medical School and Brigham & Women’s Hospital, Boston, MA.

MS Profiling and Biomarkers Biosci Rep 25(1-2):107

MALDI-TOF Mass Spectrometry NCBI Books

Outline  Introduction  Peak detection method  Random noise removal  Chemical noise removal  Peak identification  Results  Conclusion

Introduction  Peak detection is the first step in the biomarker extraction from the MS data.  Proper peak detection methods should be based on the properties of different types of MS data.  Properties of the MALDI (Matrix Assisted Laser Desorption Ionization) data interested:  high mass accuracy and resolution over a wide mass range: m/z can be as large as Da.  the unique noise pattern: low frequency noise peaks at 1 Da apart.

Example of a spectrum  Fig. 1 The largest m/z is more than 9000Da. The two enlarged regions show the detailed signals when m/z is relatively small and large in this spectrum. When m/z is small, the pattern of the chemical noise can be seen clearly, while when m/z is becoming larger, it is weaker. Small m/z Large m/z

Example of chemical noise  Fig. 2 Part of a raw spectrum. The chemical noise has a unique pattern that peaks at 1 Da apart.

Existed methods  Most methods:  Based on the a priori knowledge of the mass of the peptides  Drawback: effective for m/z less than 4000Da  Yu W. et al. :  Gaussian filter to smooth the signal  Gabor quadrature filter to extract the envelope signal  Drawback: the threshold of defining peaks is got via an empirical approach without considering the noise model. Some small peaks can be overlooked / omitted.  Jurgen K. et al:  To remove chemical noise by Fourier transform with fixed window size.  Drawback: useful for the eletrospray quadrupole TOF data, whose maximum m/z is less than 2000Da.

Peak detection method  Proposed model: : number of the spectrum :number of the time of flight (TOF) point : the true intensity :the observed intensity :chemical noise :random noise

Three steps  Random noise removal undecimated wavelet transform to get the smoothed signal software: Rice Wavelet Toolbox (RWT) website:  Chemical noise removal short time discrete Fourier transform with adaptive window size to get an estimate of chemical noise  Peak identification  Normalization  Threshold Approach via SNR  Identification

Chemical noise removal  Why STDFT?  The shape of the chemical noise is similar to the sinusoidal signal.  The number of points in a 1Da region is different and decreasing from lower m/z to higher m/z.  Short Time Discrete Fourier Transform  Choose the window size the region with the same number of points in 1Da (figure in the following page)

Window size of STDFT  Fig. 3 Approximate number of data points (window size) in 1Da region as a function of m/z. With the increasing of m/z, the region with same number of time point is becoming larger.

Estimate of the chemical noise Small m/zLarge m/z Fig. 4 Estimate of the chemical noise for the two enlarged regions in Fig. 1. Black curve: smoothed signal; Blue curve: estimate of the chemical noise. If the smoothed signal minus the estimate of the chemical noise is less than a certain threshold, it is set to be zero.

Peak identification  Normalization of the processed spectrum The spectrum is divided by the mean intensity of the processed spectrum to correct the systematic differences  Definition of the Signal-to-noise ratio (SNR)  Peak identification The peak is located in the position where it has the highest intensity in a cluster of peaks

Peak identification  Fig. 5 Examples of envelope extraction and peak detection. The blue curves show the envelope extraction. Each concave part of the curve corresponds to the same protein. The red circles label the peaks detected by our method. Small m/zLarge m/z

Results:  Data sets - The data got from the serum of a cohort of patients undergoing endarterctomy (EA) (therapy) for occluded carotid arteries (disease). -Here we use the data of a group of all the patients: the symptomatic EA patients (EAS). -After the sample preparation and the data acquisition, we finally got 35 spectra.

 Numerical experiment settings  The threshold of wavelet transform: 6  The threshold of chemical noise removal: 80 percentiles of the value got by subtracting the estimated chemical noise from the smoothed signal in each adaptive region  We define true peaks: the number of peaks in one location is more than 15% of the number of spectra in one group in the same 1Da region.  Position of the peaks: the mean m/z over all the peaks in 1 Da.

Peaks in the EAS group  Fig. 6 All peaks detected in EAS group when SNR is 2. One bad spectrum is removed. Totally 34 spectra are considered.

SNR comparison  Fig. 7 Comparison of SNR between the same spectrum without and with removal of chemical noise. Left: the SNR got without chemical noise removal. Right: SNR got with chemical noise removal.

Conclusions  Proposed an automatic peak detection method for a kind of MALDI data: chemical noise removal to increase the SNR of some small peaks.  Future works:  Peak alignment: to align all the peaks corresponding to the same protein but with shifting  Feature extraction: to select the biomarkers

References  Berndt, P., Hobohm, U., Langen, H., Reliable automatic protein identification from matrix-assisted laser desorption ionization mass spectrometric peptide finger prints, Electrophoresis, 1999, 20,  Breen, E. J, Hopwood, F.G., Williams, K.L., Wilkins, M. R., Automatic poisson peak harvesting for high throughput protein identification, Electrophoresis, 2000, 21,  Coombes, K. R., Tsavachidis, S., Morris, J. S., Baggerly, K. A., Hung, M. C., H. M. Kuerer, PImproved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform, Proteomics, 2005, 5,  Du, P., Kibbe, W. A., Lin, S. M., Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching, Bioinformatics, 2006, 22,  Gras, R., Muller, M., Gasteiger, E., Gay, S., Binz, P. A., Bienvenut, W., Hoogland, C., Sanchez, J. C., Bairoch, A., Hochstrasser, D. R., Appel, R. D., Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection, Electrophoresis, 1999, 20,

References  Jurgen, K., Marc, G., Keith, R., Matthias, W., Noise filtering techniques for electrospray quadrupole time of flight mass spectra, J. Am. Soc. Mass Spectrom., 2003, 14,  Krutchinsky, A. N., Chait, B. T., On the nature of the chemical noise in MALDI mass spectra, Journal American Society for Mass Spectrometry, 2002, 13,  Morris, J. S., Coombes, K. R., Koomen, J., Baggerly, K. A., Kobayashi, R., Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum, Bioinformatics, 2005, 21,  Samuelsson, J., Dalevi, D., Levander, F., Rognvaldsson, T., Modular, scriptable and automated analysis tools for high-throughput peptide mass fingerprinting, Bioinformatics, 2004, 20,  Yasui, Y., McLerran, D., Adam, B., Winget, M., Thornquist, M., Feng, Z., An automated peak-identification / calibration procedure for high- dimensional protein measures from mass spectrometers, Journal of Biomedicine and Biotechnology, 2003, 4,  Yu, W., Wu, B., Lin, N., Stone, K., Williams, K., Zhao, H., Detecting and Aligning Peaks in Mass Spectrometry Data with Applications to MALDI, Computational Biology and Chemistry, 2006, 30,