Identification of variables and parameters for protein data analysis in clinical diagnostics David Yang Leighton Ing Mentor: Dr. Tina Xiao JPL/NASA.

Slides:



Advertisements
Similar presentations
Imaging MS MIAPE Working Document Helmholtz Institute, Munich, April 16 th 2012.
Advertisements

Imaging MS MIAPE Working Document Helmholtz Institute, Munich, April 16 th 2012.
Protein Quantitation II: Multiple Reaction Monitoring
MALDI-TOF Mass Spectrometry and Introduction to Proteomics Dr. Steve Hartson Oklahoma State University Dept. Biochemistry and Molecular Biology Recombinant.
How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
Aug. 20, JPL, SoCalBSI '091 The power of bioinformatics tools in cancer research Early Detection Research Network, JPL Mentors: Dr. Chris Mattmann,
MALDI MS Imaging on the 4800 MALDI TOF/TOF™ Analyzer Prepared by Andrew James, PhD.
Southern California Bioinformatics Summer Institute Wendie Johnston, Beverly Krilowicz, Jamil Momand, Sandra Sharp, Nancy Warter- Perez.
Mass Spectrometry in the Biosciences: Introduction to Mass Spectrometry and Its Uses in a Company Like Decode. Sigurður V. Smárason, Ph.D. New Technologies.
Proteomics The proteome is larger than the genome due to alternative splicing and protein modification. As we have said before we need to know All protein-protein.
PROTEIN IDENTIFICATION BY MASS SPECTROMETRY. OBJECTIVES To become familiar with matrix assisted laser desorption ionization-time of flight mass spectrometry.
Larry Lam Southern California Bioinformatics Summer Institute 2009 Graeber Lab – Crump Institute for Molecular Imaging UCLA A Data Management and Analysis.
Molecular Mass Spectrometry
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
ProReP - Protein Results Parser v3.0©
BWBmin Administrative Web Interface for Paracel BioView WorkBench Frances Tong Marc Rieffel, PhD Paracel Southern California Bioinformatics Summer Institute.
Mass Spectrometry. What are mass spectrometers? They are analytical tools used to measure the molecular weight of a sample. Accuracy – 0.01 % of the total.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
My contact details and information about submitting samples for MS
Proteomics Josh Leung Biology 1220 April 13 th, 2010.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Proteome.
The dynamic nature of the proteome
es/by-sa/2.0/. Large Scale Approaches to the Study of Protein Levels and Activity Prof:Rui Alves
Mass Spectrometry I Basic Data Processing. Mass spectrometry A mass spectrometer measures molecular masses. The mass unit is called dalton, which is 1/12.
INF380 - Proteomics-61 INF380 – Proteomics Chapter 6 – Mass Spectrometry – MALDI TOF The MALDI-TOF instruments are the simplest MS instruments suitable.
Laxman Yetukuri T : Modeling of Proteomics Data
MS Calibration for Protein Profiles We need calibration for –Accurate mass value Mass error: (Measured Mass – Theoretical Mass) X 10 6 ppm Theoretical.
Quantification of Membrane and Membrane- Bound Proteins in Normal and Malignant Breast Cancer Cells Isolated from the Same Patient with Primary Breast.
Temple University MASS SPECTROMETRY FURTHER INVESTIGATIONS Ilyana Mushaeva and Amber Moscato Department of Electrical and Computer Engineering Temple University.
Temple University MASS SPECTROMETRY INTRODUCTION Ilyana Mushaeva and Amber Moscato Department of Electrical and Computer Engineering Temple University.
INF380 - Proteomics-51 INF380 – Proteomics Chapter 5 – Fundamentals of Mass Spectrometry Mass spectrometry (MS) is used for measuring the mass-to-charge.
Peak Detection with Chemical Noise Removal Using Short-Time FFT for a Kind of MALDI Data Xiaobo Zhou HCNR-CBI, Harvard Medical School and Brigham & Women’s.
High throughput Protein Measurement Techniques Harin Kanani.
Detection of Labeling Markers on Synthetic DNA molecules Background Deoxyribonucleic acid (DNA) is the “code of life” that provides the recipe of genetic.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Software Project MassAnalyst Roeland Luitwieler Marnix Kammer April 24, 2006.
PEAKS: De Novo Sequencing using Tandem Mass Spectrometry Bin Ma Dept. of Computer Science University of Western Ontario.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
SVM-based techniques for biomarker discovery in proteomic pattern data Elena Marchiori Department of Computer Science Vrije Universiteit Amsterdam.
Separates charged atoms or molecules according to their mass-to-charge ratio Mass Spectrometry Frequently.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
2014 생화학 실험 (1) 6주차 실험조교 : 류 지 연 Yonsei Proteome Research Center 산학협동관 421호
Constructing high resolution consensus spectra for a peptide library
Protein quantitation I: Overview (Week 5). Fractionation Digestion LC-MS Lysis MS Sample i Protein j Peptide k Proteomic Bioinformatics – Quantitation.
What is Mass Spectrometry? Mass spectrometry could be considered as an analytical technique that involves the study in the gas phase of ionized molecules.
RANIA MOHAMED EL-SHARKAWY Lecturer of clinical chemistry Medical Research Institute, Alexandria University MEDICAL RESEARCH INSTITUTE– ALEXANDRIA UNIVERSITY.
Yonsei Proteome Research Center Peptide Mass Finger-Printing Part II. MALDI-TOF 2013 생화학 실험 (1) 6 주차 자료 임종선 조교 내선 6625.
Peptide Mass Finger-Printing Part II. MALDI-TOF
Mass Spectrometry makes it possible to measure protein/peptide masses (actually mass/charge ratio) with great accuracy Major uses Protein and peptide identification.
The Covalent Structure of Proteins
The Syllabus. The Syllabus Safety First !!! Students will not be allowed into the lab without proper attire. Proper attire is designed for your protection.
Detection of Labeling Markers on Synthetic DNA molecules
2 Dimensional Gel Electrophoresis
S. Emonet, H.N. Shah, A. Cherkaoui, J. Schrenzel 
Bioinformatics Solutions Inc.
Schematic of the principles of mass spectrometry (MS).
V. Protein Chips 1. What is Protein Chips 2. How to Make Protein Chips
Proteomics Informatics David Fenyő
Figure 1 Schematic representation of a typical MALDI-MSI workflow
Metabolomics: Preanalytical Variables
S. Emonet, H.N. Shah, A. Cherkaoui, J. Schrenzel 
Softberry Mass Spectra (SMS) processing tools
Mass Spectrometry THE MAIN USE OF MS IN ORG CHEM IS:
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
Proteomics Informatics David Fenyő
Technology behind novel diagnostic methods for fungal infections.
General schematic for MS analysis of ionized microbiological isolates and clinical material. General schematic for MS analysis of ionized microbiological.
Presentation transcript:

Identification of variables and parameters for protein data analysis in clinical diagnostics David Yang Leighton Ing Mentor: Dr. Tina Xiao JPL/NASA

Proteomics National Cancer Institute and Early Detection Resource Network - Clinical Diagnostics Analyzing protein signature for general characterization of normal vs. pathogenic states

Project Goals Characterize the experimental variables which affect Mass Spectrometry(MS) output & the necessary steps of MS data processing What influences output and how do we correct for those influences? What information do other users need? Identify parameters for software evaluation in the processing of MS data.

Methodology Research a method of protein analysis Research the mechanics Analyze how the mechanics influence the output Recognize data important to other users Identify the data processing steps for extracting a useful spectrum

Method of Protein Analysis Mass spectrometry Measures quantity of molecules with specific mass to charge ratios Produces output which could be used as a protein signature Matrix Assisted Laser Desorption/Ionization Time of Flight for protein analysis

Matrix Assisted Laser Desorption/Ionization (MALDI) Light Mass Analyzer Protein sample

Time of Flight (TOF) Ionized particles accelerated by magnetic field

MALDI-TOF-MS MALDI TOF Mass Spectrometry of a protein sample has three elements with parameters that influence output Inconsistencies between them reduce the ability to compare samples Produce variation which is not necessarily caused by protein composition of sample

Sample Freeze/thaw cycles Source of sample Serum vs tissue Fractionated? Digested w/ protease?

Laser Ionization/Desorption Plate and Matrix used in LDI Crystallization pattern Laser intensity

Plate and Matrix

Laser Ionization/Desorption Plate and Matrix used in LDI Crystallization pattern Laser intensity

Crystallization Randomized process Introduces variation between shots

Laser Ionization/Desorption Plate and Matrix used in LDI Crystallization pattern Laser intensity

Mass analyzer Mass calibration Internal vs external Reflectron usage Detector voltage Detector saturation

Mass Calibration Internal External Sample + Standard SampleStandard

Mass analyzer Mass calibration Internal vs external Reflectron usage Detector voltage Detector saturation

Reflectron

Mass analyzer Mass calibration Internal vs external Reflectron usage Detector voltage Detector saturation

Mass analyzer Mass calibration Internal vs external Reflectron usage Detector voltage Detector saturation

Output Processing Understanding the mechanics tells us what we need to do to process the output Usability of raw output for protein signature comparison is limited

Baseline Correction High KE ions saturate the detector, resulting in a higher intensity output Malyarenko et al. Enhancement of Sensitivity and Resolution of Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometric Records for Serum Peptides Using Time-Series Analysis Techniques

Mass Calibration Required to convert time series output into m/z ratio

Normalization Scale the intensities based on the largest intensity Improves ability to compare samples by reducing the variability of intensity between spectra

Smoothing Decrease effects of electrical system noise

Peak detection Identify potential masses Reduces number of features which need to compared Where am I?

Peak alignment Aligns corresponding peaks across samples Reduces phase variation across samples by ensuring that peptides share their set of peak locations

Averaging of spectra Address variability between runs by averaging replicates Recall crystallization and shot variability Averaging of multiple laser shots often performed by machine

Results Identified vital information that affects the output of the machine Information useful for a researcher using the spectra Researched the processes which make the output more useful as protein signature Next step: Identify parameters for software evaluation in MS data processing

Goal – Identify parameters for evaluating software capabilities in the processing and analysis of Mass Spectrometry data. Three candidates VIBE (Incogen Inc.) geWorkbench (Forge) S-PLUS (Insightful Corp.)

Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations

Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations

General Parameters Platform/Operating system compatibility? Is the software Open source? Is the software capable of performing the necessary tasks independently? Additional modifications? Internet access? Server ?

Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations

Data Input What types of file formats can the software open? Import? What type of format must the data be? DNA (nucleotides – A, T, G, C) Proteins (amino acids – M, L, A, I, etc.)

Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations

Software algorithms necessary for Proteomics data analysis Can the software perform: Baseline subtractions? Mass calibrations? Noise reductions? Peak identifications? Normalization? Peak alignments?

Baseline Subtraction (Malyarenko, et al. 2005)

Mass Calibration (Kearsleya, et al. 2005)

Smoothing/Noise Reduction (Malyarenko, et al. 2005)

Peak Identifications (Do, 2006)

Normalization (Kearsleya, et al. 2005)

Peak Alignments (Malyarenko, et al. 2005)

Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations

Results – Visualization of results How can you visualize the data? Save/Export work Can you save/export your results? If yes, what format can it save/export? Once saved, can the files be opened by other software packages? Print out Can you print out a hard copy for record?

Visualization MUSCLE (Edgar) VIBE (Incogen Inc.)

Results – Visualization of results How can you visualize the data? Save/Export work Can you save/export your results? If yes, what format can it save/export? Once saved, can the files be opened by other software packages? Print out Can you print out a hard copy for record?

Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations

Software Benefits What benefits does the software offer? Convenience of integrated modules Efficient – saves “man-power” of having to sit there and do everything User-friendly interface

Convenience of Integrated Modules

Efficiency

User-friendly Interface?

Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations

Software Limitations Limitations customization Small modifications to existing modules? Adding a new module? Internet/Server Dependent?

Conclusion – We have identified these parameters to be crucial for the processing of MS data. Baseline subtractions Mass calibrations Noise reductions Peak identifications Normalization Peak alignments

Conclusion – VIBE Capable of manipulating protein sequences, but unable to process raw data. geWorkbench Did not pass general parameters for installation. S-Plus Evaluation still in progress…

VIBE (by Incogen Inc.) Convenient integration of nucleotide and amino acid analysis tools – BLAST (–X, –N, –P, TBLASTN, TBLASTP) Nucleotide and AA search FASTA, –X, –Y, Smith-Waterman, etc. Sequence manipulations Primer3, Conditional Filters, Translations, etc. Sequence alignments Crossmatch, ClustalW, Hidden Markov Model, etc.

Conclusion – We have identified these parameters to be crucial for the processing of MS data. Baseline subtractions Mass calibrations Noise reductions Peak identifications Normalization Peak alignments

Conclusion – VIBE Capable of manipulating protein sequences, but unable to process raw data. geWorkbench Did not pass general parameters for installation. S-Plus Evaluation still in progress…

Conclusion – VIBE Capable of manipulating protein sequences, but unable to process raw data. geWorkbench Did not pass general parameters for installation. S-Plus Evaluation still in progress…

Literature Citations 1) Do, P. Improved Peak Detection in Mass Spectrometry Spectrum by Incorporating Continuous Wavelet Transform-based Pattern Matching. Robert H. Lurie Comprehensive Cancer Center, Northwestern University. ppt slides ) Kearsleya, A., Wallaceb, W.E., Bernala, J., and CM Guttmanb. A numerical method for mass spectral data analysis. Applied Mathematics Letters. 18:1412– 1417, ) Malyarenko, D.I., Cooke, W.E., Adam B-L, Malik, G., Chen, H., Tracy, E.R., Trosset, M.W., Sasinowski, M., Semmes, O.J. and D.M. Manos. Enhancement of Sensitivity and Resolution of Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometric Records for Serum Peptides Using Time- Series Analysis Techniques. Clinical Chemistry. 51(1):

Acknowledgements Jet Propulsion Laboratory Dr. Tina Xiao Southern California Bioinformatics Summer Institute (SoCalBSI) Dr. Sandra Sharp Dr. Jamil Momand Dr. Wendie Johnston Dr. Nancy Warter-Perez Ronnie Cheng Friends Duke University Medical Center Dr. Simon Lin Center for Disease Control and Prevention (CDC) Dr. R Cameron Craddock Huntington Medical Research Institute (HMRI) Dr. James Riggins Dr. Alfred Fonteh