Download presentation
Presentation is loading. Please wait.
1
Identification of variables and parameters for protein data analysis in clinical diagnostics David Yang Leighton Ing Mentor: Dr. Tina Xiao JPL/NASA
2
Proteomics National Cancer Institute and Early Detection Resource Network - Clinical Diagnostics Analyzing protein signature for general characterization of normal vs. pathogenic states
3
Project Goals Characterize the experimental variables which affect Mass Spectrometry(MS) output & the necessary steps of MS data processing What influences output and how do we correct for those influences? What information do other users need? Identify parameters for software evaluation in the processing of MS data.
4
Methodology Research a method of protein analysis Research the mechanics Analyze how the mechanics influence the output Recognize data important to other users Identify the data processing steps for extracting a useful spectrum
5
Method of Protein Analysis Mass spectrometry Measures quantity of molecules with specific mass to charge ratios Produces output which could be used as a protein signature Matrix Assisted Laser Desorption/Ionization Time of Flight for protein analysis
6
Matrix Assisted Laser Desorption/Ionization (MALDI) Light Mass Analyzer Protein sample
7
Time of Flight (TOF) Ionized particles accelerated by magnetic field
8
MALDI-TOF-MS MALDI TOF Mass Spectrometry of a protein sample has three elements with parameters that influence output Inconsistencies between them reduce the ability to compare samples Produce variation which is not necessarily caused by protein composition of sample
9
Sample Freeze/thaw cycles Source of sample Serum vs tissue Fractionated? Digested w/ protease?
10
Laser Ionization/Desorption Plate and Matrix used in LDI Crystallization pattern Laser intensity
11
Plate and Matrix
12
Laser Ionization/Desorption Plate and Matrix used in LDI Crystallization pattern Laser intensity
13
Crystallization Randomized process Introduces variation between shots
14
Laser Ionization/Desorption Plate and Matrix used in LDI Crystallization pattern Laser intensity
15
Mass analyzer Mass calibration Internal vs external Reflectron usage Detector voltage Detector saturation
16
Mass Calibration Internal External Sample + Standard SampleStandard
17
Mass analyzer Mass calibration Internal vs external Reflectron usage Detector voltage Detector saturation
18
Reflectron
19
Mass analyzer Mass calibration Internal vs external Reflectron usage Detector voltage Detector saturation
20
Mass analyzer Mass calibration Internal vs external Reflectron usage Detector voltage Detector saturation
21
Output Processing Understanding the mechanics tells us what we need to do to process the output Usability of raw output for protein signature comparison is limited
22
Baseline Correction High KE ions saturate the detector, resulting in a higher intensity output Malyarenko et al. Enhancement of Sensitivity and Resolution of Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometric Records for Serum Peptides Using Time-Series Analysis Techniques
23
Mass Calibration Required to convert time series output into m/z ratio
24
Normalization Scale the intensities based on the largest intensity Improves ability to compare samples by reducing the variability of intensity between spectra www.psrc.usm.edu/mauritz/maldi.html
25
Smoothing Decrease effects of electrical system noise
26
Peak detection Identify potential masses Reduces number of features which need to compared Where am I?
27
Peak alignment Aligns corresponding peaks across samples Reduces phase variation across samples by ensuring that peptides share their set of peak locations
28
Averaging of spectra Address variability between runs by averaging replicates Recall crystallization and shot variability Averaging of multiple laser shots often performed by machine
29
Results Identified vital information that affects the output of the machine Information useful for a researcher using the spectra Researched the processes which make the output more useful as protein signature Next step: Identify parameters for software evaluation in MS data processing
30
Goal – Identify parameters for evaluating software capabilities in the processing and analysis of Mass Spectrometry data. Three candidates VIBE (Incogen Inc.) geWorkbench (Forge) S-PLUS (Insightful Corp.)
31
Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations
32
Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations
33
General Parameters Platform/Operating system compatibility? Is the software Open source? Is the software capable of performing the necessary tasks independently? Additional modifications? Internet access? Server ?
34
Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations
35
Data Input What types of file formats can the software open? Import? What type of format must the data be? DNA (nucleotides – A, T, G, C) Proteins (amino acids – M, L, A, I, etc.)
36
Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations
37
Software algorithms necessary for Proteomics data analysis Can the software perform: Baseline subtractions? Mass calibrations? Noise reductions? Peak identifications? Normalization? Peak alignments?
38
Baseline Subtraction (Malyarenko, et al. 2005)
39
Mass Calibration (Kearsleya, et al. 2005)
40
Smoothing/Noise Reduction (Malyarenko, et al. 2005)
41
Peak Identifications (Do, 2006)
42
Normalization (Kearsleya, et al. 2005)
43
Peak Alignments (Malyarenko, et al. 2005)
44
Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations
45
Results – Visualization of results How can you visualize the data? Save/Export work Can you save/export your results? If yes, what format can it save/export? Once saved, can the files be opened by other software packages? Print out Can you print out a hard copy for record?
46
Visualization MUSCLE (Edgar) VIBE (Incogen Inc.)
47
Results – Visualization of results How can you visualize the data? Save/Export work Can you save/export your results? If yes, what format can it save/export? Once saved, can the files be opened by other software packages? Print out Can you print out a hard copy for record?
48
Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations
49
Software Benefits What benefits does the software offer? Convenience of integrated modules Efficient – saves “man-power” of having to sit there and do everything User-friendly interface
50
Convenience of Integrated Modules
51
Efficiency
52
User-friendly Interface?
53
Software Evaluation General parameters Input formats Algorithms for processing and analysis of proteomics data Results Benefits Limitations
54
Software Limitations Limitations customization Small modifications to existing modules? Adding a new module? Internet/Server Dependent?
55
Conclusion – We have identified these parameters to be crucial for the processing of MS data. Baseline subtractions Mass calibrations Noise reductions Peak identifications Normalization Peak alignments
56
Conclusion – VIBE Capable of manipulating protein sequences, but unable to process raw data. geWorkbench Did not pass general parameters for installation. S-Plus Evaluation still in progress…
57
VIBE (by Incogen Inc.) Convenient integration of nucleotide and amino acid analysis tools – BLAST (–X, –N, –P, TBLASTN, TBLASTP) Nucleotide and AA search FASTA, –X, –Y, Smith-Waterman, etc. Sequence manipulations Primer3, Conditional Filters, Translations, etc. Sequence alignments Crossmatch, ClustalW, Hidden Markov Model, etc.
58
Conclusion – We have identified these parameters to be crucial for the processing of MS data. Baseline subtractions Mass calibrations Noise reductions Peak identifications Normalization Peak alignments
59
Conclusion – VIBE Capable of manipulating protein sequences, but unable to process raw data. geWorkbench Did not pass general parameters for installation. S-Plus Evaluation still in progress…
60
Conclusion – VIBE Capable of manipulating protein sequences, but unable to process raw data. geWorkbench Did not pass general parameters for installation. S-Plus Evaluation still in progress…
61
Literature Citations 1) Do, P. Improved Peak Detection in Mass Spectrometry Spectrum by Incorporating Continuous Wavelet Transform-based Pattern Matching. Robert H. Lurie Comprehensive Cancer Center, Northwestern University. ppt slides. 2006. 2) Kearsleya, A., Wallaceb, W.E., Bernala, J., and CM Guttmanb. A numerical method for mass spectral data analysis. Applied Mathematics Letters. 18:1412– 1417, 2005. 3) Malyarenko, D.I., Cooke, W.E., Adam B-L, Malik, G., Chen, H., Tracy, E.R., Trosset, M.W., Sasinowski, M., Semmes, O.J. and D.M. Manos. Enhancement of Sensitivity and Resolution of Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometric Records for Serum Peptides Using Time- Series Analysis Techniques. Clinical Chemistry. 51(1):65-74. 2005.
62
Acknowledgements Jet Propulsion Laboratory Dr. Tina Xiao Southern California Bioinformatics Summer Institute (SoCalBSI) Dr. Sandra Sharp Dr. Jamil Momand Dr. Wendie Johnston Dr. Nancy Warter-Perez Ronnie Cheng Friends Duke University Medical Center Dr. Simon Lin Center for Disease Control and Prevention (CDC) Dr. R Cameron Craddock Huntington Medical Research Institute (HMRI) Dr. James Riggins Dr. Alfred Fonteh
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.