Novel Algorithms for the Quantification Confidence in Quantitative Proteomics with Stable Isotope Labeling* Novel Algorithms for the Quantification Confidence.

Slides:



Advertisements
Similar presentations
Protein Quantitation II: Multiple Reaction Monitoring
Advertisements

Improvements in Mass Spectrometry for Life Science Research – Does Agilent Have the Answer? Ashley Sage PhD.
1 st MS 2 2 nd 3 rd 4 th 5 th 6 th 10 th 9 th 8 th 7 th Relative Intensity Fill Times Scan Times “shotgun sequencing”
1336 SW Bertha Blvd, Portland OR 97219
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
How to identify peptides October 2013 Gustavo de Souza IMM, OUS.
Integration of HPLC-FTICR MS and HPLC-QIT MS 2 to Achieve Enhanced Proteome Characterization Chongle Pan 1,2,3 Nathan VerBerkmoes 1,3 Praveen Chandramohan.
Yoona Kim University of California, San Diego UCSD Mass Spectrometry Journal Club 12/03/10.
CALIBRATION Prof.Dr.Cevdet Demir
Building and Using Libraries of Peptide Ion Fragmentation Spectra S.E. Stein, L.E. Kilpatrick, M. Mautner, P. Neta, J. Roth National Institute of Standards.
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Previous Lecture: Regression and Correlation
Scaffold Download free viewer:
Analysis of tandem mass spectra - II Prof. William Stafford Noble GENOME 541 Intro to Computational Molecular Biology.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Absolute protein quantification estimated by spectral counting using large datasets in PeptideAtlas Ning Zhang 1*, Eric W. Deutsch 1*, Henry Lam 1, Hamid.
Spectral Counting. 2 Definition The total number of identified peptide sequences (peptide spectrum matches) for the protein, including those redundantly.
Proteomics Informatics Workshop Part III: Protein Quantitation
Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications.
Identification of regulatory proteins from human cells using 2D-GE and LC-MS/MS Victor Paromov Christian Muenyi William L. Stone.
ICR-2LS Tutorial Gordon Anderson Pacific Northwest National Laboratory
Proteomics Informatics – Data Analysis and Visualization (Week 13)
Comparison of chicken light and dark meat using LC MALDI-TOF mass spectrometry as a model system for biomarker discovery WP 651 Jie Du; Stephen J. Hattan.
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
The dynamic nature of the proteome
Overview course in Statistics (usually given in 26h, but now in 2h)  introduction of basic concepts of probability  concepts of parameter estimation.
© 2010 SRI International - Company Confidential and Proprietary Information Quantitative Proteomics: Approaches and Current Capabilities Pathway Tools.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
PROTEIN QUANTIFICATION AND PTM JUN SIN HSS.I. PROJECT 1.
MS/MS Libraries of Identified Peptides and Recurring Spectra in Protein Digests Lisa Kilpatrick, Jeri Roth, Paul Rudnick, Xiaoyu Yang, Steve Stein Mass.
Correlation and Regression Chapter 9. § 9.3 Measures of Regression and Prediction Intervals.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Automatic Minirhizotron Root Image Analysis Using Two-Dimensional Matched Filtering and Local Entropy Thresholding Presented by Guang Zeng.
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
PeptideProphet Explained Brian C. Searle Proteome Software Inc SW Bertha Blvd, Portland OR (503) An explanation.
Metabolomics Metabolome Reflects the State of the Cell, Organ or Organism Change in the metabolome is a direct consequence of protein activity changes.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
Innovative Paths to Better Medicines Design Considerations in Molecular Biomarker Discovery Studies Doris Damian and Robert McBurney June 6, 2007.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview G. Jogesh Babu. Overview of Astrostatistics A brief description of modern astronomy & astrophysics. Many statistical concepts have their roots.
Isotope Labeled Internal Standards in Skyline
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Salamanca, March 16th 2010 Participants: Laboratori de Proteomica-HUVH Servicio de Proteómica-CNB-CSIC Participants: Laboratori de Proteomica-HUVH Servicio.
ISOMATCH-web For automatic matching of isotope peak distributions ■ Automatic matching of a raw spectrum (ASCII format) to theoretical isotopic distributions.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Mascot Example Slides. MS/MS Database Search Example Data: BSAonespectra.mgf (one spectra) Database: bovine Fixed modifications: Carboxymethyl(C )
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
DIA Method Design, Data Acquisition, and Assessment
Protein quantitation I: Overview (Week 5). Fractionation Digestion LC-MS Lysis MS Sample i Protein j Peptide k Proteomic Bioinformatics – Quantitation.
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
MS Libraries for Forensics: DART-MS and GC-MS
Measurement, Quantification and Analysis
Review 1. Describing variables.
MassMatrix Search Results Explained
Proteomics Informatics David Fenyő
Nat. Rev. Nephrol. doi: /nrneph
Now, More Than Ever, Proteomics Needs Better Chromatography
Metabolomics: Preanalytical Variables
Best Practices for Identification and Quantitation
NoDupe algorithm to detect and group similar mass spectra.
Schematic of MS1 filtering.
Bioinformatics for Proteomics
Feature extraction and alignment for LC/MS data
Is Proteomics the New Genomics?
Sim and PIC scoring results for standard peptides and the test shotgun proteomics dataset. Sim and PIC scoring results for standard peptides and the test.
Proteomics Informatics David Fenyő
Kuen-Pin Wu Institute of Information Science Academia Sinica
Operation manual of AI SIDA
Presentation transcript:

Novel Algorithms for the Quantification Confidence in Quantitative Proteomics with Stable Isotope Labeling* Novel Algorithms for the Quantification Confidence in Quantitative Proteomics with Stable Isotope Labeling* Chongle Pan 1,2 ; David L. Tabb 1 ; Dale Pelletier 1 ; W. Hayes McDonald 1 ; Greg Hurst 1 ; Nagiza F. Samatova 1 ; Robert L. Hettich 1 ; 1 Oak Ridge National Laboratory, Oak Ridge, TN 2 Genome Science and Technology, UT-ORNL * Research support provided by the U.S. Department of Energy, Office of Biological and Environmental Research.

Uncertainty in the Measurements Mass spectrometric measurement of a protein Mr = 23,564 Da ±10 Da 95% confidence Relative quantification of a protein in quantitative proteomics Abundance ratio = 1:1 95% confidence interval = [2:1, 1:2] The principal aim RelEx 1, ASAPratio 2, XPRESS 3, MSQuan 4 1 Anal Chem, : p Anal Chem, : p Nat Biotechnol, : p Nat Biotechnol, : p

Experimental 1. Metabolic labeling of Rhodopseudomonas palustris with the stable isotope 15 N 2. Standard mixtures of natural and 15 N-labeled proteomes at the pre-determined mixing ratios 3. Shotgun proteomics analysis –MS instrument: linear ion trap (LTQ, Finnigan) –2D-LC method: 24-hour MudPIT technique 5 4. Protein identification –Database searching: DBDigger 6 –Identification filtering: DTASelect 7 5 Int. J. of Mass Spec : p Anal Chem, : p J. Proteome Res : p

Benchmark Data Peptide I.D. filtering: 95% of true positive rate Protein I.D. filtering: minimum of 2 peptides Data quality Reproducibility

MS1 or mzXML format SIC reconstruction peak detection peptide quantification protein quantification maximum likelihood estimation principal component analysis parallel paired covariance Block Diagram selected ion chromatogram mass spectral data chromatographic peak peptide abundance ratio confidence score protein abundance ratio confidence interval

Peak Detection covariance scan number ion intensity Light isotopologue SIC; Heavy isotopologue SIC S/N=3; S/N=13 S/N=42 Parallel paired covariance chromatogram (PPC) Peak boundaries are defined as the local minima in the PPC, which include all MS/MS matching the peptide Peak boundaries

Peptide Quantification Peptide abundance ratios can be estimated by  Peak height ratio scan number ion intensity scan number  Peak area ratio ASAPratio, MSQuan, XPRESS

heavy isotopologue ion intensity light isotopologue ion intensity Peptide Quantification ion intensity scan number RelEx Linear regression RelEx ratio = tan( θ ) θ PC 1 PC 2 Principal component analysis (PCA) signal-to-noise ratio = PCA-SNR θ

Quantification Accuracy Peptide counts log 2 (ratio) Expected log 2 (ratio) Peak height ratio Peak area ratio PCA/linear regression 1:5

5:1 10:1 1:10 1:1 log 2 (ratio) Peptide counts log 2 (ratio) Quantification Accuracy 1:15:1 1:51:10 10:1

Quantification Confidence log 2 (ratio) log 2 (PCA-SNR) peptide counts 5:1 2D histogram of peptide log 2 (ratio) & log 2 (PCA-SNR)

Quantification Confidence log 2 (ratio) log 2 (PCA-SNR) 5:1 Bin the peptides by their log 2 (PCA-SNR) value Bias: the deviation of the average estimated log 2 (ratio) from the expected log 2 (ratio) Bias increases as PCA-SNR decreases below a threshold

Quantification Confidence log 2 (ratio) log 2 (PCA-SNR) 5:1 Bin the peptides by their log 2 (PCA-SNR) value Variance: the variability of the estimated log 2 (ratio) Variance increases as PCA-SNR decreases

Quantification Confidence 1:5 log 2 (ratio) log 2 (S/N) Comet-like two-dimensional distribution As log 2 (SNR) decreases, the spread of log 2 (ratio) estimates increases the spread of log 2 (ratio) estimates increases the average of log 2 (ratio) estimates regresses to zero the average of log 2 (ratio) estimates regresses to zero log 2 (ratio) log 2 (PCA-SNR) log 2 (ratio) log 2 (PCA-SNR) 1:1 1:10 10:1 1:1 5:1

Quantification Confidence log 2 (PCA-SNR) | mean { log 2 (ratio) } | 5:1&1:510:1&1:10 1:1 log 2 (PCA-SNR) standard deviation { log 2 (ratio) } 1:1 5:1&1:5 10:1&1:10 The quantification bias and variance for peptides are linear functions of PCA-SNR

Protein Quantification log 2 (ratio) log 2 (PCA-SNR) mean Maximum likelihood point estimate of a protein’s abundance ratio is the ratio that best explains its measured peptides’ estimated log 2 (ratio) at the calculated log 2 (PCA-SNR) 2 sd measured peptides A series of theoretical probability distributions of peptide abundance ratio estimates at each PCA-SNR level

Quantification Accuracy RelEx filtering: > 0.7 correlation at 1 > 0.4 correlation at 10 > 3 signal-to-noise > 2 peptides log 2 (ratio) protein counts MSE: Mean Square Error PRATIO filtering: > 2 PCA-SNR > 2 peptides < 4 95% confidence interval width for log 2 (ratio)

5:1 1:10 RelEx: red; PRATIO: blue Quantification Accuracy log 2 (ratio) 1:1 5:1 1:5 1:10 10:1 protein counts

Confidence Interval Estimation log 2 (ratio) 1:5 Display of the point estimates (+) and the 95% confidence interval estimates ( ) for protein abundance ratios Protein

Confidence Interval Estimation Point estimates and confidence interval estimates of protein abundance ratios log 2 (ratio) 1:11:5 5:110:1 1:1 1:10

Conclusions Three novel algorithms  Parallel paired covariance for peak detection  Principal component analysis for peptide quantification  Maximum likelihood estimation for protein quantification Improved Protein Quantification Accuracy Rigorous Confidence Interval Estimation Three novel algorithms  Parallel paired covariance for peak detection  Principal component analysis for peptide quantification  Maximum likelihood estimation for protein quantification Improved Protein Quantification Accuracy Rigorous Confidence Interval Estimation The fully automated program with graphic user interface is freely available for testing by contacting C. Pan (