ADVANCEMENT IN PROTEIN INFERENCE FROM SHOTGUN PROTEOMICS USING PEPTIDE DETECTABILITY PEDRO ALVES Advisor: Predrag Radivojac School of Informatics BLOOMINGTON.

Slides:



Advertisements
Similar presentations
Lipids Analytical Tool (LipidAT): automated analysis of lipidomic mass spectrometry data Jun Ma Advisor: Dr. Haixu Tang Co-Advisor: Dr. David Wild Co-Advisor.
Advertisements

MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
ProReP - Protein Results Parser v3.0©
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Computational Methods for Biomarker Discovery in Proteomics and Glycomics Vijetha Vemulapalli School of Informatics Indiana University Capstone Advisor:
Part I: Classification and Bayesian Learning
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Previous Lecture: Regression and Correlation
This work is licensed under a Creative Commons Attribution 4.0 International License. Oliver Kohlbacher, Sven Nahnsen, Knut Reinert COMPUTATIONAL PROTEOMICS.
A Neural Network Predictor for Peptide Fragmentation in Mass Spectrometry Arunima Ram Advisor : Dr. Predrag Radivojac Co-Advisor : Dr. Haixu Tang Co-Advisor.
Analysis of tandem mass spectra - II Prof. William Stafford Noble GENOME 541 Intro to Computational Molecular Biology.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Proteomics Informatics Workshop Part III: Protein Quantitation
Proteomics Informatics – Data Analysis and Visualization (Week 13)
Center for Human Health and the Environment
How to assure MIAPE compliance of the data using the ProteoRed MIAPE Extractor tool HUPO-PSI meeting - Liverpool (15th April 2013) Salvador Martínez-Bartolomé.
Acknowledgements This work is supported by NSF award DBI , and National Center for Glycomics and Glycoproteomics, funded by NIH/NCRR grant 5P41RR
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Laxman Yetukuri T : Modeling of Proteomics Data
A Study of Residue Correlation within Protein Sequences and its Application to Sequence Classification Christopher Hemmerich Advisor: Dr. Sun Kim.
PeptideProphet Explained Brian C. Searle Proteome Software Inc SW Bertha Blvd, Portland OR (503) An explanation.
Associating Biomedical Terms: Case Study for Acetylation Aaron Buechlein Indiana University School of Informatics Advisor: Dr. Predrag Radivojac.
PREDICTION OF CATALYTIC RESIDUES IN PROTEINS USING MACHINE-LEARNING TECHNIQUES Natalia V. Petrova (Ph.D. Student, Georgetown University, Biochemistry Department),
Shared Peptides in Mass Spectrometry Based Protein Quantification Banu Dost, Nuno Bandeira, Xiangqian Li, Zhouxin Shen, Steve Briggs, Vineet Bafna University.
PREETI MISRA Advisor: Dr. HAIXU TANG SCHOOL OF INFORMATICS - INDIANA UNIVERSITY Computational method to analyze tandem repeats in eukaryote genomes.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
A New Strategy of Protein Identification in Proteomics Xinmin Yin CS Dept. Ball State Univ.
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
TOX680 Unveiling the Transcriptome using RNA-seq Jinze Liu.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Salamanca, March 16th 2010 Participants: Laboratori de Proteomica-HUVH Servicio de Proteómica-CNB-CSIC Participants: Laboratori de Proteomica-HUVH Servicio.
Ubiquitination Sites Prediction Dah Mee Ko Advisor: Dr.Predrag Radivojac School of Informatics Indiana University May 22, 2009.
Observation vs. Inferences The Local Environment.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Minimize Database-Dependence in Proteome Informatics Apr. 28, 2009 Kyung-Hoon Kwon Korea Basic Science Institute.
Using Scaffold OHRI Proteomics Core Facility. This presentation is intended for Core Facility internal training purposes only.
Cedar: A Multi-Tiered Protein Identification Scheme for Shotgun Proteomics Terry Farrah (1); Eric Deutsch (1); Gilbert Omenn (2,1); Ruedi Aebersold (3),
Ho-Tak Lau, Hyong Won Suh, Martin Golkowski, and Shao-En Ong
Constituent protein characteristics Constituent detected peptide characteristics Figure S1A.
Considerations for multi-omics data integration Michael Tress CNIO,
Protein identification by mass spectrometry The shotgun proteomics strategy, based on digesting proteins into peptides and sequencing them using tandem.
Protein identification by mass spectrometry The shotgun proteomics strategy, based on digesting proteins into peptides and sequencing them using tandem.
Custom peptide synthesis services In the quantitative proteomics research, several MS-based methodologies for relative quantification have been introduced.
Post translational modification n- acetylation Peptide Mass Fingerprinting (PMF) is an analytical technique for identifying unknown protein. Proteins to.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Yiming Yang1,2, Abhay Harpale1 and Subramanian Ganaphathy1
Bottom-Up Proteomics Data collection
LC-MS/MS Identification of Impurities Present in Synthetic Peptide Drugs Dr Anna Meljon*, Dr Alan Thompson, Dr Osama Chahrour, and Dr John Malone Almac.
Conceptual approach for incorporating “omics” technologies and resulting large databases into toxicological evaluation. Data from experiments that evaluate.
Protein Inference by Generalized Protein Parsimony reduces False Positive Proteins in Bottom-Up Workflows Nathan J. Edwards, Department of Biochemistry.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Proteomics Informatics David Fenyő
NoDupe algorithm to detect and group similar mass spectra.
Distributions of the ELDP values and Mascot scores for all protein identifications.a, frequency of ELDP value returned by correct (gray bars) and incorrect.
Significantly enriched phosphorylation motifs from up-regulated phosphopeptides by Motif-X analysis. Significantly enriched phosphorylation motifs from.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Bioinformatics for Proteomics
Identification of chaperonin GroEL (Rv0440) with representative MS/MS spectrum. Identification of chaperonin GroEL (Rv0440) with representative MS/MS spectrum.A,
Bo Li, Akshay Tambe, Sharon Aviran, Lior Pachter  Cell Systems 
Protein identification using MS/MS.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
A simplified example of a protein summary list.
Sim and PIC scoring results for standard peptides and the test shotgun proteomics dataset. Sim and PIC scoring results for standard peptides and the test.
Proteomics Informatics David Fenyő
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

ADVANCEMENT IN PROTEIN INFERENCE FROM SHOTGUN PROTEOMICS USING PEPTIDE DETECTABILITY PEDRO ALVES Advisor: Predrag Radivojac School of Informatics BLOOMINGTON

Overview Shotgun Proteomics Protein Inference Problem Protein Identification Using Peptide Detectability Results Limitations and Improvements

Degenerate Peptides Rat Sample/Rat IPI Database 60% Nesvizhskii, A.I. and Aebersold, R. (2005) Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell Proteomics, 4, 1419–1440.

Protein Inference Problem Solution 1 * (A, E) * Solution 2 * (B, C, D) * * * * Minimum Protein Set 11 Possible Solutions Nesvizhskii, A.I. and Aebersold, R. (2005) Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell Proteomics, 4, 1419–1440.

Identified Peptides Proteins GMPSA Greedy Minimum Protein Set Algorithm Nesvizhskii, A.I. and Aebersold, R. (2005) Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell Proteomics, 4, 1419–1440.

Resolving Ambiguity detectability of a peptide – the probability that the peptide will be observed in a standard sample analyzed by a standard proteomics routine Tang, H., Arnold, R. J., Alves, P., Xun, Z., Clemmer, D. E., Novotny, M. V., Reilly, J. P. & Radivojac, P. (2006). A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics, (2006) 22 (14): e481-e488

Factors affecting Peptide Detection Four classes of factors 1)Chemical properties of the peptide (and parent protein) 2)Limitations of peptide identification protocol 3)Abundance of the peptide in the sample 4)Presence of other peptides that compete for detection Mean Accuracy :71% Mean AUC :78% Synthetic : ~30% of peptides identified Real :~10% of peptides identified Peptide Detectability Prediction

Identified Peptides Proteins Minimum Missed Peptides Missed peptide MDAP

Identified Peptides ProteinsLDFA

RESULTS GMPSALDFA Synthetic Sample with 12 Proteins 7 correct proteins 10 correct proteins 5 tied proteins 1 tied protein 1 incorrect tied protein

GMPSA vs LDFA in a R. norvegicus sample GMPSALDFA Rat Sample/Rat IPI Database Indistinguishable pairs

GMPSA vs LDFA GMPSALDFA Total proteins identified 62%81% Percent of proteins assigned with no ties Total assignments with no ties 149 Proteins assigned due to unique peptides 4 75 Total unambiguous assignments excluding the proteins with unique peptides Identified Proteins Unambiguously Identified Proteins

Limitations and Improvements Include missed-cleavage peptides Include lower scoring peptides to aid in the differentiation of tied proteins Include peptides identified with charges +1 and +3 Train on other analytical platforms Study the effects of detectability prediction on algorithm results

Publications PSB 2007 –Alves, P., Arnold, R., Novotny, M., Radivojac, P., Reilly, J., Tang, H. (2007). Advancement in Protein Inference from Shotgun Proteomics Using Peptide Detectability. Pac. Symp. Biocomput., (2007) 12: ISMB 2006 –Tang, H., Arnold, R. J., Alves, P., Xun, Z., Clemmer, D. E., Novotny, M. V., Reilly, J. P. & Radivojac, P. (2006). A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics, (2006) 22 (14): e481-e488.

Acknowledgements Predrag Radivojac Haixu Tang Randy Arnold IU School of Informatics IU Chemistry Dept.