Presentation is loading. Please wait.

Presentation is loading. Please wait.

Poster produced by Faculty & Curriculum Support (FACS), Georgetown University Medical Center Application of meta-search, grid-computing, and machine-learning.

Similar presentations


Presentation on theme: "Poster produced by Faculty & Curriculum Support (FACS), Georgetown University Medical Center Application of meta-search, grid-computing, and machine-learning."— Presentation transcript:

1 Poster produced by Faculty & Curriculum Support (FACS), Georgetown University Medical Center Application of meta-search, grid-computing, and machine-learning can significantly improve the sensitivity of peptide identification. The PepArML meta-search engine is publicly available, free of charge, on-line from: http://edwardslab.bmcb.georgetown.edu Improving the Sensitivity of Peptide Identification from Tandem Mass Spectra using Meta-Search, Grid-Computing, and Machine-Learning. For five columns, line up guides with these boxes Introduction Automatic search engine configuration and execution, parameterized by: Instrument & proteolytic agent Fixed and variable modifications Protein sequence database & MS/MS spectra file Peptide candidate selection Nathan J. Edwards, Georgetown University Medical Center Unified MS/MS Search Interface For three columns, line up guides with these boxes MS/MS Spectra Reformatting Peptide Identification Meta-Search via Grid-Computing Real Data: Peptide Atlas – A8_IP Conclusions References The PepArML meta-search engine provides: A unified MS/MS search interface for Mascot, X!Tandem, KScore, OMSSA, and MyriMatch, Search job scheduling on multiple large-scale heterogeneous computational grids, Unsupervised, model-free result combining using machine-learning (PepArML [1]) The PepArML meta-search engine improves peptide identification sensitivity, significantly increasing the number of peptide ids at 10% FDR. Georgetown University 1.N. Edwards, X. Wu, and C.-W. Tseng. "An Unsupervised, Model-Free, Machine-Learning Combiner for Peptide Identifications from Tandem Mass Spectra." Clinical Proteomics 5.1 (2009). 2.N.J. Edwards. "Novel Peptide Identification using Expressed Sequence Tags and Sequence Database Compression." Molecular Systems Biology 3.102 (2007). Annual Meeting, 2009 PepArML – Unsupervised Machine-Learning Combiner NSF TeraGrid 1000+ CPUs Edwards Lab Scheduler & 48+ CPUs Meta-search with five search engines; Automatic target & decoy searches. Secure communication Heterogeneous compute resources Scales to 250+ simultaneous searches Free, instant registration Legend: Tandem, Mascot, OMSSA: T, M, O; Mascot w/ Peptide Prophet: M*; Heuristic: H; Classifier w/ 5-fold-CV: C-T, C-M, C-O, C-TM, C-TO, C- MO, C-TMO; Unsupervised classifier w/ 5-fold-CV: U-TMO; Unsupervised classifier w/ no-CV: U*-TMO. Q-TOF LTQ MALDI Heuristic C-TMO U-TMO U*-TMO Tandem, KScore, OMSSA. X!Tandem, KScore, OMSSA, MyriMatch, Mascot (1 core). Simple search description Charge and precursor enumeration for peptide candidate selection (for charge & 13 C peak correction) Search engine formatting constraints (MGF/mzXML) Consistent MS/MS spectrum identifier tracking Spectrum file “chunking” Peptide Candidate Selection Missed cleavages, specific or semi-specific proteolysis Precursor matching parameters, including Precursor mass tolerance & 13 C peak correction Charge state guessing and/or enumeration Job managementResult combining Peptide Atlas A8_IP LTQ MS/MS Dataset Tryptic search of Human ESTs using PepSeqDB [2] 107084 spectra searched ~ 26 times: - Target + 2 decoys, 5 engines, 1+ vs 2+/3+ charge 8685 search jobs, 25.7 days of total CPU time. 5211 TeraGrid TKO jobs in < 2 hours (143 machines) Total elapsed time (Mascot bottleneck): < 26 hours.


Download ppt "Poster produced by Faculty & Curriculum Support (FACS), Georgetown University Medical Center Application of meta-search, grid-computing, and machine-learning."

Similar presentations


Ads by Google