Proteome Discoverer Workshop I Orbi 4, 2012. 2 Overview Why Proteome Discoverer? Support and additional information BRIMS portal, Planet Orbitrap, Sparkplug.

Slides:



Advertisements
Similar presentations
Agilent’s MX QPCR Software Tutorial Field Application Scientist
Advertisements

Protein Quantitation II: Multiple Reaction Monitoring
EndNote. What is EndNote:  EndNote is referencing software that enables you to create a database of references from your readings. Your database of references.
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
MS-Viewer – A Web Based Spectral Viewer For Database Search Results Peter R. Baker 1, Alma L. Burlingame 1 and Robert J. Chalkley 1 1 Mass Spectrometry.
WELCOME TO THE ANALYSIS PLATFORM V4.1. HOME The updated tool has been simplified and developed to be more intuitive and quicker to use: 3 modes for all.
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
FIGURE 5. Plot of peptide charge state ratios. Quality Control Concept Figure 6 shows a concept for the implementation of quality control as system suitability.
Scaffold Download free viewer:
Bar|Scan ® Asset Inventory System The leader in asset and inventory management.
My contact details and information about submitting samples for MS
Facts and Fallacies about de Novo Sequencing & Database Search.
Analysis of tandem mass spectra - II Prof. William Stafford Noble GENOME 541 Intro to Computational Molecular Biology.
Spectral Counting. 2 Definition The total number of identified peptide sequences (peptide spectrum matches) for the protein, including those redundantly.
Karl Clauser Proteomics and Biomarker Discovery Taming Errors for Peptides with Post-Translational Modifications Bioinformatics for MS Interest Group ASMS.
This presentation will guide you though the initial stages of installation, through to producing your first report Click your mouse to advance the presentation.
Chapter-4 Windows 2000 Professional Win2K Professional provides a very usable interface and was designed for use in the desktop PC. Microsoft server system.
Relex Reliability Software “the intuitive solution
Classroom User Training June 29, 2005 Presented by:
Committing to the future easyEmission - Testing Module for Engines.
Hands-On Virtual Computing
Introduction The GPM project (The Global Proteome Machine Organization) Salvador Martínez de Bartolomé Bioinformatics support –
An Introduction to Designing, Executing and Sharing Workflows with Taverna Nowgen, Next Gen Workshop 17/01/2012.
StAR web server tutorial for ROC Analysis. ROC Analysis ROC Analysis: This module allows the user to input data for several classifiers to be tested.
An Introduction to Designing and Executing Workflows with Taverna Katy Wolstencroft University of Manchester.
Common parameters At the beginning one need to set up the parameters.
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Copyright OpenHelix. No use or reproduction without express written consent1.
Laxman Yetukuri T : Modeling of Proteomics Data
Search Engine Result Combining Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown University Medical Center.
Touchstone Automation’s DART ™ (Data Analysis and Reporting Tool)
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
What’s new in Kentico CMS 5.0 Michal Neuwirth Product Manager Kentico Software.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
INF380 - Proteomics-101 INF380 – Proteomics Chapter 10 – Spectral Comparison Spectral comparison means that an experimental spectrum is compared to theoretical.
Peptidesproteinsgenes protein accessionsharedsharedunique gene nameshareduniqueunique Identified by gene unique peptides Identified by protein and gene.
PeptideProphet Explained Brian C. Searle Proteome Software Inc SW Bertha Blvd, Portland OR (503) An explanation.
Capabilities of Software. Object Linking & Embedding (OLE) OLE allows information to be shared between different programs For example, a spreadsheet created.
PIRSF Classification System PIRSF: Evolutionary relationships of proteins from super- to sub-families Homeomorphic Family: Homologous proteins sharing.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
1 Session Number Presentation_ID © 2002, Cisco Systems, Inc. All rights reserved. Using the Cisco TAC Website for Security and Virtual Private Network.
LANDESK SOFTWARE CONFIDENTIAL Tips and Tricks with Filters Jenny Lardh.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
Isotope Labeled Internal Standards in Skyline
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
ISA Kim Hye mi. Introduction Input Spectrum data (Protein database) Peptide assignment Peptide validation manual validation PeptideProphet.
Minimize Database-Dependence in Proteome Informatics Apr. 28, 2009 Kyung-Hoon Kwon Korea Basic Science Institute.
1 © 2004 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Technical Support Seminar Using the Cisco Technical Support Website.
Using Scaffold OHRI Proteomics Core Facility. This presentation is intended for Core Facility internal training purposes only.
Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
CPAS Comparative Proteomics Analysis System Adam Rauch LabKey Software
SQL Database Management
Compatible with the latest browsers; Chrome, Safari, Firefox, Opera and Internet Explorer 9 and above.
Algorithms and Computation: Bottom-Up Data Analysis Workflows
Open source tools for data analysis
MassMatrix Search Results Explained
View  text zoom  large Set properties text size to 14 point
Test Information Distribution Engine (TIDE)
Presentation Title NEMC 2018 Dale Walker, Bruce Quimby Agilent
Accelerating Research in Life Sciences
USER MANUAL - WORLDSCINET
Contract Management Software 100% Cloud-Based ContraxAware provides you with a deep set of easy to use contract management features.
USER MANUAL - WORLDSCINET
Presentation transcript:

Proteome Discoverer Workshop I Orbi 4, 2012

2 Overview Why Proteome Discoverer? Support and additional information BRIMS portal, Planet Orbitrap, Sparkplug Proteome Discoverer 1.3 Unique features of 1.3 Quan applications Some tips and tricks Proteome Discoverer 1.4 and 2.0 Supplementary information

3 Why Proteome Discoverer? Easy Processing nodes can be connected for simple or complex analyses Powerful Mascot, Sequest, and Zcore can all be used to create a single result file Correct interpretation of RAW data from Orbitrap and Exactive instruments! Other software commonly uses the less accurate pre-scan precursor mass! If using other software, it is crucial that you first convert it to a generic format using Proteome Discoverer.

4 Proteome Discoverer Use As of August, PD was cited in >270 publications In September alone, 10 papers in MCP and JPR alone cited the use of Proteome Discoverer

5 Proteome Discoverer 1.3 Support If you upgraded from PD 1.2 to PD 1.3, you received 1 year of support with the upgrade If you purchased PD 1.3 (not an upgrade) you received 3 years of support Support = Free upgrades during the time of support Free use of the Annotator module Additional years of support may be purchased If you upgraded from PD 1.2 to PD 1.3 and your support has expired, you still get to upgrade to PD 1.4 without charge

6 Additional Support: Thermo-BRIMS.com Over 1,200 registered users Over 290 discussion threads Over 90% of the threads concern PD If your question hasn’t been asked before, someone will answer it.

7 Additional Support: Planet Orbitrap Constantly updated information on: New hardware Applications and methods Recent publications

8 Additional Support: Sparkplug Mobile links to seminars in your area Curated lists of publications using Orbitrap instruments Proteomics and small molecule support

9 Proteome Discoverer 1.3 Search Modes As with previous versions of PD, you have 3 separate ways to search your data Wizards for when you want to process one file at a time Workflows automated processing Daemon automated batch processing on single or multiple remote PCs

10 Searching With Wizards

11 Open/modify a Default Workflow …or create your own from scratch

12 Workflows As in previous iterations, workflows can vary in complexity If your nodes aren’t logical, PD 1.3 won’t allow you to connect them In PD 1.3, all workflows must contain a false discovery application node, either Peptide Validator or Percolator

13 Discoverer Daemon: Data Processing

Unique Features in PD 1.3

15 Percolator FDR calculations are based on score distributions and can be almost arbitrary Percolator is an intelligent learning algorithm that takes the guess-work out of picking an FDR Percolator Q value rescoring provides a metric for comparing Sequest and Mascot results Percolator rescoring adds additional time to your processing over that of a traditional FDR This is most pronounced when processing multiple files through a Mudpit scheme.

16 Percolator in Use Collaborator experiment (unpublished) The plasma of 4 patients with severe malaria and 4 with ‘non-severe’ malaria were iTRAQ labeled. The data was processed with PD 1.2, with stringent criteria Observations must be consistent in all samples 2 peptide minimum The data was later re-processed with PD 1.3 using the same criteria, but with Percolator rather than a strict FDR

17 Percolator in Use PD 1.2 FDRPD 1.3 Percolator ASS1 pathway implicated, pathway score 16, 4 th place ASS1 pathway implicated, pathway score 42, 2 nd place

18 Percolator in Use Under particularly stringent conditions, improved FDR calculations can make or break your experiments Although the asparagine synthetase (ASS1) pathway was implicated in the first analysis, Percolator applied to the same data set allowed several new proteins to meet the cutoff conditions, strengthening this pathway As plasma asparagine levels correlate with malaria severity, changes in this pathway were expected Another important find was HMOX1, which is critical to the immune response to malaria

19 Percolator Crashes?!? Every time Percolator runs, it sends a ‘ping’ to the University of Washington. Recently, new stringent IT security protocols have identified this as a threat! If you find that your workflows are now crashing during the Percolator steps – for the free

20 PhosphoRS phosphoRS is a separate scoring algorithm that provides certainty in phosphorylation site assignment Each potential phosphorylation site receives a score that estimates the probability that the phosphorylation in question occurred at that particular point

21 PhosphoRS in Action The pRS probability is the likelihood that there is a phosphorylation The pRS Site probability is the likelihood of the phosphorylation occurring at each individual S,T, or Y

22 Annotation Annotator helps link your MS/MS observations to biological pathways by gene ontology This data is downloaded directly from Thermo ProteinCenter When annotator is added to your workflow, your output file contains: Molecular function Cellular component Biological pathway

23 Annotator in Action In this experiment, membrane proteins were enriched via gradient centrifugation The cellular component tab shows that membrane proteins are well- represented in this data set. Known functions and processes are also shown, reducing the hassle of sorting biologically relevant data

PD 1.3 Quan

25 The Three Quan Modules PD 1.3 can be used for quantification Default workflows are written for: SILAC TMT/iTRAQ Peak area calculations

26 SILAC A workflow template for SILAC (K6, R10) comes with installation The workflow can be easily adapted to other labels, with multiple options for output including ratio reporting, calculation and setting experiment bias.

27 Reporter Ions (TMT and iTRAQ) Reporter ion quantification is even more flexible than SILAC An unlimited number of ratios may be reported for a dataset, including pair-wise analysis, group comparison, and even ‘round robin’ outputs. Reporting the ratios in multiple ways helps you quickly identify the most interesting and reproducible observations

28 Label Free Quan with PIAD Adding the PIAD node to your workflow gives you an average of the areas of the 3 most intense peptides from each protein This workflow provides you with the area only. In order to determine regulation an additional step is required

29 Label Free Quan with PIAD After your files are processed individually (or as a MudPit experiment), the MSF output files may be compared by selecting the option shown above. The output will list all identified proteins as well as the precursor ion areas for that protein found in each file. If the protein was not found in both samples, one of the areas will be listed as 0.00

30 Hints and Tricks: 1) Interesting Peptides With No ID Spectra without IDs may be exported for further analysis in a few steps First go to the search input tab in your open report Right click on the report to open the “Group by Column”

31 Interesting Peptides With No ID Next, drag the column header “PSM Ambiguity” into the new search input bar and expand the bar 0 match bar. The pair above differ by <1ppm and eluted within 9 s of one another on consecutive gradients, but the area is 2.2 times higher in Sample 2 Checkmark the interesting spectra and pull down the Export menu as shown above. Export them in the most useful format for further interrogation If the number of unidentified spectra is daunting, export the list to Excel and filter with a database software such as DigDB (DigDB.com)

32 Hints and Tricks: 2) Static Exclusion Lists On a hybrid instrument, you have the ability to exclude up to 2,000 individual ions. A QE can exclude up to 5,000 ions On both instruments, this can be done with or without retention time windows On plasma samples, the increase in protein coverage has been shown to increase by as much as 50% when an Exclusion list is employed (as compared to identical technical replicates)

33 To Make an Exclusion List With PD 1.3 (1) Open your report and sort your proteins by coverage or most matched PSMs Highlight the top hits, right click anywhere, and click “Check Selected”

34 To Make an Exclusion List With PD 1.3 (2) Next, go to File  Export  Export to Xcalibur Exclusion List If this Exclusion list is for a hybrid instrument, it is ready for input Opening the list in Excel will let you see how many masses are included If the list is for a Q Exactive, there is an extra step

35 To Make an Exclusion List With PD 1.3 (3) QE software was written with direct Pinpoint compatibility The difference is the Polarity column Open the list in Excel and add a column containing only the word Positive, save as text, and it is ready for import into the QE

36 Hints and Tricks: 3) Single RAW File Conversion As mentioned earlier, conversion of RAW files to other formats can be performed with PD 1.3 The PD converter is currently the only program that is guaranteed to grab the correct precursor mass from the various RAW data output This is the only way to ensure that you are looking at the correct value every time.

37 Hints and Tricks: 4) Batch Conversion of RAW Files Open the PD 1.3 Daemon Link it to your remote PC running PD 1.3, or to your ‘local host’ if only 1 PC

38 Hints and Tricks: 4) Batch Conversion (for free) You do not need to have a full version of PD 1.3 installed in order to convert RAW files In order to set up a free converter Download the PD 1.3 free trial version from the BRIMS portal During your 30 day trial, create a separate method for each alternative data format (dta, mgf, mzdata, mzml) These methods will be available through the Daemon even after your free trial of PD 1.3 expires

39 Hints and Tricks: 4) Identifying Contaminants With cRAP The Global Proteome Machine is constantly updating a database of common contaminants found in LC-MS experiments This FASTA ranges from the expected (keratin) to the odd (sheep’s wool, and peptides found on latex?) The database can be used for construction of a universal exclusion list, or the FASTA may be appended to the end of the one for your organism of interest This can be found at thegpm.org, or by typing “gpm crap” into any search engine

Questions on PD 1.3?

41 PD Get More From Each RAW File! New search engines More comprehensive results Same processing format

42 New Features: Library Searching Integrated library search tools: SpectraST written by Henry Lam MSPepSearch from NIST Using spectra libraries provided by NIST Provides a new emerging search methodology as a standard option Improved sensitivity over sequence database search Combine Spectral Library searching with classical database searching for fast and reliable identification of a maximum number of true positive Peptide Spectrum Matches

43 Spectral Libraries Upload libraries same way as FASTA files 2 different formats! We will distribute some on the DVD National Institute of Standards and Technology (NIST) Libraries can be downloaded at More can be downloaded on the homepage of Peptide Atlas ( NIST format for MSPepSearch SPLIB for SpectraST

44 Spectral Library Searching: 2 Tools SpectraST Pro Automatic generation of decoy database More libraries are available Cons Slower Fixed mass tolerance for fragments Precursor mass tolerance min. 1Da Only 32 bit– impossible to do in memory search for large(human) library MSPepSearch Pro Faster Flexible tolerances Cons No decoy libraries Only NIST libraries

45 New Feature: Faster and More Powerful Sequest HT Large data set: Sequest: 12 cores 70 raw files, 18 GB, spectra Takes advantage of the latest computer architecture to accelerate database searches

46 ISE Search Speed Improvements Over Sequest

47 New Feature: Faster and More Powerful Search Engines Linear Scaling of search performance with the number of available cores

48 Sequest HT New configuration parameters: Work Load Level Automatic mode Manual Select # spectra to be processed at once Number of parallel tasks: Number of nodes used by Sequest For the 12 core I use (all)

49 Advanced Annotation Features for Functional Studies User definable filters for biological functions to study proteomics results in functional context. Annotation of proteins with Entrez Gene IDs retrieved from the ProteinCenter web service User definable GO slim and Annotation Aspects to annotation proteins with specific annotation terms Annotates the number of PTM sites on the protein level

50 Proteome Discoverer 2.0 Confident analysis of LARGE datasets Complete reworking of the data processing workflow to make analysis of hundreds of files easy Statistical analysis of the data at the PSM, peptide AND protein level Release in 2013

51 Proteome Discoverer 2.0 High performance for proteome sized datasets Create automatically tailor made reports during data processing for very large data sets. Optimized for flexible reading and writing of large amounts of hierarchical data Generation of report files Generate report file from one or multiple.msf files All data processing (filtering, protein scoring, FDR / probability calculation) will happen while generating the report file.

52 New Feature: Peptide and Protein Level Validation Absolute confidence in protein identifications Increased confidence through the integration of statistically sound estimation of true posterior probabilities for identified peptides and proteins

53 New Feature: Byonic Search Engine for PTMs Byonic is a next generation database search engine with many advanced features Error tolerant search mode to find unexpected modifications automatically Byonic finds more identifications compared to the combination of Sequest and Percolator.