Download presentation
Presentation is loading. Please wait.
Published byLynne Hawkins Modified over 9 years ago
1
Patricia HernandezGeneva, 28 th September 2006 Swiss Bio Grid: Proteomics Project (PP)
2
Definition: Set of technologies and methodologies for large-scale studies of proteins identification / characterization / quantification Context: Proteomics
3
Definition: Set of technologies and methodologies for large-scale studies of proteins identification / characterization / quantification Typical proteomic study: identify proteins that are differentially expressed between two samples (e.g. normal vs disease state) Context: Proteomics
4
Definition: Set of technologies and methodologies for large-scale studies of proteins identification / characterization / quantification Typical proteomic study: identify proteins that are differentially expressed between two samples (e.g. normal vs disease state) Technology: mass spectrometry (MSMS) = mass measurement of protein fragments Context: Proteomics
5
Identification of proteins: principle Many available tools; all work in the same way a LIST OF MSMS SPECTRA processed sequentially a LIST OF POSSIBLE SOLUTIONS e.g. a list of known protein sequences thousands to miostens to thousands
6
solutions are (sequentially) evaluated against the spectra using a COMPARISON FUNCTION some display (OUTPUT) of the identified proteins (with/without additional features such as statistics, result export, etc.) Identification of proteins: principle thousands to miostens to thousands
7
Key idea: Give access through a unique web portal to several spectrum analysis software in a workflow-oriented data analysis platform. the swissPIT platform Key idea and main objectifs of the PP
8
Key idea: Give access through a unique web portal to several spectrum analysis software in a workflow-oriented data analysis platform. the swissPIT platform Main objectifs: - increase the coverage of identified proteins - automatise analysis workflows - provide a environment for parameter optimisation studies and for benchmarking Key idea and main objectifs of the PP
9
Motivation many identification/characterization tools are available that differ by several aspects e.g. type of database, scoring scheme for the comparison, validation of the results…
10
Necessity to use several software in order to have a better coverage of identified spectra Motivation many identification/characterization tools are available that differ by several aspects e.g. type of database, scoring scheme for the comparison, validation of the results… and that lead to partially overlaping identifications Keller, Mol Syst. Biol, 2005
11
Interaction with the user: MSMS data upload choice of workflows and parameter configuration result visualisation data/result sharing swissPIT overview: three distinct parts
12
Execution of the analysis workflow selected by the user data exploitation or high-throughput centered workflows task-specific workflows (=personalized for a given lab) swissPIT overview: three distinct parts
13
Easy parallelisation In a workflow, several analysis tools may be called in the same time (and independently) For a given identification tool, the spectrum list and/or the db can be splitted into bundles and each bundle analysed independently swissPIT overview: three distinct parts
14
Use of distributed resources Each site decides what databases and tools to install and maintain. Corresponds to the « reality ». Research groups and proteomics facilities are geographically scattered and need to collaborate. swissPIT overview: three distinct parts
15
Current status of swissPIT Web-based interface 4 protein identification tools Phenyx X!Tandem Popitam InsPecT 2 protein sequence databases uniProtKB/swissProt (>230’000 entries) uniProtKB/trEMBL (> 3’180’000 entries) swissBioGrid compatible (submission to a grid is transparent for the user)
16
User layerSystem layer Current status: swissPIT from inside
17
User layerSystem layer Current status: swissPIT from inside
18
User layerSystem layer submit mgf Current status: swissPIT from inside
19
User layerSystem layer submit mgf Current status: swissPIT from inside
20
User layerSystem layer submit mgf Grid/cluster Current status: swissPIT from inside
21
User layerSystem layer submit mgf Grid/cluster Current status: swissPIT from inside
22
http://swisspit.cscs.ch/ username pwd Current status: swissPIT from outside
23
some global parameters Current status: swissPIT from outside
24
List of software that are installed Check/uncheck boxes to select software to be run on the data Current status: swissPIT from outside
25
Click on link to display and configure software specific parameters Current status: swissPIT from outside
26
Click on link to display and configure software specific parameters Press to run the software Current status: swissPIT from outside
27
Go to the user space Browse old/new project Current status: swissPIT from outside
28
Results are visualized in native format raw text for Popitam, InsPecT XML with style sheet for X!Tandem advanced java interface for Phenyx Current status: swissPIT from outside
29
Code improvement improve readability and maintainability of code Upcoming work
30
Code improvement improve readability and maintainability of code Standardisation unify parameters as much as possible display results in one format Upcoming work
31
Code improvement improve readability and maintainability of code Standardisation unify parameters as much as possible display results in one format Workflows find a way to implement workflows using xml configuration files Upcoming work screen for unsuspected modifications screen for proteins remove spectra with low peak statistics
32
Ron Appel Patricia Hernandez Celine Hernandez Andreas Quandt Marc Tuloup Pierre-Alain Binz Markus Müller Alexandre Masselot, Nicolas Budin Vital-it team + Bruno Nyffeler Peter Kunszt, Sergio Maffioletti, Arthur Thomas Involved people, acknowledgments
33
Thank you for your attention Involved people, acknowledgments
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.