Presentation is loading. Please wait.

Presentation is loading. Please wait.

Patricia HernandezGeneva, 28 th September 2006 Swiss Bio Grid: Proteomics Project (PP)

Similar presentations


Presentation on theme: "Patricia HernandezGeneva, 28 th September 2006 Swiss Bio Grid: Proteomics Project (PP)"— Presentation transcript:

1 Patricia HernandezGeneva, 28 th September 2006 Swiss Bio Grid: Proteomics Project (PP)

2 Definition: Set of technologies and methodologies for large-scale studies of proteins  identification / characterization / quantification Context: Proteomics

3 Definition: Set of technologies and methodologies for large-scale studies of proteins  identification / characterization / quantification Typical proteomic study:  identify proteins that are differentially expressed between two samples (e.g. normal vs disease state) Context: Proteomics

4 Definition: Set of technologies and methodologies for large-scale studies of proteins  identification / characterization / quantification Typical proteomic study:  identify proteins that are differentially expressed between two samples (e.g. normal vs disease state) Technology:  mass spectrometry (MSMS) = mass measurement of protein fragments Context: Proteomics

5 Identification of proteins: principle Many available tools; all work in the same way  a LIST OF MSMS SPECTRA processed sequentially  a LIST OF POSSIBLE SOLUTIONS e.g. a list of known protein sequences thousands to miostens to thousands

6  solutions are (sequentially) evaluated against the spectra using a COMPARISON FUNCTION  some display (OUTPUT) of the identified proteins (with/without additional features such as statistics, result export, etc.) Identification of proteins: principle thousands to miostens to thousands

7 Key idea: Give access through a unique web portal to several spectrum analysis software in a workflow-oriented data analysis platform.  the swissPIT platform Key idea and main objectifs of the PP

8 Key idea: Give access through a unique web portal to several spectrum analysis software in a workflow-oriented data analysis platform.  the swissPIT platform Main objectifs: - increase the coverage of identified proteins - automatise analysis workflows - provide a environment for parameter optimisation studies and for benchmarking Key idea and main objectifs of the PP

9 Motivation many identification/characterization tools are available that differ by several aspects  e.g. type of database, scoring scheme for the comparison, validation of the results…

10 Necessity to use several software in order to have a better coverage of identified spectra Motivation many identification/characterization tools are available that differ by several aspects  e.g. type of database, scoring scheme for the comparison, validation of the results… and that lead to partially overlaping identifications Keller, Mol Syst. Biol, 2005

11 Interaction with the user:  MSMS data upload  choice of workflows and parameter configuration  result visualisation  data/result sharing swissPIT overview: three distinct parts

12 Execution of the analysis workflow selected by the user  data exploitation or high-throughput centered workflows  task-specific workflows (=personalized for a given lab) swissPIT overview: three distinct parts

13 Easy parallelisation  In a workflow, several analysis tools may be called in the same time (and independently)  For a given identification tool, the spectrum list and/or the db can be splitted into bundles and each bundle analysed independently swissPIT overview: three distinct parts

14 Use of distributed resources  Each site decides what databases and tools to install and maintain.  Corresponds to the « reality ». Research groups and proteomics facilities are geographically scattered and need to collaborate. swissPIT overview: three distinct parts

15 Current status of swissPIT  Web-based interface  4 protein identification tools Phenyx X!Tandem Popitam InsPecT  2 protein sequence databases uniProtKB/swissProt (>230’000 entries) uniProtKB/trEMBL (> 3’180’000 entries)  swissBioGrid compatible (submission to a grid is transparent for the user)

16 User layerSystem layer Current status: swissPIT from inside

17 User layerSystem layer Current status: swissPIT from inside

18 User layerSystem layer submit mgf Current status: swissPIT from inside

19 User layerSystem layer submit mgf Current status: swissPIT from inside

20 User layerSystem layer submit mgf Grid/cluster Current status: swissPIT from inside

21 User layerSystem layer submit mgf Grid/cluster Current status: swissPIT from inside

22 http://swisspit.cscs.ch/ username pwd Current status: swissPIT from outside

23 some global parameters Current status: swissPIT from outside

24 List of software that are installed Check/uncheck boxes to select software to be run on the data Current status: swissPIT from outside

25 Click on link to display and configure software specific parameters Current status: swissPIT from outside

26 Click on link to display and configure software specific parameters Press to run the software Current status: swissPIT from outside

27 Go to the user space Browse old/new project Current status: swissPIT from outside

28 Results are visualized in native format  raw text for Popitam, InsPecT  XML with style sheet for X!Tandem  advanced java interface for Phenyx Current status: swissPIT from outside

29 Code improvement  improve readability and maintainability of code Upcoming work

30 Code improvement  improve readability and maintainability of code Standardisation  unify parameters as much as possible  display results in one format Upcoming work

31 Code improvement  improve readability and maintainability of code Standardisation  unify parameters as much as possible  display results in one format Workflows  find a way to implement workflows using xml configuration files Upcoming work screen for unsuspected modifications screen for proteins remove spectra with low peak statistics

32 Ron Appel Patricia Hernandez Celine Hernandez Andreas Quandt Marc Tuloup Pierre-Alain Binz Markus Müller Alexandre Masselot, Nicolas Budin Vital-it team + Bruno Nyffeler Peter Kunszt, Sergio Maffioletti, Arthur Thomas Involved people, acknowledgments

33 Thank you for your attention Involved people, acknowledgments


Download ppt "Patricia HernandezGeneva, 28 th September 2006 Swiss Bio Grid: Proteomics Project (PP)"

Similar presentations


Ads by Google