Download presentation
Presentation is loading. Please wait.
Published byGwen Jacobs Modified over 9 years ago
1
Daehee Hwang Leroy Hood Institute for Systems Biology
2
2 Why Prequips for Systems Biology with proteomic data? Need for visualization, analysis, and integration of multiple proteomic datasets: raw data level, peptide level, protein level, multi sample analysis Need for an interface between proteomic data and systems biology analytical tools such as network/pathway analyses
3
3 Integration of proteomic data at various levels Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline Communication not possible! Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline
4
4 Pep3d: Quality Assessment Prequips Multi Sample Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline Pep3D Properties -quality assessment -2D gel-like visualization Gaggle Network Analysis Cytoscape Interaction Database STRING Pathway Database KEGG Microarray Data Analysis Mayday, TIGR
5
5 Pep3d: Quality Assessment Pep3D Instance 1 Pep3D Instance 2 Communication not possible!
6
6 Interface to Systems Biology Gaggle Network Analysis Cytoscape Interaction Database STRING Pathway Database KEGG Microarray Data Analysis Mayday, TIGR Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline Communication not possible!
7
7 Prequips Overview Prequips Multi Sample Gaggle Network Analysis Cytoscape Interaction Database STRING Pathway Database KEGG Microarray Data Analysis Mayday, TIGR -handles multiple samples at all levels Key Properties -integrates high-level analysis tools -is extensible Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline
8
8 Integration of proteomic datasets at various levels Database Search raw data Mass Spectrometer peptide-level data e.g. mzXML, mzData,... Validation Peptide Quantification Protein Inference protein-level data Protein Quantitation e.g. pepXML, AnalysisXML,... e.g. protXML,... Trans-Proteomic Pipeline annotation further analysis results
9
9 Raw Data Data model Peptide LevelProtein Level Core Meta Single-Sample Analysis Multi-Sample Analysis Project Data Providers Data Structures protein-level data source, e.g. protXML files peptide-level data source, e.g. pepXML, dta or AnalysisXML files raw data level, e.g. mzXML or mzData files ViewersPerspectives
10
10 Case Study: Toponomic change in drug treated Mø Calreticulin BiP Bcl2 ATPase Lamp1 2468101214161820 8% 28% 114115116117 Fraction #: Mock1Mock2Thapsigargin
11
11 Visualization: Single exp. CID spectra that have been selected detailed information about one of the level 2 spectra project manager peak map for run 29 level 1 spectrum & corresponding CID spectra level 1 level 2 all scans of Mock 1 experiment
12
12 Visualization: Multiple exps. (polymer?) contamination in all 4 runs (this would be hard to see with Pep3D) green = 0 red = 1
13
13 Visualization: assess, quntify, etc. Mock Up (software is under development): m/z minmax retention time minmax map 1 map 2 map 3 map 4 map 5 map 6 map 1 map 2 map 3map 4 X X X Doesn’t really match the remaining 3 maps!
14
14 Prequips & the Gaggle Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser
15
15 Mayday
16
16 Cytoscape overall mouse protein/protein interaction map in Cytoscape
17
17 Analysis: Feature extraction Protein table Gaggle plugin for interaction with other tools Filters
18
18 Analysis: Feature extraction Gaggle plugin: selection for broadcast calreticulin
19
19 Analysis: Feature selection Mock1Mock2Thapsigargin
20
20 Broadcast to Gaggle
21
21 Prequips to Gaggle Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser
22
22 Gaggle Boss
23
23 Gaggle to Cytoscape Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser
24
24 Integration: Network Analysis proteasome complex ribosome large subunit chaperones actin filament regulation Thapsigargin 114 iTRAQ ratio
25
25 Cytoscape to Prequips Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser
26
26 Analysis: Feature extraction- Module selection the ids sent from Cytoscape through the Gaggle proteasome proteins
27
27 Prequips & the Gaggle Gaggle Boss Prequips Mayday R statistical environment Cytoscape Exchange of data structures such as name lists, lists of name-value pairs, matrices and networks. KEGG DAVID Browser
28
28 Analysis: Functional enrichment the proteasome complex enriched compared to a mouse genome background
29
29 Prequips Summary Prequips Multi Sample Gaggle Network Analysis Cytoscape Interaction Database STRING Pathway Database KEGG Microarray Data Analysis Mayday, TIGR -handles multiple samples at all levels Key Properties -integrates high-level analysis tools -is extensible Raw Data (MS, MS/MS) Peptide Id + Quantiation Protein Id + Quantitation ? Trans-Proteomic Pipeline
30
30 Conclusion general and extensible software for systems biology research with proteomics mass spectrometry data. Integration capability of data from various sources for visualization and analysis. An interactive environment that supports (visual) data exploration.
31
31 Software details implemented in Java based on Eclipse Rich Client Platform extremely modular architecture multiple plugin interfaces –e.g. viewers, data providers, algorithms meta information framework –analysis results, sequence information, annotation,... –data structures as plugins –requirement to support future analytical tools and data sources
32
32 Acknowledgements Special thanks to Nils Gehlenborg Hood Lab: Inyoul Lee Kay Nieselt Aebersold Lab: Nichole King, James Eddes, Eric Deutsch, Ning Zhang, David Shteynberg, Wei Yan, and Andrew Garbutt Paul Shannon for help with the Gaggle
33
33 Core Mayday DatabaseGaggle R Visualization Excel PostgreSQL database MySQL database R environment Bioconductor SBEAMS installation Machine Learning WEKA Library anything else Prequips
34
34 Cytoscape
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.