Download presentation
Presentation is loading. Please wait.
1
Using Spotfire for Proteomic Analysis
Steve Marshall, Bioinformatics Scientist Caprion Pharmaceuticals October 2003
2
Caprion Overview Montreal-based private company founded in 1998
80 employees (27 PhD, 30 MSc) Collaborations in prion disease diagnostics, tumor immunotherapy and biomarkers of disease:
3
CellCarta PROTEOMICS Fractionate and enrich biological samples containing disease-related proteins Comprehensively and quantitatively compare proteomes from normal and diseased samples
4
CellCarta PROGRAMS Target Oncology: tumor antigen targets discovery
Biomarker discovery Mechanism of action studies Oncology: tumor antigen targets Metabolic diseases: therapeutic proteins Pre-clinical and clinical markers of therapeutic efficacy Diagnostics development Protein expression profiling Sub-cellular protein location and translocation Phosphorylation and other post-translational modifications
5
Proteomics Pipeline Purified Sample Protein Separation Raw data
LC-MS Injections Custom Tools Expression Profiling Mascot & Custom Tools Protein Identification & Peptide Sequencing On a per sample basis go from about 2Gb of data down to a couple of Mbs Validation of Expression & Identification Custom Tools Protein Annotation & Analysis (function, location, IP, novelty, expression, etc) ?? protein reports
6
Proteomics Pipeline (con’t)
Annotation Computationally derived (Interpro, SignalP,TMHMM) Automatic Manual Curation (Tuva) Develop a web based prototype called Tuva that enabled our scientist to provide additional scientific information (ie. protein location, disease mechanism, etc…) The prototype served it’s purpose by giving our scientist a means of accessing the data, asking different questions about the data. However it didn’t scale well and wasn’t flexible. We needed a more viable long term solution
7
Information Overload? External Data Internal Data
How do you cope with all this data? External data: genome, literature Internal data: peptide expression, lab info, sample info Scientists also need to play with it, get a feel for the data How do you manage it?
8
Data Analysis Scientists require a tool that:
Allows them to navigate the data Interact dynamically Link various data sources Automate certain tasks Short learning curve Presentation of results Bioinformatics / IT requirements: Scalable Customizable Extensible With this in mind, we felt that it would be better to bring in a third party tool that could help us in this process With the increasing amounts of data available to a scientist, visualization and integration of data sets is necessary Visualization allows you: Find patterns, clusters, relationships etc.. It allows one to identify outliers and anomalous data Integration Link information together Interactive Needed a tool that: Easy to use Scalable Customizable Extensible In the next slide, I’ll show you where Spotfire fits in the proteomics platform at Caprion
9
Proteomics Pipeline LC-MS Injections raw Expression Profiling
Custom Tools Peptide Sequencing & Protein Identification Mascot & Custom Tools Validation of Expression & Identification Custom Tools On a per sample basis go from about 2Gb of data down to a couple of Mbs Protein Analysis (function, location, IP, novelty, expression, etc) DecisionSite protein reports
10
Proteomics Pipeline (con’t)
We see Spotfire as a platform to build upon By using an Information Link, SQL query or join data by column allows the scientist to view different types of information In the annotation db (which is based on protein accession from ncbi) have information such as TM domains, Interpro domains, Signal peptide etc…. In sample info, have information like band, patient etc. In expression file
11
Workflow Example Once proteins are identified, it is necessary to collapse the data into a more manageable form Accomplish this by clustering the data either by the peptides or by protein homology within a given sample This way biologists can chose how to view the data Demonstration with some Caco data This demo will show how we annotate clusters of proteins for a given sample in this case some old Caco data
12
Workflow Example (con’t)
13
Workflow Example (con’t)
14
Queries?? Once the annotation process is complete some typical queries include: return all PM localized peptides over-expressed in disease in 40% of patients Find the intersection of all peptides or proteins that are upregulated in condition 1 vs 2 and condition 3 vs 4 Find all proteins that span across X number of gel bands or identify discrepancies between a protein’s theoretical vs actual MW DecisionSite gives the bench scientist the power to analyze data easily
15
Summary DecisionSite allows our scientists to access and analyze data quickly and easily Important to communicate with your end users Future work: Possiblility of integrating additional in house tools for protein expression pathway viewer DecisionSite gives the bench scientist the power to analyze data easily
16
Acknowlegments Bioinformatics Group: Paul Kearney Jason Yen IT:
Marcelo Filgueira Scientists: Nathan Currier Joachim Ostermann Michel Dominguez
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.