Using Spotfire for Proteomic Analysis

Using Spotfire for Proteomic Analysis
Steve Marshall, Bioinformatics Scientist Caprion Pharmaceuticals October 2003

Caprion Overview Montreal-based private company founded in 1998
80 employees (27 PhD, 30 MSc) Collaborations in prion disease diagnostics, tumor immunotherapy and biomarkers of disease:

CellCarta PROTEOMICS Fractionate and enrich biological samples containing disease-related proteins Comprehensively and quantitatively compare proteomes from normal and diseased samples

CellCarta PROGRAMS Target Oncology: tumor antigen targets discovery
Biomarker discovery Mechanism of action studies Oncology: tumor antigen targets Metabolic diseases: therapeutic proteins Pre-clinical and clinical markers of therapeutic efficacy Diagnostics development Protein expression profiling Sub-cellular protein location and translocation Phosphorylation and other post-translational modifications

Proteomics Pipeline Purified Sample Protein Separation Raw data
LC-MS Injections Custom Tools Expression Profiling Mascot & Custom Tools Protein Identification & Peptide Sequencing On a per sample basis go from about 2Gb of data down to a couple of Mbs Validation of Expression & Identification Custom Tools Protein Annotation & Analysis (function, location, IP, novelty, expression, etc) ?? protein reports

Proteomics Pipeline (con’t)
Annotation Computationally derived (Interpro, SignalP,TMHMM) Automatic Manual Curation (Tuva) Develop a web based prototype called Tuva that enabled our scientist to provide additional scientific information (ie. protein location, disease mechanism, etc…) The prototype served it’s purpose by giving our scientist a means of accessing the data, asking different questions about the data. However it didn’t scale well and wasn’t flexible. We needed a more viable long term solution

Information Overload? External Data Internal Data
How do you cope with all this data? External data: genome, literature Internal data: peptide expression, lab info, sample info Scientists also need to play with it, get a feel for the data How do you manage it?

Data Analysis Scientists require a tool that:
Allows them to navigate the data Interact dynamically Link various data sources Automate certain tasks Short learning curve Presentation of results Bioinformatics / IT requirements: Scalable Customizable Extensible With this in mind, we felt that it would be better to bring in a third party tool that could help us in this process With the increasing amounts of data available to a scientist, visualization and integration of data sets is necessary Visualization allows you: Find patterns, clusters, relationships etc.. It allows one to identify outliers and anomalous data Integration Link information together Interactive Needed a tool that: Easy to use Scalable Customizable Extensible In the next slide, I’ll show you where Spotfire fits in the proteomics platform at Caprion

Proteomics Pipeline LC-MS Injections raw Expression Profiling
Custom Tools Peptide Sequencing & Protein Identification Mascot & Custom Tools Validation of Expression & Identification Custom Tools On a per sample basis go from about 2Gb of data down to a couple of Mbs Protein Analysis (function, location, IP, novelty, expression, etc) DecisionSite protein reports

Proteomics Pipeline (con’t)
We see Spotfire as a platform to build upon By using an Information Link, SQL query or join data by column allows the scientist to view different types of information In the annotation db (which is based on protein accession from ncbi) have information such as TM domains, Interpro domains, Signal peptide etc…. In sample info, have information like band, patient etc. In expression file

Workflow Example Once proteins are identified, it is necessary to collapse the data into a more manageable form Accomplish this by clustering the data either by the peptides or by protein homology within a given sample This way biologists can chose how to view the data Demonstration with some Caco data This demo will show how we annotate clusters of proteins for a given sample in this case some old Caco data

Workflow Example (con’t)

Queries?? Once the annotation process is complete some typical queries include: return all PM localized peptides over-expressed in disease in 40% of patients Find the intersection of all peptides or proteins that are upregulated in condition 1 vs 2 and condition 3 vs 4 Find all proteins that span across X number of gel bands or identify discrepancies between a protein’s theoretical vs actual MW DecisionSite gives the bench scientist the power to analyze data easily

Summary DecisionSite allows our scientists to access and analyze data quickly and easily Important to communicate with your end users Future work: Possiblility of integrating additional in house tools for protein expression pathway viewer DecisionSite gives the bench scientist the power to analyze data easily

Acknowlegments Bioinformatics Group: Paul Kearney Jason Yen IT:
Marcelo Filgueira Scientists: Nathan Currier Joachim Ostermann Michel Dominguez

Using Spotfire for Proteomic Analysis

Similar presentations

Presentation on theme: "Using Spotfire for Proteomic Analysis"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using Spotfire for Proteomic Analysis

Similar presentations

Presentation on theme: "Using Spotfire for Proteomic Analysis"— Presentation transcript:

Similar presentations

About project

Feedback