ISMB, June
From protein sequences…to protein networks Database DNA and protein sequences Query Sequence GACTGCATTAC Family of homologous genes Cellular response of interest Interaction pathways associated with cellular response Database / Scaffold of Molecular Interactions
Cytoscape.org Cytoscape is a freely-available (open-source, java-based) bioinformatics software platform for visualizing biological networks (e.g. molecular interaction networks) and analyzing networks with gene expression profiles and other state data. Additional features are available as plugins. jActiveModules: identify significant “active” subnetworks Expression Correlation Network: cluster expression data Agilent Literature Search: build networks by extracting interactions from scientific literature. MCODE: finds clusters of highly interconnected regions in networks cPath: query, retrieve and visualize interactions from the MSKCC Cancer Pathway database BiNGO: determine which Gene Ontology (GO) categories are statistically over-represented in a set of genes Motif Finder: runs a Gibbs sampling motif detector on sequences for nodes in a Cytoscape network. CytoTalk: Interact with Cytoscape from Perl, Python, R, shell scripts or C or C++ programs. Core Features Customize network data display using visual styles Powerful graph layout tools Easily organize multiple networks Easily navigate large networks Filter the network Plugin API Input/Output Protein protein interactions from BIND, TRANSFAC databases Gene functional annotations from Gene Ontology (GO) and KEGG databases Biological models from Systems Biology Markup Language (SBML) cPath: Cancer Pathway database Proteomics Standards Initiative Molecular Interaction (PSI-MI) or Biopathway Exchange Language (BioPAX) formats Oracle Spatial Network data model
Outline Introduction (5 min) Cytoscape as a network integration and query tool Basic features demo (15 min) Load network Navigate/Zoom/Select/Filter Nodes Create subnetworks Visual styles Layout Plugin demo (20 min) MCODE and BiNGO Agilent Literature Search Plug-in cPATH and BioPAX plugin Plugin development intro (10 min) Upcoming features – Version 2.2 (5 min) Future work (5 min)
cPath PlugIn cPath: Overview cPath: XML Web Service cPath Cytoscape PlugIn –Demo: Download sample protein-protein interaction network. –Demo: Drill down to protein details.
cPath: XML Web Services API Provides a URL-HTTP XML Web Services API to all cPath Data. Formats: –PSI-MI: Proteomics Standards Initiative Molecular Interaction Format –BioPAX: Biological Pathway Exchange Format Commands: –Query by keyword; query by interactor name; query by Pub Med ID, etc. Example Query: &cmd=get_by_interactor_name_xref&q=P04273&format=psi _mi&startIndex=0&organism=&maxHits=10
cPath Cytoscape PlugIn Enables Cytoscape users to easily query, download and visually render interactions stored in cPath. Utilizes the cPath XML Web Service Automatically bundled with Cytoscape 2.1 –Works out of the box Additional details available on the Cytoscape PlugIn home page: –
cPath PlugIn Demo
BioPAX Plugin Demo
Plugin Development Cytoscape is a development platform for computational network biology “Just write the algorithm” 100% open source Java 1.4 –Java 1.5 when stable on Mac Rich core + plugin API Plugins are independently licensed Template code samples
Hello World Plugin!
Getting Started 1.Read “Concepts Document” on wiki 2.Follow “Cytoscape Plugin Tutorial” on wiki 3.Visit API Javadoc
New Features for 2.2 Due October 2005 Improved node/edge attribute browsing/editing Network editing v1.0 Gene Ontology annotation support Window panels Easier network comparison (side-by-side)
Baker’s yeast (Saccharomyes cerevisiae) Nematode worm (Caenorhabditis elegans) FUTURE DIRECTIONS: Cross-comparison of networks (1) Alignment of networks across species (network conservation) (2) Correspondence between physical and genetic networks (3) Conserved regions in the presence vs. absence of stimulus Fruit fly (Drosophila melanogaster)
Network alignment with PathBLAST P is a path in the global alignment graph. The v and e represent vertices and edges in P. The value p(v) is the prob. of true homology for the proteins aligned at v. The value q(e) is the prob. that the protein interaction at e is real, i.e., not a false-positive.
Example yeast/worm/fly alignments Roded Sharan et al. PNAS 2005
Integration of genetic and physical interactions 160 between- pathway models 101 within- pathway models Num interactions: 1,102 genetic 933 physical Ryan Kelley et al. Nature Biotechnology 2005
A between-pathway model
Upcoming Events Cytoscape Conference Nov 30 th and Dec 1 st, 2005 RECOMB Satellite Conference on Network Biology and Gene Regulation Dec 2 nd -4 th, 2005 Mailing lists
Cytoscape Team Trey Ideker Nerius Landys Ryan Kelley Chris Workman Past contributors: Nada Amin Owen Ozier Jonathan Wang Mark Anderson Benno Schwikowski Lee Hood Richard Bonneau Rowan Christmas Past contributors: Iliana Avila-Campillo Larissa Kamenkovich Andrew Markiel Paul Shannon Chris Sander Gary Bader Ethan Cerami Rob Sheridan Ben Gross Agilent Annette Adler Allan Kuchinsky Aditya Vailaya Mike Creech
Funding Sources NIH (NIGMS) R01 GM Program Manager: John Whitmarsh NCI caBIG Ken Buetow, Peter Covitz Unilever, PLC Guy Werner PathBLAST network comparison NSF Quantitative Systems Biology Program Manager: Mitra Basu
Layout 16 algorithms available through plugins Zooming, hide/show, alignment
yFiles Organic
yFiles Circular
Visual Styles Map graph attributes to visual attributes Define visual styles for later use Graph has node and edge attributes E.g. expression data, interaction type, GO function Mapped to visual attributes E.g. node/edge size, shape, color, font… Take continuous gene expression data and visualize it as continuous node colors
Visual Styles Load “Your Favorite Network”
Visual Styles Load “Your Favorite Expression” Dataset
Visual Styles Map expression values to node colors using a continuous mapper
Visual Styles Expression data mapped to node colors
Visual Styles Node attributes: node color, border color, border type, node shape, size, label, font Edge attributes: edge color, line types, arrows, label, font Multidimensional visual attribute mapping soon
MCODE and Biomodules Plugins (MSKCC and ISB) Clusters in a protein-protein interaction network have been shown to represent protein complexes and parts of pathways Clusters in a protein similarity network represent protein families Network clustering is available through the MCODE Cytoscape plugin
Proteasome 26S Proteasome 20S Ribosome RNA Pol core RNA Splicing
Biomodules (ISB) Prinz S, Avila-Campillo I, Aldridge C, Srinivasan A, Dimitrov K, Siegel AF, and Galitski T Genome Res :
Agilent Literature Search Plugin for Cytoscape Extract Nouns/Verbs (User Context/ BNS) Sentence Tokenization No Is Interesting Sentence? Yes Normalize Nouns (User Context/BNS) Classify Sentence Into Interaction Type Bind Cleave Inhibit Promote Catalyze Convert to ALFA Retrieved Documents Meta - Search TermsContext Query Get Document Output ALFA Network Query Interface Information Extraction Routine Output Cytoscape Network
Cytoscape Network produced by Literature Search. Abstract from the scientific literature Sentences for an edge
Active Modules (UCSD) Ideker T, Ozier O, Schwikowski B, Siegel AF Bioinformatics. 2002;18 Suppl 1:S233-40
Active Modules
Biomodules (ISB) Prinz S, Avila-Campillo I, Aldridge C, Srinivasan A, Dimitrov K, Siegel AF, and Galitski T Genome Res :