Download presentation
Presentation is loading. Please wait.
Published byEustacia Hutchinson Modified over 8 years ago
1
Human Genetics Integrative Bioinformatics using Cytoscape (and R2)
2
Human Genetics (Bio)Chemistry versus Molecular Biology …some basic concepts (Bio)Chemistry Concentrations Molecular structures Reaction equations Quantitative Defined experimental setup Molecular Biology Regulation Large biomolecules Large scale processes Qualitative Complex experimental setup (by necessity!)
3
Human Genetics Molecular Biology: New techniques Integrative Bioinformatics needed (Deep)Sequencing – Arrays – Proteomics Quantitative analysis –handling large datasets –statistics Capturing complexity –integration –graphs Integrative Bioinformatics: Integrated Bioinformaticians!
4
Human Genetics Integrative Bioinformatics: An example
5
Human Genetics Integrative Bioinformatics: What they did 1.Sequence genome; assign gene function using protein sequence, structural similarities (Bonneau et al., 2004; Ng et al., 2000) 2.Perturb cells: environmental factors; knockouts (Baliga et al., 2004; Kaur et al., 2006; Kottemann et al., 2005) 3.Measure changes: microarrays (Baliga et al., 2004;Kaur et al., 2006; Whitehead et al., 2006). 4.Integrate diverse data (mRNA levels, evolutionarily conserved associations among proteins, metabolic pathways, cis-regulatory motifs, etc.) with the cMonkey algorithm to reduce data complexity and identify subsets of genes that are coregulated in certain environments (biclusters) (Reiss et al., 2006). 5.Using the machine learning algorithm Inferelator construct a dynamic network model for influence of changes in EFs and TFs on the expression of coregulated genes (Bonneau et al., 2006). 6.Explore the network with Gaggle, a framework for data integration and software interoperability to formulate and then experimentally test hypotheses to drive additional iterations of steps 2–6 (Shannon et al., 2006)
6
Human Genetics Integrative Bioinformatics: Their framework
7
Human Genetics Integrative Bioinformatics: results
8
Human Genetics Goes to show that: 1.Aggregate 2.Search/Visualize 3.Analyze/Feedback Combine data from different sources Filter Algorithms Need for adaptable software Goal: Facilitate ideas
9
Human Genetics Cytoscape - Network Visualization and Analysis Freely-available (open- source, java) software, easily extensible (Plugin API) Visualizing networks (e.g. molecular interaction networks) Analyzing networks with gene expression profiles and other cell state data (GO, proteomics, …) Used in several hundred analyses in recent literature Continuity guaranteed
10
Human Genetics An example Cytoscape work-flow
11
Human Genetics Cytoscape Workflow 1. Load Networks (Import network data into Cytoscape) 2. Load Attributes (Get data about networks into Cytoscape) 3. Analyze and Visualize Networks 4. Prepare for Publication A specific example of this workflow: –Cline, et al. “Integration of biological networks and gene expression data using Cytoscape”, Nature Protocols, 2, 2366- 2382 (2007).
12
Human Genetics Networks as graphs A Network is a collection of –Nodes (or vertices) –Edges connecting nodes (directed or undirected, weighted, multiple edges, self-edges) Nodes can represent proteins, genes, metabolites, or groups of these (e.g. complexes) - any sort of object Edges can be either physical or functional interactions, activators, regulators, reactions - any sort of relations
13
Human Genetics Cytoscape Workflow 1. Load Networks (Get network data into Cytoscape) 2. Load Attributes (Get data about networks into Cytoscape) 3. Analyze and Visualize Networks 4. Prepare for Publication
14
Human Genetics Creating a network
15
Human Genetics Free-format Text and Excel Files Specify Input File Define Columns Text Parsing Options Preview
16
Human Genetics http://pathguide.org : over 240 pathway db’s Pathways: plenty resources
17
Human Genetics All kinds of network data… Physical interactions –Protein – Protein interactions –Protein – DNA interactions –Metabolic interactions Functional interactions –Co-expression relations –Genetic interactions –Knockout/siRNA – targets
18
Human Genetics Pre-formatted Network Files Cytoscape supports many popular file formats: SIF (Simple Interaction Format) GML (Graph Markup Language) XGMML (eXtensible Graph Markup and Modeling Language) BioPax (Biological Pathway Data) PSI-MI 1 & 2.5 (Protein Standards Initiative) SBML Level 2 (Systems Biology Markup Language) Available for download from data sources (URLs, web-services, formatted table files)
19
Human Genetics Internet Databases Cytoscape version 2.6 –web service clients: import networks directly from several trusted internet resources IntAct (MBL-EBI) PathwayCommons (collection of data resources) NCBI Entrez Gene Many more will be included...
20
Human Genetics Interaction Database Search Import Visualize and Analyze
21
Human Genetics Cytoscape Workflow 1. Load Networks (Get network data into Cytoscape) 2. Load Attributes (Get data about networks into Cytoscape) 3. Analyze and Visualize Networks 4. Prepare for Publication
22
Human Genetics What are Attributes? Any data that describes or provides details about the nodes and edges in the network –Gene Expression Data –Mass Spectrometry Data –Protein Structure Information –Gene Ontology (GO) terms –Interaction Confidence Values, etc Cytoscape support multiple data types –Numbers (integers, floats) –Text (strings) –Logical (booleans) –Lists…
23
Attribute Management Node or Edge ID Specific Attribute Tabs Select Attributes for Display Strings and floating type of attributes
24
Human Genetics Load Attributes: Import Attribute Files Map data about Networks onto Networks. Attributes can be loaded in many of the same ways as networks. Import pre-formatted attribute files Import formatted text or Excel files Create attributes manually in attribute editor Load attributes from web services ID mapping though node attributes
25
Human Genetics ID Mapping Mapping identifiers from one source to another is a major challenge Multiple levels of IDs E.g. probe->gene ->peptide- >protein Cytoscape provides an ID mapping through the BioMart web service of EBI to convert the IDs Not perfect but sufficient Additional mapping mechanism underway
26
Human Genetics Cytoscape Workflow 1. Load Networks (Get network data into Cytoscape) 2. Load Attributes (Get data about networks into Cytoscape) 3. Analyze and Visualize Networks 4. Prepare for Publication
27
Human Genetics Visual Data Integration 1. Network Data 2. Attribute Data YDR382W pp YDL130W YDR382W pp YFL039C YFL039C pp YCL040W YFL039C pp YHR179W ExpressionValue YCL040W = 0.542 YDL130W = -0.123 YDR382W = -0.058 YFL039C = 0.192 YHR179W = 0.078 VizMapper
28
Human Genetics VizMapper List of Data Attributes Default Visual Style Editor List of Visual Attributes Mapping definition List of Visual Styles
29
Human Genetics Types of mappings Continuous Continuous Data mapped to Continuous Visual Attributes (e.g. gene expression levels mapped to node color) Continuous Data mapped to Discrete Visual Attributes (e.g. p-value categories mapped to node shape) Discrete Discrete (categorical) Data to Discrete Visual Attributes (e.g. GO annotation mapped to node shape) Discrete Data mapped to Continuous Visual Attributes(e.g. multiple GO terms mapped to pie coloring)
30
Human Genetics Network Filtering
31
Human Genetics Several Layout Algorithms Spring-embedded Circular Hierarchical
32
Human Genetics Linkout Nodes and Edges act as hyperlinks to external databases. User-configurable URLs Collection of the biological results for the publication
33
Human Genetics Cytoscape Workflow 1. Load Networks (Get network data into Cytoscape) 2. Load Attributes (Get data about networks into Cytoscape) 3. Analyze and Visualize Networks 4. Prepare for Publication
34
Human Genetics Prepare for Publication Fine tune the Figures Manual Layout manipulation options (align, scale, rotate) Manually override visual styles –place labels, change colors, etc.
35
Human Genetics Finalizing the Figures Publication Quality Graphics in several formats PDF, EPS, SVG, PNG, JPEG, and BMP Export Session to HTML for Web
36
Human Genetics Cytoscape: So what? The big Pro Cyto argument: EXTENSIBLE Plugins, Plugins, Plugins –In our case enabled extended array data analysis
37
Human Genetics Cytoscape is Extensible Cytoscape is open source and free software A plugin interface that allows any programmer to write their own extensions to Cytoscape Plugins represent the primary biological analysis mechanism in Cytoscape Plugins are distributed from a central Cytoscape database and can be installed while running Several dozens of plug-ins currently available (www.cytoscape.org/plugins/index.php)
38
Human Genetics Hello World Plugin http://cytoscape.org/cgi-bin/moin.cgi/Hello_World_Plugin http://cytoscape.org/cgi-bin/moin.cgi/Developer_Homepage
39
Human Genetics Extending the workflow through plugins Graph based integration and analysis of molecular biological data
40
Human Genetics Integrative Bioinformatics in our group Aggregate data: 18000+ Affymetrix arrays –Tumor series –Public data –Experiments Manipulate celllines; Lentiviral library Search/Visualize/Selection: R2 –Statistical cutoffs –Correlations: R2 –Clinical data coupling Analysis/Feedback: R2 and Cytoscape –Known Interactions –Transcription Factor binding
41
Human Genetics External data sources Statistical analysis Perl module Cytoscape webstart AMC Plugin Canonical paths DB Patient data GEO arrays Algorithms Array data: Tumor and Experiments R2-array analysis interface Cytoscape interface HGServer Integrative Bioinformatics in our group
42
Human Genetics Array data analysis: R2 Mainly work by Jan Koster
43
Human Genetics R2 interface: Demo
44
Human Genetics R2 interface
45
Human Genetics R2 interface
46
Human Genetics R2 interface
47
Human Genetics R2 interface
48
Human Genetics R2 interface
49
Human Genetics Timeseries in R2 / Cytoscape (Demo)
50
Human Genetics Timeseries in R2
51
Human Genetics Timeseries in R2
52
Human Genetics Timeseries in R2 Integration with Cytoscape through webstart
53
Human Genetics Timeseries in Cytoscape: Visualization
54
Human Genetics Timeseries in Cytoscape: Aggregate data
55
Human Genetics Timeseries in Cytoscape: Search/Filter
56
Human Genetics Timeseries in Cytoscape: Filter
57
Human Genetics Timeseries in Cytoscape
58
Human Genetics Timeseries in Cytoscape
59
Human Genetics Tf (green) and partners (red)
60
Human Genetics Filtering
61
Human Genetics Filtering
62
Human Genetics Coloring, layout
63
Human Genetics Resuming: 1.Aggregate 2.Search/Visualize 3.Analyze/ Feedback Combine NOTCH3 knockout data with TF and PPi data Layout timeseries/Find downstream targets Identify MSX1/Knockout in new experiment
64
Human Genetics More Plugin Examples BiNGO (Enriched GO categories found in the sub-network) WikiPathways (Visualize curated pathways) MCODE (Putative protein complexes) GenePro (Protein-Protein interaction cluster visualization) jActiveModules (Search for significant sub-networks) NetworkAnalyzer (Statistical analysis of networks) Agilent Literature Search (Network creation) CyGoose (Gaggle communication) See http://cytoscape.org/plugins for many more
65
Human Genetics Timeseries and BinGO: Aggregate
66
Human Genetics Timeseries and BinGO: Analyze
67
Human Genetics Timeseries and BinGO
68
Human Genetics Timeseries and BinGO
69
Human Genetics GOlorize plug-in (Pasteur) Node placement on the basis of both the connection structure (the edges) and the class structure (GO) A modification of the classic force-directed layout algorithm Beyond GO classes, other class information can be used though attributes (e.g. active modules, complexes)
70
Human Genetics GOlorize plug-in interface Default settings for the class attractive force and separation factor Class-directed network layout
71
Human Genetics Example: genetic interaction network Standard Spring-embedded layout algorithm in Cytoscape
72
Human Genetics Example: genetic interaction network Spring-embedded layout algorithm with GO colour-coding
73
Human Genetics Example: genetic interaction network Final results of the GOlorize layout algorithm in Cytoscape Garcia et al. Bioinformatics 2007
74
Human Genetics Find Network Clusters - MCODE Plugin Network clusters are highly interconnected sub-networks that may be also partly overlapping Clusters in a protein-protein interaction network have been shown to represent protein complexes and parts of biological pathways Clusters in a protein similarity network represent protein families Network clustering is available through the MCODE Cytoscape plugin
75
Human Genetics Network Clustering 7000 Yeast interactions among 3000 proteins
76
Human Genetics Bader & Hogue, BMC Bioinformatics 2003 4(1):2
77
Human Genetics Proteasome 26S Proteasome 20S Ribosome RNA Pol core RNA Splicing Bader & Hogue, BMC Bioinformatics 2003 4(1):2
78
Human Genetics Find Network Motifs - Netmatch plugin Network motif is a sub-network that occurs significantly more often than by chance alone Input: query and target networks, optional node/edge labels Output: topological query matches as subgraphs of target network Supports: subgraph matching, node/edge labels, label wildcards, approximate paths http://alpha.dmi.unict.it/~ctnyu/netmatch.html
79
Human Genetics Finding query sub-networks QueryResults Ferro et al. Bioinformatics 2007
80
Human Genetics Finding Signaling Pathways Potential signaling pathways from plasma membrane to nucleus via cytoplasm Raf-1 Mek MAPK TFs Nucleus - Growth Control Mitogenesis MAP Kinase Cascade Ras NetMatch query Shortest path between subgraph matches Signaling pathway example NetMatch Results
81
Human Genetics Find Active Subnetworks Active modules are sub-networks that show differential expression over user-specified conditions or time-points Microarray gene-expression attributes Mass-spectrometry protein abundance Method Calculate z-score/node, Z A score/subgraph, correct for random expression data sampling Score over multiple experimental conditions Simulated annealing-based search method is used to find the high scoring networks Ideker T, Ozier O, Schwikowski B, Siegel AF Bioinformatics. 2002;18 Suppl 1:S233-40
82
Human Genetics Finding active modules Ideker T et al. Science 2001; Bioinformatics 2002 jActiveModules plug-in Input: interaction network and p-values for gene expression values over several conditions Output: significant sub- networks that show differential expression over one or several conditions
83
Human Genetics Cerebral: Cellular location and expression data
84
Human Genetics Concluding Cytoscape is a proven valuable tool for integrative bioinformatics Easily extensible: well suited to answer new biological research questions Analyses can be tedious for biologists; up to bioinformaticians to translate these in simple workflows Therefore: bioinformaticians, integrate into wet-lab research groups!
85
Human Genetics Some notes… Plugin lifetime –Maintenance –Interoperability Visualization issues… –Standard biologist layouts –Fancy visuals Cytoscape 3.0 aims to solve these issues (amongst others)
86
Human Genetics Availability Cytoscape: –http://cytoscape.org –cytoscape-discuss@googlegroups.com –cytoscape-helpdesk@googlegroups.com R2 –Available shortly through http://humangenetics-amc.nl –Keep yourself posted on http://groups.google.com/group/r2-announce
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.