Presentation is loading. Please wait.

Presentation is loading. Please wait.

Canadian Bioinformatics Workshops

Similar presentations


Presentation on theme: "Canadian Bioinformatics Workshops"— Presentation transcript:

1 Canadian Bioinformatics Workshops

2 Cold Spring Harbor Laboratory
In collaboration with Cold Spring Harbor Laboratory & New York Genome Center

3 Module #: Title of Module
3

4 Module 3 Lab Cytoscape & Enrichment Map
Jüri Reimand High-throughput Biology: From Sequence to Networks April 27-May 3, 2015

5 Network Analysis Workflow
A specific example of this workflow: Cline, et al. “Integration of biological networks and gene expression data using Cytoscape”, Nature Protocols, 2, (2007). TODO: What: Schedules first part of talk How Bridge: So this is all about networks....

6 Pathways vs Networks - Detailed, high-confidence consensus
- Biochemical reactions - Small-scale, fewer genes - Concentrated from decades of literature - Simplified cellular logic, noisy - Abstractions: directed, undirected - Large-scale, genome-wide - Constructed from omics data integration

7 Networks Represent relationships of biological molecules
Physical, regulatory, genetic, functional interactions Useful for discovering relationships in large data sets Better than tables in Excel Visualize multiple data types together Discover interesting patterns Network analysis Finding sub-networks with certain properties (densely connected, co-expressed, frequently mutated, clinical characteristics) Finding paths between nodes (or other network “motifs”) Finding central nodes in network topology (“hub” genes)

8 Network basics (i): nodes and edges
A simple mapping one molecule - node one interaction - edge A more realistic mapping Cell localization cell cycle cell type taxonomy Physiologically relevant Edges - diverse relationships Critical: what do nodes and edges mean? Edges and nodes : a network is composed of 2 elements : edges and nodes. We can have 2 types of network when we speak about the output of a pathway enrichment analysis. A gene gene network or a pathway network. Node (circle) are going to represent gene and are going to be related by edges (lines) . We also can have a pathway network : nodes (circle) are representing a gene-set and edge represents the number of overlapping genes

9 Network basics (i): nodes and edges
Node (molecule) Gene Protein Transcript Drug MicroRNA A simple mapping one molecule - node one interaction - edge A more realistic mapping Cell localization cell cycle cell type taxonomy Physiologically relevant Edges - diverse relationships Critical: what do nodes and edges mean? Edge (interaction) Genetic interaction Physical protein interaction Co-expression Signaling interaction Metabolic reaction DNA-binding Directed or undirected? Edges and nodes : a network is composed of 2 elements : edges and nodes. We can have 2 types of network when we speak about the output of a pathway enrichment analysis. A gene gene network or a pathway network. Node (circle) are going to represent gene and are going to be related by edges (lines) . We also can have a pathway network : nodes (circle) are representing a gene-set and edge represents the number of overlapping genes

10 Network Representations

11 Network basics (ii) - layout and visual attributes
Use layout to interpret network Local relationships Guilt-by-association Dense clusters Global relationships Visualise multiple types of data

12 Automatic network layout

13 Force-directed network layout
Force-directed: nodes repel and edges pull Good for up to 500 nodes Bigger networks give hairballs - reduce number of edges Advice: try force directed first, or hierarchical for tree-like networks Tips for better looking networks Manually adjust layout Load network into a drawing program (e.g. Illustrator) and adjust labels

14 force directed layout 1 2 3 4

15 Dealing with ‘hairballs’: zoom or filter
Focus PKC Cell Wall Integrity

16 Visual Features Node and edge attributes Visual attributes
Attribute color Arrow shape Visual Features Node and edge attributes Represent properties of genes, interactions Text (string), integer, float, Boolean, list Visual attributes Node, edge visual properties Colour, shape, size, borders, opacity... Edge shape Node shape

17 What Have We Learned? Networks help discover relationships in large data sets Networks help integrate several datasets or types Important to understand meaning of nodes and edges Avoid hairballs by focusing analysis Automatic layout is required to visualize networks Visual attributes enable multiple types of data to be shown at once – useful to see their relationships 7 Part II

18 Cytoscape - Network Visualization
Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data. a lot of apps are available for various kinds of problem domains, including bioinformatics, social network analysis, and semantic web. Network visualization can be done using the Cytoscape software. You can create and visualize networks using Cytoscape.

19 Manipulate Networks Filter/Query Interaction Database Search Automatic Layout

20 The Cytoscape App Store
Pathway analysis Gene expression analysis Complex detection Literature mining Network motif search Pathway comparison

21 Introduction to Cytoscape (3.1.0)
save your session save image Results panel Control panel And finally these next slides represent a quick Cytoscape tutorial. Cytoscape has 3 major panels : control panel, data panel and results panel. You can save your session as a .cys file and reopen it later Table panel

22 Navigate through the network (3.1.0)
move the blue square to navigate through the network When you want to navigate through the network, you can use move the blue square in the network window of the control panel 7 Part II

23 Try different Cytoscape layouts (3.1.0)
Circular Layout If you want to redo your layout, you can go to layout in the menu bar and choose the layout of your choice 7 Part II

24 yFiles Circular

25 yFiles Organic

26 Change visual Features (3.1.0)
STYLE Colors Shapes Sizes Brushes Transparency You can play with visual features using Vizmapper in the control panel window 7 Part II

27 Visual Style Load “Your Favorite Network”
File > Import > Network > File …

28 Visual Style Load “Your Favorite Expression” Dataset
File > Import > Table > File …

29 Visual Style Map expression values to node colours using a continuous mapper

30 Visual Style Expression data mapped to node colours

31 Fine-tuning network layout
Move, zoom/pan, rotate, scale, align

32 Create Subnetwork

33 Create Subnetwork

34 Select nodes with at least 20 interactions
Network Filtering Select nodes with at least 20 interactions

35 Interaction Database Search
Query the BioGrid database for interactions of the cancer gene KRAS

36 What have we learned Cytoscape is a useful, free software tool for network visualisation and analysis Provides basic network features Automated layouts Mapping of genomic attributes to visual attributes Apps are available to extend functionality to diverse analyses

37 Enrichment Map A Cytoscape app to visualize and interpret results of pathway enrichment analysis Enriched pathways are visualized as networks Edges connect pathways with many shared genes pathway network node (pathway) edge: # of overlapping genes

38 GO.id GO.name p.value covercover.rat Deg.mdn Deg.iqr
GO: taxis E GO: chemotaxis E GO: adaptive immune response based on somatic recombination E GO: adaptive immune response E GO: leukocyte mediated immunity GO: B cell mediated immunity GO: myeloid cell differentiation GO: immune effector process GO: regulation of phagocytosis GO: positive regulation of phagocytosis GO: lymphocyte mediated immunity GO: growth factor binding GO: protein polymerization GO: endoplasmic reticulum membrane GO: immunoglobulin mediated immune response GO: heart development GO: response to bacterium GO: regulation of endocytosis GO: acute inflammatory response GO: positive regulation of endocytosis GO: myeloid leukocyte activation GO: amino acid biosynthetic process GO: regulation of inflammatory response GO: activation of immune response GO: positive regulation of immune system process GO: positive regulation of immune response GO: antigen processing and presentation GO: regulation of immune system process GO: regulation of immune response GO: negative regulation of enzyme activity GO: phagocytosis GO: myeloid leukocyte differentiation GO: humoral immune response GO: lymphocyte activation GO: leukocyte chemotaxis GO: negative regulation of protein kinase activity GO: negative regulation of transferase activity GO: transforming growth factor beta receptor signaling pathw GO: insulin-like growth factor binding GO: T cell activation GO: humoral immune response mediated by circulating immunogl GO: cytosolic ribosome (sensu Eukaryota)

39

40 Creating an Enrichment Map with data from g:Profiler
Go to g:Profiler website - . Select and copy all genes in the tutorial file MCF7_24hr_topgenes.txt in the Query box In Options, check Significant only, No electronic GO annotations Set the Output type to Generic Enrichment Map (TAB) Show advanced options Set Max size of functional category to 1000 and Min size to 5. Set Min size of gene list and functional category overlap Q&T to 2. Set Significance threshold to Benjamini-Hochberg FDR . Choose GO biological process, molecular function, KEGG and Reactome from the color legend. Download g:Profiler data as gmt: name Click on g:Profile! to run the analysis Download the result file: Download data in Generic Enrichment Map (GEM) format

41 g:Profiler files for Enrichment Map
Two files are required: Tabular text file with enriched processes&pathways GMT file with all processes&pathways and associated genes A. B.

42 TAB file – Enrichment Map
Process ID Process name P-value, FDR Phenotype positive 1; negative -1 Genes common to input list and process/pathway

43 GMT file – Enrichment Map
Process ID Process name Gene list

44 Getting the Enrichment Map app
A. Cytoscape App Store B. In Cytoscape software Apps > App manager > Search > Install

45 Constructing an Enrichment Map
A. Set input file format to “Generic” (format of g:Profiler results) A. B. Select GMT file from file system (this file contains gene lists of pathways) B. C. C. Select enrichments file from file system D. D. Set equal P-value and Q-value cutoffs (because g:Profiler only provides corrected P-values=Q-values) Click Build !

46 Enrichment Map – a network of pathways
Nodes – pathways, processes, functions Edges – between pathways with many shared genes Node size – genes in pathway Node color – enrichment strength Edge weight – genes shared

47 Enrichment Map – a network of pathways
What are major functional themes? Apoptosis Prolife- ration Mitochondrial processes Protein degradation Tissue development

48 Edge definition determines granularity
Edges between pathways defined by % of common genes Increasing edge stringency reveals finer granularity of functional themes Protein degradation Mitochondrial processes Apoptosis

49 Mapping attributes to network Node labels mapped to pathway IDs by default
Pathway names Pathway IDs

50 Compare two pathway enrichment analyses g:Profiler analyses of MCF7_12hr_topgenes.txt and MCF7_24hr_topgenes.txt 1st dataset maps to node fillings 2nd dataset maps to node edges

51 Mapping further gene sets to Enrichment Map E. g
Mapping further gene sets to Enrichment Map E.g. which processes associate to known cancer genes? CANCER_GENES B. Select GMT file from file system (this file contains gene lists used in original Enrichment Map) B. Select another GMT file from file system (this file contains gene lists mapped on top of original Enrichment Map)

52 Post-Analysis: New edges link further gene lists to existing pathways
Purple edges – pathway genes in this analysis are significantly enriched in known cancer genes

53 What have we learned? Enrichment map uses network visualization to summarize results of pathway enrichment analysis Edges connect pathways with shared genes, and edge stringency determines granularity of maps Two sets of pathways can be compared Post-analysis links further gene sets to existing map

54 Tips for publishable Enrichment Maps
Manually curate clusters of connected pathways as “functional themes” e.g. by picking most general process/pathway in the group Check genes involved Review small groups and singleton nodes Many are probably part of larger groups – safe to remove Some interesting singletons may be the only representatives! Assign further visual attributes using your omics data Export as PDF and use graphics software (AI) for finalizing Sometimes less is more

55 Cytoscape Tips & Tricks
Network views When you open a large network, no view is shown by default To improve interactive performance, Cytoscape has the concept of “Levels of Detail” Some visual attributes will only be apparent when you zoom in The levels for various attributes can be changed in the preferences To see what things will look like at full detail: ViewShow Graphics Details

56 Cytoscape Tips & Tricks
Sessions Sessions save pretty much everything: Networks Properties Visual styles Screen sizes Saving a session on a large screen may require some resizing when opened on your laptop

57 Cytoscape Tips & Tricks
Memory Cytoscape uses lots of it Doesn’t like to let go of it An occasional restart when working with large networks helps Destroy views when you don’t need them Java doesn’t have a good way to get the memory right at start Since version 2.7, Cytoscape does a much better job at “guessing” good default memory sizes than previous versions

58 Cytoscape Tips & Tricks
CytoscapeConfiguration directory In your user/home directory Your defaults and any apps downloaded from the app store will go here Sometimes, if things get really messed up, deleting (or renaming) this directory can give you a “clean slate”

59 We are on a Coffee Break & Networking Session


Download ppt "Canadian Bioinformatics Workshops"

Similar presentations


Ads by Google