Tutorial session 2 Network annotation Exploring PPI networks using Cytoscape EMBO Practical Course Session 8 Nadezhda Doncheva and Piet Molenaar
Overview Focus: Network annotation and visualization Loading and manipulating attributes Identifier mapping Mapping data onto the network Use visuals to convey data Concepts Vizmapper Data Human Neuroblastoma mutated genes list 10/18/20152
Attributes Nodes and edges can have attributes associated with them Gene expression data Mass spectrometry data Protein structure information Gene Ontology terms, etc. Cytoscape supports multiple data types: Numbers, Text, Logical, Lists... 10/18/20153
Loading attributes Use pre-formatted attribute files Import attribute from table Excel file Comma or tab delimited text Import attribute from web services NCBI Entrez Gene Ensembl Biomart Use ‘import attribute or expression matrix’ Create attributes manually in the attribute browser 10/18/20154
Loading attributes from table (Demo) 10/18/20155
Use Case 2.1: Neuroblastoma Childhood neuro endocrine tumor Young children Variable clinical outcome Low stages Good prognosis Numeric changes of chromosomal copy numbers High stages Poor prognosis Structural chromosomal defects (LOH1p / 11q etc) Few gene defects identified MYCN amplification (20%) ALK activation (7%) CCND1 / PHOX2B / NF1
Use Case 2.1: Neuroblastoma Poor prognosis Subgroup (~1/3) characterized by MYCN amplification Rest unknown
Use case: Assignment 2.1 Whole genome sequence of 86 tumor vs blood 1043 genes with mutations 1. Load the list of genes (neuroblastoma_mutated_symbols.txt) as a network 2. Use the tab separated dataset (neuroblastoma_mutated_annotations.txt) to map additional information 1. Make sure the attributes have informative names Attribute Table Files 10/18/20158
Assignment 2.1 results 1. Load the list; use the same importer 1. No interactions yet 2. Load the annotations 1. Check text import settings 2. Check mapping settings 10/18/20159
Attribute management 10/18/ Select attributes for display Specific Attribute Tabs: for Nodes, Edges, and Network Node or Edge ID Different type of attributes: Strings, Numbers, …
Tips & Tricks: Root Graph and sessions ”There is one graph to rule them all...” The networks in Cytoscape are all ”views” on a single graph. Changing the attribute for a node in one network will also change that attribute for a node with the same ID in all other loaded networks There is no way to ”copy” a node and keep the same ID Make a copy of the session 10/18/201511
Identifier mapping Identifiers (IDs) are ideally unique, stable name or numbers But: too many IDs and different database records for Gene, DNA, RNA, Protein The ID Mapping challenge: Avoid errors by mapping IDs correctly Gene names are ambiguous Excel introduces errors Problems reaching 100 % coverage Recommendations (for proteins and genes): Map everything to Entrez Gene IDs using a spreadsheet Manually curate missing mappings to achieve 100 % coverage Be careful of Excel auto conversions 10/18/201512
Identifier mapping (Demo) 10/18/201513
Use case: Assignment Use the Biomart plugin to map UniProt identifiers on the genes Name Mapping 10/18/201514
Assignment 2.2 results Use the Ensemble 68 set Input data type is HGNC symbols Import more than just the UniProt IDs 10/18/201515
Data mapping Mapping of data values associated with graph elements onto graph visuals 10/18/201516
Data mapping Visual attributes Node fill color, border color, border width, size, shape, opacity, label Edge type, color, width, ending type, ending size, ending color Mapping types Passthrough (labels) Continuous (numeric values) Discrete (categories) Visual style 10/18/201517
VizMapper 10/18/ Default Visual Style editor List of Data attributes List of Visual Styles List of Visual attributes Mapping definition
Data mapping (Demo) 10/18/201519
Tips & Tricks: Data mapping Avoid cluttering your visualization with too much data Map the data you are specifically interested in to call out meaningful differences Mapping too much data to visual attributes may just confuse the viewer Create multiple networks and map different values 10/18/201520
Use case: Assignment Map the size of the nodes to the number of occurrences 2. Map color to the tumor ids 1. Hint: use a rainbow pattern Styles 10/18/201521
Assignment 2.3 results Use gradient Readily shows higher number of mutations Use rainbow Similar names, similar colors 10/18/201522
Assignment 2.3 results 10/18/201523
Exploring expression data VistaClara plugin Exploratory data analysis of multi-experiment microarray studies A graphical and interactive alternative to the standard attribute browser 10/18/201524
VistaClara (Demo) 10/18/201525
Filtering & editing data Use filters QuickFind nodes and edges Index the network based on a node or edge attribute Dynamic filtering for numerical attributes Build complex filters using AND, OR, NOT relations Define topological filters (considers properties of near-by nodes) Create subnetworks 10/18/201526
Filtering & editing data (Demo) 10/18/201527
Use case: Assignment What is the gene with most mutations? 2. Filter the network for genes with more than one mutation (why?) and save the new network. 3. Use the Bisogenet plugin to find interactions among these 2....to find interactions among these and their first neighbours (or explore different settings of Bisogenet according to your taste) 4. Store your session for later use and Filtering Nodes and Edges 10/18/201528
Assignment 2.4 results 1. MYCN 1. Frequently amplified; no additional information 2. More likely not to be bystander 10/18/201529
Assignment 2.4 results 1. MYCN 1. Frequently amplified; no additional information 2. More likely not to be bystander 3. Bisogenet 1. Between: Only large genes 2. Neighbours: Promising hairball 10/18/201530
To be continued… Build, visualize and analyze your own network with Cytoscape Network generation Network annotation and visualization Network analysis Identify active subnetworks Analyze Gene Ontology enrichment Perform topological analysis Find network clusters Find network motifs 10/18/201531