Information Visualization using graphs algorithms Symeonidis Alkiviadis
Contents Preliminaries Gene clustering Graph extraction from biological data Graph visualization Open issues Discussion
Preliminaries Visualize clusters of genes produced by clustering over gene expressions Gene expression: set of values of genes over a set of patients
Preliminaries Graph G(V,E) : set of vertices, with edges joining vertices Each vertex represents a gene Each edge represents strong correlation Clustering => groups of vertices
Contents Preliminaries Gene clustering Graph extraction from biological data Graph visualization Open issues Discussion
Gene clustering Correlation Compute Pearson's correlation coefficient for every pair of genes
Gene clustering Greedy clustering for every unclassified gene x create a cluster which includes it add all genes y with correlation > threshold Cost: O(|genes| 2 )
Contents Preliminaries Gene clustering Graph extraction from biological data Graph visualization Open issues Discussion
Graph extraction from biological data Genes → vertices ۷ Clusters→ groups ۷ Edges ?
Graph extraction from biological data In-cluster relation Mean value of correlation coefficients for all genes in a cluster All pairs of genes with correlation higher than threshold* mean are considered highly correlated Edge meaning: (Very) strong correlation
Graph extraction from biological data Inter-cluster relation Mean value of correlation coefficients for each cluster All pairs of genes with correlation higher than threshold* (mean1+mean2)/2 are considered highly correlated Edge meaning: Possibly wrong classification
Graph extraction from biological data Genes → vertices ۷ Clusters→ groups ۷ Edges ۷ all highly correlated pairs of genes
Contents Preliminaries Gene clustering Graph extraction from biological data Graph visualization Open issues Discussion
Graph visualization Gene → Vertex → circle High correlation → Edge → line Cluster → Group → Circle with respective genes - vertices on its periphery
Graph visualization Place groups Determine ordering of vertices in group Try to reduce crossings
Graph visualization placing groups Force - directed method over groups
Graph visualization Place groups Determine ordering of vertices in group Try to reduce crossings
Graph visualization Determine ordering of vertices in group(tree) Tree depth first search discovery time
Graph visualization Determine ordering of vertices in group(bicon) Biconnected graph: Remains connected after removing one(any) vertex/edge
Graph visualization Determine ordering of vertices in group(bicon) For every node u identify triangles or create them Store (v,w) Remove u u v w u v w
Graph visualization Determine ordering of vertices in group(bicon) Restore graph Remove all stored edges Perform dfs, compute longest path and place it
Graph visualization Determine ordering of vertices in group(bicon) Place any remaining vertices Next to 2 neighbors Next to 1 neighbor Next to 0 neighbors
Graph visualization Determine ordering of vertices in group(n-bic) Non-biconnected graph … under development There is a vertex whose removal disconnects the graph Decompose into bicon. components get articulation points vertices responsible for non-biconnectivity
Graph visualization Determine ordering of vertices in group(n-bic) Decompose into bicon. components biconnected subgraphs get articulation points vertices responsible for non-biconnectivity
Graph visualization Determine ordering of vertices in group(n-bic) Articulation points + biconnected components Block - cut - point tree -Dfs on block cut point=> relative ordering of components - For each biconnected component act as before
Graph visualization Determine ordering of vertices in group Cost Tree: dfs: O(|E|+\V|)=O(|E|) Biconnected graph Dominated by dfs O(|E|) Non- biconnected graph Dominated by extracting block-cut tree O(|E|)
Graph visualization … until now Determine groups’ positions ۷ Determine vertices ordering۷
Graph visualization Place groups ۷ Determine ordering in group ۷ Try to reduce crossings
Graph visualization reduce crossings Spin groups trying to minimize energy
Graph visualization edge coloring Each edge is assigned a weight weight(x node,y node )= r(x gene,y gene ) The color of each edge reflects its weight brighter color → stronger correlation In- group edges have different color than inter-group edges
Graph visualization Overall Initially…
Graph visualization overall Finally…
Open issues Clustering Edge translation Visualize large data sets Zoom Layered drawing Scrollbars