The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed.

Slides:



Advertisements
Similar presentations
The overlapping community structure of complex networks.
Advertisements

Network analysis Sushmita Roy BMI/CS 576
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Modularity and community structure in networks
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles Authors: Chia-Hao Chin 1,4,
University at BuffaloThe State University of New York Young-Rae Cho Department of Computer Science and Engineering State University of New York at Buffalo.
Seeing the forest for the trees : using the Gene Ontology to restructure hierarchical clustering Dikla Dotan-Cohen, Simon Kasif and Avraham A. Melkman.
Comparison of Networks Across Species CS374 Presentation October 26, 2006 Chuan Sheng Foo.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Fast algorithm for detecting community structure in networks.
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Modular Organization of Protein Interaction Network Feng Luo, Ph.D. Department of Computer Science Clemson University.
WORKSHOP ON ONTOLOGIES OF CELLULAR NETWORKS
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
A scalable multilevel algorithm for community structure detection
Data Mining Presentation Learning Patterns in the Dynamics of Biological Networks Chang hun You, Lawrence B. Holder, Diane J. Cook.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
ChIP-seq and its applications in GRN construction Jin Chen 2012 Fall CSE
Functional Module Prediction in Protein Interaction Networks Ch. Eslahchi NUS-IPM Workshop 5-7 April 2011.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Graph preprocessing. Common Neighborhood Similarity (CNS) measures.
Community Detection by Modularity Optimization Jooyoung Lee
Community detection algorithms: a comparative analysis Santo Fortunato.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
University at BuffaloThe State University of New York Clustering of Interaction Network Definition qProcess to detect densely connected sub-graphs qDetermines.
Introduction to Bioinformatics Biological Networks Department of Computing Imperial College London March 18, 2010 Lecture hour 18 Nataša Pržulj
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
Top X interactions of PIN Network A interactions Coverage of Network A Figure S1 - Network A interactions are distributed evenly across the top 60,000.
Clustering Gene Expression Data BMI/CS 576 Colin Dewey Fall 2010.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Understanding Network Concepts in Modules Dong J, Horvath S (2007) BMC Systems Biology 2007, 1:24.
Communities. Questions 1.What is a community (intuitively)? Examples and fundamental hypothesis 2.What do we really mean by communities? Basic definitions.
Network Community Behavior to Infer Human Activities.
LECTURE 2 1.Complex Network Models 2.Properties of Protein-Protein Interaction Networks.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
University at BuffaloThe State University of New York Detecting Community Structure in Networks.
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Overlapping Community Detection in Networks
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
James Hipp Senior, Clemson University.  Graph Representation G = (V, E) V = Set of Vertices E = Set of Edges  Adjacency Matrix  No Self-Inclusion (i.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Graph clustering to detect network modules
PINALOG Protein Interaction Network Alignment and its implication in function prediction and complex detection Hang Phan Prof. Michael J.E. Sternberg.
Cohesive Subgraph Computation over Large Graphs
Hierarchical Agglomerative Clustering on graphs
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
Groups of vertices and Core-periphery structure
Date of download: 12/25/2017 Copyright © ASME. All rights reserved.
Greedy Algorithm for Community Detection
IDENTIFICATION OF DENSE SUBGRAPHS FROM MASSIVE SPARSE GRAPHS
Community detection in graphs
Assessing Hierarchical Modularity in Protein Interaction Networks
Finding modules on graphs
Overcoming Resolution Limits in MDL Community Detection
3.3 Network-Centric Community Detection
Anastasia Baryshnikova  Cell Systems 
(a) Venn diagram showing the degree of overlap of the following different approaches: G-test for significant differences between groups (with Bonferroni.
Presentation transcript:

The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed Ibrahim (King’s College, London, UK) Alioune Ngom (University of Windsor, Windsor, Canada)

Protein Complexes and Functional Modules 2  Protein complex: Proteins interacting with each other at the same time and place [Spirin et al. 2004]  Functional module: Set of proteins involved in a common elementary biological function  Bind each other at different time and place  Multiple protein complexes [Chen et al ]

Identification of Functional Modules 3  Protein Interaction Networks (PINs)  Functional modules correspond to highly connected sub- graphs in a PIN  Many graph clustering approaches Clique-based methods: strict and not scalable to large PINs Density-based methods: issues with low-degree nodes and low topological connectivity Hierarchical methods Hierarchical organization of the modules within PINs Global metric: not scalable to large PINs Local metric: common misclassification of low-degree nodes Poor performance on noisy PINs; i.e., false positives interactions

Graph Clustering 4 Find non-overlapping communities in PINs

Hierarchical Methods -- Related Works 5  Divisive Approaches  Iteratively remove an edge with the Highest Edge Betweenness Score CNM method [Clauset et al 2004] O(m h logn) Lowest Edge Clustering Coefficient Radicchi method [Radicchi et al 2004] O(m 2 )  These are global measures

Hierarchical Methods -- Related Works 6  Agglomerative Approaches:  Iteratively merge two clusters C u and C v  Edge Clustering Value:  Local similarity metric between nodes  HC-PIN Algorithm [Wang et al 2011]

Our New Criterion – UnWeighted PINs 7  Relative Vertex-to-Vertex Clustering Value  0 ≤ R(u → v) ≤ 100  Likelihood of u to be in v’s cluster  Not how likely that both u and v lie in the same cluster  Local similarity pre-metric  Principle of preferential attachment in scale-free networks

Our New Criterion – Weighted PINs 8 Where, w(x, y) = weight on interaction edge (x, y)

FAC-PIN Algorithm – Test for Inclusion 9  Insert u into C v whenever 1. R(u → v) = R(u → v) > R(v → u) 3. R(u → v) = R(v → u) and 1. R(u → v) = R(v → u) = 100 or 2. R(u → v) > 50  That is whenever: R(u → v) > 50μ and R(u → v) ≥ R(v → u)  Algorithm: for each v; iteratively insert its neighbors u into C v whenever test is true for u.

FAC-PIN Algorithm - Clustering 10  Initialization Phase  Form singleton cluster C(v) for each v  Community Detection Phase  For each v, include each neighbor u into C(v) whenever [ R w (u → v) > 50μ and R w (u → v) ≥ R w (v → u) ] is true with merging parameter: 0 ≤ μ < 2  Partition Computation Phase  Obtain the induced subgraph of G for each C(v) as sub- network cluster  Evaluation Phase

FAC-PIN Algorithm - Clustering 11

Computational Complexities 12  Given n nodes and m edges  CNM Algorithm: O( m h logn ) h = height  Radicchi Algorithm: O( m 2 )  HC-PIN Algorithm: O( m δ 2 )  FAC-PIN Algorithm: O( n δ 2 ) << O( n D 2 )  δ = average degree and D = maximum degree

Computational Experiments 13 For any given PIN: 1. Apply FAC-PIN with merging parameters μ 2. Evaluate modularity of resulting partitions P k,μ Three modularity functions 3. P k = best P k,μ 4. Execution time to obtain P k,μ 5. Functional Enrichment validations with SGD GO P-value cutoff = 0.05 Retain significant clusters and number of significant clusters

Data Sets 14  8 un-weighted PIN data of from REACTOME database  Including PIN data of S. cerevisiae (yeast SC-1) PIN data 5697 proteins interactions  1 un-weighted PIN and corresponding weighted PIN data of S. cerevisiae (yeast SC-2) from DIP database  4726 proteins  interactions  Protein complexes from MIPS database

Results – Effect of Merging Parameter μ (SC-2; 4726 proteins and interactions) 15 Recall: merging test = [ R w (u → v) > 50μ and R w (u → v) ≥ R w (v → u) ] Less neighbors are merged with v as μ increases, hence k increases with μ

Results – Execution Times in Seconds (PINs from Reactome database; μ = 0.5) 16

Results – Modularity Functions 17  Function Q:  Function Ω:  Function D: where w(u, v) = 0 or 1 for un-weighted PINs

Results – Modularity of FAC-PIN Partitions (PINs from Reactome database; μ = 0.5) 18 QwΩwDwQwΩwDw

Functional Module Prediction 19  Recall indicates how effectively proteins with the same functional category in the network are extracted  Precision illustrated how consistently proteins in the same module are annotated  f-measure is used to evaluate the overall performance  Average f-measure as the accuracy of the algorithms

Functional Enrichment of FAC-PIN Modules 20  Hypergeometric distribution…  …

Results – Functional Enrichment Validations (Un-weighted SC-1; 5697 proteins and interactions; μ = 0.5) 21