Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed.

Similar presentations


Presentation on theme: "The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed."— Presentation transcript:

1 The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed Ibrahim (King’s College, London, UK) Alioune Ngom (University of Windsor, Windsor, Canada)

2 Protein Complexes and Functional Modules 2  Protein complex: Proteins interacting with each other at the same time and place [Spirin et al. 2004]  Functional module: Set of proteins involved in a common elementary biological function  Bind each other at different time and place  Multiple protein complexes [Chen et al. 2005 ]

3 Identification of Functional Modules 3  Protein Interaction Networks (PINs)  Functional modules correspond to highly connected sub- graphs in a PIN  Many graph clustering approaches Clique-based methods: strict and not scalable to large PINs Density-based methods: issues with low-degree nodes and low topological connectivity Hierarchical methods Hierarchical organization of the modules within PINs Global metric: not scalable to large PINs Local metric: common misclassification of low-degree nodes Poor performance on noisy PINs; i.e., false positives interactions

4 Graph Clustering 4 Find non-overlapping communities in PINs

5 Hierarchical Methods -- Related Works 5  Divisive Approaches  Iteratively remove an edge with the Highest Edge Betweenness Score CNM method [Clauset et al 2004] O(m h logn) Lowest Edge Clustering Coefficient Radicchi method [Radicchi et al 2004] O(m 2 )  These are global measures

6 Hierarchical Methods -- Related Works 6  Agglomerative Approaches:  Iteratively merge two clusters C u and C v  Edge Clustering Value:  Local similarity metric between nodes  HC-PIN Algorithm [Wang et al 2011]

7 Our New Criterion – UnWeighted PINs 7  Relative Vertex-to-Vertex Clustering Value  0 ≤ R(u → v) ≤ 100  Likelihood of u to be in v’s cluster  Not how likely that both u and v lie in the same cluster  Local similarity pre-metric  Principle of preferential attachment in scale-free networks

8 Our New Criterion – Weighted PINs 8 Where, w(x, y) = weight on interaction edge (x, y)

9 FAC-PIN Algorithm – Test for Inclusion 9  Insert u into C v whenever 1. R(u → v) = 100 2. R(u → v) > R(v → u) 3. R(u → v) = R(v → u) and 1. R(u → v) = R(v → u) = 100 or 2. R(u → v) > 50  That is whenever: R(u → v) > 50μ and R(u → v) ≥ R(v → u)  Algorithm: for each v; iteratively insert its neighbors u into C v whenever test is true for u.

10 FAC-PIN Algorithm - Clustering 10  Initialization Phase  Form singleton cluster C(v) for each v  Community Detection Phase  For each v, include each neighbor u into C(v) whenever [ R w (u → v) > 50μ and R w (u → v) ≥ R w (v → u) ] is true with merging parameter: 0 ≤ μ < 2  Partition Computation Phase  Obtain the induced subgraph of G for each C(v) as sub- network cluster  Evaluation Phase

11 FAC-PIN Algorithm - Clustering 11

12 Computational Complexities 12  Given n nodes and m edges  CNM Algorithm: O( m h logn ) h = height  Radicchi Algorithm: O( m 2 )  HC-PIN Algorithm: O( m δ 2 )  FAC-PIN Algorithm: O( n δ 2 ) << O( n D 2 )  δ = average degree and D = maximum degree

13 Computational Experiments 13 For any given PIN: 1. Apply FAC-PIN with merging parameters μ 2. Evaluate modularity of resulting partitions P k,μ Three modularity functions 3. P k = best P k,μ 4. Execution time to obtain P k,μ 5. Functional Enrichment validations with SGD GO P-value cutoff = 0.05 Retain significant clusters and number of significant clusters

14 Data Sets 14  8 un-weighted PIN data of from REACTOME database  Including PIN data of S. cerevisiae (yeast SC-1) PIN data 5697 proteins 50675 interactions  1 un-weighted PIN and corresponding weighted PIN data of S. cerevisiae (yeast SC-2) from DIP database  4726 proteins  15166 interactions  Protein complexes from MIPS database

15 Results – Effect of Merging Parameter μ (SC-2; 4726 proteins and 15166 interactions) 15 Recall: merging test = [ R w (u → v) > 50μ and R w (u → v) ≥ R w (v → u) ] Less neighbors are merged with v as μ increases, hence k increases with μ

16 Results – Execution Times in Seconds (PINs from Reactome database; μ = 0.5) 16

17 Results – Modularity Functions 17  Function Q:  Function Ω:  Function D: where w(u, v) = 0 or 1 for un-weighted PINs

18 Results – Modularity of FAC-PIN Partitions (PINs from Reactome database; μ = 0.5) 18 QwΩwDwQwΩwDw

19 Functional Module Prediction 19  Recall indicates how effectively proteins with the same functional category in the network are extracted  Precision illustrated how consistently proteins in the same module are annotated  f-measure is used to evaluate the overall performance  Average f-measure as the accuracy of the algorithms

20 Functional Enrichment of FAC-PIN Modules 20  Hypergeometric distribution…  …

21 Results – Functional Enrichment Validations (Un-weighted SC-1; 5697 proteins and 50675 interactions; μ = 0.5) 21


Download ppt "The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed."

Similar presentations


Ads by Google