Community Detection by Modularity Optimization Jooyoung Lee

Slides:



Advertisements
Similar presentations
Class 12: Communities Network Science: Communities Dr. Baruch Barzel.
Advertisements

Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University.
Cartography of complex networks: From organizations to the metabolism Cartography of complex networks: From organizations to the metabolism Roger Guimerà.
Network analysis Sushmita Roy BMI/CS 576
DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.
Social network partition Presenter: Xiaofei Cao Partick Berg.
ICDE 2014 LinkSCAN*: Overlapping Community Detection Using the Link-Space Transformation Sungsu Lim †, Seungwoo Ryu ‡, Sejeong Kwon§, Kyomin Jung ¶, and.
A Hierarchical Multiple Target Tracking Algorithm for Sensor Networks Songhwai Oh and Shankar Sastry EECS, Berkeley Nest Retreat, Jan
KDD 2009 Scalable Graph Clustering using Stochastic Flows Applications to Community Discovery Venu Satuluri and Srinivasan Parthasarathy Data Mining Research.
An Efficient Algorithm Based on Self-adapted Fuzzy C-Means Clustering for Detecting Communities in Complex Networks Jianzhi Jin 1, Yuhua Liu 1, Kaihua.
Modularity and community structure in networks
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Graph Partitioning Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
V4 Matrix algorithms and graph partitioning
University of Buffalo The State University of New York Spatiotemporal Data Mining on Networks Taehyong Kim Computer Science and Engineering State University.
Discovering Overlapping Groups in Social Media Xufei Wang, Lei Tang, Huiji Gao, and Huan Liu Arizona State University.
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Global topological properties of biological networks.
A scalable multilevel algorithm for community structure detection
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Epistasis Analysis Using Microarrays Chris Workman.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed.
Department of Engineering Science Department of Zoology Soft partitioning in networks via Bayesian nonnegative matrix factorization Ioannis Psorakis, Steve.
Interactive Image Segmentation of Non-Contiguous Classes using Particle Competition and Cooperation Fabricio Breve São Paulo State University (UNESP)
1 Using Heuristic Search Techniques to Extract Design Abstractions from Source Code The Genetic and Evolutionary Computation Conference (GECCO'02). Brian.
Science & Technology Centers Program Center for Science of Information Bryn Mawr Howard MIT Princeton Purdue Stanford Texas A&M UC Berkeley UC San Diego.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Community detection algorithms: a comparative analysis Santo Fortunato.
Efficient Identification of Overlapping Communities Jeffrey Baumes Mark Goldberg Malik Magdon-Ismail Rensselaer Polytechnic Institute, Troy, NY.
Particle Competition and Cooperation for Uncovering Network Overlap Community Structure Fabricio A. Breve 1, Liang Zhao 1, Marcos G. Quiles 2, Witold Pedrycz.
Uncovering Overlap Community Structure in Complex Networks using Particle Competition Fabricio A. Liang
Efficient Route Computation on Road Networks Based on Hierarchical Communities Qing Song, Xiaofan Wang Department of Automation, Shanghai Jiao Tong University,
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
ELeaRNT: Evolutionary Learning of Rich Neural Network Topologies Authors: Slobodan Miletic 3078/2010 Nikola Jovanovic 3077/2010
Relative Validity Criteria for Community Mining Evaluation
Sharon Bruckner, Bastian Kayser, Tim Conrad Freie Uni. Berlin Finding Modules in Networks with Non-modular Regions.
Communities. Questions 1.What is a community (intuitively)? Examples and fundamental hypothesis 2.What do we really mean by communities? Basic definitions.
Community Detection Algorithms: A Comparative Analysis Authors: A. Lancichinetti and S. Fortunato Presented by: Ravi Tiwari.
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering.
Typically, classifiers are trained based on local features of each site in the training set of protein sequences. Thus no global sequence information is.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Community detection via random walk Draft slides.
G LOBAL S IMILARITY B ETWEEN M ULTIPLE B IONETWORKS Yunkai Liu Computer Science Department University of South Dakota.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
Computer Science and Engineering PhD in Computer Science Monday, November 07, :00 a.m. – 11:00 a.m. Swearingen Conference Room 3A75 Network Based.
Selected Topics in Data Networking Explore Social Networks:
James Hipp Senior, Clemson University.  Graph Representation G = (V, E) V = Set of Vertices E = Set of Edges  Adjacency Matrix  No Self-Inclusion (i.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Graph clustering to detect network modules
Hierarchical Agglomerative Clustering on graphs
Community detection in graphs
Finding modules on graphs
Michael L. Nelson CS 495/595 Old Dominion University
Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu
3.3 Network-Centric Community Detection
Presentation transcript:

Community Detection by Modularity Optimization Jooyoung Lee http://lee Community Detection by Modularity Optimization Jooyoung Lee http://lee.kias.re.kr Center for in-silico Protein Science Korea Institute for Advanced Study Survey Science Group Workshop High1 Resort, Korea Feb 15, 2013 Network Science and Modules What is Community Detection? Community Detection by Modularity Optimization Protein function prediction by community detection of a network

For the decade, a network has become a important tool for studying complex biological systems For example, Protein-Protein interaction network of yeast, brain network, metabolic network and disease network

Communities (modules) in networks Communities in yeast protein-protein interaction network Biological networks are consisted of communities Community Module Partition Cluster Functional modules Protein complexes Gene clusters Biological networks consist of communities. Communities can also be called as module, partition or cluster Existence of communities is a universal property of biological networks Communities in biological network can be “Functional modules of proteins” or “Protein complexes” or “Gene clusters”

“Community/Module Detection” by Modularity Optimization Divide a network into sub-graphs/modules nodes are more densely connected internally The most commonly used objective function to evaluate the quality of partition is Q proposed by Girvan and Newman

Which one is better? Zachary’s karate club network Friendship network of members Which partitioning do you think is better? It is hard to determine at first glance Therefore we need an objective function to judge Q=0.420 Q=0.379 We need an objective function!

Two Issues with Modularity Q (1) Difficulty of the problem: Finding the best Q partition is a hard combinatorial optimization problem (NP-hard) The current best stochastic optimization method is simulated annealing (SA) (2) Relevance of the objective function: Is a higher-Q solution more useful to extract hidden information from a network? "So far, most works in the literature on graph clustering focused on the development of new algorithms, and applications were limited to those few benchmark graphs that one typically uses for testing" from Community Detection in Graphs (2010), Physics Report The first issue is about finding the maximum modularity. It is well-known that community detection problem is NP-hard which means enumeration is impractical

Three Benchmark Tests of Q-Optimization We performed ~

Benchmark Test #1: LFR graphs Networks are generated from known/assigned community structure. More (less) edges are assigned between nodes within (between) a community. We generate edges with a fixed mixing probability: Mixing probability 0.1 10% of inter-community edges 90% of intra-community edges LFR benchmark graph is a network with known community structure. Based on known communities of nodes, edges are generated with a mixing probability For example, if mixing probability is 0.1, 10 percent of edges are generated between communities and 90 percent of edges are generated inside the community.

Mixing probability = 0.1 This is a visual example of a LFR network with mixing probability 0.1 And community structures are easily identified.

Mixing probability = 0.6 This is an example of network with mixing probability of 0.6 And communities are not easily identified, which means the problem gets more difficult than the previous one.

Test on LFR graphs 50 simulations of CSA and SA (PRE 85, 056702 (2012) Mixing Prob. Modularity Accuracy tCSA(sec) tSA(sec) <QCSA>/<QSA> <ACCCSA>/ <ACCSA> 0.10 0.8638 / 0.8638 0.9994 / 0.9994 107.5 2422.4 0.20 0.7585 / 0.7585 0.9990 / 0.9990 100.4 4095.3 0.30 0.6641 / 0.6641 0.9974 / 0.9974 128.1 3596.5 0.40 0.5641 / 0.5639 0.9936 / 0.9926 175.1 4784.9 0.50 0.4675 / 0.4665 0.9705 / 0.9675 276.4 8350.1 0.60 0.3711 / 0.3691 0.8671 / 0.8545 699.2 94170.1 This table shows the benchmark result obtained by CSA and SA simulations. Since both methods are stochastic, We performed 50 iterations for each method We performed tests with a number of benchmark graphs with different mixing probabilities. The second column shows modularities And the third shows the accuracy which means how many nodes are correctly classified compared to the answer. The last two columns show computational time. Higher modularity and more accurate partitions are obtained using less computational time (only ~5%)

Benchmark Test #2: real-world networks PRE 85, 056702 (2012)

Conclusions We developed a highly efficient modularity optimization method by using CSA Higher modularity partitions are functionally more coherent We developed the first module-assisted function prediction algorithm which outperforms neighbor-assisted methods Extracting maximal information from network topology itself Our method is general and can be applied to other networks

Postdoc/Researcher Positions Available Acknowledgements Community Detection: Juyong Lee (KIAS, NIH) Steven Gross (UC Irvine) Cluster computers: KIAS/CAC Supported by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korean government (MEST) (No. 2009-0063610) Postdoc/Researcher Positions Available