Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discovering Larger Network Motifs

Similar presentations


Presentation on theme: "Discovering Larger Network Motifs"— Presentation transcript:

1 Discovering Larger Network Motifs
Wooyoung Kim and Li Chen 4/24/2009 CSC 8910 Analysis of Biological Network, Spring 2009 Dr. Yi Pan

2 Outline Project Topic Related Works Proposed Ideas Unsolved Problems

3 Project Topic Discovering Larger Network Motifs
Given a biological network (PPI, transcriptional regulatory network, gene network, etc), find network motifs whose size is large (>15)

4 Related Works (1) Network Motif Discovery using subgraph enumeration and symmetry breaking motif size <=15 Given a candidate subgraph, find all symmetry subgraphs in the graph, then evaluate it by checking the frequency. Problem: How to find candidate subgraph?  Proposed solution: Cluster the whole network and find the representation at each cluster to claim that as candidate subgraphs.

5 Related Works (2) Motif Discovery Algorithm
Exact algorithm on motifs with a small number of nodes 1. Exhaustive Recursive Search (ERS): (motif size <= 4) 2. ESU: starting with individual nodes and adding one node at a time until the required size k is reached. (motif size <=14) 3. Compact Topological Motifs

6 Related Works (3) Approximate Algorithms
Search Algorithm Based on Sampling (MFINDER) Rand-ESU NeMoFINDER Sub-graph Counting by Scalar Computation A-priori-based Motif Detection

7 Related Works (4) Network Clustering Compact representation of network. Type I: minimum number of clusters Type II: maximum cohesiveness Aggregation of topological motifs (combining smaller network motifs to observe the whole structure) However, in our proposed solution, the clustering task is grouping similar network patterns together, not grouping similar nodes (sequence) together. Nor it is not used for aggregating motifs.

8 Proposed Ideas Given a graph G = (V,E), and t (the size of desirable motif) and k (the number of motifs), find a network motif with size t. List all graph patterns with t (or larger than t) nodes. Represent the network as an adjacency matrix A (1, -1, 0) Scan A for all t x t sub-matrices Cluster the subgraphs into k clusters Use any numerical clustering algorithms including K-means, NMF, etc. Find a subgraph representation at each cluster. Use the symmetry breaking technique to find the representation. Each representation can be a candidate of network motif.

9 Unsolved Problems How to cluster the graphs?
The clustering algorithms to apply will be various based on what features we are using for the data. What type of clustering algorithm? Type I or type II? How to find the representation subgraph of each cluster? Should we consider network alignment first? Should we consider the sequence similarities as well? Will there be any relationship between sequence motif and network motif? Applying the sequence motif into vertex attributes matrix? compact topological motifs. Large network motif vs. small network motif

10 Discovering Topological Motifs Using a Compact Notation

11 Compact Notation Main Idea
A topological motif can be represented either as a motif or as a collection of location lists of the vertices of the motif. It works in the space of the location lists so as to discover motif.

12 Compact Notation Method
Step1: compute an exhaustive list of potential lists of vertices of motifs as compact location lists Step 2: enlarge the collection of compact location lists computed in the first step by including all the non-empty intersections, along with the differences.

13 Compact Notation An Example
Different color indicate different attribute

14 Compact Notation G1’s adjacency matrices

15 Compact Notation Adjacency Matrix B1 (the conjugacy relationship of two lists is shown by “”) L = {ℓ1, ℓ2, ℓ3, ℓ4}

16 Compact Notation Initialization Step

17 Compact Notation Iterative Step

18 References [1] Bill Andreopoulos, Aijun An, Xiaogang Wang, and Michael Schroeder. A roadmap of clustering algorithms: finding a match for a biomedical application. Brief Bioinform, pages bbn058+, February 2009. [2] Alberto Apostolico, Matteo Comin, and Laxmi Parida". Bridging Lossy and Lossless Compression by Motif Pattern Discovery. Electronic Notes in Discrete Mathematics, 21: , General Theory of Information Transfer and Combinatorics. [3] Giovanni Ciriello and Concettina Guerra. A review on models and algorithms for motif discovery in protein-protein interaction networks. Brief Funct Genomic Proteomic, 7(2): , 2008. [4] Jun Huan, Wei Wang, and Jan Prins. Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism. Data Mining, IEEE International Conference on, 0:549, 2003. [5] Michihiro Kuramochi and George Karypis. Finding Frequent Patterns in a Large Sparse Graph. Data Mining and Knowledge Discovery, 11(3): , November 2005. [6] Laxmi Parida. Discovering Topological Motifs Using a Compact Notation. Journal of Computational Biology, 14(3): , 2007.

19 References [7] Radu Dobrin, Qasim K. Beg, Albert-Laszlo Barabasi, and Zoltan N. Oltvai. Aggregation of topological motifs in the escherichia coli transcriptional regulatory network. BMC Bioinformatics, 5:10, 2004. [8] McKay, B.D. Isomorph-free exhaustive generation. J. Algorithms, 26: , 1998 [9] Middendorf, M., Zive, E., and Wiggins, C.H. Inferring network mechanisms: the Drosophila melanogaster protein interaction network. PNAS, 102 (9): , Mar [10]Grochow, J. A. and Kellis, M. Network motif discovery using subgraph enumeration and symmetry-breaking. In RECOMB 2007, Lecture Notes in Computer Science 4453, pp Springer-Verlag, 2007.

20 Thank you so much !


Download ppt "Discovering Larger Network Motifs"

Similar presentations


Ads by Google