九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics.

九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University

Tree Decomposition and Partial k-Tree [Flum, Grohe: Parameterized Complexity Theory, Springer]

Tree Decomposition Tree decomposition of G(V,E) Pair of rooted tree and family of sets of vertices For all v ∊ V, is connected For all {u,v} ∊ E, u, v ∊ B t holds for some t ∊ V T Width max t |B t |-1 Treewidth Minimum width of possible tree decompositions

Examples ⇒ treewidth of tree is 1 ⇒ treewidth of cycle is 2

Prop. Let s be parent and t 1,…,t h be children of node t. For all j, Several Properties Prop. Let t 1,…,t h be children of node t in T(V T,E T ). For all i≠j, Thm. Graphs with treewidth k is partial k-tree, and treewidth of partial k -tree is k Definition of partial k -tree is omitted. Thm. For fixed k, tree decomposition of partial k -tree can be computed in linear time Thm. Determination of treewidth is NP-hard ⇒ Many optimization problems can be solved in a bottom up manner

DP Algorithm for Partial k-Trees For fixed k, many NP-hard problems can be solved in polynomial time using dynamic programming Ex. Vertex cover problem  Ch(t) : Set of children of node t in tree T  Dynamic programming algorithm  where W t is a vertex cover for a subgraph induced by B t, r is the root of T.

Explanation of DP Algorithm BtBt BsBs B s’ OPT t (W t ) : size of minimum vertex cover of G(t) under the condition that W t is cover of B t T(t): subtree of T induced by t and its descendants G(t): subgraph of G induced by

Analysis of Time Complexity Let k be a constant. Tree decomposition can be computed in linear time. For each t ∊ V T, at most 2 k+1 W t are tested. To compute min in Σ, 2 k+1 × 2 k+1 =4 k+1 pairs are tested per edge in T Thus, the total complexity is O(4 k poly(n)).

Applications to Bioinformatics Graphs representing structures of proteins and RNAs are considered to have small treewidth Examples Protein threading Protein side-chain packing Protein structure alignment Comparison of RNA secondary structures Attractor detection in Boolean networks

Color Coding [Alon et al.: J. ACM 1995]

k-Path Problem Input ： undirected graph G(V,E), integer k Output ： vertex disjoint path of G with length k NP-hard ⇐ Hamilton path problem if k=n(=|V|) Naïve algorithm ： For each vertex v, examine neighbors, neighbors of neighbors, … ⇒ O(n k ) time Idea Partition V into k subsets （ color vertices using k colors ） If lucky, all vertices lie in different subsets （ analysis of such probability ⇒ randomized algorithm ）

DP Algorithm P(u,C): 1 if there exists a path from v to u using each color in C exactly once, otherwise 0 （ C is a subset of {1,2,…,k} ） Initialization ： P(v,{f(v)})←1, others be 0 (f(v) is color of v ） Recursion ：（ in the order of |C|=1 to |C|=k-1 ） {u,w} ∈ E For each v, examine whether there exists k-path starting from v Path can be reconstructed by traceback P(v,{R})=1 v w u1u1 u2u2 P(w,{R,Y,B})=1 P(u 1,{R,Y, B,G})=1

Analysis of Time Complexity Lemma ： The above algorithm works in O(2 k poly(n)) time Proof ： Numbr of C is 2 k. Thus, it is enough to examine 2 k n P(u,C)s. This computation should be done for all initial vertex v, which needs additional O(n) factor P(u,C): 1 if there exists a path from v to u using each color in C exactly once, otherwise 0 （ C is a subset of {1,2,…,k} ） Initialization ： P(v,{f(v)})←1, others be 0 (f(v) is color of v ） Recursion ：（ in the order of |C|=1 to |C|=k-1 ） {u,w} ∈ E

Analysis of Success Probability Lemma ： Let P be k -path of G. When randomly coloring, the probability that k vertices in P have different colors is ≧ e -k Proof ： #coloring to P is k k. On the other hand, #(successful coloring) is k!. Therefore, by using Stirling formula, we have Theorem ： By repeating the algorithm at least e k times, a solution can be obtained (if any) with probability ≧ 1/2 Proof ： The probability of all fails is bounded by The algorithm never outputs a wrong solution

Derandomization Idea ： use of hash function families k -perfect hash functions ： Let F be a family of hash functions from V={1,2,…,n} to {1,2,…, k}. F is called a family of k-perfect hash functions if, for any k -element subsets of V, there exists a function f ∊ F that gives one-to-one mapping Corollary ： k -Path Problem can be solved in 2 O(k) ・ poly(n) time Theorem ： For any n and k, k -perfect hash functions with 2 O(k) ・ log 2 n functions can be constructed in 2 O(k) ・ n ・ log 2 n time ⇒ In place of random coloring, it is enough to examine all f given by this theorem

Applications of Color Coding `Path’ is color coding can be extended to small trees and small subgraphs (network motifs) ⇒ Applications to bioinformatics Network motif [Alon et al.: Bioinformatics, 2008] Signal pathway analysis [Huffner et al.: Bioinformatics 2007 & Algorithmica 2008] Network marker [Dao et al.: Bioinformatics 2011] Pathway search/alignment [Shlomi et al.: BMC Bioinformaics 2006]

Comparison of Chemical Graphs

Chemical Structures and Graphs Tree  graph without cycle Almost tree  tree + some edges (in each strongly connected component) Outerplanar graph  No crossing edges  No internal vertex Partial k -tree  Decomposed into tree by identifying k+1 vertices as one node

Partial k -trees Partial k -tree （ tree width ≦ k ）  Decomposed into tree by identifying k+1 vertices as one node  Outerplanar graphs are 2-trees Chemical compounds in NCI database [Horvath & Ramon, TCS 2010] tree width 1 （ tree ） 21,950 2221,675 36,548 ≧4≧4 65 If we can design efficient algorithms for partial 4-trees, we can cover almost all chemical compounds

Three Matching Problems Graph isomorphism  Are two graphs are essentially the same ? Subgraph isomorpshim  Is one graph a part of the other graph ? Maximum common subgraph  Largest (connected) common part between two given graphs

Complexity of Graph Comparison Problems Graph isomorphism  Polynomial time for bounded degree graphs [Luks, JCSS, 1982]  However, not practical because the algorithm is too complicated (based on group theory) Subgraph isomorphism  Polynomial time for partial k -trees of bounded degree [Matousek & Thomas, Disc. Math., 1992]  However, the algorithm is still too complicated Maximum common subgraph  trees ： polynomial time [Matula, Ann. Disc. Math, 1978]  almost trees: polynomial time [Akutsu, IEICE Trans., 1993]  outerplanar graphs ： polynomial time [Akutsu & Tamura, Algorithms, 2013]  partial k -trees ： NP-hard for k=11 [Akutsu & Tamura, Proc. ISAAC 2013]  partial k -trees with k=3 ： open problem (since we recently improved to k=4 )

Algorithm for Outerplanar Graphs: Key Idea Difficulty: need to find cut points ⇒ easily lead to combinatorial explosion Idea: introduction of the concept of blade Lemma: #blades is O(n 2 ). ⇒ polynomial time algorithm

Maximum Common Subgraph: Summary Trees  polynomial time [Matula, Ann. Disc. Math, 1978 ] Almost trees  polynomial time [Akutsu, IEICE Trans.,1993] Outerplanar graphs of bounded degree  polynomial time [Akutsu & Tamura, Algorithms, 2013] Partial k -trees of bounded degree  NP-hard [Akutsu & Tamura, Proc. ISAAC 2013] ⇔ Polynomial time for subgraph isomorphism [Matousek & Thomas, Disc. Math., 1992]

Summary Tree Decomposition  For fixed k, many NP-hard problems can be solved in polynomial time by DP algorithms  Applications to analysis of protein/RNA structures Color Coding  Useful for finding small paths/subgraphs in networks  Applications to biological pathway analysis Comparison of Chemical Graphs  The maximum common subgraph problem is NP-hard even for partial k -trees for k=4, but is solvable in polynomial time for outerplanar graphs

九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics.

Similar presentations

Presentation on theme: "九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics.

Similar presentations

Presentation on theme: "九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics."— Presentation transcript:

Similar presentations

About project

Feedback