九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics.

Slides:



Advertisements
Similar presentations
Solving connectivity problems parameterized by treewidth in single exponential time Marek Cygan, Marcin Pilipczuk, Michal Pilipczuk Jesper Nederlof, Dagstuhl.
Advertisements

CS 336 March 19, 2012 Tandy Warnow.
1 Decomposing Hypergraphs with Hypertrees Raphael Yuster University of Haifa - Oranim.
Minimum Vertex Cover in Rectangle Graphs
Greedy Algorithms Greed is good. (Some of the time)
Graph Isomorphism Algorithms and networks. Graph Isomorphism 2 Today Graph isomorphism: definition Complexity: isomorphism completeness The refinement.
Presented by Yuval Shimron Course
1 NP-completeness Lecture 2: Jan P The class of problems that can be solved in polynomial time. e.g. gcd, shortest path, prime, etc. There are many.
Approximating the Domatic Number Feige, Halldorsson, Kortsarz, Srinivasan ACM Symp. on Theory of Computing, pages , 2000.
Combinatorial Algorithms
CS774. Markov Random Field : Theory and Application Lecture 17 Kyomin Jung KAIST Nov
Noga Alon Institute for Advanced Study and Tel Aviv University
The number of edge-disjoint transitive triples in a tournament.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Approximating Maximum Edge Coloring in Multigraphs
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
16:36MCS - WG20041 On the Maximum Cardinality Search Lower Bound for Treewidth Hans Bodlaender Utrecht University Arie Koster ZIB Berlin.
Perfect Graphs Lecture 23: Apr 17. Hard Optimization Problems Independent set Clique Colouring Clique cover Hard to approximate within a factor of coding.
1 Vertex Cover Problem Given a graph G=(V, E), find V' ⊆ V such that for each edge (u, v) ∈ E at least one of u and v belongs to V’ and |V’| is minimized.
Randomized Process of Unknowns and Implicitly Enforced Bounds on Parameters Jianer Chen Department of Computer Science & Engineering Texas A&M University.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 14 Strongly connected components Definition and motivation Algorithm Chapter 22.5.
Randomness in Computation and Communication Part 1: Randomized algorithms Lap Chi Lau CSE CUHK.
Elementary graph algorithms Chapter 22
Finding a maximum independent set in a sparse random graph Uriel Feige and Eran Ofek.
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (3) Domain-Based Mathematical Models for Protein Evolution Tatsuya Akutsu Bioinformatics.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
Fixed Parameter Complexity Algorithms and Networks.
Kernel Bounds for Structural Parameterizations of Pathwidth Bart M. P. Jansen Joint work with Hans L. Bodlaender & Stefan Kratsch July 6th 2012, SWAT 2012,
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
1 Treewidth, partial k-tree and chordal graphs Delpensum INF 334 Institutt fo informatikk Pinar Heggernes Speaker:
Approximating the Minimum Degree Spanning Tree to within One from the Optimal Degree R 陳建霖 R 宋彥朋 B 楊鈞羽 R 郭慶徵 R
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
 2004 SDU Lecture 7- Minimum Spanning Tree-- Extension 1.Properties of Minimum Spanning Tree 2.Secondary Minimum Spanning Tree 3.Bottleneck.
Approximation Algorithms
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (4) Analysis and Control of Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute.
1 Rainbow Decompositions Raphael Yuster University of Haifa Proc. Amer. Math. Soc. (2008), to appear.
Data Structures & Algorithms Graphs
Partitioning the Labeled Spanning Trees of an Arbitrary Graph into Isomorphism Classes Austin Mohr.
EMIS 8373: Integer Programming Combinatorial Relaxations and Duals Updated 8 February 2005.
CS 3343: Analysis of Algorithms Lecture 25: P and NP Some slides courtesy of Carola Wenk.
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (5) Control of Probabilistic Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute.
Computing Branchwidth via Efficient Triangulations and Blocks Authors: F.V. Fomin, F. Mazoit, I. Todinca Presented by: Elif Kolotoglu, ISE, Texas A&M University.
Algorithms for hard problems Introduction Juris Viksna, 2015.
Algorithms for hard problems Parameterized complexity Bounded tree width approaches Juris Viksna, 2015.
Kernel Bounds for Path and Cycle Problems Bart M. P. Jansen Joint work with Hans L. Bodlaender & Stefan Kratsch September 8 th 2011, Saarbrucken.
The geometric GMST problem with grid clustering Presented by 楊劭文, 游岳齊, 吳郁君, 林信仲, 萬高維 Department of Computer Science and Information Engineering, National.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Graphs and Algorithms (2MMD30)
Algorithms for Finding Distance-Edge-Colorings of Graphs
Hans Bodlaender, Marek Cygan and Stefan Kratsch
Computing Connected Components on Parallel Computers
Parameterized complexity Bounded tree width approaches
Algorithms for hard problems
Algorithms and networks
Chapter 5. Optimal Matchings
Algorithms and Complexity
Computability and Complexity
Structural graph parameters Part 2: A hierarchy of parameters
Bart M. P. Jansen June 3rd 2016, Algorithms for Optimization Problems
Elementary graph algorithms Chapter 22
Introduction Wireless Ad-Hoc Network
Problem Solving 4.
Dániel Marx (slides by Daniel Lokshtanov)
5.4 T-joins and Postman Problems
Bioinformatics Center Institute for Chemical Research Kyoto University
Elementary graph algorithms Chapter 22
Locality In Distributed Graph Algorithms
Treewidth meets Planarity
Presentation transcript:

九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University

Tree Decomposition and Partial k-Tree [Flum, Grohe: Parameterized Complexity Theory, Springer]

Tree Decomposition Tree decomposition of G(V,E) Pair of rooted tree and family of sets of vertices For all v ∊ V, is connected For all {u,v} ∊ E, u, v ∊ B t holds for some t ∊ V T Width max t |B t |-1 Treewidth Minimum width of possible tree decompositions

Examples ⇒ treewidth of tree is 1 ⇒ treewidth of cycle is 2

Prop. Let s be parent and t 1,…,t h be children of node t. For all j, Several Properties Prop. Let t 1,…,t h be children of node t in T(V T,E T ). For all i≠j, Thm. Graphs with treewidth k is partial k-tree, and treewidth of partial k -tree is k Definition of partial k -tree is omitted. Thm. For fixed k, tree decomposition of partial k -tree can be computed in linear time Thm. Determination of treewidth is NP-hard ⇒ Many optimization problems can be solved in a bottom up manner

DP Algorithm for Partial k-Trees For fixed k, many NP-hard problems can be solved in polynomial time using dynamic programming Ex. Vertex cover problem  Ch(t) : Set of children of node t in tree T  Dynamic programming algorithm  where W t is a vertex cover for a subgraph induced by B t, r is the root of T.

Explanation of DP Algorithm BtBt BsBs B s’ OPT t (W t ) : size of minimum vertex cover of G(t) under the condition that W t is cover of B t T(t): subtree of T induced by t and its descendants G(t): subgraph of G induced by

Analysis of Time Complexity Let k be a constant. Tree decomposition can be computed in linear time. For each t ∊ V T, at most 2 k+1 W t are tested. To compute min in Σ, 2 k+1 × 2 k+1 =4 k+1 pairs are tested per edge in T Thus, the total complexity is O(4 k poly(n)).

Applications to Bioinformatics Graphs representing structures of proteins and RNAs are considered to have small treewidth Examples Protein threading Protein side-chain packing Protein structure alignment Comparison of RNA secondary structures Attractor detection in Boolean networks

Color Coding [Alon et al.: J. ACM 1995]

k-Path Problem Input : undirected graph G(V,E), integer k Output : vertex disjoint path of G with length k NP-hard ⇐ Hamilton path problem if k=n(=|V|) Naïve algorithm : For each vertex v, examine neighbors, neighbors of neighbors, … ⇒ O(n k ) time Idea Partition V into k subsets ( color vertices using k colors ) If lucky, all vertices lie in different subsets ( analysis of such probability ⇒ randomized algorithm )

DP Algorithm P(u,C): 1 if there exists a path from v to u using each color in C exactly once, otherwise 0 ( C is a subset of {1,2,…,k} ) Initialization : P(v,{f(v)})←1, others be 0 (f(v) is color of v ) Recursion : ( in the order of |C|=1 to |C|=k-1 ) {u,w} ∈ E For each v, examine whether there exists k-path starting from v Path can be reconstructed by traceback P(v,{R})=1 v w u1u1 u2u2 P(w,{R,Y,B})=1 P(u 1,{R,Y, B,G})=1

Analysis of Time Complexity Lemma : The above algorithm works in O(2 k poly(n)) time Proof : Numbr of C is 2 k. Thus, it is enough to examine 2 k n P(u,C)s. This computation should be done for all initial vertex v, which needs additional O(n) factor P(u,C): 1 if there exists a path from v to u using each color in C exactly once, otherwise 0 ( C is a subset of {1,2,…,k} ) Initialization : P(v,{f(v)})←1, others be 0 (f(v) is color of v ) Recursion : ( in the order of |C|=1 to |C|=k-1 ) {u,w} ∈ E

Analysis of Success Probability Lemma : Let P be k -path of G. When randomly coloring, the probability that k vertices in P have different colors is ≧ e -k Proof : #coloring to P is k k. On the other hand, #(successful coloring) is k!. Therefore, by using Stirling formula, we have Theorem : By repeating the algorithm at least e k times, a solution can be obtained (if any) with probability ≧ 1/2 Proof : The probability of all fails is bounded by The algorithm never outputs a wrong solution

Derandomization Idea : use of hash function families k -perfect hash functions : Let F be a family of hash functions from V={1,2,…,n} to {1,2,…, k}. F is called a family of k-perfect hash functions if, for any k -element subsets of V, there exists a function f ∊ F that gives one-to-one mapping Corollary : k -Path Problem can be solved in 2 O(k) ・ poly(n) time Theorem : For any n and k, k -perfect hash functions with 2 O(k) ・ log 2 n functions can be constructed in 2 O(k) ・ n ・ log 2 n time ⇒ In place of random coloring, it is enough to examine all f given by this theorem

Applications of Color Coding `Path’ is color coding can be extended to small trees and small subgraphs (network motifs) ⇒ Applications to bioinformatics Network motif [Alon et al.: Bioinformatics, 2008] Signal pathway analysis [Huffner et al.: Bioinformatics 2007 & Algorithmica 2008] Network marker [Dao et al.: Bioinformatics 2011] Pathway search/alignment [Shlomi et al.: BMC Bioinformaics 2006]

Comparison of Chemical Graphs

Chemical Structures and Graphs Tree  graph without cycle Almost tree  tree + some edges (in each strongly connected component) Outerplanar graph  No crossing edges  No internal vertex Partial k -tree  Decomposed into tree by identifying k+1 vertices as one node

Partial k -trees Partial k -tree ( tree width ≦ k )  Decomposed into tree by identifying k+1 vertices as one node  Outerplanar graphs are 2-trees Chemical compounds in NCI database [Horvath & Ramon, TCS 2010] tree width 1 ( tree ) 21, ,675 36,548 ≧4≧4 65 If we can design efficient algorithms for partial 4-trees, we can cover almost all chemical compounds

Three Matching Problems Graph isomorphism  Are two graphs are essentially the same ? Subgraph isomorpshim  Is one graph a part of the other graph ? Maximum common subgraph  Largest (connected) common part between two given graphs

Complexity of Graph Comparison Problems Graph isomorphism  Polynomial time for bounded degree graphs [Luks, JCSS, 1982]  However, not practical because the algorithm is too complicated (based on group theory) Subgraph isomorphism  Polynomial time for partial k -trees of bounded degree [Matousek & Thomas, Disc. Math., 1992]  However, the algorithm is still too complicated Maximum common subgraph  trees : polynomial time [Matula, Ann. Disc. Math, 1978]  almost trees: polynomial time [Akutsu, IEICE Trans., 1993]  outerplanar graphs : polynomial time [Akutsu & Tamura, Algorithms, 2013]  partial k -trees : NP-hard for k=11 [Akutsu & Tamura, Proc. ISAAC 2013]  partial k -trees with k=3 : open problem (since we recently improved to k=4 )

Algorithm for Outerplanar Graphs: Key Idea Difficulty: need to find cut points ⇒ easily lead to combinatorial explosion Idea: introduction of the concept of blade Lemma: #blades is O(n 2 ). ⇒ polynomial time algorithm

Maximum Common Subgraph: Summary Trees  polynomial time [Matula, Ann. Disc. Math, 1978 ] Almost trees  polynomial time [Akutsu, IEICE Trans.,1993] Outerplanar graphs of bounded degree  polynomial time [Akutsu & Tamura, Algorithms, 2013] Partial k -trees of bounded degree  NP-hard [Akutsu & Tamura, Proc. ISAAC 2013] ⇔ Polynomial time for subgraph isomorphism [Matousek & Thomas, Disc. Math., 1992]

Summary Tree Decomposition  For fixed k, many NP-hard problems can be solved in polynomial time by DP algorithms  Applications to analysis of protein/RNA structures Color Coding  Useful for finding small paths/subgraphs in networks  Applications to biological pathway analysis Comparison of Chemical Graphs  The maximum common subgraph problem is NP-hard even for partial k -trees for k=4, but is solvable in polynomial time for outerplanar graphs