九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (4) Analysis and Control of Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute.

Slides:



Advertisements
Similar presentations
The Complexity of Linear Dependence Problems in Vector Spaces David Woodruff IBM Almaden Joint work with Arnab Bhattacharyya, Piotr Indyk, and Ning Xie.
Advertisements

Systems biology SAMSI Opening Workshop Algebraic Methods in Systems Biology and Statistics September 14, 2008 Reinhard Laubenbacher Virginia Bioinformatics.
Time-Space Tradeoffs in Resolution: Superpolynomial Lower Bounds for Superlinear Space Chris Beck Princeton University Joint work with Paul Beame & Russell.
ECE 667 Synthesis and Verification of Digital Circuits
Polynomial dynamical systems over finite fields, with applications to modeling and simulation of biological networks. IMA Workshop on Applications of.
Predicting essential genes via impact degree on metabolic networks ISSSB’11 Takeyuki Tamura Bioinformatics Center, Institute for Chemical Research Kyoto.
An Introduction to the Model Verifier verds Wenhui Zhang September 15 th, 2010.
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
DYNAMICS OF RANDOM BOOLEAN NETWORKS James F. Lynch Clarkson University.
Solutions for Scheduling Assays. Why do we use laboratory automation? Improve quality control (QC) Free resources Reduce sa fety risks Automatic data.
Computing Kemeny and Slater Rankings Vincent Conitzer (Joint work with Andrew Davenport and Jayant Kalagnanam at IBM Research.)
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Planning under Uncertainty
Computational problems, algorithms, runtime, hardness
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Semidefinite Programming
1 Introduction to Linear and Integer Programming Lecture 9: Feb 14.
08/1 Foundations of AI 8. Satisfiability and Model Construction Davis-Putnam, Phase Transitions, GSAT Wolfram Burgard and Bernhard Nebel.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
The Theory of NP-Completeness
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.
Jan 6-10th, 2007VLSI Design A Reduced Complexity Algorithm for Minimizing N-Detect Tests Kalyana R. Kantipudi Vishwani D. Agrawal Department of Electrical.
2-Layer Crossing Minimisation Johan van Rooij. Overview Problem definitions NP-Hardness proof Heuristics & Performance Practical Computation One layer:
Chapter 11: Limitations of Algorithmic Power
1 Slides by Asaf Shapira & Michael Lewin & Boaz Klartag & Oded Schwartz. Adapted from things beyond us.
ROM-based computations: quantum versus classical B.C. Travaglione, M.A.Nielsen, H.M. Wiseman, and A. Ambainis.
Distributed Constraint Optimization * some slides courtesy of P. Modi
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding ILPs with Branch & Bound ILP References: ‘Integer Programming’
1.1 Chapter 1: Introduction What is the course all about? Problems, instances and algorithms Running time v.s. computational complexity General description.
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (3) Domain-Based Mathematical Models for Protein Evolution Tatsuya Akutsu Bioinformatics.
Decision Procedures An Algorithmic Point of View
Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (7) Partial k-Trees, Color Coding, and Comparison of Graphs Tatsuya Akutsu Bioinformatics.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
MBSat Satisfiability Program and Heuristics Brief Overview VLSI Testing B Marc Boulé April 2001 McGill University Electrical and Computer Engineering.
© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.
Theory of Computing Lecture 17 MAS 714 Hartmut Klauck.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
Attractor Detection and Control of Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Partitioning Graphs of Supply and Demand Generalization of Knapsack Problem Takao Nishizeki Tohoku University.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Algorithm Analysis Part of slides are borrowed from UST.
Quality of LP-based Approximations for Highly Combinatorial Problems Lucian Leahu and Carla Gomes Computer Science Department Cornell University.
CS223 Advanced Data Structures and Algorithms 1 Maximum Flow Neil Tang 3/30/2010.
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (5) Control of Probabilistic Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute.
Bioinformatics Center Institute for Chemical Research Kyoto University
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
Deterministic Algorithms for Submodular Maximization Problems Moran Feldman The Open University of Israel Joint work with Niv Buchbinder.
DEPARTMENT/SEMESTER ME VII Sem COURSE NAME Operation Research Manav Rachna College of Engg.
Non-LP-Based Approximation Algorithms Fabrizio Grandoni IDSIA
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Haploid-Diploid Evolutionary Algorithms
CMPT 438 Algorithms.
Francisco Antonio Doria
Integer Programming An integer linear program (ILP) is defined exactly as a linear program except that values of variables in a feasible solution have.
Computability and Complexity
Professor Arne Thesen, University of Wisconsin-Madison
Propositional Calculus: Boolean Algebra and Simplification
Complexity 6-1 The Class P Complexity Andrei Bulatov.
Hidden Markov Models Part 2: Algorithms
Chapter 11 Limitations of Algorithm Power
CPS 173 Computational problems, algorithms, runtime, hardness
Expectation-Maximization & Belief Propagation
Bioinformatics Center Institute for Chemical Research Kyoto University
Presentation transcript:

九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (4) Analysis and Control of Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University

Contents Boolean Network Attractor Detection/Enumeration Algorithms for Singleton Attractor Detection /Enumeration Control of Boolean Networks Integer Linear Programming-based Approach

Boolean Network

Mathematical model of genetic networks node ⇔ gene  State of node : 1 (active) / 0 (inactive) Regulation rules  Boolean function (AND, OR, NOT …)  Edge from y to x ⇔ y directly controls x Synchronized update  Almost the same as digital circuits (with clocks) [Kauffman, The Origin of Order, 1993]

Example of Boolean Network A B C A ’ = B B ’ = A and C C ’ = not A State Transition TableBoolean Network A ’ B ’ C ’ time t t+1 A B C INPUTOUTPUT Example of state transition : 111 ⇒ 110 ⇒ 100 ⇒ 000 ⇒ 001 ⇒ 001 ⇒ 001 ⇒ 。。。

Why Boolean Networks ? Criticism that BN is too simplified  Unless simplified, difficult for theoretical analysis, inference, and control though complex models can be used for simulation  Maybe useful for qualitative analyses One of most simple non-linear models  Negative results on BN suggest negative results on more general (non-linear) models Almost the same as digital circuits  Theories and techniques in computer science can be utilized

Our focus: Time Complexity Many problems for BN are NP-hard  NP-hard means that there is no polynomial time algorithm (unless P=NP) It will take O(2 n ) time or more if we use naïve methods But, we want to solve much better  Because we can solve the cases of n=300 for O(1.1 n ) n=600 for O(1.05 n ) Important for coping with large-scale networks

Attractor Detection

Attractor (1) Steady state Different attractors ⇔ Different cell types Example  011 ⇒ 101 ⇒ 010 ⇒ 101 ⇒ 010 ⇒ …  111 ⇒ 110 ⇒ 100 ⇒ 000 ⇒ 001 ⇒ 001 ⇒ 001 ⇒ … A ’ B ’ C ’ time t t+1 A B C INPUTOUTPUT State Transition Table

Attractor (2) A ’ B ’ C ’ time t t+1 A B C INPUTOUTPUT

N-K Model (Kauffman Network) N : Number of nodes (We use n instead of N ) K : Indegree  Indegree = the number of input edges = the number of genes directly affecting node v  Each node has (maximum or average) indegree K Boolean function assigned to each node is randomly selected v indegree =2 indegree =3 v

Distribution of Attractors in N-K Model Classical conjecture  The number of attractors is Recent results suggest that this conjecture may not be true  Superpolynomial growth ( > n γ for any γ) of the number of attractors (Samuelsson & Troein, PRL, 2003)  Superpolynomial growth of the average size of attractors (Drossel et al., PRL, 2005) No conclusive result is known

Singleton Attractor (or Point Attractor) Biological interpretation of attractors Different attractors ⇔ Different cell types Point attractor Attractor with period 1 Corresponding to a steady state Definition: satisfying Attractor Detection Input: Boolean Network Output: Point Attractor (if any) ( or, )

Attractor Detection: Previous Works Around time is enough since there are 2 n global states  But, it cannot be applied to large n  Several heuristics are known, but no theoretical guarantee [Irons, Pysica D, 2006], [Devloo et al., Bull. Math. Biol. 2003], … Detection of a singleton attractor is NP-hard [Akutsu et al., GIW 1998] We developed algorithms with average case theoretical bounds [Zhang et al., EURASIP JBSB 2007] We also developed time algorithms for AND-OR BNs [Tamura & Akutsu, FCT07, Trans. IEICE 2009] [Tamura & Akutsu, AB08, Math. in CS 2009] [Melkman, Tamura & Akutsu, 2010]

Algorithms for Singleton Attractor Detection/Enumeration

Singleton Attractor ( =Attractor with Period 1) attractor

Indegree Indegree = the number of input edges = the number of genes directly affecting node v We use K to denote the maximum indgree v indegree =2 indegree =3 v

Simple Recursive Enumeration Algorithm (1) Examine 0-1 assignment one-by-one, and backtrack as soon as some contradiction occurs [Zhang et al., EURASIP JBSB 2007]

Illustration of Recursive Algorithm Output

Simple Recursive Enumeration Algorithm (2) Examine 0-1 assignment one-by-one, and backtrack as soon as some contradiction occurs.  0  00  X  backtrack  01  010  X  backtrack  011  X  backtrack  10 Several variants depending on ordering of nodes Much better than trivial O(n2 n ) time K23456 Basic1.35 n 1.43 n 1.49 n 1.53 n 1.57 n Outdegree- based 1.19 n 1.27 n 1.34 n 1.41 n 1.45 n

Analysis of Average Case Time Complexity Probability that v i (0)≠v i (1) is detected when 0-1 assignment for first m bits is examined: Probability that a random assignment for m bits is consistent (with def. of singleton attractor): Expected number of consistent 0-1 assignments for m bits: By taking the maximum of the above for m in [1…n], we can estimate the complexity v1v1 v m-1 vmvm v m+1 t=0t=1 K

Computational Experiment Exponential increases, but bases are less than 2 K23456 Basic1.39 n 1.46 n 1.53 n 1.57 n 1.60 n Outdegree- based 1.23 n 1.30 n 1.37 n 1.42 n 1.47 n Empirical Time Complexity

Issues on Worst Case Time Complexity Detection of a Singleton Attractor for BNs with indegree K  ( K+1 )-SAT  O(1.322 n ) time for K=2 (randomized)  We developed O((1.322-δ) n ) time algorithm for K=2 Detection problem remains NP-hard even for K=2 O(1.587 n ) time algorithm for BNs with AND/OR nodes (no constraint on K ) [Melkman, Tamura & Akutsu, 2010]

Reduction from BN-ATTRACTOR to SAT Detection of Singleton Attractor with Max. Indegree K  (K+1)- SAT (Boolean SATisfiability problem) vivi vjvj vkvk

Basic Idea in O(1.587 n ) Time Algorithm Consider recursive assignment of 0-1 values to nodes (A) v=0 ⇒ u=0, v=1 ⇒ w=1 (B) v=0 ⇒ u=0 and w=1 Let f(k) be #(assignments) for BN with k variables By solving the above (like Fibonacci number), f(k) is O( n ) However, above procedure cannot be applied to all cases (e.g., not to bipartite networks)  combination with SAT is required  O(1.587 n ) time u v w u v w All nodes are OR NOT input (A) (B)

Attractor Detection: Previous Works (2) K=2K=3 AND/OR of literals (any K) Nested canalyzing (any K) Nested canalyzing (any K) in partial k- tree with period p Recursive (Ave. Time) O(1.19 n )O(1.27 n ) SAT based (detection) O(1.323 n )O(1.474 n ) N/A Our algorithms (detection) O((1.323-δ) n ) (δ= ) [IEICE, 2009] O(1.587 n ) [IPL, 2010] O(1.871 n ) [JCB, 2013] O(n 2p(w+1) poly(n) ) [TCBB, 2012] Singleton Attractors Cyclic Attractors ( Recursive, Average Case ) K=2K=3K=4K=5 period=2 O(1.57 n )O(1.70 n )O(1.78 n )O(1.83 n ) period=3 O(1.72 n )O(1.86 n )O(1.92 n )O(1.95 n )

Control of Boolean Network

BN-Control: Previous Works Datta et al. defined a problem of control of PBN ( Probabilistic Extension of BN ) and proposed a dynamic programming based method  They also proposed various extensions  But, their method must handle 2 n ×2 n matrices BN-Control (also PBN-Control) is NP-hard BN-Control can be solved in polynomial time if the network has a tree structure [Akutsu et al., JTB 2007] Practical approach based on Model Checking/SAT [Langmund & Jha, APBC 2008, JBCB 2009] Theoretical studies using Semi-Tensor Product [Cheng, 2009] [Machine Learning, 52: , 2003]

Definition of BN-Control Input  Internal nodes: v 1,…, v n External nodes : u 1,…, u m  Initial state: v 0 Desired state: v M BN Output  Sequence of states of external nodes : u(0), u(1), …, u(M) v(0)= v 0, v(M)=v M ( leading to the desired state at time M ) [Akutsu et al., J. Theo. Biol. 2007]

Dynamic Programming for Control of BN BN version of the algorithm by Datta et al. DP table:  takes 1 if there is a control seq. leading to the target state  can be computed by

Illustration of DP Algorithm D[0,1,1, 3] = 1 D[1,1,1, 2] =1 u 1 =1, u 2 =1 D[0,0,0, 2] = 0 DP Computation But, the size of DP table is exponential

Integer Linear Programming- Based Approach

Integer Programming Linear Programming (LP)  Maximize (or minimize) an objective linear function under constraints of linear inequalities Integer Linear Programming (ILP)  LP + constraints that specified variables must take integer value  Several efficient solvers: CPLEX, Gurobi  Used for solving various NP-hard problems

ILP for Attractor Detection (1) x i : state of v i

ILP for Attractor Detection (2) 0

ILP for Attractor Detection (3) dummy for using ILP

ILP formalization for BN-Control major changes from Attractor Detection

Summary

Boolean network  A discrete model of a genetic network  Similar to digital circuits Attractor Detection/Enumeration  NP-hard  Much better than a naïve O(2 n ) bound for bounded indegree cases  Identification of cyclic attractors is more difficult Control of Boolean networks  NP-hard  Can be solved by DP algorithm (but, in exponential time) Integer Linear Programming-based Approach  Simple  Flexible for modifications/extensions  Fast if indegree ≦ 2