Polynomial dynamical systems over finite fields, with applications to modeling and simulation of biological networks. IMA Workshop on Applications of.

Slides:



Advertisements
Similar presentations
Systems biology SAMSI Opening Workshop Algebraic Methods in Systems Biology and Statistics September 14, 2008 Reinhard Laubenbacher Virginia Bioinformatics.
Advertisements

Stochastic algebraic models SAMSI Transition Workshop June 18, 2009 Reinhard Laubenbacher Virginia Bioinformatics Institute and Mathematics Department.
Vector Spaces A set V is called a vector space over a set K denoted V(K) if is an Abelian group, is a field, and For every element vV and K there exists.
Singularity of a Holomorphic Map Jing Zhang State University of New York, Albany.
CSCI 115 Chapter 6 Order Relations and Structures.
Orthogonal Drawing Kees Visser. Overview  Introduction  Orthogonal representation  Flow network  Bend optimal drawing.
A Mathematical Formalism for Agent- Based Modeling 22 nd Mini-Conference on Discrete Mathematics and Algorithms Clemson University October 11, 2007 Reinhard.
An Intro To Systems Biology: Design Principles of Biological Circuits Uri Alon Presented by: Sharon Harel.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Degenerations of algebras. José-Antonio de la Peña UNAM, México Advanced School and Conference on Homological and Geometrical Methods in Representation.
DYNAMICS OF RANDOM BOOLEAN NETWORKS James F. Lynch Clarkson University.
Kick-off Meeting, July 28, 2008 ONR MURI: NexGeNetSci Distributed Coordination, Consensus, and Coverage in Networked Dynamic Systems Ali Jadbabaie Electrical.
6.896: Topics in Algorithmic Game Theory Lecture 11 Constantinos Daskalakis.
Discrete models of biological networks Segunda Escuela Argentina de Matematica y Biologia Cordoba, Argentina June 29, 2007 Reinhard Laubenbacher Virginia.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Representing Relations Using Matrices
Predicting Gene Expression using Logic Modeling and Optimization Abhimanyu Krishna New Challenges in the European Area: Young Scientist’s 1st International.
Entropy Rates of a Stochastic Process
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
On the Toric Varieties Associated with Bicolored Metric Trees Khoa Lu Nguyen Joseph Shao Summer REU 2008 at Rutgers University Advisors: Prof. Woodward.
Dagstuhl 2010 University of Puerto Rico Computer Science Department The power of group algebras for constrained multilinear monomial detection Yiannis.
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
Approximation Algorithms
The Equivalence between Static (rigid) and Kinematic (flexible, mobile) Systems through the Graph Theoretic Duality Dr. Offer Shai Tel-Aviv University.
Introduction to Gröbner Bases for Geometric Modeling Geometric & Solid Modeling 1989 Christoph M. Hoffmann.
Thompson’s Group Jim Belk. Associative Laws Let  be the following piecewise-linear homeomorphism of  :
CS5371 Theory of Computation Lecture 1: Mathematics Review I (Basic Terminology)
Discrete models of biochemical networks Algebraic Biology 2007 RISC Linz, Austria July 3, 2007 Reinhard Laubenbacher Virginia Bioinformatics Institute.
Conjugacy and Dynamics in Thompson’s Groups Jim Belk (joint with Francesco Matucci)
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.
Gröbner Bases Bernd Sturmfels Mathematics and Computer Science University of California at Berkeley.
Applied Discrete Mathematics Week 10: Equivalence Relations
MA5209 Algebraic Topology Wayne Lawton Department of Mathematics National University of Singapore S ,
Preperiodic Points and Unlikely Intersections joint work with Laura DeMarco Matthew Baker Georgia Institute of Technology AMS Southeastern Section Meeting.
A Study of The Applications of Matrices and R^(n) Projections By Corey Messonnier.
A Framework for Distributed Model Predictive Control
Polynomial models of finite dynamical systems Reinhard Laubenbacher Virginia Bioinformatics Institute and Mathematics Department Virginia Tech.
Yaomin Jin Design of Experiments Morris Method.
Section 2-8 First Applications of Groebner Bases by Pablo Spivakovsky-Gonzalez We started this chapter with 4 problems: 1.Ideal Description Problem: Does.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Week 11 What is Probability? Quantification of uncertainty. Mathematical model for things that occur randomly. Random – not haphazard, don’t know what.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Simultaneously Learning and Filtering Juan F. Mancilla-Caceres CS498EA - Fall 2011 Some slides from Connecting Learning and Logic, Eyal Amir 2006.
Mathematical Preliminaries
CSCI 115 Chapter 8 Topics in Graph Theory. CSCI 115 §8.1 Graphs.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Analysis of Boolean Functions and Complexity Theory Economics Combinatorics …
Commuting birth-and-death processes Caroline Uhler Department of Statistics UC Berkeley (joint work with Steven N. Evans and Bernd Sturmfels) MSRI Workshop.
Analogue and digital techniques in closed loop regulation applications
Lecture 7: Constrained Conditional Models
CS 9633 Machine Learning Support Vector Machines
State Space Representation
Richard Anderson Lecture 26 NP-Completeness
Analytics and OR DP- summary.
Richard Anderson Lecture 26 NP-Completeness
Paper Code : BCA-27 Paper Title : Discrete Mathematics.
Computability and Complexity
Digital Control Systems (DCS)
Hidden Markov Models Part 2: Algorithms
Richard Anderson Lecture 25 NP-Completeness
Steven Lindell Scott Weinstein
State Space Analysis UNIT-V.
LECTURE 15: REESTIMATION, EM AND MIXTURES
Social Networks and Simplicial Complexes
Locality In Distributed Graph Algorithms
Representing Relations Using Matrices
Presentation transcript:

Polynomial dynamical systems over finite fields, with applications to modeling and simulation of biological networks. IMA Workshop on Applications of Algebraic Geometry in Biology, Dynamics, and Statistics March 6, 2007 Reinhard Laubenbacher Virginia Bioinformatics Institute and Mathematics Department Virginia Tech

Polynomial dynamical systems Let k be a finite field and f1, … , fn  k[x1,…,xn] f = (f1, … , fn) : kn → kn is an n-dimensional polynomial dynamical system over k. Natural generalization of Boolean networks. Fact: Every function kn → k can be represented by a polynomial, so all finite dynamical systems kn → kn are polynomial dynamical systems.

k = F3 = {0, 1, 2}, n = 3 f1 = x1x22+x3, f2 = x2+x3, f3 = x12+x22. Example k = F3 = {0, 1, 2}, n = 3 f1 = x1x22+x3, f2 = x2+x3, f3 = x12+x22. Dependency graph (wiring diagram)

http://dvd.vbi.vt.edu

Motivation: Gene regulatory networks “[The] transcriptional control of a gene can be described by a discrete-valued function of several discrete-valued variables.” “A regulatory network, consisting of many interacting genes and transcription factors, can be described as a collection of interrelated discrete functions and depicted by a wiring diagram similar to the diagram of a digital logic circuit.” Karp, 2002

Nature 406 2000

Motivation (2): a mathematical formalism for agent-based simulation Example 1: Game of life Example 2: Large-scale simulations of population dynamics and epidemiological networks (e.g., the city of Chicago) Need a mathematical formalism.

Network inference using finite dynamical systems models Variables x1, … , xn with values in k. (s1, t1), … , (sr, tr) state transition observations with sj, tj  kn. Network inference: Identify a collection of “most likely” models/dynamical systems f=(f1, … ,fn): kn → kn such that f(sj)=tj.

Important model information obtained from f=(f1, … ,fn): The “wiring diagram” or “dependency graph” directed graph with the variables as vertices; there is an edge i → j if xi appears in fj. The dynamics directed graph with the elements of kn as vertices; there is an edge u → v if f(u) = v.

The Hallmarks of Cancer Hanahan & Weinberg (2000)

I = <f  k[x1, … xn] | f(si)=0 for all i>. The model space Let I be the ideal of the points s1, … , sr, that is, I = <f  k[x1, … xn] | f(si)=0 for all i>. Let f = (f1, … , fn) be one particular feasible model. Then the space M of all feasible models is M = f + I = (f1 + I, … , fn + I).

Problem: Given data (si, ti), i=1, … , r, Wiring diagrams Problem: Given data (si, ti), i=1, … , r, (a collection of state transitions for one node in the network), find all minimal (wrt inclusion) sets of variables y1, … , ym  {x1, … , xn} such that (f +I) ∩ k[y1, … , ym] ≠ Ø. Each such minimal set corresponds to a minimal wiring diagram for the variable under consideration.

The “minimal sets” algorithm For a  k, let Xa = {si | ti = a}. Let X = {Xa | a  k}. Then f 0+I = M = {f  k[x] | f(p) = a for all p  Xa}. Want to find f  M which involves a minimal number of variables, i.e., there is no g  M whose support is properly contained in supp(f).

Definitions. For F  {1, … , n}, let RF = k[xi | i  F]. The algorithm Definitions. For F  {1, … , n}, let RF = k[xi | i  F]. Let ΔX = {F | M ∩ RF ≠ Ø}. For p  Xa, q  Xb, a ≠ b  k, let m(p, q) = pi≠qi xi. Let MX = monomial ideal in k[x1, … , xn] generated by all monomials m(p, q) for all a, b  k. (Note that ΔX is a simplicial complex, and MX is the face ideal of the Alexander dual of ΔX.)

The algorithm Proposition. (Jarrah, L., Stigler, Stillman) A subset F of {1, … , n} is in ΔX if and only if the ideal < xi | i  F > contains the ideal MX.

The algorithm Corollary. To find all possible minimal wiring diagrams, we need to find all minimal subsets of variables y1, … , ym such that MX is contained in <y1, … , ym>. That is, we need to find all minimal primes containing MX.

This scoring method has a bias toward small sets. Let F = {F1, … , Ft} be the output of the algorithm. For s = 1, … , n, let Zs = # sets in F with s elements. For i = 1, … , n, let Wi(s) = # sets in F of size s which contain xi. S(xi) := ΣWi(s) / sZs where the sum extends over all s such that Zs ≠ 0. T(Fj) := ΠxiFj S(xi). Normalization  probability distribution on F of min. var. sets This scoring method has a bias toward small sets.

Problem: The model space f + I is WAY TOO BIG Model selection Problem: The model space f + I is WAY TOO BIG Solution: Use “biological theory” to reduce it.

Limit the admissible dynamical properties of models. “Biological theory” Limit the structure of the coordinate functions fi to those which are “biologically meaningful.” (Characterize special classes computationally.) Limit the admissible dynamical properties of models. (Identify and computationally characterize classes for which dynamics can be predicted from structure.)

Nested canalyzing functions

Nested canalyzing functions

A non-canalyzing Boolean network f1 = x4 f2 = x4+x3 f3 = x2+x4 f4 = x2+x1+x3

A nested canalyzing Boolean network g1 = x4 g2 = x4*x3 g3 = x2*x4 g4 = x2*x1*x3

Polynomial form of nested canalyzing Boolean functions

The vector space of Boolean polynomial functions

The variety of nested canalyzing functions

Input and output values as functions of the coefficients

The algebraic geometry Corollary. The ideal of relations defining the class of nested canalyzing Boolean functions on n variables forms an affine toric variety over the algebraic closure of F2. The irreducible components correspond to the functions that are nested canalyzing with respect to a given variable ordering. (joint work with Jarrah, Raposa)

Dynamics from structure Theorem. Let f = (f1, … , fn) : kn → kn be a monomial system. If k = F2, then f is a fixed point system if and only if every strongly connected component of the dependency graph of f has loop number 1. (Colón-Reyes, L., Pareigis) The case for general k can be reduced the Boolean + linear case. (Colón-Reyes, Jarrah, L., Sturmfels)

What extra mathematical structure is needed to make progress? Questions What are good classes of functions from a biological and/or mathematical point of view? What extra mathematical structure is needed to make progress? How does the nature of the observed data points affect the structure of f + I and MX?

Modeling and Simulation of Biological Networks Advertisement 1 Modeling and Simulation of Biological Networks Symposia in Pure and Applied Math, AMS in press articles by Allman-Rhodes, Pachter, Stigler, … .

Special year 2008-09 at SAMSI: Advertisement 2 Special year 2008-09 at SAMSI: “Algebraic methods in biology and statistics” (subject to final approval)