Using Gröbner Bases to Reconstruct Regulatory Modules in C. elegans Brandilyn Stigler Southern Methodist University.

Slides:



Advertisements
Similar presentations
Systems biology SAMSI Opening Workshop Algebraic Methods in Systems Biology and Statistics September 14, 2008 Reinhard Laubenbacher Virginia Bioinformatics.
Advertisements

Stochastic algebraic models SAMSI Transition Workshop June 18, 2009 Reinhard Laubenbacher Virginia Bioinformatics Institute and Mathematics Department.
DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.
Polynomial dynamical systems over finite fields, with applications to modeling and simulation of biological networks. IMA Workshop on Applications of.
A Mathematical Formalism for Agent- Based Modeling 22 nd Mini-Conference on Discrete Mathematics and Algorithms Clemson University October 11, 2007 Reinhard.
Modelling and Identification of dynamical gene interactions Ronald Westra, Ralf Peeters Systems Theory Group Department of Mathematics Maastricht University.
Inferring Quantitative Models of Regulatory Networks From Expression Data Iftach Nachman Hebrew University Aviv Regev Harvard Nir Friedman Hebrew University.
Discrete models of biological networks Segunda Escuela Argentina de Matematica y Biologia Cordoba, Argentina June 29, 2007 Reinhard Laubenbacher Virginia.
Deterministic Global Parameter Estimation for a Budding Yeast Model T.D Panning*, L.T. Watson*, N.A. Allen*, C.A. Shaffer*, and J.J Tyson + Departments.
Synthetic lethal analysis of Caenorhabditis elegans posterior embryonic patterning genes identifies conserved genetic interactions L Ryan Baugh, Joanne.
Darwinian Genomics Csaba Pal Biological Research Center Szeged, Hungary.
Computational Modelling of Biological Pathways Kumar Selvarajoo
August 19, 2002Slide 1 Bioinformatics at Virginia Tech David Bevan (BCHM) Lenwood S. Heath (CS) Ruth Grene (PPWS) Layne Watson (CS) Chris North (CS) Naren.
Models and methods in systems biology Daniel Kluesing Algorithms in Biology Spring 2009.
Systems Biology Biological Sequence Analysis
Gene expression analysis summary Where are we now?
CISC667, F05, Lec26, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Genetic networks and gene expression data.
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
Functional genomics and inferring regulatory pathways with gene expression data.
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
Systems Biology Biological Sequence Analysis
Introduction to Gröbner Bases for Geometric Modeling Geometric & Solid Modeling 1989 Christoph M. Hoffmann.
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.
Systems Biology Biological Sequence Analysis
Inferring the nature of the gene network connectivity Dynamic modeling of gene expression data Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff,
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
6. Gene Regulatory Networks
Discrete models of biochemical networks Algebraic Biology 2007 RISC Linz, Austria July 3, 2007 Reinhard Laubenbacher Virginia Bioinformatics Institute.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Reverse engineering gene networks using singular value decomposition and robust regression M.K.Stephen Yeung Jesper Tegner James J. Collins.
Computational Models in Systems Biology Karan Mangla 22 nd April, 2008.
Epistasis Analysis Using Microarrays Chris Workman.
Modeling the Cell Cycle with JigCell and DARPA’s BioSPICE Software Departments of Computer Science* and Biology +, Virginia Tech Blacksburg, VA Faculty:
Gaussian Processes for Transcription Factor Protein Inference Neil D. Lawrence, Guido Sanguinetti and Magnus Rattray.
Biological pathway and systems analysis An introduction.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
What Is a Gene Network?. Gene Regulatory Systems “Programs built into the DNA of every animal.” Eric H. Davidson.
Polynomial models of finite dynamical systems Reinhard Laubenbacher Virginia Bioinformatics Institute and Mathematics Department Virginia Tech.
Genetic modification of flux (GMF) for flux prediction of mutants Kyushu Institute of Technology Quanyu Zhao, Hiroyuki Kurata.
Modeling and identification of biological networks Esa Pitkänen Seminar on Computational Systems Biology Department of Computer Science University.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Finish up array applications Move on to proteomics Protein microarrays.
Reconstruction of Transcriptional Regulatory Networks
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Agent-based methods for translational cancer multilevel modelling Sylvia Nagl PhD Cancer Systems Science & Biomedical Informatics UCL Cancer Institute.
Abstract Many research groups have described genetic and protein networks as networks of Boolean variables, and provided procedures for reverse engineering.
The Frequency Dependence of Osmo-Adaptation in Saccharomyces cerevisiae Jerome T. Mettetal, et al. Science 319, 482 (2008); William J. Gibson.
1 Departament of Bioengineering, University of California 2 Harvard Medical School Department of Genetics Metabolic Flux Balance Analysis and the in Silico.
A Maximum Principle for Single-Input Boolean Control Networks Michael Margaliot School of Electrical Engineering Tel Aviv University, Israel Joint work.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
Steady-state Analysis of Gene Regulatory Networks via G-networks Intelligent Systems & Networks Group Dept. Electrical and Electronic Engineering Haseong.
Central dogma: the story of life RNA DNA Protein.
Introduction to biological molecular networks
Boolean Networks and Biology Peter Lee Shaun Lippow BE.400 Final Project December 10, 2002.
A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches Cantone, I., Marucci, L., Iorio, F., Ricci, M., Belcastro,
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Dynamical Modeling in Biology: a semiotic perspective Junior Barrera BIOINFO-USP.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
BT8118 – Adv. Topics in Systems Biology
Journal club Jun , Zhen.
Learning gene regulatory networks in Arabidopsis thaliana
System Structures Identification
Bud Mishra Professor of Computer Science and Mathematics 12 ¦ 3 ¦ 2001
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
CISC 841 Bioinformatics (Spring 2006) Inference of Biological Networks
1 Department of Engineering, 2 Department of Mathematics,
Network Inference Chris Holmes Oxford Centre for Gene Function, &,
Principle of Epistasis Analysis
CISC 667 Intro to Bioinformatics (Spring 2007) Genetic networks and gene expression data CISC667, S07, Lec24, Liao.
Presentation transcript:

Using Gröbner Bases to Reconstruct Regulatory Modules in C. elegans Brandilyn Stigler Southern Methodist University

SAMSI September 16, 2008 Brandy’s Bio Education and Training PhD in mathematics – 2005, Virginia Tech  Advisor: Reinhard Laubenbacher Postdoctoral Fellow – 2008, Math. Biosci. Inst.  Math mentor: Winfried Just – Math, Ohio U.  Bio mentor: Helen Chamberlin – Molecular genetics, OSU Research Interests Systems biology  Reverse engineering of gene regulatory networks Computational algebra  Gröbner bases of zero-dimensional radical ideals

SAMSI September 16, 2008 Systems Biology (Kitano) The study of an organism, viewed as an integrated and interacting network of biochemicals, through understanding structure and dynamics methods of control and design (Ideker) The study of biological systems by perturbing them and monitoring the responses integrating the data and formulating mathematical models that describe system structure and the response.

SAMSI September 16, 2008 Gene regulatory networks are the main objects of study in molecular SB

SAMSI September 16, 2008 Molecular Systems Biology “forward engineering” “reverse engineering”

SAMSI September 16, 2008 “forward engineering” “reverse engineering”

SAMSI September 16, 2008 Overview Models and methods in RE Polynomial dynamical systems An algorithm for reverse engineering using computational algebra Application to tissue development in C. elegans

SAMSI September 16, 2008 Mathematical Methods for Modeling Continuous systems Linear algebra Statistics, Bayesian inference Boolean algebra Logic Stochastic processes Trends Biotech 2003 Building with a scaffold: emerging strategies for high- to low-level cellular modeling Ideker et al.

SAMSI September 16, 2008 Reverse engineering: continuous systems Yeung et al. (2002) built a model of linear ODEs for a gene regulatory network near a steady state. X = mRNA concentrations (given)‏ W = type and strength of interaction (unknown)‏ B = external stimuli (given) Robust regression to select sparsest matrix; W 0 particular solution, C vanishes on X Singular value decomposition; U, V orthogonal

SAMSI September 16, 2008 Challenges of RE Methods Many models may fit the same data.  Analysis of solution (model) space may be difficult.  Model selection is crucial. Continuous models: parameters may not be known  Needed: methods to “learn” parameters  Solution: genetic algorithms, for example Boolean models: algorithms based on enumeration  Needed: algorithms to compute “space” of models  Solution: use of algebraic techniques

SAMSI September 16, 2008 Mathematical Methods for Modeling Computational algebra Trends Biotech 2003 Building with a scaffold: emerging strategies for high- to low-level cellular modeling Ideker et al. Polynomial dynamical systems

SAMSI September 16, 2008 Polynomial Dynamical Systems g1g2gn f 1 ( x 1,…, x n ), f 2 ( x 1,…, x n ), f n ( x 1,…, x n ) ) ‏ x1x1 x2x2 xnxn … …, Variables with states in a finite set S Transition functions f i Finite dyn. sys. f Genes ( proteins, etc. ) ‏ … If | S | = prime, then S ≈ field. Theorem : Function f i : S n → S = polynomial in S [ x 1,…, x n ] Polynomial dynamical system (PDS) := finite dyn. sys f : S n → S n over a finite field f = (

SAMSI September 16, 2008 PDSs store structure and dynamics f = ( f 1, f 2, f 3 ) : ( Z 3 ) 3 → ( Z 3 ) 3 f 1 = – x x 1 f 2 = x 3 2 – x f 3 = – x x x1x1 x3x3 x2x2 Wiring diagram (WD) ‏ State space Fixed point Limit cycle

SAMSI September 16, 2008 Computing PDSs from Data Input: T = {s 1,…,s t } time series in k n (k = a finite field)‏ Output: F = a minimal PDS Find one particular solution f 0 = (f 1,…,f n ) with f 0 (s i ) = s i+1. Construct ideal of vanishing functions I =. All PDSs that fit T: f 0 + I := { (f 1,…,f n ) + (g 1,…,g n ) }. Select minimal PDS F = (F 1,…,F n ) with F i = f i % I. Implemented in Macaulay 2 Available at

SAMSI September 16, 2008 Gr ö bner Bases is a Gröbner basis for I if the leading term of f is divisible by the leading term of some g i under >. The normal form of f with respect to G NF(f, G) = the remainder of f on division by G. Gröbner bases exist (not unique). NF ( f, G ) is unique. Let > be a term order, I an ideal in k[x 1,…,x n ], and f a polynomial.

SAMSI September 16, 2008 RE Methods using PDSs R Laubenbacher, BS E Allen, J Fetrow, L Daniel, S Thomas, D John E Dimitrova, A Jarrah, R Laubenbacher, BS D Heldt, M Kreuzer, S Pokutta, H Poulisse P Vera-Licona A Jarrah, R Laubenbacher, M Stillman, BS BS, A Jarrah, R Laubenbacher, P Mendes Gröbner bases (GB) ‏ GB and Deegan–Packel Index of Power (DPIP) ‏ Gröbner fan (GF) and DPIP Approximate GB Evolutionary algorithm Minimal sets (MS) MS and GF

SAMSI September 16, 2008 Stigler, Jarrah, Laubenbacher, Mendes. Reverse engineering of dynamic networks. NY Acad Sci 2007 Jarrah, Laubenbacher, Stillman, Stigler. Reverse engineering of polynomial dynamical systems. Adv Appl Math 2006 Reverse Engineering using PDSs Minimal WD Primary decomposition Minimal WD … Ideal variety Model space Ideal variety Model space Minimal PDS Gröbner fan Minimal PDS … Gröbner fan Minimal PDS … Experim. data Mutual information Discrete data x302000x Time t 1 t 2 t 3 t 4 t 5 x201211x x112000x { f + I | f (t i ) = t i+1, I = }

SAMSI September 16, 2008 x201211x x302000x Time t 1 t 2 t 3 t 4 t 5 x112000x Ideal variety Experim. data Mutual information Discrete data Model space Ideal variety Minimal WD Primary decomposition Minimal WD … Model space Minimal PDS Gröbner fan Minimal PDS … Gröbner fan Minimal PDS … Primary decomposition produces minimal sets of variables required to define a PDS, thereby computing of the intersection of all wiring diagrams. (1 0 0) → 1 (2 1 2) → 2 (0 2 0) → 1 (0 1 0) → 1 (1 0 0) → 1(2 1 2) → 2 (0 2 0) → 1 (0 1 0) → 1 Adv Appl Math 2007 Reverse engineering of polynomial dynamical systems Jarrah et al. Encode: = = ∩ ‏ Interpret: x 1 -> x 2 (or x 3 -> x 2 ) in all WDs Computing Minimal WDs

SAMSI September 16, 2008 Ideal variety Experim. data Mutual information Discrete data Model space Ideal variety Minimal WD Primary decomposition Minimal WD … Model space Minimal PDS Gröbner fan Minimal PDS … Gröbner fan Minimal PDS … Term orders in a “cone” give the same model. Gröbner fan partitions the term order “space” and allows for efficient exploration of model space to find most representative model. Exploring the Model Space { f + I | f (t i ) = t i+1, I = }

SAMSI September 16, 2008 Method Validation: Segment polarity network in the fruitfly Network in cell: 15 genes, proteins Boolean model (Albert, Othmer 2003) ‏  44 known interactions  6 extracellular interactions Time series data  Generated wildtype, knockout  < 0.01% of 2 21 total states 4 most likely PDSs  82% links (36 TP, 2 FP, 8 TN)  100% terms for 6 fncs; 88% TP, 39.5% TN for 9 fncs  Missing terms = unobserved interactions  100% fixed points NY Acad of Sci 2007 Reverse-Engineering of Dynamic Networks Stigler et al.

SAMSI September 16, 2008 Identification of Muscle Module in Caenorhabditis elegans Genes and Development 2006 Defining the transcriptional redundancy of early bodywall muscle development in C. elegans : evidence for a unified theory of animal muscle development Fukushige et al.

SAMSI September 16, 2008 Regulatory Modules in C. elegans Baugh et al. ( Development 2005 ) identified tissue-identity genes (TIGs) := targets of PAL-1. Our goals: Model TIG network using their published microarray time series data. Reconstruct muscle module. Identify ectoderm module. Joint work with H. Chamberlin - OSU R. Hill - OSU R. Laubenbacher - VBI

SAMSI September 16, 2008 Regulatory Module for TIG Network Time series contains 10 points. Data discretized to 5 states. Predicted modules for muscle, ectoderm. Most edges in muscle module supported in literature. New prediction for timing of regulatory interactions. pal-1C55C2.1 unc-120 hlh-1 hnd-1

SAMSI September 16, 2008 PDS for TIG Network Does polynomial “form” encode “phenotype”?

SAMSI September 16, 2008 Conclusions Algorithm reverse engineers networks by Identifying minimal WDs Computing all minimal PDSs on the WD. Advantages of PDSs Provide compact representation of model space and framework within which to analyze the model space Facilitate hypothesis generation for further network exploration and discovery. Applications to gene regulatory networks High identification in fruit fly network Reconstructed C. elegans muscle, proposed ectoderm module Generated new hypotheses for regulation timing Potential for predictions about mechanisms

SAMSI September 16, 2008 Collaborations C. elegans H. Chamberlin – OSU, mol. gen. R. Hill – OSU,molecular genetics R. Laubenbacher – VBI, comp. algebra Yeast VBI Simulated networks D. Camacho – Boston U, biochemistry E. Dimitrova – Clemson, comp. algebra A. Jarrah – VBI, comp. algebra) ‏ R. Laubenbacher – VBI, comp. algebra P. Mendes – Manchester, biochemistry P. Vera Licona – DIMACS, comp. algebra Development of theory W. Just – Ohio U, logic/math bio A. Taylor – Colorado College, comm. algebra Development of algorithms W. Just – Ohio U, logic/math bio R. Laubenbacher – VBI, comp. algebra M. Stillman – Cornell, comp. algebra

SAMSI September 16, 2008

SAMSI September 16, 2008 Computing PDSs from > > > I = = ∩ Step 1 Step 2 Step 3 f 1 = f 1 0 mod GB(I) = – x x 3 f 2 = f 2 0 mod GB(I) = x 3 2 – x f 3 = f 3 0 mod GB(I) = – x x Requires a term order: grevlex with x 1 > x 2 > x 3 f 0 = (f 1 0, f 2 0, f 3 0 )‏

SAMSI September 16, Arithmetic in a Finite Number System Z p = integers modulo p = {0, 1, …, p -1} Z 12 = {0, 1, …, 11} “clock” arithmetic p prime=> field p not prime=> ring

SAMSI September 16, 2008 Gene regulatory networks are the main objects of study in molecular SB Interconnected biochemicals, including DNA-derived (mRNAs and proteins) and non-DNA-derived (metabolites) ‏ DNA= recipe book Gene= recipe mRNA= copy of recipe Protein= outcome of recipe Metabolites= other “helpers”

SAMSI September 16, 2008 Apply to oxidative stress response network in the yeast S. cerevisiae A new mathematical modeling approach to biochemical networks, with an application to oxidative stress in yeast Develop mathematical tools to model biochemical networks given experimental data } combine  Continuous models(ODEs) ‏  Discrete models(PDSs) ‏ Glutathione metabolism Yeast

SAMSI September 16, 2008 Transcriptomic 7 mutants + 1 wildtype (knockouts) ‏ 3 replicates 2 treatments (with and without CHP) ‏ 8 time points Proteomic 7 mutants + 1 wildtype (knockouts) ‏ 3 replicates 2 treatments (with and without CHP) ‏ 8 time points Metabolomic 7 mutants + 1 wildtype (knockouts) ‏ 3 replicates 2 treatments (with and without CHP) ‏ 8 time points = 1152 total data points!

SAMSI September 16, 2008 Theoretical Improvements Computing Gr ö bner bases (with W. Just – Ohio U) ‏ Implemented algorithm using LU decomposition in Macaulay 2 Identifies essential variables ( = support std mon ) Reduces computation to ring in EV Complexity = O(nm 2 +m 4 ) ‏ Computing GB structure (with A. Taylor – Colo C) ‏ Extended Shape Lemma for graded orders Connecting to term detection in statistics with solution being noiseless linear regression 2008 In Preparation Reverse Engineering Gröbner Bases Stigler and Taylor 2007 Submitted Efficiently Computing Gröbner Bases of Ideals of Points Just and Stigler