1 Pattern storage in gene-protein networks Ronald Westra Department of Mathematics Maastricht University.

Slides:



Advertisements
Similar presentations
Slides from: Doug Gray, David Poole
Advertisements

Lect.3 Modeling in The Time Domain Basil Hamed

Modelling and Identification of dynamical gene interactions Ronald Westra, Ralf Peeters Systems Theory Group Department of Mathematics Maastricht University.
Systems Theoretical Modelling of Genetic Pathways Ronald L. Westra Systems Theory Group Department Mathematics Maastricht University.
Fast Algorithms For Hierarchical Range Histogram Constructions
Experimental Design, Response Surface Analysis, and Optimization
The multi-layered organization of information in living systems
B.Macukow 1 Lecture 12 Neural Networks. B.Macukow 2 Neural Networks for Matrix Algebra Problems.
An RG theory of cultural evolution Gábor Fáth Hungarian Academy of Sciences Budapest, Hungary in collaboration with Miklos Sarvary - INSEAD, Fontainebleau,
Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.
Spontaneous recovery in dynamic networks Advisor: H. E. Stanley Collaborators: B. Podobnik S. Havlin S. V. Buldyrev D. Kenett Antonio Majdandzic Boston.
Artificial Intelligence Lecture 2 Dr. Bo Yuan, Professor Department of Computer Science and Engineering Shanghai Jiaotong University
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Presenter: Yufan Liu November 17th,
Visual Recognition Tutorial
280 SYSTEM IDENTIFICATION The System Identification Problem is to estimate a model of a system based on input-output data. Basic Configuration continuous.
Principal Component Analysis
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
November 30, 2010Neural Networks Lecture 20: Interpolative Associative Memory 1 Associative Networks Associative networks are able to store a set of patterns.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Biologically Inspired Robotics Group,EPFL Associative memory using coupled non-linear oscillators Semester project Final Presentation Vlad TRIFA.
Solver & Optimization Problems n An optimization problem is a problem in which we wish to determine the best values for decision variables that will maximize.
Radial Basis Function Networks
Adaptive Signal Processing
Normalised Least Mean-Square Adaptive Filtering
RLSELE Adaptive Signal Processing 1 Recursive Least-Squares (RLS) Adaptive Filters.
Sistem Kontrol I Kuliah II : Transformasi Laplace Imron Rosyadi, ST 1.
Nature-inspired Smart Info Systems Westra: Piecewise Linear Dynamic Modeling and Identification of Gene-Protein Interaction Networks1 Ronald L. Westra,
Nature-inspired Smart Info Systems Westra: Robust Identification of Piecewise Linear Gene-Protein Interaction Networks1 Ronald L. Westra, Ralf L. M. Peeters,
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
1 The Identification of Scale-Free Gene-Protein Networks Ronald Westra Department of Mathematics Maastricht University.
1 Algorithmic Networks & Optimization Maastricht, October 2009 Ronald L. Westra, Department of Mathematics Maastricht University.
Dynamical network motifs: building blocks of complex dynamics in biological networks Valentin Zhigulin Department of Physics, Caltech, and Institute for.
1 RECENT DEVELOPMENTS IN MULTILAYER PERCEPTRON NEURAL NETWORKS Walter H. Delashmit Lockheed Martin Missiles and Fire Control Dallas, TX 75265
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Nature-inspired Smart Info Systems Westra: Piecewise Linear Dynamic Modeling and Identification of Gene-Protein Interaction Networks1 Ronald L. Westra.
Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.
1 Departament of Bioengineering, University of California 2 Harvard Medical School Department of Genetics Metabolic Flux Balance Analysis and the in Silico.
Motivation Thus far we have dealt primarily with the input/output characteristics of linear systems. State variable, or state space, representations describe.
Feedback Control Systems (FCS) Dr. Imtiaz Hussain URL :
Robotics Research Laboratory 1 Chapter 7 Multivariable and Optimal Control.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
KPS 2007 (April 19, 2007) On spectral density of scale-free networks Doochul Kim (Department of Physics and Astronomy, Seoul National University) Collaborators:
8.4.2 Quantum process tomography 8.5 Limitations of the quantum operations formalism 量子輪講 2003 年 10 月 16 日 担当:徳本 晋
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Bundle Adjustment A Modern Synthesis Bill Triggs, Philip McLauchlan, Richard Hartley and Andrew Fitzgibbon Presentation by Marios Xanthidis 5 th of No.
Model-based learning: Theory and an application to sequence learning P.O. Box 49, 1525, Budapest, Hungary Zoltán Somogyvári.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Joint Moments and Joint Characteristic Functions.
1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.
CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models.
Network Science K. Borner A.Vespignani S. Wasserman.
Computacion Inteligente Least-Square Methods for System Identification.
Mihály Bányai, Vaibhav Diwadkar and Péter Érdi
OPERATING SYSTEMS CS 3502 Fall 2017
Compressive Coded Aperture Video Reconstruction
Computing and Compressive Sensing in Wireless Sensor Networks
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
Ch9: Decision Trees 9.1 Introduction A decision tree:
Digital Control Systems (DCS)
Digital Control Systems (DCS)
Boltzmann Machine (BM) (§6.4)
NONLINEAR AND ADAPTIVE SIGNAL ESTIMATION
NONLINEAR AND ADAPTIVE SIGNAL ESTIMATION
Chapter 3 Modeling in the Time Domain
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Presentation transcript:

1 Pattern storage in gene-protein networks Ronald Westra Department of Mathematics Maastricht University

2 1. Problem formulation 2. Modeling of gene/proteins interactions 3. Information Processing in Gene-Protein Networks 4. Information Storage in Gene-Protein Networks 5. Conclusions Items in this Presentation

3 1. Problem formulation How much genome is required for an organism to survive in this World? Some observations...

4 Mycoplasma genitalium 500 nm 580 Kbp 477 genes 74% coding DNA Obligatory parasitic endosymbiont Nanoarchaeum equitans 400 nm 460 Kbp 487 ORFs 95% coding DNA Obligatory parasitic endosymbiont SARS CoV 100 nm 30 Kbp 5 ORFs 98% coding DNA Retro virus Minimal genome sizes

5 Organisms like Mycoplasma genitalium, Nanoarchaeum equitans, and the SARS Corona Virus are able to exhibit a large amount of complex and well-tuned behavioral patterns despite an extremely small genome A pattern of behaviour here is the adequate conditional sequence of responses of the gene-protein interaction network to an external input: light, oxygen-stress, pH, feromones, and numerous organic and anorganic molecules.

6 Questions: * How do gene-protein networks perform computations and how do they process real time information? * How is information stored in gene-protein networks? * How do processing speed, computation power, and storage capacity relate to network properties? Problem formulation

7 CENTRAL THOUGHT [1] What is the capacity of a gene-protein network to store input-output patterns, where the stimulus is the input and the behaviour is the output. How does the pattern storage capacity of an organism relate to the size of its genome n, and the number of external stimuli m?

8 CENTRAL THOUGHT [2] Conjecture: The task of reverse engineering a gene regulatory network from a time series of m observations, is actually identical to the task of storing m patterns in that network. In the first case an engineer tries to design a network that fits the observations; in the second case Nature selects those networks/organisms that best perform the input-output mapping.

9 Requirements For studying the pattern storage capacity of a gene- protein interaction system we need: 1. a suitable parametrized formal model 2. a method for fixing the model parameters with the given set of input-parameters We will visit these items in the following slides...

10 2. Modeling the Interactions between Genes and Proteins Prerequisite for the successful reconstruction of gene-protein networks is the way in which the dynamics of their interactions is modeled.

11 Components in Gene-Protein networks Genes:ON/OFF-switches RNA&Proteins: vectors of information exchange between genes External inputs:interact with higher-order proteins

12 General state space dynamics The evolution of the n-dimensional state space vector x (gene expressions) depend on p-dim inputs u, parameters θ and Gaussian white noise ξ.

13 external inputs genes/proteins input-coupling interaction- coupling Example of an general dynamics network topology

14 The general case is too complex Strongly dependent on unknown microscopic details Relevant parameters are unidentified and thus unknown Therefore approximate interaction potentials and qualitative methods seem appropriate

15 1. Linear stochastic state-space models Following Yeung et al and others x :the vector (x 1, x 2,..., x n ) where x i is the relative gene expression of gene ‘í’ u :the vector (u 1, u 2,..., u p ) where u i is the value of external input ‘í’ (e.g. a toxic agent) νξ(t):white Gaussian noise

16 2. Piecewise Linear Models Following Mestl, Plahte, Omhold 1995 and others b il sum of step-functions s +,–

17 3. More complex non-linear interaction models Example: including quadratic terms;

18 Our mathematical framework for non-linear gene-protein interactions

19 3. Information processing in sparse Hierarchic gene-protein networks Consider a network as described before with only a few connections (=sparse) and where few genes/proteins control the a considerable amount of the others (=hierarchic)

20 Information Processing in random sparse Gene-Protein Interactions random sparse network, n=64, k=2largest cluster therein

21 Information Processing in random sparse Gene-Protein Interactions Now consider the information processing time (= #iterations) necesary to reach all nodes (proteins) as a function of: The number of connections (= #non-zero- elements) in the network

22 phase transition from slow to fast processing

23

24 * Ben-Hur, Siegelmann: Computation with Gene Networks, Chaos, January 2004 * Skarda and Freeman: How brains make chaos in order to make sense of the world, Behavioral and brain sciences, Vol Philosophy: Information is stored in the network topology (weights, sparsity, hierarchy) and the system dynamics 4. Memory storage in gene-protein networks

25 We assume a hierarchic, non-symmetric, and sparse gene/protein network (with k out of n possible connections/node) with linear state space dynamics Suppose we want to store M patterns in the network Memory storage in gene-protein networks

26 Linearized form of a subsystem First order linear approximation of system separates state vector x and inputs u.

27 input-output pattern: The organism has (evolutionary) learned to react to an external input u (e.g. toxic agent, viral infection) with a gene-protein activity x(t). This combination (x,u) is the input-output PATTERN

28 Memory Storage = Network Reconstruction Using these definitions it is possible to map the problem of pattern storage to the * solved * problem of gene network reconstruction with sparse estimation

29 Information Pattern: Now, suppose that we have M patterns we want to store in the network:

30 The relation between the desired patterns (state derivatives, states and inputs) defines constraints on the data matrices A and B, which have to be computed. Pattern Storage: method 1.0

31 Computing the optimal A and B for storing the Patterns The matrices A and B, are sparse (most elements are zero): Using optimization techniques from robust/sparse optimization, this problem can be defined as: Pattern Storage: method 1.0

32 Number of retrieval errors as a function of the number of nonzero entries k, with: M = 150 patterns, N = genes. 1st order phase transition from error-free memory retrieval kCkC

33 kCkC Number of retrieval errors versus M with fixed N = 50000, k = 10. 1st order phase transition to error-free memory retrieval

34 Critical number of patterns M crit versus the problem size N,

35 Pattern Storage: method 2.0 A pattern corresponds to a converged state of the system hence: Therefore a sparse system ∑ = {A,B} is sought that maps the inputs to the patterns {U,X}, which leads to:

36 LP: subject to: 1.condition for stationary equilibrium: 2.condition to avoid A = B = 0 : 3.avoid A = 0 by using degradation of proteins and auto-decay of genes: diag(A) < 0 Computing optimal sparse matrices

37 The sparsity in the gene/protein interaction matrix A is k A : the number of non-zero elements in A This can be scaled to the size of A: N, and we obtain: p A = k A /N, Similarly for the input-coupling B: p B = k B /P. The sparsity in A and B

38 BA Results: A B gene-gene input-gene

39 BA A B gene-gene input-gene

40 sparsity versus the number of stored patterns There are three distinct regions with different ‘learning’ strategies separated by order transitions A B gene-gene input-gene

41 sparsity versus the number of stored patterns Region I : all information is exclusively stored in B. Region II : information is preferably stored in A. Region III : no clear preference for A or B, Highest ‘order’. Highest ‘disorder’. A B gene-gene input-gene

42 sparsity versus the number of stored patterns I : ‘impulsive’ II : ‘rational’ III : ‘hybrid’. A B gene-gene input-gene

43 The entropy of the macroscopic system relates to the relative fraction of connections p A and p B as: As A and B are indiscernible the total entropy is: Phase transitions and entropy

44 The entropy of the microscopic system A relates to the degree distribution: the number of connections f i of node i. Let P(v) be the probability that a given node has v outgoing connections: and Information entropy

45 With P the Laplace distribution for large networks the average entropy per node converges to: Information entropy [2] With Euler's constant.

46 This also allows the computation of the gain in information entropy if one connection is added: Information gain per node If this formalism is applied to our network structure we obtain:

47 Left: the entropy S versus for n=100, p=30, based on 1180 observations, Right: the gain in entropy for the same data set. Again the three learning strategies are clearly visible {impulsive, rational, hybrid} Information gain per node

48 Relation between p A = k A /n and p B = k B /p averaged for measurements.. Relation between sparsities

49 5. Conclusions Non-linear time-invariant state space models for gene- protein networks exhibit a range of complex behaviours for storing input-output patterns in sparse representations. In this model information processing (=computing) and pattern storage (=learning) exhibit multiple distinct 1st and 2nd order continuous phase transitions There are two second-order phase transitions that divide the network learning in three distinct regions, ‘impulsive’, ‘rational’, ‘hybrid’.

50 Other members of trans-national University Limburg - Bioinformatics Research Team University of Hasselt (Belgium): Goele Hollanders (PhD student) Geert Jan Bex Marc Gyssens University of Maastricht (Netherlands): Stef Zeemering (PhD student) Karl Tuyls Ralf Peeters

51 Discussion … Ronald Westra Department of Mathematics Maastricht University