Download presentation
Presentation is loading. Please wait.
Published byDiane Goodman Modified over 9 years ago
1
François Fages LOPSTR-SAS 2005 Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA Rocquencourt, France http://contraintes.inria.fr/ Joint work with : Nathalie Sylvain Laurence Chabrier-Rivier Soliman Calzone 2002-2004: ARC CPBIO “Process Calculi and Biology of Molecular Networks” A.Bockmayr, LORIA, V. Danos, CNRS PPS, V. Schächter, Genoscope Evry
2
François Fages LOPSTR-SAS 2005 Systems Biology ? Multidisciplinary field aiming at getting over the complexity walls to reason about biological processes at the system level. Virtual cell: emulate high-level biological processes in terms of their biochemical basis at the molecular level (in silico experiments) Beyond providing tools to biologists, Computer Science has much to offer in terms of concepts and methods. Bioinformatics: end 90’s, genomic sequences post-genomic data (RNA expression, protein synthesis, protein-protein interactions,… ) Need for a strong effort on: - the formal representation of biological processes, - formal tools for modeling and reasoning about their global behavior.
3
François Fages LOPSTR-SAS 2005 Language Approach to Cell Systems Biology Qualitative models: from diagrammatic notation to Boolean networks [Thomas 73] Petri Nets [Reddy 93] Milner’s π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] Transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 04] Biochemical abstract machine BIOCHAM-1 [Chabrier-Fages 03] Quantitative models: from differential equation systems to Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] Hybrid concurrent constraint languages [Bockmayr-Courtois 01] Rules with continuous dynamics BIOCHAM-2 [Chabrier-Fages-Soliman 04]
4
François Fages LOPSTR-SAS 2005 Outline of the Presentation 1.Introduction 2.Biocham Rule Language for Modeling Biochemical Systems 1.Syntax of objects and reactions 2.Semantics at 3 abstraction levels: Boolean, Concentrations, Populations 3.Biocham Temporal Logic for Formalizing Biological Properties 1.CTL for Boolean semantics 2.Constraint LTL for Concentration semantics 4.Learning Rules and Parameters from Temporal Properties 1.Learning reaction rules from CTL specification 2.Learning kinetic parameter values from Constraint-LTL specification 5.Conclusion and collaborations
5
François Fages LOPSTR-SAS 2005 2. Modeling Biochemical Systems Small molecules: covalent bonds (outer electrons shared) 50-200 kcal/mol 70% water 1% ions 6% amino acids (20), nucleotides (5), fats, sugars, ATP, ADP, … Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol Stability and bindings determined by the number of weak bonds: 3D shape 20% proteins (50-10 4 amino acids) RNA (10 2 -10 4 nucleotides AGCU) DNA (10 2 -10 6 nucleotides AGCT)
6
François Fages LOPSTR-SAS 2005 Formal Proteins Cyclin dependent kinase 1 Cdk1 (free, inactive) Complex Cdk1-Cyclin B Cdk1–CycB (low activity) Phosphorylated form Cdk1~{thr161}-CycB at site threonine 161 (high activity) also called Mitosis Promotion Factor MPF
7
François Fages LOPSTR-SAS 2005 BIOCHAM Syntax of Objects E == compound | E-E | E~{p1,…,pn} Compound : molecule, #gene binding site, abstract @process… - : binding operator for protein complexes, gene binding sites, … Associative and commutative. ~{…} : modification operator for phosphorylated sites, … Set of modified sites (Associative, Commutative, Idempotent). O == E | E::location Location : symbolic compartment (nucleus, cytoplasm, membrane, …) S == _ | O+S + : solution operator (Associative, Commutative, Neutral _)
8
François Fages LOPSTR-SAS 2005 Six Main Reaction Rule Schemas Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB Synthesis: _ =[C]=> A. _ =[#Ge2-E2f13-Dp12]=> cycA Degradation: A =[C]=> _. cycE =[@UbiPro]=> _ (not for cycE-cdk2 which is stable)
9
François Fages LOPSTR-SAS 2005 BIOCHAM Syntax of Reaction Rules R ::= S=>S | S=[O]=>S | S S | S S where A=[C]=>B stands for A+C=>B+C A B stands for A=>B and B=>A, etc. N ::= expr for R (import/export SBML format) Three abstraction levels: 1.Boolean Semantics: presence-absence of molecules 1.Concurrent Transition System (asynchronous, non-deterministic) 2.Concentration Semantics: number / volume of diffusion 1.Ordinary Differential Equations (deterministic) 3.Population of molecules: number of molecules 1.Stochastic Multiset Rewriting
10
François Fages LOPSTR-SAS 2005 Cell Cycle: G1 DNA Synthesis G2 Mitosis G1: CdK4-CycD S: Cdk2-CycA G2,M: Cdk1-CycA Cdk6-CycD Cdk1-CycB Cdk2-CycE (MPF)
11
François Fages LOPSTR-SAS 2005 Mammalian Cell Cycle Model [Kohn 99]
12
François Fages LOPSTR-SAS 2005 Zoom on Cdk1 cdk1~{p1,p2,p3} + cycA => cdk1~{p1,p2,p3}-cycA. cdk1~{p1,p2,p3} + cycB => cdk1~{p1,p2,p3}-cycB.... cdk1~{p1,p3}-cycA =[ Wee1 ]=> cdk1~{p1,p2,p3}-cycA. cdk1~{p1,p3}-cycB =[ Wee1 ]=> cdk1~{p1,p2,p3}-cycB. cdk1~{p2,p3}-cycA =[ Myt1 ]=> cdk1~{p1,p2,p3}-cycA. cdk1~{p2,p3}-cycB =[ Myt1 ]=> cdk1~{p1,p2,p3}-cycB.... cdk1~{p1,p2,p3} =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}. cdk1~{p1,p2,p3}-cycA =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}-cycA. cdk1~{p1,p2,p3}-cycB =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}-cycB.... _ =[ E2F13-DP12-gE2 ]=> cycA. cycB =[ APC~{p1} ]=>_.... 800 rules, 165 proteins/genes, 500 variables [Chabrier-Chiaverini-Danos-Fages-Schachter 04]
13
François Fages LOPSTR-SAS 2005 Boolean Semantics Associate: Boolean state variables to molecules denoting the presence/absence of molecules in the cell or compartment A Finite concurrent transition system [Shankar 93] to rules (asynchronous) over-approximating the set of all possible behaviors A reaction A+B=>C+D is translated into 4 transition rules for the possibly complete consumption of reactants: A+B A+B+C+D A+B A+B +C+D A+B A+ B+C+D A+B A+ B+C+D
14
François Fages LOPSTR-SAS 2005 Concentration Semantics k1cc for _=>preMPF. k3cc*[C25~{s1,s2}]*[preMPF] for preMPF=[C25~{s1,s2}]=>MPF. (k14cc*[CKI]*[MPF],k15cc*[CKI-MPF]) for CKI+MPF CKI-MPF. k2cc*[preMPF] for preMPF=>_. k2cc*[MPF] for MPF=>_. k2u*[APC]*[MPF] for MPF=[APC]=>_. k4cc*[Wee1]*[MPF] for MPF=[Wee1]=>preMPF. … parameter(k1cc,0.25). … present({preMPF, Wee1m}). Compiles into an ODE system (or a Stochastic Process under the Population semantics)
15
François Fages LOPSTR-SAS 2005 Plan 1.Biocham Rule Language for Modeling Biochemical Systems 1.Syntax of objects and reactions 2.Semantics at 3 abstraction levels: Boolean, Concentrations, Populations 2.Biocham Temporal Logic for Formalizing Biological Properties 1.Computation Tree Logic for Boolean semantics 2.Constraint Linear Time Logic for Concentration semantics 3.Learning Rules and Parameters from Temporal Properties 1.Learning reaction rules from CTL properties 2.Learning kinetic parameter values from Constraint LTL properties 4.Conclusion, collaborations
16
François Fages LOPSTR-SAS 2005 2. Formalizing Biological Properties in Temporal Logics Boolean Semantics: Computation Tree Logic CTL Choice Time E exists A always X next time EX( ) AX( ) AX( ) F finally EF( ) AG( ) AF( ) G globally EG( ) AF( ) AG( ) U until E ( U )A ( U )
17
François Fages LOPSTR-SAS 2005 Biological Properties formalized in CTL [Chabrier Fages 03] About reachability: Can the cell produce some protein P? reachable(P)==EF(P)
18
François Fages LOPSTR-SAS 2005 Biological Properties formalized in CTL [Chabrier Fages 03] About reachability: Can the cell produce some protein P? reachable(P)==EF(P) About pathways: Is it possible to produce P without having Q? E( Q U P) Is state s 2 a necessary checkpoint for reaching state s? checkpoint(s 2,s)== E( s 2 U s)
19
François Fages LOPSTR-SAS 2005 Biological Properties formalized in CTL [Chabrier Fages 03] About reachability: Can the cell produce some protein P? reachable(P)==EF(P) About pathways: Is it possible to produce P without having Q? E( Q U P) Is state s 2 a necessary checkpoint for reaching state s? checkpoint(s 2,s)== E( s 2 U s) About stationarity: Is a (partially described) state s a stable state? stable(s)== AG(s) Is s a steady state (with possibility of escaping) ? steady(s)==EG(s) Can the cell reach a stable state? EF(stable(s))
20
François Fages LOPSTR-SAS 2005 Biological Properties formalized in CTL [Chabrier Fages 03] About reachability: Can the cell produce some protein P? reachable(P)==EF(P) About pathways: Is it possible to produce P without having Q? E( Q U P) Is state s 2 a necessary checkpoint for reaching state s? checkpoint(s 2,s)== E( s 2 U s) About stationarity: Is a (partially described) state s a stable state? stable(s)== AG(s) Is s a steady state (with possibility of escaping) ? steady(s)==EG(s) Can the cell reach a stable state? EF(stable(s)) About oscillations (approximation without strong fairness): Can the system exhibit a cyclic behavior w.r.t. the presence of P ? oscillation(P)== EG((P EF P) ^ ( P EF P))
21
François Fages LOPSTR-SAS 2005 Cell Cycle Model-Checking biocham: check_reachable(cdk46~{p1,p2}-cycD~{p1}). Ei(EF(cdk46~{p1,p2}-cycD~{p1})) is true biocham: check_checkpoint(cdc25C~{p1,p2}, cdk1~{p1,p3}-cycB). Ai(!(E(!(cdc25C~{p1,p2}) U cdk1~{p1,p3}-cycB))) is true biocham: nusmv(Ai(AG(!(cdk1~{p1,p2,p3}-cycB) -> checkpoint(Wee1, cdk1~{p1,p2,p3}-cycB))))). Ai(AG(!(cdk1~{p1,p2,p3}-cycB)->!(E(!(Wee1) U cdk1~{p1,p2,p3}-cycB)))) is false biocham: why. -- Loop starts here cycB-cdk1~{p1,p2,p3} is present cdk7 is present cycH is present cdk1 is present Myt1 is present cdc25C~{p1} is present rule_114 cycB-cdk1~{p1,p2,p3}=[cdc25C~{p1}]=>cycB-cdk1~{p2,p3}. cycB-cdk1~{p2,p3} is present cycB-cdk1~{p1,p2,p3} is absent rule_74 cycB-cdk1~{p2,p3}=[Myt1]=>cycB-cdk1~{p1,p2,p3}. cycB-cdk1~{p2,p3} is absent cycB-cdk1~{p1,p2,p3} is present
22
François Fages LOPSTR-SAS 2005 Cell Cycle Model-Checking 800 rules, 165 proteins and genes, 500 variables. BIOCHAM-NuSMV symbolic model-checker time in seconds: Initial state G2Query:Time compiling29s Reachability G1EF CycE2s Reachability G1EF CycD1.9s Reachability G1EF PCNA-CycD1.7s Checkpoint for mitosis complex EF ( Cdc25~{Nterm} U Cdk1~{Thr161}-CycB) 2.2s Cycle EG ( (CycA EF CycA) ( CycA EF CycA)) 31.8s
23
François Fages LOPSTR-SAS 2005 Concentration Semantics: Constraint LTL Constraints over concentrations and derivatives as FOL formulae over the reals: [M] > 0.2 [M]+[P] > [Q] d([M])/dt < 0 Constraint LTL operators for time F, U, G (no non-determinism). F([M]>0.2) FG([M]>0.2) F ([M]>2 & F (d([M])/dt 0 & F(d([M])/dt<0)))) oscil(M,n)= F (d([M])/dt>0 & F(d([M])/dt<0 & … )) Language to formalize the relevant properties observed in experiments
24
François Fages LOPSTR-SAS 2005 Outline 1.Biocham Rule Language for Modeling Biochemical Systems 1.Syntax of objects and reactions 2.Semantics at 3 abstraction levels: Boolean, Concentrations, Populations 2.Biocham Temporal Logic for Formalizing Biological Properties 1.Computation Tree Logic for Boolean semantics 2.Constraint Linear Time Logic for Concentration semantics 3.Learning Rules and Kinetics from Temporal Properties 1.Learning reaction rules 2.Learning kinetic parameter values 4.Conclusion, collaborations
25
François Fages LOPSTR-SAS 2005 3. Learning Rules from Temporal Properties General framework of Theory Revision [de Raedt 92] Theory T: BIOCHAM model molecule declarations reaction rules: complexation, phosphorylation, etc… Training Examples φ: biological properties formalized in temporal logic Reachability Checkpoints Stable states Oscillations Bias P: Rule patterns and parameter range Kind of reaction rules to change Find R in P such that T,R |= φ
26
François Fages LOPSTR-SAS 2005 Learning Reaction Rules from CTL Specification The biological properties of the system are added as CTL formulas biocham: add_spec({reachable(MPF),checkpoint(cdc25C~{p1,p2},MPF),...}). Suppose that the MPF activation rule is missing in the model biocham: delete_rule(MPF~{p}=[cdc25C~{p1,p2}]=>MPF). biocham: check_all. The specification is not satisfied. This formula is the first not verified: Ei(EF(MPF)) Rules can be searched to correct the model w.r.t. specification: biocham: learn_one_rule(all_elementary_interaction_rules). Possible rules to be added: 3 _=[cdc25C~{p1,p2}]=>MPF MPF~{p}=[cdc25C~{p1,p2}]=>MPF CKI+MPF~{p}=[cdc25C~{p1,p2}]=>CKI-MPF
27
François Fages LOPSTR-SAS 2005 Learning Reaction Rules from CTL Specification Example: finding an intermediary step between MPF and APC activation biocham: absent(X). add_rule(_=>X). add_rule(X=>_). biocham: add_specs({ Ei(reachable(X)), Ai(oscil(X)), Ai(AG(!APC->checkpoint(X,APC))), Ai(AG(!X->checkpoint(MPF,X))) }). biocham: check_all. The specification is not satisfied. This formula is the first not verified: Ai(AG(!APC->!(E(!X U APC)))) Biocham searches for revisions of the model satisfying the specification biocham: revise_model. Deletion(s): _=[MPF]=>APC. _=>X. Addition(s): _=[X]=>APC. _=[MPF]=>X.
28
François Fages LOPSTR-SAS 2005 Theory Revision Algorithm General idea of constraint programming: replace a generate-and-test algorithm by a constrain-and-generate algorithm. Anticipate whether one has to add or remove a rule: ACTL formulae contain only A quantifiers: checkpoint,… If false, remains false after adding a rule delete rule Remove a rule on the path given by the model checker ( why command) ECTL formulae contain only E quantifiers: reachability, oscillation, … If false, remain false after deleting a rule add rule Unclassified CTL formulae Mixed E and A quantifiers Guides the backtracking search of the possible changes to the model
29
François Fages LOPSTR-SAS 2005 Learning Kinetic Parameters with Constraint-LTL parameter(k3cc,0.1). k3cc*[MPF~{p}]*[cdc25C~{p1,p2}] for MPF~{p}=[cdc25C~{p1,p2}]=>MPF. biocham: trace_get([k3cc],[(0,5)],20, oscil(MPF,4)&F([MPF]>1),100). Found parameters that make oscil(MPF,4) & F([MPF]>1) true: parameter(k3cc,2.5).
30
François Fages LOPSTR-SAS 2005 Traces from Numerical Simulation From a system of Ordinary Differential Equations dX/dt = f(X) Numerical integration produces a discretization of time (adaptive step size Runge-Kutta and Rosenbrock method for stiff systems) The trace is a linear Kripke structure: (t 0,X 0 ), (t 1,X 1 ), …, (t n,X n )… the derivatives can be added to the trace (t 0,X 0,dX 0 /dt), (t 1,X 1,dX 1 /dt), …, (t n,X n,dX n /dt)… Equality x=v true if x i ≤v & x i+1 ≥ v or if x i ≥ v & x i+1 ≤v
31
François Fages LOPSTR-SAS 2005 Constraint-Based LTL (Forward) Model Checking Hypothesis 1: the initial state is completely known Hypothesis 2: the formula can be checked over a finite period of time [0,T] Simple algorithm based on the trace of the numerical simulation: 1.Run the numerical simulation from 0 to T producing values at a finite sequence of time points 2.Iteratively label the time points with the sub-formulae of that are true: Add to the time points where a FOL formula is true, Add F to the previous time points labeled by Add U to the predecessor time points of while they satisfy (Add G to the states satisfying until T (optimistic abstraction…))
32
François Fages LOPSTR-SAS 2005 Conclusion The biochemical abstract machine BIOCHAM implements: A simple rule-based language for modeling biochemical processes with three abstraction levels: Boolean semantics: presence/absence of molecules Molecule Concentration semantics (ODE) Molecule Population semantics (stochastic) A powerful temporal logic language for formalizing biological properties CTL (implemented with NuSMV model checker) Constraint LTL (implemented in Prolog) An original machine learning system Reaction rule discovery from CTL specification Parameter estimation from constraint LTL specification Issue of compositionality: model reuse in different contexts Issue of abstraction/refinement: model simplification/decomposition
33
François Fages LOPSTR-SAS 2005 Collaborations STREP APRIL 2: Applications of probabilistic inductive logic programming Luc de Raedt, Freiburg, Stephen Muggleton, Imperial College London,… Learning in a probabilistic logic setting NoE REWERSE: Reasoning on the web with rules and semantics François Bry, Münich, Rolf Backofen Jena, Mike Schroeder Dresden,… Connecting Biocham to the semantic web: gene and protein ontologies INRIA Bang, Jean Clairambault, Benoît Perthame INSERM, Villejuif, Francis Lévi “Cancer chronotherapies” ULB, Albert Goldbeter, Bruxelles Coupled models of cell cycle, circadian cycle, drugs.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.