François Fages ICLP December 2003 The Biochemical Abstract Machine BIOCHAM Logic programming steps towards formal biology François Fages, INRIA Rocquencourt.

Slides:



Advertisements
Similar presentations
CS 267: Automated Verification Lecture 2: Linear vs. Branching time. Temporal Logics: CTL, CTL*. CTL model checking algorithm. Counter-example generation.
Advertisements

Algorithmic Software Verification VII. Computation tree logic and bisimulations.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Protein structure prediction with constraint logic programming François Fages, Constraint.
François FagesLyon, Dec. 7th 2006 Biologie du système de signalisation cellulaire induit par la FSH ASC 2006, projet AgroBi INRIA Rocquencourt Thème “Systèmes.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages WCB Nantes 2006 On Using Temporal Logic with Constraints to express Biological Properties of Cell Processes François Fages, Constraint Programming.
Planning based on Model Checking Dept. of Information Systems and Applied CS Bamberg University Seminar Paper Svetlana Balinova.
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
François FagesShonan village 14/11/11 Formal Cell Biology in Biocham François Fages Constraint Programming Group INRIA Paris-Rocquencourt.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
An Introduction to the Model Verifier verds Wenhui Zhang September 15 th, 2010.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Locations, Transport and Signaling François Fages, Constraint Programming Group, INRIA Rocquencourt.
ECE Synthesis & Verification - L271 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Model Checking basics.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraints Group, INRIA.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,
UPPAAL Introduction Chien-Liang Chen.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages FJCP 2005 Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA Rocquencourt,
1 Temporal Logic u Classical logic:  Good for describing static conditions u Temporal logic:  Adds temporal operators  Describe how static conditions.
Proteins. Proteins / Polypeptides The functional molecules of life.
BY1101 Introduction to Molecular and Cellular Biology Tutorial for module BY1101: Proteins and nucleic acids Joe Colgan
François Fages CPCV, March 2004 Constraint-based Model Checking of Hybrid Systems: A First Experiment in Systems Biology François Fages, INRIA Rocquencourt.
Models and methods in systems biology Daniel Kluesing Algorithms in Biology Spring 2009.
Petri net modeling of biological networks Claudine Chaouiya.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Programming Languages for Biology Bor-Yuh Evan Chang November 25, 2003 OSQ Group Meeting.
Temporal Logic and Model Checking. Reactive Systems We often classify systems into two types: Transformational: functions from inputs available at the.
1 Ivan Lanese Computer Science Department University of Bologna Italy Concurrent and located synchronizations in π-calculus.
Boolean Here, we are focusing on the early steps of FSH-induced signalling: the FSH receptor transduction mechanisms. We have translated the model previously.
Witness and Counterexample Li Tan Oct. 15, 2002.
Louis Stodieck Phone: Office: ECAE 113.
G-protein linked Plasma membrane receptor. Works with “G-protein”, an intracellular protein with GDP or GTP. Involved in yeast mating factors, epinephrine.
Macromolecules: proteins & nucleic acids Building Blocks of Life
Cost-based Optimization of Graph Queries Silke Trißl Humboldt-Universität zu Berlin Knowledge Management in Bioinformatics IDAR 2007.
Model Checking Lecture 4 Tom Henzinger. Model-Checking Problem I |= S System modelSystem property.
The Chemical Level of Organization Chapter 2. Atoms and Molecules Atoms are the smallest units of matter, they consist of protons, neutrons, and electrons.
The Chemical Level of Organization Chapter 2. Atoms and Molecules  Atoms are the smallest units of matter, they consist of protons, neutrons, and electrons.
On Reducing the Global State Graph for Verification of Distributed Computations Vijay K. Garg, Arindam Chakraborty Parallel and Distributed Systems Laboratory.
François FagesICLP, Edinburgh, 18/7/2010 A Logical Paradigm for Systems Biology François Fages INRIA Paris-Rocquencourt
Chapter 11 DNA and GENES. DNA: The Molecule of Heredity DNA, the genetic material of organisms, is composed of four kinds nucleotides. A DNA molecule.
Lecture 81 Optimizing CTL Model checking + Model checking TCTL CS 5270 Lecture 9.
Biochemistry Concept 1: Analyzing and the chemistry of life (Ch 2, 3, 4, 5) Let’s go back a few steps…
François Fages LOPSTR-SAS 2005 Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA.
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Validation - Formal verification -
Verification & Validation By: Amir Masoud Gharehbaghi
François Fages Rennes March 2005 The Biochemical Abstract Machine BIOCHAM-2 François Fages, Contraintes project-team, Theme: symbolic systems, INRIA Rocquencourt.
Symbolic Algorithms for Infinite-state Systems Rupak Majumdar (UC Berkeley) Joint work with Luca de Alfaro (UC Santa Cruz) Thomas A. Henzinger (UC Berkeley)
1 CSEP590 – Model Checking and Automated Verification Lecture outline for July 9, 2003.
CH. 2 BASIC CHEMISTRY MRS. BARNES. MATTER Matter is anything that takes up space. Elements are the natural form of matter. They are composed of atoms;
Chapter 2 Review. Atomic Structure Protons Neutrons Electrons.
August 18, 2015 Bell Work:  What is the purpose of DNA replication? Objective: The student will be able to… 1. Demonstrate his or her knowledge of DNA.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraints Group, INRIA.
Complexity of Compositional Model Checking of Computation Tree Logic on Simple Structures Krishnendu Chatterjee Pallab Dasgupta P.P. Chakrabarti IWDC 2004,
Sub-fields of computer science. Sub-fields of computer science.
Basic concepts of Model Checking
2 Chemistry Comes Alive.
CIS 842: Specification and Verification of Reactive Systems
Conformationally changed Stability
The Chemical Building Blocks of Life
CSCI1600: Embedded and Real Time Software
Formal Methods in software development
CSCI1600: Embedded and Real Time Software
Conformationally changed Stability
Languages for Systems Biology
CSCI1600: Embedded and Real Time Software
Formal Methods in software development
Program correctness Branching-time temporal logics
Presentation transcript:

François Fages ICLP December 2003 The Biochemical Abstract Machine BIOCHAM Logic programming steps towards formal biology François Fages, INRIA Rocquencourt Joint work with and Nathalie Chabrier-Rivier Sylvain Soliman In collaboration with ARC CPBIO Alexander Bockmayr, LORIA Nancy, Vincent Danos, CNRS PPS Paris 7, Vincent Schächter, Genoscope.

François Fages ICLP December 2003 Current revolution in Biology Elucidation of high-level biological processes in terms of their biochemical basis at the molecular level. Mass production of genomic and post-genomic data: ARN expression, protein synthesis, protein-protein interactions,… Need for a strong parallel effort on the formal representation of biological processes. Need for formal tools for modeling and reasoning about their global behavior.

François Fages ICLP December 2003 Formalisms for modeling biochemical systems Diagrammatic notation Boolean networks [Thomas 73] Milner’s  –calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] Concurrent transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 03] Biochemical abstract machine BIOCHAM [Chabrier-Fages-Soliman 03] Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] Differential equations Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] Hybrid concurrent constraint languages [Bockmayr-Courtois 01]

François Fages ICLP December 2003 Our goal Beyond simulation, provide formal tools for querying, validating and completing biological models. Our proposal: Use of temporal logic CTL as a query language for models of biological processes; Use of concurrent transition systems for their modeling; Use of symbolic and constraint-based model checkers for automatically evaluating CTL queries in qualitative and quantitative models. Use of inductive logic programming for learning models [EU APRIL 2] In course, learn and teach bits of biology with constraint logic programs.

François Fages ICLP December 2003 Plan of the talk 1. Introduction 2. A simple algebra of cell molecules 3. Concurrent transition systems of biochemical reactions Example of the mammalian cell cycle control 4. Temporal logic CTL as a query language Computational results with BIOCHAM 5. Learning models An experiment with inductive logic programming 6. Quantitative models Simulation with differential equations Constraint-based model checking 7. Conclusion

François Fages ICLP December 2003 References A wonderful textbook: Molecular Cell Biology. 5th Edition, 1100 pages+CD, Freeman Publ. Lodish, Berk, Zipursky, Matsudaira, Baltimore, Darnell. Nov Genes and signals. Ptashne, Gann. CSHL Press Modeling dynamic phenomena in molecular and cellular biology. Segel. Cambridge Univ. Press Modeling and querying bio-molecular interaction networks. Chabrier, Chiaverini, Danos, Fages, Schächter. To appear in TCS The biochemical abstract machine BIOCHAM. Chabrier, Fages, Soliman.

François Fages ICLP December A Simple Algebra of Cell Molecules Small molecules: covalent bonds (outer electrons shared) kcal/mol 70% water 1% ions 6% amino acids (20), nucleotides (5), fats, sugars, ATP, ADP, … Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol Stability and bindings determined by the number of weak bonds: 3D shape 20% proteins ( amino acids) RNA ( nucleotides AGCU) DNA ( nucleotides AGCT)

François Fages ICLP December 2003 Structure levels of proteins 1) Primary structure: word of n amino acids residues (20 n possibilities) linked with C-N bonds ICLP Isoleucine Cysteine Leucine Proline 2) Secondary: word of m  helix,  strands, random coils,… (3 m -10 m ) stabilized by hydrogen bonds H---O 3) Tertiary 3D structure: spatial folding stabilized by hydrophobic interactions

François Fages ICLP December 2003 Formal proteins Cyclin dependent kinase 1 Cdk1 (free, inactive) Complex Cdk1-Cyclin B Cdk1–CycB (low activity) Phosphorylated form Cdk1~{thr161}-CycB at site threonine 161 (high activity) (BIOCHAM syntax)

François Fages ICLP December 2003 Gene expression: DNA  RNA  protein DNA: word over 4 nucleotides Adenine, Guanine, Cytosine, Thymine double helix of pairs A--T and C---G Replication: DNA synthesis Genes: parts of DNA Transcription: RNA copying from a gene # ERCC1-(PRB-JUN-CFOS)

François Fages ICLP December 2003 Genome Size SpeciesGenome sizeChromosomesCoding DNA E. Coli (bacteria)5 Mb1 circular100 % S. Cerevisae (yeast)12 Mb1670 % Mouse, Human3 Gb20, 2315 % …15 Gb …140 Gb 3,200,000,000 pairs of nucleotides single nucleotide polymorphism 1 / 2kb

François Fages ICLP December 2003 Genome Size SpeciesGenome sizeChromosomesCoding DNA E. Coli (bacteria)4 Mb1100 % S. Cerevisae (yeast)12 Mb1670 % Mouse, Human3 Gb20, 2315 % Onion15 Gb81 % …140 Gb

François Fages ICLP December 2003 Genome Size SpeciesGenome sizeChromosomesCoding DNA E. Coli (bacteria)4 Mb1100 % S. Cerevisae (yeast)12 Mb1670 % Mouse, Human3 Gb20, 2315 % Onion15 Gb81 % Lungfish140 Gb0.7 %

François Fages ICLP December 2003 Algebra of Cell Molecules E ::= Name|E-E|E~{E,…,E}|(E) S ::= _|E+S Names : proteins, #genes, molecules, abstract processes… - : binding operator for protein complexes, gene bindings, … Non associative, non commutative (could be in most cases) ~{…} : modification operator for phosphorylated sites, … Associative, Commutative, Idempotent. + : solution operator, “soup aspect”, Assoc. Comm. Idempotent, Neutral _ No membranes, no transport formalized. Bitonal calculi [Cardelli 03].

François Fages ICLP December 2003 Plan of the talk 1. Introduction 2. A simple algebra of cell molecules 3. Concurrent transition systems of biochemical reactions Example of the mammalian cell cycle control 4. Temporal logic CTL as a query language Computational results with BIOCHAM 5. Learning models An experiment with inductive logic programming 6. Quantitative models Simulation with differential equations Constraint-based model checking 7. Conclusion

François Fages ICLP December Concurrent Transition Syst. of Biochemical Reactions Enzymatic reactions: R ::= S=>S | S=[E]=>S | S=[R]=>S | S S | S S define a concurrent transition system CTS over integer state variables denoting the multiplicity of the molecules (multiset rewriting). One can associate a finite abstract CTS over boolean state variables denoting the presence/absence of molecules which correctly over-approximates the set of all possible behaviors If we translate a reaction A+B=>C by 4 rules for possible consumption: A+B  A+B+C A+B   A+B +C A+B   A+  B+C A+B  A+  B+C

François Fages ICLP December 2003 Four Rule Schemas Complexation: A + B => A-B Cdk1+CycB => Cdk1–CycB Phosphorylation: A =[C]=> A~{p} Cdk1–CycB =[Myt1]=> Cdk1~{thr161}-CycB Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB Synthesis: _ =[C]=> A. _ =[#Ge2-E2f13-Dp12]=> CycA Degradation: A =[C]=> _. CycE =[UbiPro]=> _ (not for CycE-Cdk2 which is stable)

François Fages ICLP December 2003 An Actin-Myosin Engine with ATP fuel A two-stroke nano-engine: Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP

François Fages ICLP December 2003 Cell Cycle: G1  DNA Synthesis  G2  Mitosis G1: CdK4-CycD Cdk6-CycD Cdk2-CycE S: Cdk2-CycA G2 M: Cdk1-CycA Cdk1-CycB

François Fages ICLP December 2003 Mammalian Cell Cycle Control Map [Kohn 99]

François Fages ICLP December 2003 Kohn’s map detail for Cdk2 Complexation with CycA and CycE Phosphorylation sites PY15 and P Concurrent Transition Rules [ARC CPBIO]: cdk2+cycA => cdk2-cycA. cdk2~{p2}+cycA => cdk2~{p2}-cycA. cdk2~{p1}+cycA => cdk2~{p1}-cycA. cdk2~{p1,p2}+cycA => cdk2~{p1,p2}-cycA. cdk2+cycE => cdk2-cycE. cdk2+cycE~{p1} => cdk2-cycE~{p1}. cdk2~{p2}+cycE => cdk2~{p2}-cycE. … 700 rules, 165 proteins and genes, 500 variables, states.

François Fages ICLP December 2003 Translation in Prolog Encode states with a single predicate p(A,B,C,D,E) A+B  C+D. p(1,1,_,_,E):-p(_,_,1,1,E). C  A. p(_,B,1,D,E):- p(1,B,_,D,E). Thm. [Delzanno-Podelski 99] Predecessor(S) = T P (S) Backward analysis by computing lfp(T P  {p(x):-s} ). CLP-based Deductive Model Checker DMC [Delzanno-Podelski 99] More efficient implementation using state-of-the-art symbolic model- checker NuSMV [Cimatti Clarke Giunchiglia Giunchiglia Pistore 02].

François Fages ICLP December 2003 Plan of the talk 1.Introduction 2. A simple algebra of cell molecules 3. Concurrent transition systems of biochemical reactions Example of the mammalian cell cycle control 4. Temporal logic CTL as a query language Computational results with BIOCHAM 5. Learning models An experiment with inductive logic programming 6. Quantitative models Simulation with differential equations Constraint-based model checking 7. Conclusion

François Fages ICLP December Temporal Logic CTL as a Query Language Computation Tree Logic Choice Time E exists A always X next time EX(  )AX(  ) F finally EF(  )  AG(  ) AF(  ) liveness G globally EG(  )  AF(   ) AG(  ) safety U until E (    U   )A (    U   )

François Fages ICLP December 2003 Kripke Structures A Kripke structure K is a triple (S; R; L) where S is a set of states, and R  SxS is a total relation. s |=  if  is true in s, s |= E  if there is a path  from s such that  |= , s |= A  if for every path  from s,  |= ,  |=  if s |=  where s is the starting state of ,  |= X  if  1 |= ,  |= F  if there exists k >0 such that  k |= ,  |= G  if for every k >0,  k |= ,  |=  U  iff there exists k>0 such that  k |=  for all j < k  j |=  Following [Emerson 90] we identify a formula  to the set of states which satisfy it  ~ {s  S : s |=  }.

François Fages ICLP December 2003 Symbolic Model Checking Model Checking is an algorithm for computing, in a given finite Kripke structure the set of states satisfying a CTL formula: {s  S : s |=  }. Basic algorithm: represent K as a graph and iteratively label the nodes with the subformulas of  which are true in that node. Add  to the states satisfying  Add EF  (EX  ) to the (immediate) predecessors of states labeled by  Add E(  U  ) to the predecessor states of  while they satisfy  Add EG  to the states for which there exists a path leading to a non trivial strongly connected component of the subgraph of states satisfying  Symbolic model checking: use OBDDs to represent states and transitions as boolean formulas (S is finite).

François Fages ICLP December 2003 Biological Queries (1/3) About reachability: Given an initial state init, can the cell produce some protein P? init  EF(P) Which are the states from which a set of products P1,..., Pn can be produced simultaneously? EF(P1^…^Pn) About pathways: Can the cell reach a state s while passing by another state s 2 ? init  EF(s 2 ^EFs) Is state s 2 a necessary checkpoint for reaching state s?  EF(  s 2 U s) Is it possible to produce P without using nor creating Q? EF(  Q U s) Can the cell reach a state s without violating some constraints c? init  EF(cUs)

François Fages ICLP December 2003 Biological Queries (2/3) About stability: Is a certain (partially described) state s a stable state? s  AG(s) s  AG(s) (s denotes both the state and the formula describing it). Is s a steady state (with possibility of escaping) ? s  EG(s) Can the cell reach a stable state? init  EF(AG(s)) not a LTL formula. Must the cell reach a stable state? init  AF(AG(s)) What are the stable states? Not expressible in CTL [Chan 00]. Can the system exhibit a cyclic behavior w.r.t. the presence of P ? init  EG((P  EF  P) ^ (  P  EF P))

François Fages ICLP December 2003 Biological Queries (3/3) About the correctness of the model: Can one see the inaccuracies of the model and correct them? Exhibit a counterexample pathway or a witness. Suggest refinements of the model or biological experiments to validate/invalidate the property of the model. About durations: How long does it take for a molecule to become activated? In a given time, how many Cyclins A can be accumulated? What is the duration of a given cell cycle’s phase? CTL operators abstract from durations. Time intervals can be modeled in FO by adding numerical arguments for start times and durations.

François Fages ICLP December 2003 Cell to Cell Signaling by Hormones and Receptors Receptor tyrosine kinase RTK Mitogen activated protein kinase MAPK RAF + RAFK -> RAF-RAFK RAF~p + RAFPH -> RAF~p-RAFPH MEK~p + RAF~p -> MEK~p-RAF~p … RAF-RAFK -> RAF + RAFK. RAF~p-RAFPH -> RAF~p + RAFPH. MEK~p-RAF~p -> MEK~p + RAF~p. … RAF-RAFK -> RAFK + RAF~p. RAF~p-RAFPH -> RAF + RAFPH. MEK~p-RAF~p -> MEK~{p,q}+ RAF~p. … … -> MAPK~{p,q}.

François Fages ICLP December 2003 Cell to Cell Signaling by Hormones and Receptors Receptor tyrosine kinase RTK Mitogen activated protein kinase MAPK RAF + RAFK -> RAF-RAFK RAF~p + RAFPH -> RAF~p-RAFPH MEK~p + RAF~p -> MEK~p-RAF~p … RAF-RAFK -> RAF + RAFK. RAF~p-RAFPH -> RAF~p + RAFPH. MEK~p-RAF~p -> MEK~p + RAF~p. … RAF-RAFK -> RAFK + RAF~p. RAF~p-RAFPH -> RAF + RAFPH. MEK~p-RAF~p -> MEK~{p,q}+ RAF~p. … … -> MAPK~{p,q}. MEKp is a checkpoint for the cascade (producing MAPKpp) ?- nusmv(!(E(!(MEK~p) U MAPK~{p,q}))). true The PH complexes are only here to "slow down" the cascade ?- nusmv(E(!(MEK~p-MEKPH) U MAPK~~{p,q})). true

François Fages ICLP December 2003 Cell Cycle: G1  DNA Synthesis  G2  Mitosis G1: CdK4-CycD Cdk6-CycD Cdk2-CycE S: Cdk2-CycA G2 M: Cdk1-CycA Cdk1-CycB

François Fages ICLP December 2003 Mammalian Cell Cycle Control Benchmark 700 rules, 165 proteins and genes, 500 variables, states. BIOCHAM NuSMV model-checker time in seconds: Initial state G2Query:Time: compiling29 Reachability G1EF CycE2 Reachability G1EF CycD1.9 Reachability G1EF PCNA-CycD1.7 Checkpoint for mitosis complex  EF (  Cdc25~{Nterm} U Cdk1~{Thr161}-CycB) 2.2 Cycle EG ( (CycA  EF  CycA)  (  CycA  EF CycA)) 31.8

François Fages ICLP December 2003 Plan of the talk 1. Introduction 2. A simple algebra of cell molecules 3. Concurrent transition systems of biochemical reactions Example of the mammalian cell cycle control 4. Temporal logic CTL as a query language Computational results with BIOCHAM 5. Learning models An experiment with inductive logic programming 6. Quantitative models Simulation with differential equations Constraint-based model checking 7. Conclusion

François Fages ICLP December Learning Models Basic idea: learn reaction rules from temporal properties of the system. Learning of yeast cell cycle rules from reachability properties and counterexamples with Progol [Muggleton 00]. reaction([m_CP,m_Y],[m_pM]). reaction([m_CP],[m_C2]). % reaction([m_pM],[m_M]). reaction([m_M],[m_C2,m_YP]). reaction([m_C2],[m_CP]). reaction([m_YP],[]). reaction([],[m_Y]). pathway(S1,S2) :- same(S1,S2). pathway(S1,S2) :- reaction(L1,L2), transition(S1,L1,S3,L2), pathway(S3,S2).

François Fages ICLP December 2003 Inductive Logic Programming reaction([m_pM],[m_M]) learned… 6th PCRD APRIL 2 “Applications of Probabilistic Inductive Logic Progr.” Luc de Raedt, Univ. Freiburg, Stephen Muggleton, Univ. London. pathway([m_CP,m_Y],[m_M]). pathway([m_CP,m_Y],[m_M,m_pM]). pathway([m_CP,m_Y],[m_M,m_Y]). pathway([m_CP,m_Y],[m_M,m_Y,m_pM] ). pathway([m_CP,m_Y],[m_M,m_CP]). pathway([m_CP,m_Y],[m_M,m_CP,m_Y] ). pathway([m_CP,m_Y],[m_M,m_CP,m_pM ]). pathway([m_CP,m_Y],[m_M,m_CP,m_Y, m_pM]). pathway([m_pM],[m_C2,m_YP]). pathway([m_pM],[m_M,m_C2,m_YP]). pathway([m_pM],[m_pM,m_C2,m_YP]). pathway([m_pM],[m_M,m_pM,m_C2,m_Y P]). :-pathway([],[m_C2]). :-pathway([],[m_CP]). :-pathway([],[m_C2,m_CP]). :-pathway([],[m_M]). :-pathway([],[m_YP]). :-pathway([],[m_YP, m_Y]). :-pathway([],[m_Y,m_pM]). :-pathway([],[m_CP,m_pM]). :-pathway([],[m_Y,m_M]). :-pathway([m_CP, m_C2],[m_YP]). :-pathway([m_CP],[m_YP]). :-pathway([m_C2],[m_YP]). :-pathway([m_Y],[]).

François Fages ICLP December 2003 Plan of the talk 1. Introduction 2. A simple algebra of cell molecules 3. Concurrent transition systems of biochemical reactions Example of the mammalian cell cycle control 4. Temporal logic CTL as a query language Computational results with BIOCHAM 5. Learning models An experiment with inductive logic programming 6. Quantitative models Simulation with differential equations Constraint-based model checking 7. Conclusion

François Fages ICLP December Quantitative Models Enzymatic reactions with rates k 1 k 2 k 3 E+S  k1 C  k2 E+P E+S  k3 C can be compiled by the law of mass action into a system of Ordinary Differential Equations dE/dt = -k 1 ES+(k 2 +k 3 )C dS/dt = -k 1 ES+k 3 C dC/dt = k 1 ES-(k 2 +k 3 )C dP/dt = k 2 C

François Fages ICLP December 2003 Circadian Cycle Model C' = -(k1*C)-k4*C-kdC*C +k2*CN+k3*P2*T2 CN' = k1*C-k2*CN-kdN*CN MP' = (KIP^n*nusP)/(KIP^n+CN^n) -kd* MP-(numP*MP)/(KmP+MP) MT' = (KIT^n*nusT)/(KIT^n+CN^n) -MT[ t]*(kd+numT/(KmT+MT)) P0' = ksP*MP-kd*P0-(V1P*P0)/( K1P+P0) +(V2P*P1)/(K2P+P1) P1' = (V1P*P0)/(K1P+P0)-kd*P1 -(V2P*P1)/(K2P+P1) -(V3P*P1)/( K3P+P1)+(V4P*P2)/(K4P+P2) P2' = k4*C+(V3P*P1)/(K3P+P1) -kd*P2-(V4P*P2)/(K4P+P2) -(nudP*P2)/(KdP+P2)-k3*P2*T2 T0' = ksT*MT-kd*T0-(V1T*T0)/( K1T+T0)+(V2T*T1)/(K2T+T1) T1' = (V1T*T0)/(K1T+T0)-kd*T1 -(V2T*T1)/(K2T+T1)-(V3T*T1)/( K3T+T1)+(V4T*T2)/(K4T+T2) T2' = k4*C+(V3T*T1)/(K3T+T1) -k3*P2*T2-(V4T*T2)/(K4T+T2) -T2*(kd+nudT/(KdT+T2))

François Fages ICLP December 2003 Gene Interaction Networks Gene interaction example [Bockmayr-Courtois 01] Hybrid Concurrent Constraint Programming HCC [Saraswat et al.] 2 genes x and y. dx/dt = 0.01 – 0.02*x if y < 0.8 dx/dt = – 0.02*x if y ≥ 0.8 dy/dt = 0.01*x

François Fages ICLP December 2003 Concurrent Transition System Time discretized using Euler’s method (Runge-Kutta method in HCC): y < 0.8  x’ = x + dt*( *x), y’ = y + dt*0.01*x y ≥ 0.8  x’ = x + dt*( *x), y’ = y + dt*0.01*x Initial condition: x=0, y=0. Associated Constraint Logic Program over reals CLP(R) for dt=1. Init :- X=0, Y=0, p(X,Y). p(X,Y):-X>=0, Y>=0, Y<0.8, X1=X-0.02*X+0.01, Y1=Y+0.01*X, p(X1,Y1). p(X,Y):-X>=0, Y>=0, Y>=0.8, X1=X-0.02*X, Y1=Y+0.01*X, p(X1,Y1).

François Fages ICLP December 2003 Proving CTL properties by computing fixpoints of CLP programs Theorem [Delzanno Podelski 99] EF(  )=lfp(T P  {p(x):-  ), EG(  )=gfp(T P   ). Safety property AG(  ) iff  EF(  ) iff init  lfp(T P  {  ) Liveness property AG(  1  AF(  2)) iff init  lfp(T P  gfp(T P  {    ) Prolog-based implementation with constraints in CLP(R,B) [Delzanno 00] Proofs of protocols, cache consistency, etc. [Delzanno 01]

François Fages ICLP December 2003 Deductive Model Checker DMC: Gene Interaction r(init, p(s_s,A,B), {A=0,B=0}). r(p(s_s,A,B), p(s_s,C,D), {A>=0,B>=0.8,C=A-0.02*A,D=B+0.01*A}). r(p(s_s,A,B), p(s_s,C,D), {A>=0,B>=0,B<0.8, C=A-0.02*A+0.01,D=B+0.01*A}). | ?- prop(P,S). P = unsafe, S = p:s*(x>=0.6) | ?- ti. Property satisfied. Execution time 0.0 | ?- ls. s(0, p(s_s,A,_), {A>=0.6}, 1, (0,0)).

François Fages ICLP December 2003 Demonstration DMC (continued) | ?- prop(P,S). P = unsafe, S = p:s*(x>=0.2) ? | ?- ti. Property NOT satisfied. Execution time 1.5 | ?- ls. s(0, p(s_s,A,_), {A>=0.2}, 1, (0,0)). s(1, p(s_s,A,B), {B =-0.0,A>= }, 2, (2,1)). … s(26, p(s_s,A,B), {B>=0.0,A>=0.0, B *A< }, 27, (2,26)). s(27, init, {}, 28, (1,27)).

François Fages ICLP December Conclusion The great ambition of logic programming is to make of programming a modeling task in the first place, with equations, constraints and logical formulae. In this respect, computational molecular biology offers numerous challenges to the logic programming community at large. Besides combinatorial search and optimization problems coming from molecular biology (DNA and protein sequence comparison, protein structure prediction,…) there is a need to model globally the system at hand and automate reasoning on all its possible behaviors.

François Fages ICLP December 2003 Conclusion The biochemical abstract machine BIOCHAM project aims at developing: Qualitative models of complex biochemical processes: Intracellular and extracellular signaling, cell-cycle control,… [ Prolog-based implementation + BDD symbolic model-checking ILP-based learning of models from temporal properties [6thPCRD APRIL 2] Membranes and transportation not modeled Bitonal algebras [Cardelli et al. 03] BioAmbients, Brane calculi [Cardelli et al. 03] Quantitative models: Differential equations Hybrid concurrent constraint programming [Bockmayr-Courtois-Eveillard 03] Constraint-based model-checking [Delzanno-Podelski 99] [Chabrier-Fages 03]

François Fages ICLP December 2003 Perspectives Collaboration with biologists on BIOCHAM models of the cell-cycle control Colon cancer therapies, Domenjoud, UHP Nancy Chronotherapies, Clairambaud, INSERM Hybrid constraint logic programming Multi-scale molecular-electro-physiological models [Sorine et al. 03]