Abstract Processes Go Live: Representing Biomolecular Process Networks with Process Algebra Ehud Shapiro Joint work with Aviv Regev, Bill Silverman, Naama Barkai
2 The Cell
3
4 DNA, RNA, and Ribosomes creating Proteins Ribosomes translate RNA to Proteins RNA Polymerase transcribes DNA to RNA
5 Computationally: A stateless string transducer from the RNA alphabet of nucleic acids to the Protein alphabet of amino acids (= protein) Ribosomes in operation 25 nm
6 Biochemical Pathways
7 Types of biological knowledge Sequence: Sequence of the genome ( long sequences of 4 nucleic acids) and identity of its protein products ( long sequences of 22 amino acids) Structure: 3D structure of biomolecules Molecular interactions: Inter-molecular interaction capabilities Network behavior: Behavior of biomolecular networks
8 Computer representation and generation of biological knowledge
9 Formal representation languages for biological knowledge Sequence: Strings Structure: 3D modelling language Molecular interaction: ? Network behavior: ? Computer use is key driver of explosive growth in sequence and structure branches of biology
10 Our goal: Formal language for representing biomolecular networks
11 Formal language for representing biomolecular process networks Enables objective repository Deposited knowledge can be easily shared and critically evaluated. Enables computer manipulation Analysis and discovery Distant goal (10-25yrs): Full simulations of virtual cells and of virtual organisms, derived from such knowledge repositories
12 Our approach: Represent biomolecular networks as computational process networks
13 The molecule as a computational process Example: The enzyme ERK1 Ser/Thr kinase Detect: Binds proteins with Ser and Thr amino acids Modify: Adds phosphate group to these amino acids Be Modified: Can be bound and phosphorylated by other proteins and eznymes
14 The molecule as a computational process Binding MP1 molecules Regulatory T-loop: Change conformation Kinase site: Phosphorylate Ser/Thr residues (PXT/SP motifs) ATP binding site: Bind ATP, and use it for phsophorylation Binding to substrates COOH Nt lobe Catalytic core Ct lobe NH 2 StructureFunction p-Y p-T
15 The correspondence between molecular and computational processes
16 Our approach, specifically: Represent biomolecular networks as Stochastic -Calculus programs
17 Process algebras (calculi) Research direction began in the late 70’s Guarded commands (Dijkstra), CSP, Occam (Hoare), CCS (Milner) Culminated in the -Calculus in the late 80’s Except for Occam, no viable implementations (viewed as mathematical tools, not “real” programming languages)
18 The -calculus A program specifies a network of interacting processes Processes are defined by their potential communication activities Communication occurs via channels, defined by names Communication content: Channel names (mobility, reconfiguration) (Milner, Walker and Parrow, 1989)
19 Syntax: Channels All communication events, input or output, occur on channels
20 Syntax: Processes Processes are composed of communication events and of other processes
21 The -calculus: Reduction rules COMM: z replaces y in P Actions consumed; Alternative choices discarded Ready to send z on x ( … + x ! z. Q ) | (… + x ? y. P) Q | P {z/y} Ready to receive y on x
22 Principles for mapping molecules to - calculus Domain = Process SYSTEM ::= R_GENE | A_GENE | R | R | A |... A ::= ( new internal_channels) (BINDING_DOMAIN |CATALYTIC_DOMAIN) A Residue stretches = Global (free) channel names and co-names BINDING_DOMAIN (rbs )::= rbs ? {e}. BOUND_DOMAIN (e ) A R R_GENE
23 Principles for mapping molecules to - calculus Molecular integrity (molecule) = Local channels as unique identifiers A ::= ( new e) (BINDING_DOMAIN |CATALYTIC_DOMAIN) Molecule binding = Exporting local channels rbs ! {e}. e ! { … } | rbs ? {cross_e}. cross_e ? {…} A A R
24 Principles for mapping molecules to - calculus Molecular interaction and modification = Communication and change of channel names tyr ! p-tyr. CATALYTIC_DOMAIN | … + tyr ? tyr’. BINDING_DOMAIN CATALYTIC_DOMAIN | BINDING_DOMAIN {p-tyr / tyr } Y Y
25 Quantitative aspects ~10 9 protein molecules within the cell Packed tightly in space Important proteins in only small amounts (100’s,1000’s) Stochastic effects on molecular interaction
26 Stochastic -calculus (Priami, 1995) Every channel x or internal communication attached with a delay parameter d Delay for each communication is chosen from an exponential distribution with d At each time step all enabled communications occur
27 A simple example of stochastic molecular processes sP R R R R_GENE Synthesis machinery Degradation machinery Fast synthesis of protein R from single gene R Slow degradation of each protein R Result: Steady state level of protein R
28 A simple example of stochastic molecular processes: -calculus code TEST::= R_GENE | SYNTHESIS | DEGRADATION R_GENE::= spR(1,0) ? []. ( R_GENE | R ) R::= degR(100,0) ? []. 0 SYNTHESIS::= spR(1,0) ! []. SYNTHESIS DEGRADATION::= degpR(100,0) ! []. DEGRADATION sP R R R R_GENE Synthesis machinery Degradation machinery
29 A simple example of stochastic molecular processes: spiFCP simulation sP R R R R_GENE Synthesis machinery Degradation machinery
30 The spiFCP simulation system Based on the Logix system (Flat Concurrent Prolog) Supports synchronous interaction Appropriate insulated surface syntax (PiFCP) Compiler: Generate FCP computational processes from input -calculus (in PiFCP syntax) code Each channel to an FCP message stream Each process to an FCP process
31 The spiFCP simulation system Regular version Step-by-step execution and tracing, at process and channel level Stochastic version A Scheduler mechanism ensures behavior Monitoring time evolution of quantities of process instances
32 Circadian Clocks: Implementations J. Dunlap, Science (1998)
33 AR mA PAPA mR PRPR Hysteretic Oscillator A R fast Induced Repressed The circadian clock machinery (Barkai and Leibler, Nature 2000)
34 The circadian clock machinery (Barkai and Leibler, Nature 2000) PAPA PRPR UTR A UTR R RA AR A_GENE A_RNA R_GENE R_RNA transcription translation transcription translation degradation Figure 1
35 The circadian clock machinery: Stochastic aspects (Barkai and Leibler, Nature 2000) Appropriate behavior requires different rates Basal transcription A >>> R Promoted transcription A > R Degradation A > R degradation R binding to A (repression) >> A binding to pA or pR (promotion)
36 The machinery in -calculus: “A” molecules A gene_a ::= PROMOTED_A + BASAL_A PROMOTED_A::= pA ? {e}. ACTIVATED_TRANSCRIPTION_A(e) BASAL_A::= bA ? []. ( A gene_a | A mRNA_a ) ACTIVATED_TRANSCRIPTION_A::= 1. (ACTIVATED_TRANSCRIPTION_A | A mRNA_a ) + e ? []. A gene_a A mRNA_a ::= TRANSLATION_A + DEGRADATION_mA TRANSLATION_A::= utrA ? []. (A mRNA_a | A prot_A ) DEGRADATION_mA::= degmA ? []. 0 A prot_A ::= (new e1,e2,e3) PROMOTION_A-R + BINDING_R + DEGRADATION_A PROMOTION_A-R ::= pA ! {e2}. e2 ! []. A prot_A + pR ! {e3}. e3 ! []. A prot_A BINDING_R ::= rbs ! {e1}. BOUND_A prot_A BOUND_A prot_A ::= e1 ! []. A prot_A + degpA ? []. e1 ! []. 0 DEGRADATION_A::= degpA ? []. 0
37 The machinery in -calculus: “R” molecules R gene_r ::= PROMOTED_R + BASAL_R PROMOTED_R::= pR ? {e}. ACTIVATED_TRANSCRIPTION_R(e) BASAL_R::= bR ? []. ( R gene_r | R mRNA_r ) ACTIVATED_TRANSCRIPTION_R::= 2. (ACTIVATED_TRANSCRIPTION_R | R mRNA_r ) + e ? []. R gene_r R mRNA_r ::= TRANSLATION_R + DEGRADATION_mR TRANSLATION_R::= utrR ? []. (R mRNA_r | R prot_R ) DEGRADATION_mR::= degmR ? []. 0 R prot_R ::= BINDING_R + DEGRADATION_A BINDING_A ::= rbs ? {e}. BOUND_R prot_R BOUND_R prot_R ::= e1 ! []. R prot_R DEGRADATION_R::= degpR ? []. 0
38 spiFCP simulation Free A protein A mRNA Free R protein R mRNA A-R complex
39 Utilizing Concurrency Theory to assign function to biomolecular ensembles Semantic concept: Two processes are equivalent if can be exchanged within a context without changing system behavior Build two representations in the -calculus molecular level (implementation) functional module level (specification) Show the equivalence of both representations by computer simulation by formal verification
40 The circadian clock specification: Hysteresis module (Barkai and Leibler, Nature 2000) A R ON OFF Fast
41 Hysteresis module “R” processes remain intact All “A” processes replaced by a single process, the H-MODULE H-MODULE::= (new e1, e2, e3) ON_H-MODULE(C A )
42 Hysteresis module (ON) ON_H-MODULE(C A )::= {C A T1}. (rbs ! {e1}. ON_DECREASE + e1 ! []. ON_H_MODULE + pR ! {e2}. (e2 ! [].0 | ON_H_MODULE) + 1. ON_INCREASE ) ON_INCREASE::= {C A ++}. ON_H-MODULE ON_DECREASE::= {C A --}. ON_H-MODULE
43 Hysteresis module (OFF) OFF_H-MODULE(C A )::= {C A >T2}. ON_H-MODULE(C A ) + {C A <=T2}. (rbs ! {e1}. OFF_DECREASE + e1 ! []. OFF_H_MODULE + 2. OFF_INCREASE ) OFF_INCREASE::= {C A ++}. OFF_H-MODULE OFF_DECREASE::= {C A --}. OFF_H-MODULE