1 BioUML - Biological Universal Modeling Language Biosoft.Ru, Novosibirsk, Russia. Laboratory of Bioinformatics, Digital Design Technologies Institute, SB RAS. Copyright © 2003 Biosof.Ru. All rights reserved.
2 The challenge "We now have unprecedented ability to collect data about nature but there is now a crisis developing in biology, in that completely unstructured information does not enhance understanding. We need a framework to put all of this knowledge and data into — that is going to be the problem in biology. We've reached the stage where we can't talk to each other — we've all become highly specialized. We need a framework, a framework where people can come back to us and say, 'Yes, I understand.' Driving toward that framework is really the big challenge.“ Sydney Brenner Sydney Brenner, 2002 Nobel Prize winner
3
4 BioUML workbench v. 0.5
5 Visual Modeling The problem of modeling and simulating of complex systems can be significantly simplified for customers by using computer systems providing visual modeling. These visual depictions offer alternative syntax to completely and formally specify models. A number of visual syntaxes were developed and implemented in computer systems for electrical engineering and computer science. The most known graphical language for computer science is UML – Unified Modeling Language.
6 The OMG specification states: "The Unified Modeling Language (UML) is a graphical language for visualizing, specifying, constructing, and documenting the artifacts of a software-intensive system. The UML offers a standard way to write a system's blueprints, including conceptual things such as business processes and system functions as well as concrete things such as programming language statements, database schemas, and reusable software components."
7 UML diagrams use case diagram class diagram behavior diagrams: statechart diagram activity diagram interaction diagrams: sequence diagram collaboration diagram implementation diagrams: component diagram deployment diagram
8 UML use case diagram
9 UML state chart diagram
10 UML deployment diagram
11 If we will consider UML architecture from developer view point then we will note: 1) UML was really designed for modeling software systems and are hardly suitable for other problem domains. 2) UML has complicated structure that is quite hard for implementation. OMG specification is more the 700 pages. 3) UML was not designed for visual modeling and simulation of dynamics of complex systems. That is why we need new language for modeling biological systems and we called this language BioUML.
12 Graphical notations for biological pathways
13 Some graphical notations for biological pathways Kohn K.W. (1999). Molecular Interaction Map of the Mammalian Cell Cycle Control and DNA Repair Systems. Mol. Biol.Cell. 10, Kitano H. (2003). A graphical notation for biochemical networks. BIOSILICO Vol. 1. No. 5. R. Maimon and S. Browning (2001). Diagrammatic Notation and Computational Grammar for Gene Networks. Proceedings of the International Conference on Systems Biology Cook D.L. et al. (2001). A basis for a visual language for describing, archiving and analyzing functional models of complex biological systems. Genome Biol. 2. RESEARCH Database specific notations: - KEGG/Metabolic pathways; GeneNet system; TRANSPATH - …
14 Kohn K.W. (1999). Molecular Interaction Map of the Mammalian Cell Cycle Control and DNA Repair Systems. Mol. Biol.Cell. 10,
15
16 Representation of multimolecular complexes: stimulatory and inhibitory complexes of E2F1, DP1, and pRb. (a) E2F1:DP1 dimer; (b) E2F1:DP1:pRb trimer; (c) E2F1:DP1 bound to promoter element E2 (transcriptional activation shown); (d) E2F1:DP1:pRb bound to E2 (transcriptional inhibition shown). Note that the promoter element can be occupied either by E2F1:DP1 or by E2F1:DP1:pRb (alternative binding represented by interaction lines joined at an acute angle).
17
18 Kitano H. (2003). A graphical notation for biochemical networks. BIOSILICO Vol. 1. No. 5.
19
20
21 R. Maimon and S. Browning. Diagrammatic Notation and Computational Grammar for Gene Networks. Proceedings of the International Conference on Systems Biology
22
23 p53, The gatekeeper of death:
24 Cook D.L. et al. (2001). A basis for a visual language for describing, archiving and analyzing functional models of complex biological systems. Genome Biol. 2. RESEARCH 0012.
25
26
27 KEGG - metabolic pathways
28 KEGG - signaling pathways
29 GeneNet system A chemical formalism was employed as a basis for describing the events occurring in biological pathways 2 types of relationships between entities: reaction - the interaction between the entities that leads to the appearance of new entity regulatory event - the effect of an entity on a certain reaction
30 GeneNet – antiviral responce
31 TRANSPATH – p53 pathway
32 BioUML architecture
33
34
35 BioUML meta model The core of BioUML workbench is meta model. Unlike UML meta mode BioUML meta model is problem domain neutral and provides an abstract layer for comprehensive formal description of wide range of biological and other complex systems. Content of databases on biological pathways or SBML models are expressed in terms of meta model and then can be used by other workbench plug-ins.
36 A B -k1[A] R1 C -k2[B]K2[B] R System structure is described as a graph Mathematical model of the system Description of system components in the database ID A CC..... // ID R1 A->B... // ID B CC..... // ID R2 B->C... // ID C CC..... // A B -k1[A] R1 C -k2[B] K2[B] R Example of formalized description of system from two chemical reactions
37
38 <!ATTLIST dml version CDATA "0.9.2" appVersion CDATA "0.7.0" > <!ATTLIST diagram diagramType CDATA #REQUIRED > Detailed description of BioUML diagrams markup language is available at:
39 Diagram type concept Diagram type defines: · what system components can be shown in the diagram; · diagram view builder - it is used to generate view for each diagram element taking into account problem domain peculiarities; · semantic controller - provides semantic integrity of the diagram during its editing; · filters – hide or highlight diagram elements according to some selection criteria.
40
41 Module concept The module concept allows to developer define new diagram types and incorporate other databases on biological pathways into BioUML framework. The module defines mapping of database content into diagram elements and diagram types that can be used with the database. Module also provides query engine that can be used by BioUML workbench to find interactiong components of the system.
42 Modules standard BioUML module for biological pathways; module for models in SBML format; module for GeneNet database; module for KEGG/Pathways datbase (draft); module for TRANSPATH database (draft).
43 Standard BioUML module for biological pathways The module defines most common biological data types (gene, protein, RNA, substance, reaction, etc.), they mapping into simple text database and three diagram types for description of biological pathways on several semantic levels: 1. Semantic network (ontology) - this diagram type is used to describe semantic relationships between system components, system states, and related problem domain concepts. 2. Pathway diagram type is used for formalized description of biological pathway structure. This diagram type uses GeneNet graphical notation. 3. Pathway simulation diagram type is extension of pathway structure diagram, where variables are associated with graph nodes and differential equations with graph edges. This allows to BioUML workbench automatically generate mathematical model of the system and simulate its dynamics.
44
45
46
47 Formal description and modeling of biological systems require coordinated efforts of different group of researchers: programmers - they should provide computer tools for this task. problem domain experts - they should specify what and how should be described. experimenters and annotators - they should describe corresponding data following to these rules. mathematicians - they should provide methods for models analysis and simulations. BioUML meta model separates these tasks so they can be effectively solved by corresponding group of researchers and provides simple contract how these groups and corresponding software parts.
48
49 BioUML from practical view point
50 Diagram viewer
51
52 Diagram editor
53 Database search engine
54 Graph search engine
55 MATLAB plug-in
56 %script for 'CellCycle_1991Gol' model simulation %constants declaration global Reaction1_vi Reaction2_kd Reaction4_K1 Reaction4_Kc Reaction4_VM1 Reaction5_K3 Reaction5_VM3 Reaction6_K2 Reaction6_V2 Reaction7_K4 Reaction7_V4 Reaction1_vi = Reaction2_kd = Reaction4_K1 = 0.1 Reaction4_Kc = 0.3 Reaction4_VM1 = 0.5 Reaction5_K3 = 0.1 Reaction5_VM3 = 0.2 Reaction6_K2 = 0.1 Reaction6_V2 = Reaction7_K4 = 0.1 Reaction7_V4 = 0.1 %Model rate variables and their initial values y = [] y(1) = 0.0 % y(1) - $cytoplasm.C y(2) = 0.0 % y(2) - $cytoplasm.EmptySet y(3) = 0.0 % y(3) - $cytoplasm.M y(4) = 0.0 % y(4) - $cytoplasm.X %numeric equation solving [t,y] = ode23('CellCycle_1991Gol_dy',[0 100],y) %plot the solver output plot(t,y(:,1),'-',t,y(:,2),'-',t,y(:,3),'-',t,y(:,4),'-') title ('Solving Goldbeter problem') ylabel ('y(t)') xlabel ('x(t)') legend('$cytoplasm.C','$cytoplasm.EmptySet','$cytoplasm.M','$cytoplasm.X');
57 Function to calculate dy/dt for the model function dy = CellCycle_1991Gol_dy(t, y) % Calculates dy/dt for 'CellCycle_1991Gol' model. %constants declaration global Reaction1_vi Reaction2_kd Reaction4_K1 Reaction4_Kc Reaction4_VM1 Reaction5_K3 Reaction5_VM3 Reaction6_K2 Reaction6_V2 Reaction7_K4 Reaction7_V4 % write rules to calculate some eqution parameters rateOfReaction1 = Reaction1_vi; rateOfReaction4 = ((1 - y(3))*Reaction4_VM1*y(1))/((1 + Reaction4_K1 - y(3))*(Reaction4_Kc + y(1))); rateOfReaction5 = (Reaction5_VM3*(1 - y(4))*y(3))/(1 + Reaction5_K3 - y(4)); rateOfReaction6 = (y(3)*Reaction6_V2)/(Reaction6_K2 + y(3)); rateOfReaction7 = (Reaction7_V4*y(4))/(Reaction7_K4 + y(4)); rateOfReaction2 = y(1)*Reaction2_kd; % calculates dy/dt for 'CellCycle-1991Gol.xml' model dy = [ + rateOfReaction1 - rateOfReaction2 - rateOfReaction1 - rateOfReaction4 - rateOfReaction5 + rateOfReaction6 + rateOfReaction7 + rateOfReaction2 + rateOfReaction4 - rateOfReaction6 + rateOfReaction5 - rateOfReaction7]
58 JavaScript plug-in
59 What is SBML? SBML - Systems Biology Markup Language - is a description language for simulations in systems biology. SBML is oriented towards representing biochemical networks, including cell signaling pathways, metabolic pathways, biochemical reactions, gene regulation, and many others. It is mostly useful for exchange of models between different software. SBML was developed by the Caltech unit of the ERATO Kitano project, with frequent input from the community.
60 SBML plug-in
61 What is SBW? The goal of the Systems Biology Workbench (SBW) project is to create an open-source, integrated software environment for systems biology that enables sharing of models and resources between simulation and analysis tools. SBW libraries are available for many different programming languages: C, C++, Delphi, Java, Perl, and Python. Software applications that implement different functions can be connected to each other through SBW using a straightforward application programming interface (API).
62
63 SBW plug-in
64
65 Further works The biggest room in the Earth is room for improvement
66 ReleaseDescriptionEstimated date v GeneNet module, database search engine, new graph layouts and graph search support. 2004, Q1 v Database modules (improved version): KEGG, TRANSPATH 2004, Q2 v Using MathML for formulas, equations editor. Library of predefined kinetic lows, own simulation engine. 2004, Q3 v Rule concept support, event concept support. 2004, Q4
67 Availability BioUML workbench (including source code) is freely available at There is special forum dedicated to BioUML workbench where you can post your questions and suggestions:
68 Acknowledgments Part of this work was partially supported by the grant of Volkswagen-Stiftung (I/75941). Author is grateful to Alexander Kel and Sergey Zhatchenko and for useful comments and discussions, as well as to Igor Tyazhev, Vlad Zhvaleev and Oleg Onegov for technical support.