BioUML Fedor Kolpakov Institute of Systems Biology (spin-off of DevelopmentOnTheEdge.com) Laboratory of Bioinformatics, Design Technological Institute.

Slides:



Advertisements
Similar presentations
1 BioUML - extensible workbench for systems biology Laboratory of Bioinformatics, Novosibirsk, Russia ITC Software All.
Advertisements

Database System Concepts and Architecture
A Prototype Implementation of a Framework for Organising Virtual Exhibitions over the Web Ali Elbekai, Nick Rossiter School of Computing, Engineering and.
BioUML SOFTWARE FRAMEWORK FOR SYSTEMS BIOLOGY Overview  ITC Software All rights reserved.
CellDesigner Tutorial Laurence Calzone, Andrei Zinovyev UMR U900 INSERM/Institut Curie/Ecole des Mines de Paris Wednesday, April 30th.
IEC Substation Configuration Language and Its Impact on the Engineering of Distribution Substation Systems Notes Dr. Alexander Apostolov.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 System modeling 2.
® IBM Software Group © 2006 IBM Corporation Rational Software France Object-Oriented Analysis and Design with UML2 and Rational Software Modeler 04. Other.
5 EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI.
BioUML integrated platform for building virtual cell and virtual physiological human Fedor Kolpakov Institute of Systems Biology Laboratory of Bioinformatics,
Guide to Oracle10G1 Introduction To Forms Builder Chapter 5.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 8 Slide 1 System models.
Chapter 12: ADO.NET and ASP.NET Programming with Microsoft Visual Basic.NET, Second Edition.
A Guide to Oracle9i1 Introduction To Forms Builder Chapter 5.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Component and Deployment Diagrams
Presented by IBM developer Works ibm.com/developerworks/ 2006 January – April © 2006 IBM Corporation. Making the most of Creating Eclipse plug-ins.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
INTRODUCTION TO WEB DATABASE PROGRAMMING
BioUML – open source integrated platform for collaborative and reproducible research in systems biology Fedor Kolpakov, Institute of Systems.
BioUML SOFTWARE FRAMEWORK FOR SYSTEMS BIOLOGY Overview  ITC Software All rights reserved.
A Scalable Application Architecture for composing News Portals on the Internet Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta Famagusta.
Cytoscape A powerful bioinformatic tool Mathieu Michaud
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
K. Jamroendararasame*, T. Matsuzaki, T. Suzuki, and T. Tokuda Department of Computer Science, Tokyo Institute of Technology, JAPAN Two Generators of Secure.
Lecture 3: Pathway Generation Tool I: CellDesigner: A modeling tool of biochemical networks Y.Z. Chen Department of Pharmacy National University of Singapore.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 7 Slide 1 System models l Abstract descriptions of systems whose requirements are being.
System models Abstract descriptions of systems whose requirements are being analysed Abstract descriptions of systems whose requirements are being analysed.
BioUML Fedor Kolpakov Institute of Systems Biology (spin-off of DevelopmentOnTheEdge.com) Laboratory of Bioinformatics, Design Technological Institute.
Tutorial 121 Creating a New Web Forms Page You will find that creating Web Forms is similar to creating traditional Windows applications in Visual Basic.
Introduction to Eclipse Plug-in Development. Who am I? Scott Kellicker Java, C++, JNI, Eclipse.
Encoding and exchanging graphical representation: architecture and formats Fedor Kolpakov Institute of Systems Biology Novosibirsk, Russia COMBINE-2010,
Comprehensive model for formalized description, visualization and simulation of biological systems Fedor A. Kolpakov Biosoft.Ru,
BioUML integrated platform for building virtual cell and virtual physiological human Fedor Kolpakov Institute of Systems Biology Laboratory of Bioinformatics,
BioUML ( Software framework for systems biology Overview Biosoft.Ru, Novosibirsk, Russia. Laboratory of Bioinformatics, Digital Design.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
The Optimization Plug-in for the BioUML Platform E. O. Kutumova 1,2,*, A. S. Ryabova 1,3, N. I. Tolstyh 1, F. A. Kolpakov 1,2 1 Institute of Systems Biology,
Copyright OpenHelix. No use or reproduction without express written consent1.
Chapter 7 System models.
Selected Topics in Software Engineering - Distributed Software Development.
Virtual Cell and CellML The Virtual Cell Group Center for Cell Analysis and Modeling University of Connecticut Health Center Farmington, CT – USA.
CERN - IT Department CH-1211 Genève 23 Switzerland t DB Development Tools Benthic SQL Developer Application Express WLCG Service Reliability.
System models l Abstract descriptions of systems whose requirements are being analysed.
Modified by Juan M. Gomez Software Engineering, 6th edition. Chapter 7 Slide 1 Chapter 7 System Models.
BIological NetwOrk Manager Cytoscape plugin Andrei Zinovyev Institut Curie/INSERM/Ecole de Mines, UMR 900 “Computational Systems Biology of Cancer”
Sommerville 2004,Mejia-Alvarez 2009Software Engineering, 7th edition. Chapter 8 Slide 1 System models.
Modular Approach To Modeling Of The Apoptosis Machinery E. O. Kutumova 1,2,*, R. N. Sharipov 1,3,2, F. A. Kolpakov 1,2 1 Institute of Systems Biology,
WEP Presentation for non-IT Steps and roles in software development 2. Skills developed in 1 st year 3. What can do a student in 1 st internship.
Modelling epithelial transport David P. Nickerson¹, Kirk L. Hamilton², Peter J. Hunter¹ ¹Auckland Bioengineering Institute, Auckland, New Zealand ²Department.
Sharing Models. How Can I Exchange Models? SBML (Systems Biology Markup Language): de facto standard for representing cellular networks. A large number.
New possibilities 1. EBI data pack – database modules for main databases supported by EBI: Ensembl, UniProt, ChEBI,Reactome, IntAct, GO, BioModels, SBO.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Copyright OpenHelix. No use or reproduction without express written consent1.
Mining the Biomedical Research Literature Ken Baclawski.
BlackBerry Applications using Microsoft Visual Studio and Database Handling.
Systems Biology Markup Language Ranjit Randhawa Department of Computer Science Virginia Tech.
1 Technical & Business Writing (ENG-715) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter 5 Introduction To Form Builder. Lesson A Objectives  Display Forms Builder forms in a Web browser  Use a data block form to view, insert, update,
Preface IIntroduction Objectives I-2 Course Overview I-3 1Oracle Application Development Framework Objectives 1-2 J2EE Platform 1-3 Benefits of the J2EE.
1 BioUML - Biological Universal Modeling Language Biosoft.Ru, Novosibirsk, Russia. Laboratory of Bioinformatics, Digital Design Technologies.
Welcome to Gramene’s RiceCyc (Pathways) Tutorial RiceCyc allows biochemical pathways to be analyzed and visualized. This tutorial has been developed for.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
10 Copyright © 2004, Oracle. All rights reserved. Building ADF View Components.
BioUML – integrated platform for building virtual cell and virtual physiological human Fedor Kolpakov 1,2, Nikita Tolstykh 1,2, Elena Kutumova 1,2, Ilya.
XML: Extensible Markup Language
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Analysis models and design models
Tutorial 7 – Integrating Access With the Web and With Other Programs
Presentation transcript:

BioUML Fedor Kolpakov Institute of Systems Biology (spin-off of DevelopmentOnTheEdge.com) Laboratory of Bioinformatics, Design Technological Institute of Digital Techniques Novosibirsk, Russia

Agenda Part 1: overview of BioUML workbench Cafe break Part 2: new concepts and possibilities (versions – 0.8.3) Further development Questions and discussion

Part 1: overview of BioUML workbench Overview Main concepts Meta model Architecture overview Diagram types Database module concepts Full text search Graph search Simulation engine BioUML server BMOND/Biopath database Live demonstration: Installation of BioUML workbench Creating and simulating simple model SBML - Biomodels module BioPAX import BMOND database web interface JavaScript shell

Part 2: new concepts and possibilities Overview Reconstruction as solitaire game Levels of biological information BioHub concept Composite database module Composite diagram Experiment concept Graphic notation editor Microarray data analysis Live demonstration Loading database modules from server Text search Graph search Creating of composite database module Creating of composite diagram Experiment Graphic notation editor Microarray data analysis

Useful resources Flash movies that demonstrates how to work with BioUML workbench Useguide, >200 pages -HTML version -MS Word document Examples of pathway annotation: BMOND – Biological Models aNd Diagrams database

Part 1 Overview of BioUML workbench

Main BioUML concepts and ideas Visual modeling oMeta model – problem domain neutral level of abstraction that describes system as compartmentalized graph oDiagram type concept – formally defines graphical notation and provides its incorporation into BioUML workbench. oAutomated code generation for model simulation. Database module concept - allows developer to incorporate databases on biological pathways into BioUML workbench taking into account database peculiarities. Plug-in based architecture (Eclipse platform runtime from IBM company).

Biological databases Data search and retrieving Visual modeling Automated code generation for model simulation of model behavior Formal description of structure of biological system MATLAB codeJava code Simulating using MATLAB. JMatLink allows to BioUML workbench to start MATLAB and retrieve simulations results Java simulation plug-in. Contains ODE solvers ported from odeToJava and methods for hybrid models support. … code

Meta model

Corresponding mathematical model: Example: system from two chemical reactions A B -k1[A] R1 C -k2[B] K2[B] R k1 - reaction rate for R1 k2 – reaction rate for R2

A B -k1[A] R1 C -k2[B] R System structure is described as a graph Mathematical model of the system Description of system components in the database ID A CC..... // ID R1 A->B... // ID B CC..... // ID R2 B->C... // ID C CC..... // A B -k1[A] R1 C -k2[B] R Meta-model: example of formal description of system from two chemical reactions

Suggested approach can be applied for modeling biological systems using: –Systems of ordinary differential equations –Systems of algebra-differential equations –State and transition diagrams –Hybrid models –Boolean and logical networks –Petri nets –Markov chains –Stochastic models –Cellular automates –… Some limitations –Spatial models –PDE –…

BioUML architecture

Plug-in based architecture Plug-in - plugin.xml - Java jar files A plug-in is the smallest unit of BioUML workbench function that can be developed and delivered separately into BioUML workbench. A plug-in is described in an XML manifest file, called plugin.xml. The parsed contents of plug-in manifest files are made available programmatically through a plug-in registry API provided by Eclipse runtime. - extension points are well-defined function points in the system where other plug-ins can contribute functionality. - extension is a specific contribution to an extension point. Plug-ins can define their own extension points, so that other plug-ins can integrate tightly with them. Plug-in - plugin.xml - Java jar files Plug-in - plugin.xml - etc. Eclipse platform runtime

Formal description and modeling of biological systems require coordinated efforts of different group of researchers: programmers - they should provide computer tools for this task. problem domain experts - they should specify what and how should be described. experimenters and annotators - they should describe corresponding data following to these rules. mathematicians - they should provide methods for models analysis and simulations. BioUML architecture separates these tasks so they can be effectively solved by corresponding group of researchers and provides simple contract how these groups and corresponding software parts should communicate.

Diagram types

Diagram type concept Diagram type defines: · types of biological components and their interactions that can be shown on the diagram; · diagram view builder - it is used to generate view for each diagram element taking into account problem domain peculiarities; · semantic controller - provides semantic integrity of the diagram during its editing; · filters – hide or highlight diagram elements according to some selection criteria.

Reconstruction and formal description of biological systems using different diagram types 1. Semantic network 2. Pathway diagram (semantic network + gene network or metabolic pathway) 3. Metabolic pathway 4. Gene network 5. Pathway simulation (mathematical model) Formality, details Semi-structured data Structured data (reactions and its components) Kinetic data (kinetic laws, constants, initial values

Graphic notation

Stimulus activating NF-kappaB (semantic network, ontology)

NF-kappaB family (semantic network, ontology)

Function of human DNA methyltransferases (pathway diagram)

The biosynthesis of catecholamines (metabolic pathway)

Cell cycle model of mammalian G1/S transition control with E2F feedback loops (pathway simulation diagram)

DGR0356 “NF-kB model” (Hoffmann et al., 2002)

NF-kB dynamics in nucleus and cytoplasm before and after TNF-alpha stimulation (Hoffmann et al., 2002)

Regulation of caspase-3 activation and degradation (Stucki and Simon, 2005 )

Database module concept The database module concept allows to developer define new diagram types and incorporate other databases on biological pathways into BioUML framework. The database module defines mapping of database content into diagram elements and diagram types that can be used with the database. Module also provides query engine that can be used by BioUML workbench to find interactiong components of the system.

BioUML database modules BioUML standard module Databases EBI databases: Ensembl, UniProt, ChEBI, GeneOntology Biopath/BMOND ( KEGG/Ligand ( TRANSPATH ( GeneNet ( Formats SBML – Systems Biology Markup Language, level 1, 2 ( CellML – Cell Markup Language ( BioPax – Biological Pathways Exchange ( PSI-MI OBO GXL - Graph eXchange Language (

KEGG pathway

CellML model

SBML model

Full text search

User interface for full text search: 1) pop-up menu; 2) menu buttons for selected entity; 3) full text search pane.

Full text search (uses Lucene engine)

Graph search

Graph search engine

Simulation engine

Biological databases Data search and retrieving Visual modeling Automated code generation for model simulation of model behavior Formal description of structure of biological system MATLAB codeJava code Simulating using MATLAB. JMatLink allows to BioUML workbench to start MATLAB and retrieve simulations results Java simulation plug-in. Contains ODE solvers ported from odeToJava and methods for hybrid models support. … code

%script for 'CellCycle_1991Gol' model simulation %constants declaration global Reaction1_vi Reaction2_kd Reaction4_K1 Reaction4_Kc Reaction4_VM1 Reaction5_K3 Reaction5_VM3 Reaction6_K2 Reaction6_V2 Reaction7_K4 Reaction7_V4 Reaction1_vi = Reaction2_kd = Reaction4_K1 = 0.1 Reaction4_Kc = 0.3 Reaction4_VM1 = 0.5 Reaction5_K3 = 0.1 Reaction5_VM3 = 0.2 Reaction6_K2 = 0.1 Reaction6_V2 = Reaction7_K4 = 0.1 Reaction7_V4 = 0.1 %Model rate variables and their initial values y = [] y(1) = 0.0 % y(1) - $cytoplasm.C y(2) = 0.0 % y(2) - $cytoplasm.EmptySet y(3) = 0.0 % y(3) - $cytoplasm.M y(4) = 0.0 % y(4) - $cytoplasm.X %numeric equation solving [t,y] = ode23('CellCycle_1991Gol_dy',[0 100],y) %plot the solver output plot(t,y(:,1),'-',t,y(:,2),'-',t,y(:,3),'-',t,y(:,4),'-') title ('Solving Goldbeter problem') ylabel ('y(t)') xlabel ('x(t)') legend('$cytoplasm.C','$cytoplasm.EmptySet','$cytoplasm.M','$cytoplasm.X');

Function to calculate dy/dt for the model function dy = CellCycle_1991Gol_dy(t, y) % Calculates dy/dt for 'CellCycle_1991Gol' model. %constants declaration global Reaction1_vi Reaction2_kd Reaction4_K1 Reaction4_Kc Reaction4_VM1 Reaction5_K3 Reaction5_VM3 Reaction6_K2 Reaction6_V2 Reaction7_K4 Reaction7_V4 % write rules to calculate some eqution parameters rateOfReaction1 = Reaction1_vi; rateOfReaction4 = ((1 - y(3))*Reaction4_VM1*y(1))/((1 + Reaction4_K1 - y(3))*(Reaction4_Kc + y(1))); rateOfReaction5 = (Reaction5_VM3*(1 - y(4))*y(3))/(1 + Reaction5_K3 - y(4)); rateOfReaction6 = (y(3)*Reaction6_V2)/(Reaction6_K2 + y(3)); rateOfReaction7 = (Reaction7_V4*y(4))/(Reaction7_K4 + y(4)); rateOfReaction2 = y(1)*Reaction2_kd; % calculates dy/dt for 'CellCycle-1991Gol.xml' model dy = [ + rateOfReaction1 - rateOfReaction2 - rateOfReaction1 - rateOfReaction4 - rateOfReaction5 + rateOfReaction6 + rateOfReaction7 + rateOfReaction2 + rateOfReaction4 - rateOfReaction6 + rateOfReaction5 - rateOfReaction7]

Results of SBML semantic tests

BioModels – comparison BioUML simulation results with other simulators

Simulators comparison criteria Passed – CSV file was generated by simulator interval criteria no difference * min < x < * max or x < ZERO and max < ZERO small difference – 0.5 * min < x < 1.5 * max significant difference - otherwise median criteria no difference - abs((x – median)/median) < 0.01 or x < ZERO and median < ZERO small difference - abs((x – median)/median) < 0.5 significant difference – otherwise x – variable value provided by compared simulator min, max, median – calculated from values provided by other simulators with which the specified simulator is being compared. Implementation note: if result file was not generated by BioUML, then other simulators can be compared one to each other.

BioUML Enterprise Edition: BioUML server

BioUML workbench Servlet container: Tomcat BioUML EE architecture MySQL database Web browser JDBC BeanExplorer Enterprise Edition Client side: Server side: Database module BioUML servlet JDBC DB module Lucene full text search engine

BMOND Biological MOdels aNd Diagrams database (former name – Biopath)

BioUML workbench Servlet container: Tomcat BMOND system architecture Biopath MySQL database Web browser JDBC BeanExplorer Enterprise Edition Client side: Server side: Biopath module

Figure 4. G1/S entry model (Kel et al., 2000) described using BioUML technology.

BMOND web interface live demonstration - Interface overview - View diagrams - View diagram components - List of diagram components - Categories (classification) - Filter - Dynamic columns - Web forms for components editing

Part 2 New concepts and possibilities

Part 2: new concepts and possibilities Overview Reconstruction as solitaire game Levels of biological information BioHub concept Composite database module Composite diagram Experiment concept Graphic notation editor Microarray data analysis Live demonstration Loading database modules from server Text search Graph search Creating of composite database module Creating of composite diagram Experiment Graphic notation editor Microarray data analysis

Metaphor: biological systems reconstruction as solitaire (patience) game Desk – BioUML editor Solitaire – biological pathway Cards – biological objects (genes, proteins, lipids, etc.) Pack of cards – different biological databases

UniProtEnsembl ChEBI GO Level 1: Catalogs Level 2: Pathways, models GeneModels Biological objects Levels of biological information refers Level 3: Problem specific Cyclonet - leads - actions - targets refers LipidNet classifications: - lipids - genes refers UbiProt classifications: E1, E2, E3, … Main idea for data integration and pathway reconstruction: - escape information duplication - classify components of biological pathways by levels - each next level should refer but do not duplicate information from previous levels - use free EBI databases whenever it is possible. BMOND refers wikiwiki wikiwiki wikiwiki

Add-on technology This approach should help us to solve difficulties with usage of external catalogs when external catalog does not contain needed entity (for example gene or substance) or when we would like to add some information to existing entity description. Example for BMOND2, gene: special table allow us to add new entity to BMOND2 if such entity missing in corresponding external catalog. Gene catalog Ensembl Gene add-on table Synonyms Description DB references Literature references Classification SQL query BioUML BeanExplorer Web interface Java object Lucene Document

BioHub

BioHub concept BioHUB – an approach link information from different databases. Main usage: –binding microarray (omics) data to pathway diagrams –graph search –DBReferences editor –microarray (omics) data analysis Follows to MIRIAM standard: –References to database objects –Relationships between biological objects Simple Java API

BioHub structure Entities - DB_ID - version - ID - AC - species - description - key words Relations - DB_ID_1 - DB_version_1 - ID_1 - DB_ID_2 - DB_version_2 - ID_2 - relation - evidence - comment Databases - DB_ID - name - description - URL - url_patern_ID - url_patern_AC RelationTypes - relation - description - backwardRelation - comment RelationInfo - DB_ID_1 - DB_ID_2 - relation - comment

UniProtEnsembl ChEBI GO Level 1: Catalogs Level 2: Pathways, models GeneModels Biological objects Linking with experimental data and results of analysis refers Level 3: Problem specific BMOND refers Experimental data, results of analysis BioHUB OMICS data Results of analysis MSigDBGeneAtlas, NCI60 Cyclonet - leads - actions - targets refers LipidNet classifications: - lipids - genes refers UbiProt classifications: E1, E2, E3, … wikiwiki wikiwiki wikiwiki

UniProtEnsembl ChEBI GO Level 1: Catalogs Level 2: Pathways, models GeneModels Biological objects Linking with external databases refers Level 3: Problem specific BMOND refers Experimental data, results of analysis BioHUB OMICS data Results of analysis MSigDBGeneAtlas, NCI60 External databases: - KEGG - LipidMap, LipidBank - Reactome, … Cyclonet - leads - actions - targets refers LipidNet classifications: - lipids - genes refers UbiProt classifications: E1, E2, E3, … wikiwiki wikiwiki wikiwiki

Coloring diagram according to microarray data. Each bar corresponds to one value from corresponding microarray series.

Coloring diagram according to omics data

BioHub usage: graph search engine

Composite database module Flash movie: XML_module.exe

Composite database module is defined formally as XML document. It allows: specify dependencies from other database modules specify data types that can be used from external database modules describe dynamic properties for add-on technology specify what dynamic properties can be added to data types from external modules. This information will be stored in local module and merged dynamically with information from external modules. By this way user can add information to external catalogs like Ensembl, UniPropt, etc. specify data types used by local module specify diagram types used by local module specify QueryEngine Composite database module

DTD <!ELEMENT dbModule (jdbcConnection, properties?, dependencies?, types?)> name CDATA #REQUIRED title CDATA #REQUIRED description PCDATA version CDATA "0.8.0" type CDATA text|SQL databaseType CDATA databaseVersion CDATA databaseName CDATA > name CDATA #REQUIRED jdbcDriverClass CDATA #REQUIRED jdbcURL CDATA #REQUIRED jdbcUser CDATA jdbcPassword CDATA >

<!ATTLIST property name CDATA #REQUIRED type CDATA #REQUIRED short-description CDATA #IMPLIED value CDATA > <!ATTLIST tag name CDATA #REQUIRED value CDATA #IMPLIED > <!ATTLIST propertyRef name CDATA #REQUIRED value CDATA >

name CDATA #REQUIRED > <!ATTLIST externalType name CDATA #REQUIRED readOnly CDATA true|false > name CDATA #REQUIRED type CDATA Java|XML class CDATA path CDATA >

section CDATA #REQUIRED name CDATA #REQUIRED class CDATA #REQUIRED transformer CDATA #REQUIRED > class CDATA #REQUIRED luceneIndexes CDATA > class CDATA #REQUIRED table CDATA >

Editor for composite database module

Current status: Implemented: Database modules (initial version): Ensembl, UniProt, ChEBI, GO, IntAct, Reactome, BioModels Composite module (external referencies) –Defined as XML –Composite module editor Selecting and loading modules from server In process: BioHUB Protein state concept Add-on technology BMOND2 – redesigned version of BMOND.

From huge theory to practical output Automated language translation Practical output electronic dictionaries spell checkers Biological data integrations Practical output catalogs (Ensembl, UniProt, CheBI) controlled vocabularies, ontologies hubs

Model composition

Composite diagram: main concepts Block types: 1) block – only mathematical equations. Used mainly for physiological models; 2) subdiagram – other diagram Connection types: 1) directed – input  output. Transformation function can be used; 2) undirected – contact. Indicates that 2 nodes in mode is the same entity. Semantic constraints: There are semantic constraints, for example: block can have only one input for each variable. Two inputs are forbidden for the same variable. Flat model: Before Matlab or Java code generation composite model is transformed into flat model and usual genertions routines are used.

Experiment

To make a virtual experiment it is frequently needed to modify initial model. Typical modifications (changes) are: changing of initial values changing of model parameters to imitate different conditions or mutations deleting of some model elements to imitate knock-out mutations adding events to imitate external influences on the model To skip model duplications for each virtual experiment we introduce “changes” concept.

Graphic notation formal definition as XML document Flash movie: Graphic_Notations_Editor.exe

Graphic notation versus graph layout allows edit diagram allows to create new diagram different graphic notations can be applied to the same SBML model allows formally define SBGN and use it in SBML models allows to reuse graphic notation by many tools

Graphic notation can be defined formally as XML document properties – formal definition of properties that can be used as properties of nodes and edges (for example, title, multimer, etc.). Definition of property includes: –name –type –short description –controlled vocabulary (optional) node types – definition of node includes: –name –icon –properties –view function (JavaScript) –short description edge types – definition of edge includes: –name –icon –properties –view function (JavaScript) –short description semantic controller – defines rules for semantic control of diagram integrity. For this purpose it defines following functions: –canAccept (JavaScript) –isResizable (JavaScript) –move (JavaScript) Examples – a set of diagrams that can be used as test cases, legend and examples for the graphic notation. DML - Diagram Markup Language – is used for this purpose.

SBML… Diagram Model API BioPAXLayout information Graphic notation Layout APINotation API Rendering engine JavaScript functions: - build node/edge view - semantic control Initial data JavaScript API for data access Rendering API JavaScript API for creating primitives similar with SBML layout extension Basic software architecture for rendering of biological models according to specified graphic notation and layout information

Formal definition of graphic notation as XML document and integration with SBML format

Graphic notation editor main concepts graphic notation is defined formally as XML document graphic notation editor provides user friendly interface for XML document editing SBGN graphic notation (prototype) is implemented BioUML workbench allows to create and edit diagrams using graphic notation defined as XML document May be graphic editor will be useful for SBGN community for: –improving SBGN specification –for testing SBGN specification by creating different diagrams Details:

BioUML workbech Select ‘Data’ tab to see the tab with a list with available graphic notations

Click right mouse button on selected graphic notation to open it Graphic Notation Editor

Main sections of formal definition of graphic notation

List of specific properties that are used by graphic notation Properties editor

User can click right mouse button on Properties node to create new property

Nodes – contains list of all node types used by graphic notation

For each node type user can define: - name - properties - icon - view function (JavaScript)

By clicking right mouse button on “Nodes” user can create new node type

By the same way user can define edge type: - name - properties - icon - view function (JavaScript)

“Examples” node contains a set of diagrams that demonstrates usage of graphic notation.

User can create and edit such diagram.

When user selects some element on the diagram he can edit: - object properties - JavaScript that builds a view for selected diagram element

“Semantic controller” node contains list of JavaScript functions that provide semantic constraints and semantic integrity of the diagram.

Graphic notation defined as XML document can be used by BioUML workbench to create corresponding diagram.

Graphic Notation Editor SBGN examples created in BioUML

Skins

Microarray plug-in (alpha version)

Microarray plug-in -Import microarray data in tab delimited format -Show data as a table -Filter data by different criteria -Microarray data analysis -Revealing up/down regulated genes -Meta-analyses -Binding with diagram nodes by ID -Coloring diagrams -JavaScript functions -Data manipulation (filter, join, intersect, trim, etc.) -Statistical analysis

Microarray plug-in Current work: -Powerful user interface for coloring diagrams -Support of other formats for microarray data and results of analyses -Sophisticated binding algorithm using different database references and ID (gene hub) Further work: -Server module that will provide access to ArrayExpress data

BioUML workbench. Data tab contains section “Microarray”. User can import microarray data in tab delimited format into this section.

Possibility to filter probe sets: - by column values - selecting only those probe sets that can be linked to the specified diagram

Microarray analysis

Coloring diagram according to microarray data. Each bar corresponds to one value from corresponding microarray series.

Coloring diagram according to omics data

Further development: Protein state

BioUML workbench: further development Protein states Complexes Improving team work on annotation –Login, single sign on –Editing history (what data were modified, whom and when) –Passing of changes from server to client Sequence analysis and visualization Agent based modeling

Protein state

Modification The functions of macromolecular entities (mainly proteins) are often determined not only by their primary sequences, but by chemical modifications they have undergone. In BMOND2 unmodified and modified forms of a protein refer to the same entity in UniProt database List of possible modifications is extracted from UniProt Feature Table BMOND2 modifications table –allows to describe modifications that are not described in UniProt. These modifications are automatically added to the protein, referred from BMOND2. Modification type – control vocabulary that describes possible modification types (for example, phosphorylation, acetylation, ubiqutination) To take into account protein modifications State concept is used.

UniProt Feature Table FT CHAIN Cytosolic purine 5'-nucleotidase. FT /FTId=PRO_ FT REGION Substrate binding (Potential). FT COMPBIAS Asp/Glu-rich (acidic). FT ACT_SITE Nucleophile. FT ACT_SITE Proton donor. FT METAL Magnesium. FT METAL Magnesium (via carbonyl oxygen). FT METAL Magnesium. FT BINDING Allosteric activator 1. FT BINDING Allosteric activator 2. FT BINDING Allosteric activator 2. FT BINDING Allosteric activator 1; via carbonyl FT oxygen. FT BINDING Allosteric activator 2. FT MOD_RES Phosphoserine (By similarity). FT VARIANT 3 3 T -> A (in dbSNP:rs ). FT /FTId=VAR_ FT VARIANT Q -> R (in dbSNP:rs ). FT /FTId=VAR_

Modification position amynoacid modification type (controlled vocabulary) evidence experimental, by similarity, predicted comment Publication reference

State concept State – describes states of all amino acids available for modifications possible values: –? – unknown, not specified –* – any –- – unmodified –p – phoshporylated –ac – acetylated –… – from controlled vocabulary Protein states are described in BMOND2 states table Reaction – user should specify protein state Diagram – user should specify protein state

State table module (database) id state – short name (like TRANSPATH) position modification

SBGN Mapping: BMOND2 -- SBGN modification – state variable state – state of macromolecule

Complex concept

A complex is s a biochemical entity composed of other biochemical entities, whether macromolecules, small molecules, multimers, or themselves complexes. Complex is specified as a set of units Complex modifications –all possible modifications of its units (some of them can not occur due to physical interactions between units – how we can take it into account) Complex state –var.1 – list of modifications for its subunits –var. 2 – list of states for its units

Complex tables Complex –ID –title (short name) –complete name –species –synonyms –comment References: –States –Synonyms –Structure –DBReferences –Publications Complex Units –complexDB –complexID –unitDB –unitID –multimer

SBGN

Reaction Reaction components –component identification DB id [state] [compartment] Reaction –[compartment] Reaction dialog –specie state –specie compartment –reaction compartment Tables –Reaction compartment –Reaction components state compartment

Diagrams Macromolecule state –“New diagram element” dialog Graphic notation –BioUML states – right label, one modification complexes –SBGN skin

Acknowledgements Part of this work was partially supported by following grants: European Committee grant № “Net2Drug” Siberian Branch of Russian Academy of Sciences (interdisciplinary projects № 46) Volkswagen-Stiftung (I/75941), INTAS Nr RFBR Nr а Author is grateful to for useful comments, discussions and technical support Alexander Kel Sergey Zhatchenko Software developers Annotators Nikita Tolstyh Mikhail Puzanov Ruslan Sharipov Sergey Lapukhov Ilya Kiselev Ivan Yevshin Alexander Magdysyuk Denis Ryumin Elena Cheremushkina Vlad Zhvaleev Alexandr Koshukov Ekaterina Kalashnikova Vasiliy Hudyakov Igor Tyazhev Sergey Graschenko Oleg Onegov