Model Reproducibility and Part Repositories Model Sharing Group Presentation Interagency Modeling and Analysis Group Herbert Sauro Maxwell Neal University.

Slides:



Advertisements
Similar presentations
The ERATO Systems Biology Workbench Michael Hucka, Hamid Bolouri, Andrew Finney, Herbert Sauro ERATO Kitano Systems Biology Project California Institute.
Advertisements

Theory. Modeling of Biochemical Reaction Systems 2 Assumptions: The reaction systems are spatially homogeneous at every moment of time evolution. The.
Snejina Lazarova Senior QA Engineer, Team Lead CRMTeam Dimo Mitev Senior QA Engineer, Team Lead SystemIntegrationTeam Telerik QA Academy SOAP-based Web.
CellDesigner Tutorial Laurence Calzone, Andrei Zinovyev UMR U900 INSERM/Institut Curie/Ecole des Mines de Paris Wednesday, April 30th.
Åbo Akademi University & TUCS, Turku, Finland Ion PETRE Andrzej MIZERA COPASI Complex Pathway Simulator.
Multiscale Systems Biology WG Breakout session Sept. 3, 2014.
CellML in practice: modularity & reuse; versioning & provenance; embedded workspaces David Nickerson Auckland Bioengineering Institute University of Auckland.
University of Leeds Department of Chemistry The New MCM Website Stephen Pascoe, Louise Whitehouse and Andrew Rickard.
Computational Biology, Part 19 Cell Simulation: Virtual Cell Robert F. Murphy, Shann-Ching Chen, Justin Newberg Copyright  All rights reserved.
Software Development for Systems Biology Herbert M Sauro Frank Bergmann Keck Graduate Institute 535 Watson Drive Claremont, CA,
Applying Multi-Criteria Optimisation to Develop Cognitive Models Peter Lane University of Hertfordshire Fernand Gobet Brunel University.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
Mgt 240 Lecture Website Construction: Software and Language Alternatives March 29, 2005.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
4 th NeuroML Development Workshop & BrainScaleS CodeJam, Edinburgh, March NeuroML: Where are we at? Padraig Gleeson Department.
Functional Curation with the Cardiac Electrophysiology Web Lab Gary Mirams Computational Biology, University of Oxford CellML Workshop, 14 th April 2015.
January, 23, 2006 Ilkay Altintas
Lecture 3: Pathway Generation Tool I: CellDesigner: A modeling tool of biochemical networks Y.Z. Chen Department of Pharmacy National University of Singapore.
The ERATO Systems Biology Workbench Michael Hucka, Andrew Finney, Herbert Sauro, Hamid Bolouri ERATO Kitano Systems Biology Project California Institute.
Breakout Report: Model and Data Sharing Working Group Peter Hunter auckland.ac.nzauckland.ac.nz Herbert Sauro uw.edu uw.edu Jim Bassingthwaighte uw.edu.
Composition and Aggregation in Modeling Regulatory Networks Clifford A. Shaffer* Ranjit Randhawa* John J. Tyson + Departments of Computer Science* and.
CARDIAC ELECTROPHYSIOLOGY WEB LAB The underlying software.
David Nickerson 1 (with help from Dagmar Waltemath 2 & Frank Bergmann 3 ) 1 Auckland Bioengineering Institute, University of Auckland 2 University of Rostock.
1 Research Groups : KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems SCI 2 SMetrology and Models Intelligent.
EBI is an Outstation of the European Molecular Biology Laboratory. BioModels Database, a public model- sharing resource In silico systems biology: network.
Database structure for the European Integrated Tokamak Modelling Task Force F. Imbeaux On behalf of the Data Coordination Project.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Measure Model Manipulate The Center for Cell Analysis and Modeling focuses on creating new technologies for understanding the dynamic distributions of.
BioUML integrated platform for building virtual cell and virtual physiological human Fedor Kolpakov Institute of Systems Biology Laboratory of Bioinformatics,
HERA/LHC Workshop, MC Tools working group, HzTool, JetWeb and CEDAR Tools for validating and tuning MC models Ben Waugh, UCL Workshop on.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
The Optimization Plug-in for the BioUML Platform E. O. Kutumova 1,2,*, A. S. Ryabova 1,3, N. I. Tolstyh 1, F. A. Kolpakov 1,2 1 Institute of Systems Biology,
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Copyright OpenHelix. No use or reproduction without express written consent1.
Virtual Cell and CellML The Virtual Cell Group Center for Cell Analysis and Modeling University of Connecticut Health Center Farmington, CT – USA.
FlexElink Winter presentation 26 February 2002 Flexible linking (and formatting) management software Hector Sanchez Universitat Jaume I Ing. Informatica.
The european ITM Task Force data structure F. Imbeaux.
The Physiome Model Repository – PMR David Nickerson Auckland Bioengineering Institute The University.
The ERATO Systems Biology Workbench Hamid Bolouri ERATO Kitano Systems Biology Project California Institute of Technology & University of Hertfordshire,
K.Furukawa, Nov Database and Simulation Codes 1 Simple thoughts Around Information Repository and Around Simulation Codes K. Furukawa, KEK Nov.
Herbert M Sauro Install from the web site sys-bio.org.
Modelling epithelial transport David P. Nickerson¹, Kirk L. Hamilton², Peter J. Hunter¹ ¹Auckland Bioengineering Institute, Auckland, New Zealand ²Department.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Sharing Models. How Can I Exchange Models? SBML (Systems Biology Markup Language): de facto standard for representing cellular networks. A large number.
RoSBM Registry of Standard Biological Models Barry Canton (MIT) Vincent Rouilly (Imperial College) Registry Workshop November 2007,Boston.
CellDesigner and Virtual Cell Leang Chhun and Chanchala Kaddi Georgia Institute of Technology 29 June, 2006.
The ERATO Systems Biology Workbench: Enabling Interaction and Exchange Between Tools for Computational Biology Michael Hucka, Andrew Finney, Herbert Sauro,
Development of a Distributed MATLAB Environment with Real-Time Data Visualization Authors: Joseph Diamond, Richard McEver Affiliation: Dr. Jian Huang,
Summary of Progress During the last 12 Months in the Standards Community Model Sharing Working Group Peter Hunter and Herbert M Sauro IMAG Meeting 2012.
Biological systems and pathway analysis
SOAP-based Web Services Telerik Software Academy Software Quality Assurance.
Mining the Biomedical Research Literature Ken Baclawski.
Systems Biology Markup Language Ranjit Randhawa Department of Computer Science Virginia Tech.
Composition in Modeling Macromolecular Regulatory Networks Ranjit Randhawa September 9th 2007.
20081 Converting workspaces and using SALT & subversion to maintain them. V1.02.
Fusing and Composing Macromolecular Regulatory Network Models Ranjit Randhawa* Clifford A. Shaffer* John J. Tyson + Departments of Computer Science* and.
© Geodise Project, University of Southampton, Integrating Data Management into Engineering Applications Zhuoan Jiao, Jasmin.
JigCell Nicholas A. Allen*, Kathy C. Chen**, Emery D. Conrad**, Ranjit Randhawa*, Clifford A. Shaffer*, John J. Tyson**, Layne T. Watson* and Jason W.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
© Geodise Project, University of Southampton, Workflow Application Fenglian Xu 07/05/03.
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
BioUML – integrated platform for building virtual cell and virtual physiological human Fedor Kolpakov 1,2, Nikita Tolstykh 1,2, Elena Kutumova 1,2, Ilya.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
BENG/CHEM/Pharm/MATH 276 HHMI Interfaces Lab 2: Numerical Analysis for Multi-Scale Biology Modeling Cell Biochemical and Biophysical Networks Britton Boras.
Advanced Higher Computing Science
Fig. 1. Top-level classes of SBRML-OM.
Model Curation Edmund J. Crampin Auckland Bioengineering Institute
Digital Human Meeting FAS, July 23, 2001 NLM, Bethesda, MD
Lecture 1: Multi-tier Architecture Overview
Overview of Workflows: Why Use Them?
Presentation transcript:

Model Reproducibility and Part Repositories Model Sharing Group Presentation Interagency Modeling and Analysis Group Herbert Sauro Maxwell Neal University of Washington

Definitions Repeatability – The ability for an individual to show that an experiment, repeated using the same material and equipment, yields the same statistical result. Reproducibility – The ability for different individuals to show that an experiment repeated using different but similar material and different equipment yields the same statistical result.

Graphically….

Model Repeatability What do we mean by model repeatability? Running the model multiple times on the same computer using the same software yields the same result. For a stochastic simulation, multiple runs on the same computer will yield the same statistical distribution.

Model Reproducibility What do we mean by model reproducibility? The ability to recreate a published simulation without necessarily using the same software or computer that was used in the original publication. At the present time it is a non-trivial exercise to show reproducibility in a published model.

Why Reproducibility is Hard There are a number of reasons why reproducibility of computational models is difficult: 1.The model itself is difficult to extract from a published article. 2.Details of the algorithm and settings that were used are often missing. We already have a solution to problem 1: Model repositories.

Model Repositories: Biomodels, CellML and JSim Repositories Biomodels: SBML, CellML, Matlab etc CellML JSim: MML

Model Repositories: Biomodels, CellML and JSim Repositories Both SBML and CellML are standard formats for specifying a model. They do not specify how to generate a simulation (experiment).

Model Repositories: Biomodels, CellML and JSim Repositories There are other tools such Jarnac, JDesigner COPASI, TinkerCell, JSim, VCell, CompuCell3D etc where simulations can be specified. The details of the simulation experiments cannot however be easily transferred to another tool. For example a simulation experiment in Jarnac cannot be run in COPASI. All these tools can however generate SBML so that the models are portable.

Current Portfolio of Community Standards in Computational Systems Biology Rate KineticsAlgorithmsBehaviors Modified from Nicolas Le Novere SBRML SBRML = Systems Bio Results (data) Markup Language

SBRML: Systems Biology Results Markup Language Can be use to store information such as: 1.Time Course Simulations 2.Steady State Results 3.Enzyme Kinetic Data 4.Experimental Data 5.Results of parameter scans 6.Who generated the data? 7.etc. – very flexible proposal Developed by Mendes Manchester Group in the UK

SED-ML: Simulation Experiment Description Language

….. ModelsData‘Simulation Experiments’ Inputs

SED-ML: Simulation Experiment Description Language ….. ModelsData Tasks ….. Inputs Connecting a model to an experiment and data. ‘Simulation Experiments’ Note: One model per task but multiple data and experiments per task.

SED-ML: Simulation Experiment Description Language ….. ModelsData Tasks ….. Data Generators Inputs Post-processing of results ‘Simulation Experiments’ Connecting a model to an experiment and data.

SED-ML: Simulation Experiment Description Language ….. ModelsData Tasks ….. Data Generators Outputs Inputs Post-processing of results ‘Simulation Experiments’ Connecting a model to an experiment and data.

Concatenating Experiments => SED-ML Optimization Sensitivity Analysis Time Course Simulation

SED-ML: Simulation Experiment Description Language ….. ModelsData Inputs ‘Simulation Experiments’ Models: SBML CellML XMML Currently must be expressible in XML, i.e cannot be Matlab, Java etc. Must be able to annotate and must have a formal structure Data: No standard format yet but SBRML is a possibility. Could be virtual patients, time course data for fitting, flux balance data, Etc. Experiments: Perturbations, Time Course, Steady State, Numerical Methods Settings, Parameter Scans, Optimization, Sensitivity Analysis, Bifurcation Analysis, etc

SED-ML: Simulation Experiment Description Language Because the models, data and experiments are based on XML, the different files need not reside on the same computer. For example it would be possible to reference a model remotely from Biomodels or the Cellml repository. The same goes for data and experiments. In the case of large datasets it might not be practical to have the data on a local machine and remote access it more convenient.

SED-ML: Archival Format Rather than have separate files for models, data and experiments, it is also proposed to have a single archival file. This file will be a zip file containing models (SBML, etc), Data (SBRML) and Experiments (in SED-ML). Models Data SED-ML Zip File

Ancillary Efforts KiSAO: Kinetic Simulation Algorithm Ontology KiSAO can be used to identify both the algorithm used and the initial setup. For example what ODE solver was used and what tolerances etc where specified.

What does SED-ML Look like? Comparing Limit Cycles and strange attractors for oscillation in Drosophila <model id="model1" name="Circadian Oscillations" language="urn:sedml:language:cellml" source=" a47e222bc3b3d9117baa08d2e7246d67eedd/leloup_gonze_goldbeter_1999_a.cellml"/> <changeAttribute newValue="0.28"/> <changeAttribute newValue="4.8"/> Etc….

What about the User? Obviously the user isn’t going to write XML What might the user experience be? There are at least three approaches here

Forms Based Entry of SED-ML Web Tools/ Frank Bergmann COPASI (copasi.org)

Script Based Definition of SED-ML model myModel (Xo, X1) var S1, S2; ext Xo, X1; // Declare variables and boundary species Initial // Set up initial conditions and parameter values Xo = 10.0; X1 = 0.0; k1 = 0.1; k2 = 0.3; start = 5; duration = 2; slope = 0.2 Signals Xo = ramp (start, duration, slope); Events when S1 > 5 do k1 = k1 / 2; Network // Define the biochemical network Xo -> S1; k1*Xo; S1 -> S2; k2*S1; S2 -> X1; k3*S2; end; // Specify Simulation Experiment m1 = runSimulation (0, 10, 100); reset; k1 = k1 * 2; m2 = runSimulation ( ); output (m1, m2);

Graphical UI Tracking All operations are tracked by the software so that model changes and simulations can be replayed. JDesigner (sys-bio.org) Tinkercell (tinkercell.com)

What’s in it for me? Reproducibility!

What’s in it for me? Download a pdf article from pubmed Dynamic modelling of oestrogen signalling and cell fate in breast cancer cells, Tyson el al, Nature Rev Cancer, 11, (2011) From the figure extract all the information required to reproduce the three simulation experiments at the three stress levels: 1. Model including the SBGN network diagram 2. Data (possibly for fitting or comparison) 3. SED-ML that describes three experiments. Load into your favorite tool to recreate figure. Unfolded protein response (UPR), y axis % unfolded protein, total chaperone and free protein

But there’s more… Once we have the ability to encode simulation experiments there are a few other related things we can think about doing, these include: 1.Tracking Model Changes 2.Recording Simulations to Exchange or Replay 3.Versioning 4.Unit Tests 5.Multiple data sets, eg support virtual patients 6.Supporting Animations 7.Formalize and automate simulator testing

Versioning and Tracking

SED-ML ● SEDML homepage: ● SEDML at Sourceforge: Support library to read and write SEDML, developed by VCell and CSBE (Edinburgh) groups - jlibSEDML ● SEDML mailing list: Prototypes developed by: Frank Bergmann (part of SBW Project) Jacky Snoep (JWS Online Simulator) Richard Adams (SBSI: Edinburgh) Ion Moraru (VCell) via SBW grant Peter Hunter (PCEnv) Future developments Jim Bassingthwaighte (JSim) via SBW grant. Sven Sahle (COPASI, EU) Nicolas Le Novere (Biomodels, EU/UK)

Community Effort US: Herbert Sauro (UW, SBW) Ion Moraru (Connecticut, VCell) Jim Bassingthwaighte (UW, JSim) Frank Bergmann (Caltech, SBW) Mike Hucka (Caltech, SBML) Europe: Dagmar Waltemath (Rostock, SEDML) Nicolas Le Novere (EBI, Biomodels) Richard Adams (Edinburgh, SBSI) Sven Sahle (Heidelberg, COPASI) Henning Schmidt (SB Toolbox) Fedor Kolpakov (BioUML) New Zealand Andrew Miller (Auckland, CellML) David Nickerson (Auckland, CellML) Funding:

Component Repositories Possibly one of the hardest things about modeling a cellular network is researching the rate laws and parameters that one should use to build the model. What if there were a repository of ready made parts that could be dropped into a model?

Exploiting Biomodels The biomodels repository has almost 400 curated models, many of which are annotated. We can take the annotations and use them to break up every biomodel into its constituent parts. That will result in over 4000 parts that could be used in new models.

What does a biomodels part look like? etc Name of the enzyme The rate law The parameter values that were used

How was it done A knowledge base was created that represents each individual reaction that is annotated against a GO term in the BioModels repository. Each individual reaction is associated with it's GO term, its synonyms, its codename in the SBML model, and the SBML model free-text description. A search using the tool sends a pattern match query across these attributes of the reactions. The tool accesses the repository itself so that it is always up to date. This tool can be incorporated in to simulators allowing users to browse for suitable parts they can include in their model. We’d like to do the same for CellML and the JSim repositories so that users can browse for parts related to physiological systems.

How to get hold of SRF SBML Reaction Finder Description: Downloads: