SysMO-DB: A pragmatic approach to sharing information amongst Systems Biology projects in Europe Carole Goble, University of Manchester,

Slides:



Advertisements
Similar presentations
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
Advertisements

SysMo-DB: Supporting Data Access and Integration Carole Goble, University of Manchester UK Jacky Snoep, Uni of Manchester / Stellenbosch, S Africa Isabel.
CHORUS Implementation Webinar May 16, 2014 Mark Martin Assistant Director, Office of Scientific and Technical Information Office of Science U.S. Department.
RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 15 Creating Collaborative Partnerships.
SysMO-DB: Towards “just enough” data exchange for the SysMO Consortium Katy Wolstencroft, University of Manchester, UK.
SysMO-DB: Towards “just enough” data exchange for the SysMO Consortium Stuart Owen, University of Manchester.
Designing, Executing and Reusing Scientific Workflows Katy Wolstencroft, Paul Fisher, myGrid.
IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
CNRIS CNRIS 2.0 Challenges for a new generation of Research Information Systems.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
The Imperial College Tissue Bank A searchable catalogue for tissues, research projects and data outcomes Prof Gerry Thomas - Dept. Surgery & Cancer The.
Microsoft Research Faculty Summit David De Roure University of Southampton, UK.
Providing an environment where every data-driven researcher will thrive Professor Carole Goble University of Manchester,
Jiten Bhagat University of myExperiment A Social VRE for Research Objects JISC Roadshow | February.
Creating Collaborative Partnerships
LEVERAGING THE ENTERPRISE INFORMATION ENVIRONMENT Louise Edmonds Senior Manager Information Management ACT Health.
1 FACS Data Management Workshop The Immunology Database and Analysis Portal (ImmPort) Perspective Bioinformatics Integration Support Contract (BISC) N01AI40076.
Science as an Open Enterprise: Open Data for Open Science Professor Brian Collins CB, FREng UCL, June 2012 Emerging conclusions from a Royal Society Policy.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1 August 15th, 2012 BP & IA Team.
© Rheinmetall Defence 2013 The Geospatial Catalogue and Database Repository (GCDR) and the Knowledge Management System (KMS) Shane Reschke – Technical.
Good practice in Research Data Management Module 6: Tools, training and support.
About CUAHSI The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities.
SysMO-DB: Towards “just enough” data exchange for the SysMO Consortium Carole Goble, Uni of Manchester, UK Jacky Snoep, Uni of Manchester, UK / Stellenbosch,
Sage Bionetworks Mission Sage Bionetworks is a non-profit organization with a vision to create a “commons” where integrative bionetworks are evolved by.
Highlights from Day 3* in the Big Data House * ±1.
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
A centre of expertise in digital information management UKOLN is supported by: Monica Duke Project.
SysMO-DB: Just Enough Exchange for Systems Biology Data and Models Carole Goble, Katy Wolstencroft, Stuart Owen, Sergejs Aleksejevs - University of Manchester.
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
RightField: Semantic Enrichment of Systems Biology Data using Spreadsheets Katy Wolstencroft myGrid, SysMO-DB University of Manchester.
SysMo-DB: Towards “just enough” data exchange for the SysMO Consortium Carole Goble, Uni of Manchester, UK Jacky Snoep, Uni of Manchester, UK / Stellenbosch,
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
Data-driven research with e-Laboratories Stuart Owen University of Manchester
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,
The Environmental Genomics Thematic Programme Data Centre Dawn Field, Director.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
SysMO-DB: Sharing and Exchanging Data and Models in Systems Biology Katy Wolstencroft University of Manchester.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
The Physiome Model Repository – PMR David Nickerson Auckland Bioengineering Institute The University.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
Sage Bionetworks Mission Sage Bionetworks is a non-profit organization with a vision to create a “commons” where integrative bionetworks are evolved by.
SysMO-DB and ISA Katy Wolstencroft, University of Manchester, UK.
Why to care about research?
SEEK & JERM Progress Stuart Owen December Alphabetical pagination Requested by several users. Will also be applied to Sops, Models & Data – (needs.
Linking Models & Data within the ISA structure Stuart Owen (based upon notes by Olga Krebs).
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
TDWG – Looking Backward and Forward Donald Hobern, Director, Atlas of Living Australia 20 October 2008.
Workshop: Linking Models and Data in SysMO Katy Wolstencroft, SysMO-DB University of Manchester, UK.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 15 Creating Collaborative Partnerships.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
Describing and Annotating Experimental Data: Hands On.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Data Coordinating Center University of Washington Department of Biostatistics Elizabeth Brown, ScD Siiri Bennett, MD.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Creating Collaborative Partnerships
GeoNetwork OpenSource: Geographic data sharing for everyone
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
Professor Carole Goble University of Manchester, UK
An Overview of Data-PASS Shared Catalog
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Introduction to the CESSDA Data Management Expert Guide
Presentation transcript:

SysMO-DB: A pragmatic approach to sharing information amongst Systems Biology projects in Europe Carole Goble, University of Manchester, UK

Pan European collaboration. Systems Biology of Microorganisms. The transition from growing to non-growing Bacillus subtilis cells Energy and Saccharomyces cerevisiae Biology of Clostridium acetobutylicum Gene interaction networks and models of cation homeostasis in Saccharomyces cerevisiae

Eleven individual projects, 91 institutes Different research outcomes A cross-section of microorganisms, incl. bacteria, archaea and yeast. Record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way Present these processes in the form of computerized mathematical models. Pool research capacities and know-how. Running since April Two phases – more later! BaCell-SysMO COSMIC SUMO KOSMOBAC SysMO-LAB PSYSMO Valla MOSES TRANSLUCENT STREAM SulfoSYS

Types of stuff Multiple ‘omics genomics, transcriptomics proteomics, metabolomics Images Reaction Kinetics Models Relationships between data sets/experiments Procedures, experiments, data, results and models Analysis of data The same across many Systems Biology projects

The Problem (1) No one concept of experimentation or modelling No planned, shared infrastructure for pooling

Started July 2008, 3 years + 3 years 4 people, 3 teams over 3 sites Sensitively retrofit a data access, model handling and data integration platform. Support and manage the diversity of data, models and competencies. Web-based solution: exchange of data, models and processes. search for across the initiative‘s assets. dissemination of results. DB SysMO-DB

Own solutions Suspicion Data issues Resource Issues Own data solutions and collaboration environments. wikis, e-Groupware, PHProjekt, BaseCamp, PLONE, Alfresco, bespoke commercial … files and spreadsheets. Suspicion and caution over sharing. Interesting interplay between modellers, experimentalists and bioinformaticians. Many do not have data, or follow the standards that exist or know who is doing what. Much of the data cannot be compared Different organisms, different strains. No extra resources for the consortiums 91 institutes, 11 consortiums, some overlapping The Problem (2)

Principles… A series of small victories Realistic Don‘t reinvent Sustainable and extensible Migrate to community standards Provide instant gratification Address doubt and anxiety Keep barriers low.

Social Approach PALS - Power Contributors! 18 Postdocs and PhD students All three kinds of people Design and technical collaboration team Very intense collaboration UK and Continental PALS Chapters Audits and Sharing Methods, data, models, standards, software, schemas, spreadsheets, SOPs….. 20 questions want answered Summer Schools

Communication via PALs DB teamPALSProjects Show what is there Suggest what is possible Ask for requirements Give requirements Tell priorities Rate outcomes Suggest improvements Double check Transmit Disseminate Collect answers

Picking Pain Points. Keeping it Real. Project Directors Data remains with us. We control who sees what. Just enough exchange. Responsibility PALs Spreadsheets. Yellow Pages. Standard Operating Procedures.

SysMO SEEK Assets Catalogue. Archive. Social Network. Sharing Space. Gateway. Yellow Pages People. Expertise. Projects. Institutions. Facilities. Studies. Data Experimental data sets and analysed results. Gateway to data stores – SABIO-RK, ‘omics Models Store. Stimulate. Publish. Curate. Gateway to COPASI, JWS Online, BioModels Processes Laboratory protocols – Standard Operating Procedures Bioinformatics analyses – computational workflows - Taverna Model population and validation – workflows – Taverna Gateway to myExperiment, MolMeth, OpenWetWare…. Interlinking ASSETS CATALOGUE

SysMO SEEK Is there any group generating kinetic data? Is this data available? Who is working with which organism? What methods are been used to determine enzyme activity? Under which experimental conditions are my partners working on for the measurement of glucose concentration? ? ? ? ?

Social Networks

Access Permissions Protect: Just Enough Sharing Reusing myExperiment

Attribution Credit Reward and Provenance Reusing myExperiment

Human-readable web pages Yellow pages Web Service Access Assets catalogue Asset archive JERM Plug-in Architecture Applications and Resources Workflows SysMO CMS Sites Backup SysMO users Monitor Models Community Databases Workflows SOPs Processes myExperiment JWS Online SABIO-RK

Just Enough Results Model Harvest standards e.g. MIAME (MIBBI.org) consortium schemas and spreadsheets JERMs for each data type – microarray, metabolomics, proteomics Map to projects Distribute as spreadsheet templates “I only want to collect and share just enough results”

Experimental Data Metadata People Projects Assay Study Experimental conditions Factors studied Models SOPs Homogenised terminology and values in the datasets themselves Workflows ISA-TAB compliant Investigation Just Enough Results Model

COSMIC and BaCell ( Alfresco, document management system)

Keeping data safe at home Content Management System harvest Harvester Extractor Register Assets Catalogue SearchFetch Project X

Keeping data safe at home Content Management System Upload Extractor Register Assets Catalogue SearchFetch Project X upload

Keeping data safe with SEEK Content Management System Extractor Register Assets Catalogue SearchFetch Project X upload Upload

Models Exchange Experiment Data Exchange Verification Comparison ISA-TAB SBML MIRIAM Population Prediction MIBBI Standards OBO Controlled Vocabularies

Models Exchange Experiment Data Exchange Verification Comparison Just Enough Results Model ISA-TAB SBML MIRIAM Population Prediction MIBBI Standards OBO Controlled Vocabularies SBRML SB-TAB

Quality of Data – Reliable Interpretation Publication standards by stealth Controlled vocabulary plug in BioPortal

Observations - PALs Dissemination of standards Debunking myths Tools exchange Modeller – Experimentalist Trust Like, talking together Transcended the projects Project power politics PALs did their jobs….

Observations - Sharing Methods sharing. Protective of models. in progress vs published models. Access and Version management. Curator-Rival conflict Reluctant to share data. Even within their own projects. Legacy spreadsheets dominate. Curation practices vary. Centralised archive take-up. Point to Point Exchange. Nature 461, 145 (10 Sept09)

SysMO2 Musical Chairs Incentive Model for Sharing Future Funding Phase 2 - SysMO2 Projects dropped and added People dropped and added Institutions dropped and added Others reconstituted and added Incentive Model for Sharing? Convenience, Added Value? Personal benefit? Consortium Policies?

A Platform for Systems Biology Exchange Preservation and archiving. Widen Participation of mothership Community Exchange Bazaar Widen adoption of platform and enable exchange. Accelerant to standards Adoption of JERM. Curation tools CMS + JERM bundling Widen access to External Resources, incl. publication Added value and convenience Preparation for publishing. EMBL- EBI ‘omics datasets Public Model repositories isatab sbml

Research Objects and e-Laboratories Packaged Assets Workflows linked to models linked to data linked to SOPs Community standards Mixed resources External and central Trust Spreadsheets Integration via RDF linked data. myExperiment, MethodBox, NEMA, BioCatalogue

Summary Reality is messy. Extreme Technology Determinism vs Voluntarist Sociocultural shaping Extreme and continuous partnership with users. Act Local Think Global Agile development environment facilitated stream of features to tackle pain points. Leverage other e-Laboratories, Maintaining scientists’ buy-in. Socio-Political Axis dominates the Technical Axis. Collaboration evolutions. Confidence in exchange Consortium Policies.

SysMO-DB Team University of Stellenbosch, South Africa University of Manchester, UK Jacky Snoep EML Research gGmbH, Germany Isabel Rojas University of Manchester, UK Olga Krebs Wolfgang Müller Sergejs Aleksejevs Carole Goble Stuart Owen Katy Wolstencroft Finn Bacall

Acknowledgements myExperiment: Taverna: JWS Online: SABIO-RK: