Workflow and Data Management for Nuclear Magnetic Resonance.

Slides:



Advertisements
Similar presentations
CCPN project modeling framework University of Cambridge European Bioinformatics Institute MSD group.
Advertisements

Single view of customer Support deposit and loan accounts Fully integrated General Ledger module that can be customised according to customer specification.
The MEMOPS Programming Framework Wayne Boucher, Cambridge
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Mgt 240 Lecture Website Construction: Software and Language Alternatives March 29, 2005.
UNIT-V The MVC architecture and Struts Framework.
26-28 th April 2004BioXHIT Kick-off Meeting: WP 5.2Slide 1 WorkPackage 5.2: Implementation of Data management and Project Tracking in Structure Solution.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
Memops Data modelling and automatic code generation Edinburgh 9 September 2008.
A Scalable Application Architecture for composing News Portals on the Internet Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta Famagusta.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
C Copyright © 2009, Oracle. All rights reserved. Appendix C: Service-Oriented Architectures.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.

Javascript Cog Kit By Zhenhua Guo. Grid Applications Currently, most grid related applications are written as separate software. –server side: Globus,
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Convert generic gUSE Portal into a science gateway Akos Balasko 02/07/
17 th October 2005CCP4 Database Meeting (York) CCP4(i)/BIOXHIT Database Project: Scope, Aims, Plans, Status and all that jazz Peter Briggs, Wanjuan Yang.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.
INTRODUCTION TO WEB APPLICATION Chapter 1. In this chapter, you will learn about:  The evolution of the Internet  The beginning of the World Wide Web,
Copyright © 2012 UNICOM Systems, Inc. Confidential Information z/Ware Product Overview illustro Systems International A Division of UNICOM Global.
Project Database Handler The Project Database Handler dbCCP4i is a brokering application that mediates interactions between the project database and an.
 Our mission Deploying and unifying the NMR e-Infrastructure in System Biology is to make bio-NMR available to the scientific community in.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
Interactive Workflows Branislav Šimo, Ondrej Habala, Ladislav Hluchý Institute of Informatics, Slovak Academy of Sciences.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
A worldwide e-Infrastructure for NMR and structural biology A worldwide e-Infrastructure for NMR and structural biology Introduction Structural biology.
Worldwide Protein Data Bank wwPDB Common D&A Project November 24, 2009 November 24, 2009 Steering Committee Project Update.
Project Database Handler The Project Database Handler is a brokering application, which will mediate interactions between the project database and other.
Structural Biology on the GRID Dr. Tsjerk A. Wassenaar Biomolecular NMR - Utrecht University (NL)
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Partnerships in Innovation: Serving a Networked Nation Grid Technologies: Foundations for Preservation Environments Portals for managing user interactions.
A computer contains two major sets of tools, software and hardware. Software is generally divided into Systems software and Applications software. Systems.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
12. DISTRIBUTED WEB-BASED SYSTEMS Nov SUSMITHA KOTA KRANTHI KOYA LIANG YI.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Using iRODS with the EnginFrame Grid Portal into the GRIDA3 project Francesco Locunto Marco Piras Matteo Vocale.
What is BizTalk ?
Architecture Review 10/11/2004
Integration with External Applications: General View
Distributed Control and Measurement via the Internet
Pegasus WMS Extends DAGMan to the grid world
Chapter 1: Introduction
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
System Design.
Recap: introduction to e-science
CMPE419 Mobile Application Development
JDXpert Workday Integration
Building an Integrable XBRL Portal Daniel Hamm German Central Bank
Project tracking system for the structure solution software pipeline
CIS16 Application Development – Programming with Visual Basic
Lecture 1: Multi-tier Architecture Overview
Module 01 ETICS Overview ETICS Online Tutorials
Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Overview of Workflows: Why Use Them?
CMPE419 Mobile Application Development
Software Development Process Using UML Recap
Production Manager Tools (New Architecture)
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

Workflow and Data Management for Nuclear Magnetic Resonance

● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits

CCPN ● Collaborative Computing Project for NMR (Nuclear Magnetic Resonance) ● Funded by BBSRC since 1999 ●Goals: ● Unifying platform for NMR software ● Community-based, open-source, software development ● Meetings and courses Member of WeNMR project Member of WeNMR project

CCPN Results ●Software development ●CcpNmr suite of NMR applications ●Integrating external software ●CCPN Data standard for NMR and structural biology ●Abstract data model ●Data access subroutine libraries ●Multiple programming languages ●Memops: Data modelling and code generation tools

WeNMR

WeNMR goals ●Science gateway for NMR and SAXS communities ●Virtual research platform for data storage and exchange ●Operate and expand eNMR grid infrastructure ●Support users and developers ●Extend integration with related disciplines and Grid initiatives. ●WeNMR maintains and operates web portals allowing Grid submission for over 25 NMR and structure calculation programs.

● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits

Macromolecular NMR pipelineAnalysisAssignment Structure generation Validation NMR processing ●Macromolecular structures and dynamics ●Underlying information heterogenous and extremely complex ●Workflow often branched or recursive ●Multiple, incompatible data formats ●Multiple, complex data transformations

Peculiarities of NMR field ●Data in electronic form from the beginning ●No direct mathematical relationship between results and original data ●Peak-atom mapping (‘assignment’) is ‘puzzle solving’ ●Redone for each sample group ●Not fully automatic ●Semi-ambiguous ●Limited resources ●Programs often done by single person, ● who has since left or become professor

Task3 Convert Task1 Task2 Convert Task2 Task1 Convert Task3 Convert Task3 Convert Programs: Native Disorganisation

Integration with Data Standard Data Standard Task1 Convert Task2 Task1 Convert Task1 Convert Task3 Convert Task3

CCPN Data Standard ●Precisely defined ●A single central description ●Validation directly against standard ●Comprehensive – cover everything, including intermediate results ●Ensure consistency and validity for changing data ●Support different implementations in parallel ●Easy to maintain and modify

Pipeline and CCPN data model CCPN data model CcpNmrFormatConverter Reference data External formats Deposition in Protein Data Bank (PDB) and BioMagResBank using CCPN XML files AnalysisAssignment Structure generation Validation NMR processing CcpNmrECI

● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits

Workflow Management Goals ●Standardized interface to WeNMR portals ●Application-independent data selection ●Standard submission and result gathering ●Submit to multiple programs ●Seamless, invisible format conversion ●Start and end on precisely defined CCPN data ●Combine jobs into workflows ●Easy use for non-specialists

Data Management Goals ●Central data store, with access control ●Track jobs and data flow ●NMR analysis is rarely linear ●Alternative jobs from single starting point ●Run – modify – re-run ●Identified as desirable also for non-Grid data

● O7.1: Design and implement a grid-based multidisciplinary approach for the characterization of biomolecular interactions, based on the joint use of NMR, SAXS, bioinformatics and biophysical tools. ● O7.2: Establish a SAXS Grid-enabled infrastructure providing secure remote access to SAXS instrumentation ● O7.3: Develop an end-user local platform making use of portals and web services. ● O7.4: Establish an infrastructure and tools for data- and structure validation. ● O7.5: Provide web services and/or simple direct upload mechanisms for the web portal applications. ● O7.6: Implement a WeNMR end-user virtual machine. WP7: Research Platform

● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits

WP7 – End User Local Platform ●WMS is a web-based end user platform for accessing web-based services and executing workflows ●Development of the Extend-NMR project ●Funded as part of WeNMR ●Accesses services though adaptor modules ●Allows direct access from CcpNmr Analysis

WMS – Architecture Client GWT Web Bioinformatics Web Services Taverna Remote Execution Server Analysis Python Desktop Java web service wrapper Python i/o and CGI code CS-ROSETTA Java CGI code ARIA, CING WeNMR Web Portals and Services CS-ROSETTA, ARIA, CING Server Java / Hibernate Database Postgres Plan to use TAVERNA for the actual workflow management

WMS – Adaptor service Adaptor Servlet I/O Module CCPN in CCPN out Misc format Execution Module(s) Web Local GRID Misc format nmrCalcId Execution Module(s) Web Local GRID ● Format conversion. Access existing web portals using CGI approaches ● Exposed as wsdl-defined web services for consumption by TAVERNA etc.

Data handling ●Data stored as tarred, zipped CCPN data sets ●Repository-type storage planned when CCPN data set become ‘diff-able’. ●Workflow tracks starting data, end data, job ●Run data and parameters stored within CCPN data set in ‘Calculation’ package. ●Run input and output transferred as CCPN data set plus calculation ID

Protocol and interface specification ●Data selection driven from protocol specification ●Parameters: names, types and default values ●Types of data to select ●Specific widget for each data type (structures, peak lists, …) ●New protocols can be specified by users, with JSON file or protocol editor (forthcoming). ●Specific widget for each data type (structures, peak lists, …) ●Layout specification as part of protocol specification

Data conversion ●Takes place in adapter ●Decoupled from server ●Python, working on CCPN data set ●Data export ●Data selection from Calculation package ●To program-specific files ●Result import ●Re-integrated in input Calculation package ●Starting data known ●Mapping information kept as needed

● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits

WMS – Home page

WMS – Running a task

WMS – Workflows

● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits

Status and plans ●Current: ●System working at alpha test level ●ARIA, CING, CS-Rosetta integrated ●Short term: ●Integrate UNIO, CYANA, Autostructure ●Parallel structure determination ■ARIA, UNIO., CYANA, Autostructure, from single input selection ■Results captured together; CcpNmr Analysis to analyze. ●Longer term: ●Improve user interface and robustness ●Integrate more programs ●Replace CGI wrappers with WSDL services on the portals

● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits

CCPN People ■Cambridge (Biochemistry)‏ ●Ernest Laue ●Wayne Boucher ●Rasmus Fogh ●John Ionides ●Tim Stevens ●Alan Sousa da Silva ■EBI (PDBe), Hinxton ●Kim Henrick ●Wim Vranken ■SpronkNMR ●Chris Spronk

Funding ■BBSRC ■Industry ●AstraZeneca, Dupont Pharma (now BMS), Genentech, GlaxoSmithKline, Vernalis, Syngenta ■European Community ●WeNMR, EXTEND-NMR, EU-NMR, NMR-Life, NMRQUAL, and TEMBLOR contracts

END END