Download presentation
Presentation is loading. Please wait.
Published byAnn Dixon Modified over 8 years ago
1
Workflow and Data Management for Nuclear Magnetic Resonance
2
● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits
3
CCPN ● Collaborative Computing Project for NMR (Nuclear Magnetic Resonance) ● Funded by BBSRC since 1999 ●Goals: ● Unifying platform for NMR software ● Community-based, open-source, software development ● Meetings and courses Member of WeNMR project Member of WeNMR project
4
CCPN Results ●Software development ●CcpNmr suite of NMR applications ●Integrating external software ●CCPN Data standard for NMR and structural biology ●Abstract data model ●Data access subroutine libraries ●Multiple programming languages ●Memops: Data modelling and code generation tools
5
WeNMR
6
WeNMR goals ●Science gateway for NMR and SAXS communities ●Virtual research platform for data storage and exchange ●Operate and expand eNMR grid infrastructure ●Support users and developers ●Extend integration with related disciplines and Grid initiatives. ●WeNMR maintains and operates web portals allowing Grid submission for over 25 NMR and structure calculation programs.
7
● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits
8
Macromolecular NMR pipelineAnalysisAssignment Structure generation Validation NMR processing ●Macromolecular structures and dynamics ●Underlying information heterogenous and extremely complex ●Workflow often branched or recursive ●Multiple, incompatible data formats ●Multiple, complex data transformations
9
Peculiarities of NMR field ●Data in electronic form from the beginning ●No direct mathematical relationship between results and original data ●Peak-atom mapping (‘assignment’) is ‘puzzle solving’ ●Redone for each sample group ●Not fully automatic ●Semi-ambiguous ●Limited resources ●Programs often done by single person, ● who has since left or become professor
10
Task3 Convert Task1 Task2 Convert Task2 Task1 Convert Task3 Convert Task3 Convert Programs: Native Disorganisation
11
Integration with Data Standard Data Standard Task1 Convert Task2 Task1 Convert Task1 Convert Task3 Convert Task3
12
CCPN Data Standard ●Precisely defined ●A single central description ●Validation directly against standard ●Comprehensive – cover everything, including intermediate results ●Ensure consistency and validity for changing data ●Support different implementations in parallel ●Easy to maintain and modify
13
Pipeline and CCPN data model CCPN data model CcpNmrFormatConverter Reference data External formats Deposition in Protein Data Bank (PDB) and BioMagResBank using CCPN XML files AnalysisAssignment Structure generation Validation NMR processing CcpNmrECI
14
● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits
15
Workflow Management Goals ●Standardized interface to WeNMR portals ●Application-independent data selection ●Standard submission and result gathering ●Submit to multiple programs ●Seamless, invisible format conversion ●Start and end on precisely defined CCPN data ●Combine jobs into workflows ●Easy use for non-specialists
16
Data Management Goals ●Central data store, with access control ●Track jobs and data flow ●NMR analysis is rarely linear ●Alternative jobs from single starting point ●Run – modify – re-run ●Identified as desirable also for non-Grid data
17
● O7.1: Design and implement a grid-based multidisciplinary approach for the characterization of biomolecular interactions, based on the joint use of NMR, SAXS, bioinformatics and biophysical tools. ● O7.2: Establish a SAXS Grid-enabled infrastructure providing secure remote access to SAXS instrumentation ● O7.3: Develop an end-user local platform making use of portals and web services. ● O7.4: Establish an infrastructure and tools for data- and structure validation. ● O7.5: Provide web services and/or simple direct upload mechanisms for the web portal applications. ● O7.6: Implement a WeNMR end-user virtual machine. WP7: Research Platform
18
● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits
19
WP7 – End User Local Platform ●WMS is a web-based end user platform for accessing web-based services and executing workflows ●Development of the Extend-NMR project ●Funded as part of WeNMR ●Accesses services though adaptor modules ●Allows direct access from CcpNmr Analysis
20
WMS – Architecture Client GWT Web Bioinformatics Web Services Taverna Remote Execution Server Analysis Python Desktop Java web service wrapper Python i/o and CGI code CS-ROSETTA Java CGI code ARIA, CING WeNMR Web Portals and Services CS-ROSETTA, ARIA, CING Server Java / Hibernate Database Postgres Plan to use TAVERNA for the actual workflow management
21
WMS – Adaptor service Adaptor Servlet I/O Module CCPN in CCPN out Misc format Execution Module(s) Web Local GRID Misc format nmrCalcId Execution Module(s) Web Local GRID ● Format conversion. Access existing web portals using CGI approaches ● Exposed as wsdl-defined web services for consumption by TAVERNA etc.
22
Data handling ●Data stored as tarred, zipped CCPN data sets ●Repository-type storage planned when CCPN data set become ‘diff-able’. ●Workflow tracks starting data, end data, job ●Run data and parameters stored within CCPN data set in ‘Calculation’ package. ●Run input and output transferred as CCPN data set plus calculation ID
23
Protocol and interface specification ●Data selection driven from protocol specification ●Parameters: names, types and default values ●Types of data to select ●Specific widget for each data type (structures, peak lists, …) ●New protocols can be specified by users, with JSON file or protocol editor (forthcoming). ●Specific widget for each data type (structures, peak lists, …) ●Layout specification as part of protocol specification
24
Data conversion ●Takes place in adapter ●Decoupled from server ●Python, working on CCPN data set ●Data export ●Data selection from Calculation package ●To program-specific files ●Result import ●Re-integrated in input Calculation package ●Starting data known ●Mapping information kept as needed
25
● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits
26
WMS – Home page
27
WMS – Running a task
28
WMS – Workflows
29
● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits
30
Status and plans ●Current: ●System working at alpha test level ●ARIA, CING, CS-Rosetta integrated ●Short term: ●Integrate UNIO, CYANA, Autostructure ●Parallel structure determination ■ARIA, UNIO., CYANA, Autostructure, from single input selection ■Results captured together; CcpNmr Analysis to analyze. ●Longer term: ●Improve user interface and robustness ●Integrate more programs ●Replace CGI wrappers with WSDL services on the portals
31
● Introduction ● CCPN and WeNMR ● Data and workflow of macromolecular NMR ● WMS Workflow Management System ● Goals ● Organization ● Example ● Plans ● Credits
32
CCPN People ■Cambridge (Biochemistry) ●Ernest Laue ●Wayne Boucher ●Rasmus Fogh ●John Ionides ●Tim Stevens ●Alan Sousa da Silva ■EBI (PDBe), Hinxton ●Kim Henrick ●Wim Vranken ■SpronkNMR ●Chris Spronk
33
Funding ■BBSRC ■Industry ●AstraZeneca, Dupont Pharma (now BMS), Genentech, GlaxoSmithKline, Vernalis, Syngenta ■European Community ●WeNMR, EXTEND-NMR, EU-NMR, NMR-Life, NMRQUAL, and TEMBLOR contracts
34
END END
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.