Research Traceability using Provenance Services for Biomedical Analysis Dr Peter Bloodsworth CCCS Research Centre UWE, Bristol, UK

Slides:



Advertisements
Similar presentations
Presentation by Priyanka Sawarkar
Advertisements

Towards Intelligent Workflow Planning for Neuroimaging Analyses Irfan Habib, Ashiq Anjum, Peter Bloodsworth, Richard McClatchey Centre for Complex Cooperative.
Architecture Tutorial 1 Overview of Today’s Talks Provenance Data Structures Recording and Querying Provenance –Break (30 minutes) Distribution and Scalability.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
NeuGRID PROJECT: A PRACTICAL EXAMPLE OF RESEARCH THROUGH THE GARR.
MammoGrid: A Service Oriented Architecture based Medical Grid Application CERN (Switzerland, Project Coordination) Mirada Solutions (UK) – Medical Image.
A Data Curation Application Using DDI: The DAMES Data Curation Tool for Organising Specialist Social Science Data Resources Simon Jones*, Guy Warner*,
Kamran Munir, M. Odeh, R. McClatchey
Advanced Data Mining and Integration Research for Europe ADMIRE – Framework 7 ICT ADMIRE Overview European Commission 7 th.
Dagstuhl, February 16, 2009 Layers in Grids Uwe Schwiegelshohn 17. Februar 2009 Layers in Grids.
A tool to enable CMS Distributed Analysis
Overview of Data Management solutions for the Control and Operation of the CERN Accelerators Database Futures Workshop, CERN June 2011 Zory Zaharieva,
Data Integration in Service Oriented Architectures Rahul Patel Sr. Director R & D, BEA Systems Liquid Data – XML-based data access and integration for.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
E-science grid facility for Europe and Latin America Bridging OurGrid-based and gLite-based Grid Infrastructures Abmar de Barros, Adabriand.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
From BIOPATTERN to Bioprofiling over Grid for eHealthcare Emmanuel Ifeachor University of Plymouth, U.K.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Architecture Tutorial 1 Overview of Today’s Talks Provenance Data Structures Recording and Querying Provenance –Break (30 minutes) Distribution and Scalability.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
A Model-Driven Approach to Interoperability and Integration in Systems of Systems Gareth Tyson Adel Taweel Steffen Zschaler Tjeerd Van Staa Brendan Delaney.
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
Lead Black Slide Powered by DeSiaMore1. 2 Chapter 8 Personal Productivity and Problem Solving.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Stages of Processing.  When a computer is given instructions, a series of tasks must take place in order for a result to be accomplished  To accomplish.
Holding slide prior to starting show. A Portlet Interface for Computational Electromagnetics on the Grid Maria Lin and David Walker Cardiff University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
DECIDE DECIDE ( Diagnostic Enhancement of Confidence by an International Distributed Environment ) David Manset – maatG France DECIDE SA Coordinator
Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
 TERMINOLOGY TERMINOLOGY DATA INFORMATION  NEED OF INFORMATION NEED OF INFORMATION  QUALITIES OF INFORMATION QUALITIES OF INFORMATION  FILE SYSTEM.
Biomedical Informatics Research Network BIRN Workflow Portal.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
1 WS-GIS: Towards a SOA-Based SDI Federation Fábio Luiz Leite Júnior Information System Laboratory University of Campina Grande
WP3 Information and Monitoring Rob Byrom / WP3
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
CRISTAL Andrew Branson University of the West of England.
Workflows Description, Enactment and Monitoring in SAGA Ashiq Anjum, UWE Bristol Shantenu Jha, LSU 1.
DIRAC Project A.Tsaregorodtsev (CPPM) on behalf of the LHCb DIRAC team A Community Grid Solution The DIRAC (Distributed Infrastructure with Remote Agent.
Provenance in Distr. Organ Transplant Management EU PROVENANCE project: an open provenance architecture for distributed.
HLRmon accounting portal The accounting layout A. Cristofori 1, E. Fattibene 1, L. Gaido 2, P. Veronesi 1 INFN-CNAF Bologna (Italy) 1, INFN-Torino Torino.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Collection and storage of provenance data Jakub Wach Master of Science Thesis Faculty of Electrical Engineering, Automatics, Computer Science and Electronics.
Developing GRID Applications GRACE Project
ETICS An Environment for Distributed Software Development in Aerospace Applications SpaceTransfer09 Hannover Messe, April 2009.
DECIDE DECIDE ( Diagnostic Enhancement of Confidence by an International Distributed Environment ) Laura Leone - GARR DECIDE Project Coordinator From neurological.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Project Coordinator Laura Leone GARR The Italian Academic and Research Network Italy From neurological research to clinical praxis:
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
Biomedical Informatics Research Network BIRN Workflow Portal.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
Bob Jones EGEE Technical Director
OGF PGI – EDGI Security Use Case and Requirements
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
A GRID-BASED e-INFRASTRUCTURE
CMS High Level Trigger Configuration Management
Existing Perl/Oracle Pipeline
Leigh Grundhoefer Indiana University
Development of Information Grid
Gridifying the LHCb Monte Carlo production system
Presentation transcript:

Research Traceability using Provenance Services for Biomedical Analysis Dr Peter Bloodsworth CCCS Research Centre UWE, Bristol, UK HealthGrid Presentation: 29 th of June 2010

Talk Structure The neuGRID Project. Requirements from Users. The Bigger Picture. A Provenance Service. CRISTAL. Conclusion. HealthGrid Presentation: 29 th of June 2010

The neuGRID Consortium Vrije Universiteit Medical Centre, THE NETHERLANDS CF consulting s.r.l., ITALY Provincia Lombardo Veneta Fatebenefratelli, ITALY Karolinska institutet, SWEDEN University of the West of England, Bristol, UK Neuralyse Europe (Prodema Medical), SWITZERLAND Maat Gknowledge, SPAIN HealthGrid, FRANCE HealthGrid Presentation: 29 th of June 2010

To build a new user-friendly Grid-based research e-Infrastructure. Collection/archiving of large amounts of imaging data. Paired with computationally intensive data analyses. To enable EU neuroscientists to carry out cutting- edge research. Imaging of degenerative brain diseases. Project Objectives HealthGrid Presentation: 29 th of June 2010

neuGRID Provenance Requirements Provenance in neuGRID relates to: 1.Data provenance (source, quality control applied and other facets.) 2.Workflow provenance (author, versioning, certification, etc.) 3.Analysis Result provenance (data set, workflow chosen, settings, errors, etc.) HealthGrid Presentation: 29 th of June 2010

The Bigger Picture Real-world end users care about doing their research and getting their results. They don’t care about the grid / certificates or virtual organisations. They don’t want to learn grid-speak. They don’t all want to do the same things in the same way. They expect services that help them to do their work. They expect a high-level of integration between services and reliability. HealthGrid Presentation: 29 th of June 2010

The neuGRID Provenance Service HealthGrid Presentation: 29 th of June 2010

The Provenance Architecture Provenance API Translator CRISTAL Core Provenance DB HealthGrid Presentation: 29 th of June 2010

Service Wrapper Provides a web service-based interface to the Provenance Service Consists of methods for  Creating workflows  Creating workflow instances  Storing workflow provenance  Retrieving workflow provenance HealthGrid Presentation: 29 th of June 2010

Translator To prevent lock-in to a specific workflow format, the Provenance Service consists of an adaptor-based translator for converting user workflows into CRISTAL workflow format Acts as bridge between users and CRISTAL core CRISTAL Core Provenance management is handled internally by CRISTAL. Workflow needs to be translated between user format and CRISTAL format. HealthGrid Presentation: 29 th of June 2010

CRISTAL was designed to track the development of LHC detector components at CERN HealthGrid Presentation: 29 th of June 2010

CRISTAL in neuGRID Overview CRISCRISTAL TAL Researcher Input Data Derived Data Analysis Suite LORIS CRISTAL Process & Data Tracking Provenance Data Workflow steps Analysis data Histories A Complete Analysis Knowledge Base

CRISTAL Main Functions Complete capture of system functionality in workflows. As every action is represented by a workflow activity, every operation is recorded and stored in a replayable way. Every piece of data, including descriptions, is versioned, so all previous states of items are available. Several interfaces exist to bridge to other components for database storage, job distribution, definition management, etc.

Service Architecture

Further Developments Composite jobs. If some tasks are clustered together, they should be executed by CRISTAL as a composite activity. In composite jobs, each sub-job should send the feedback to CRISTAL as soon as it completes its execution. The Glueing Service should have user related information to map users to jobs and provenance data. The Querying Service should query both CRISTAL provenance and LORIS data The translation component in the pipeline service should map the user workflows to CRISTAL workflows. The translation should be two way. HealthGrid Presentation: 29 th of June 2010

Conclusions A robust provenance system is necessary if users are to have confidence in and use the neuGRID infrastructure for their research. Provenance is important throughout neuGRID, from data input through to analysis output. Errors that occur at any stage may effect the final results. It can be thought of as a chain of evidence and spans: Data provenance (source, quality control applied and other facets.) Workflow provenance (source, versioning, certification, etc.) Analysis Result provenance (data set, workflow chosen, settings, errors, etc.) We need CRISTAL which is a resource that is both powerful and flexible in the way that it captures provenance data. HealthGrid Presentation: 29 th of June 2010

Question Time None like this please!! HealthGrid Presentation: 29 th of June 2010

CRISTAL Enabled Provenance