Taverna Workbench Stuart Owen University of Mancester, UK

Slides:



Advertisements
Similar presentations
1 Semantic Webs and The Semantic Web: Services, Resources and Technologies for Clinical Care and Biomedical Research Alan Rector School of Computer Science.
Advertisements

Taverna: From Biology to Astronomy Dr Katy Wolstencroft University of Manchester my Grid OMII-UK.
Sandra Gesing Division for Simulation of Biological Systems Eberhard-Karls-Universität Tübingen Portals for Life.
Sandra Gesing Eberhard-Karls-Universität Tübingen Requirements on a portal for MoSGrid (Molecular Simulation.
Center for Bioinformatics, University of Tübingen
16/11/ IRS-II: A Framework and Infrastructure for Semantic Web Services Motta, Domingue, Cabral, Gaspari Presenter: Emilia Cimpian.
IPAW'08 – Salt Lake City, Utah, June 2008 Data lineage model for Taverna workflows with lightweight annotation requirements Paolo Missier, Khalid Belhajjame,
Peter Rice Bioinformatics and Grid: Progress and Potential Peter Rice, EBI ISGC, April 2005.
Classical and myGrid approaches to data mining in bioinformatics
Taverna the story from up-above Antoon Goderis The University of Manchester, UK DART workshop, Brisbane,
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Designing, Executing and Reusing Scientific Workflows Katy Wolstencroft, Paul Fisher, myGrid.
IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
Building Scientific Workflows with Taverna and BPEL: a Comparative Study in caGrid Wei Tan 1, Paolo Missier 2, Ravi Madduri 1, Ian Foster 1 1 University.
Doing it again: Workflows and Ontologies Supporting Science Phillip Lord Frank Gibson Newcastle University.
Business Process Orchestration
Workflows within Taverna Stuart Owen University of Mancester, UK
Personal Data Management Why is this such an issue? Data Provenance Representing links v Representing data Identifying resources: Life Science Identifiers.
The Representation of Scientific Data
A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards, Alun Preece, Nick Gotts and Gary Polhill School.
Provenance in my Grid Jun Zhao School of Computer Science The University of Manchester, U.K. 21 October, 2004.
June Amsterdam A Workflow Bus for e-Science Applications Dr Zhiming Zhao Faculty of Science, University of Amsterdam VL-e SP 2.5.
An Introduction to Taverna Dr. Georgina Moulton and Stian Soiland The University of Manchester
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik University of Manchester materials by Dr Katy Wolstencroft and Dr Aleksandra.
Taverna and my Grid A solution for confusion intensive computing? Tom Oinn – EMBL-EBI,
Deciding Semantic Matching of Stateless Services Duncan Hull †, Evgeny Zolin †, Andrey Bovykin ‡, Ian Horrocks †, Ulrike Sattler † and Robert Stevens †
USC Viterbi School of Engineering Scientific Workflows and Systems Ewa Deelman.
Science, Workflows and Collections Professor Carole Goble The University of Manchester, UK
The Taverna Workbench: Integrating and analysing biological and clinical data with computerised workflows Dr Katy Wolstencroft myGrid University of Manchester.
Taverna and my Grid Basic overview and Introduction Tom Oinn
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
An Introduction to Taverna Workflows Franck Tanoh my Grid University of Manchester.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
OMII-UK Software Activities Steven Newhouse, Director.
(Bio)Web Services at the INB BioMOBY. Instituto Nacional de Bioinformática.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
Taverna: A Workbench for the Design and Execution of Scientific Workflows Dr Katy Wolstencroft myGrid University of Manchester.
Going with the Flow Distributed Computing for Systems Biology Using Taverna Prof Carole Goble The University of Manchester, UK
The ACGT Workflow Editing & Enactment Environment Giorgos Zacharioudakis Institute of Computer Science, Foundation for Research & Technology – Hellas (ICS-FORTH)
E-Science Tools For The Genomic Scale Characterisation Of Bacterial Secreted Proteins Tracy Craddock, Phillip Lord, Colin Harwood and Anil Wipat Newcastle.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Systems Analysis and Design in a Changing World, 3rd Edition
Provenance challenge --- my Grid David De Roure University of Southampton Jun Zhao, Carole Goble and Daniele Turi University of Manchester.
VBI Web Services Workshop May 2005 Performing In silico Experiments in a Service Based Architecture: Solutions and Issues Chris Wroe, Phillip Lord,
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
Exploring Williams-Beuren Syndrome using my Grid R.D. Stevens, a H.J. Tipney, b C.J. Wroe, a T.M. Oinn, c M. Senger, c P.W. Lord, a C.A. Goble, a A. Brass,
Moby Web Services Iván Párraga García MSc on Bioinformatics for Health Sciences May 2006.
An Identity Crisis in the Life Sciences Jun Zhao, Carole Goble and Robert Stevens The University of Manchester, UK Thanks to: Tom Oinn, Matthew Pocock,
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Katy Wolstencroft myGrid University of Manchester IMPACT/Taverna Hackathon 2011.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
First International Workshop on Portals for Life Sciences Sandra Gesing
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
EScience Case Studies Using Taverna Dr. Georgina Moulton The University of Manchester
The Semantic Web, Service Oriented Architectures, the my Grid Experience Carole Goble
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
Selected Workflow and Semantic Experiences from my Grid Professor Carole Goble The University of Manchester, UK
An Introduction to Taverna caBIG monthly workspace call and Taverna, Franck Tanoh.
Exploring Taverna engine Aleksandra Pawlik materials by Katy Wolstencroft University of Manchester.
Workflow and myGrid Justin Ferris IT Innovation Centre 7 October 2003 Life Sciences Grid GGF9.
Introduction to Workflows with Taverna and myExperiment Aleksandra Pawlik University of Manchester materials by Katy Wolstencroft and Aleksandra Pawlik.
Taverna allows you to automatically iterate through large data sets. This section introduces you to some of the more advanced configuration options for.
Introduction to Workflows with Taverna and myExperiment Aleksandra Pawlik University of Manchester materials by Dr Katy Wolstencroft.
Exploring Taverna 2 Katy Wolstencroft myGrid University of Manchester.
LSIDs in Taverna Daniele Turi University of Manchester
Web Ontology Language for Service (OWL-S)
Distributed Computing for System Biology using Taverna Workflows
Scientific Workflows Lecture 15
Presentation transcript:

Taverna Workbench Stuart Owen University of Mancester, UK

What is a workflow Data workflows –A task is invoked once its expected data has been received, and when complete passes any resulting data downstream. –B starts when it receives data from A. –C and D run in parallel when they receive data from B –E starts once its received data from both C and D. Control workflows –A task is invoked once its dependant tasks have completed. –B starts when A has completed. –C and D run in parallel once B has completed –E starts once both C and D have completed. A B CD E F

Advantages of workflows acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa

Advantages to workflows High-level abstraction –Easier to understand and modify. –Easier to describe and discuss with others. –Describes what you want to do, not how to do it. Automation Sytematic Sharing and re-use –Either on its own, or within other workflows!

Workflows within Taverna Predominantly based around the flow of data, but does allow control constraints as well. Service oriented workflows. Services may or not be grid enabled. High-level GUI approach seperated from lower level coding, you don’t have to be a coder to build a workflow. Enactment can take place separate to the GUI, allowing workflows to be executed from the command line or within other systems.

Taverna 1.4 Workbench Integral part of the myGrid project Java based, runs on Windows, Mac OS, Linux, Solaris Open source and user driven development Taverna in OMII-UK –Dedicated team of developers focused on design, implementation, testing and support – leading to production quality software. –Development of Taverna 2.0

Taverna 1.4 workbench

Freefluo Workflow enactor Scufl + Workflow Object Model Processor Web Service Soap lab Processor Local App Processor Enactor Taverna Workbench Processor Bio MOBY Processor ? SCUFL Application data flow layer Scufl graph + service introspection Execution flow layer List management; implicit iteration mechanism; MIME & semantic type decoration; fault management; service alternates Processor invocation layer Workflow Execution (Simple Conceptual Unified Flow Language)

Nested workflows A processor can be a workflow itself. Encourages the reuse of workflows within a more complex scenario. Greater abstraction of an overall process making it more manageable.

Iterations Scufl handles iterations implicitly i.e. Taverna handles it automagically, theres no need for the user to indicate that there is an iteration required. Taverna recognises the data mismatch and repeatedly runs the task over each data element in the list. Iteration stategy with multiple inputs can be configured. “Cross product” - all against all “Dot product” – first against first, second against second ….. etc

What about when a service fails? Most services are owned by other people No control over service failure Some are research level Workflows are only as good as the services they connect! To help - Taverna can: Notify failures Instigate retries Set criticality Substitute alternative services

Provenance Data? Supports scientific method and best practice Metadata about the origin of a resource (workflow, service, data, experiment hypothesis etc) and the process of how a resource was generated. The Who?, What?, When?,Where? and Why? about resources. Stored as RDF triples Also available as OWL, opening it up to complex reasoning Provenance Record Result Input

Typed Workflow Run urn:lsid:..:wfInstance:8 runs launchedBy Experimenter belongsTo Organization urn:lsid:…:org:HY7 ProcessRunWorkflowRunWorkflow Provenance Ontology runs launchedBy belongsTo executed urn:lsid:…:person:4 urn:lsid:…:workflow:6 urn:lsid:…:processRun:84 urn:lsid:…:processRun:51 executed

Provenance Browser

New plans for Taverna 2.0

Evolving challenges Long running data intensive workflows Manipulation of confidential or otherwise protected information Use with classical grid systems Publishing and sharing of workflows Better use of provenance

Runtime Service Binding Service definition consists of an abstract description Resolved at workflow runtime to one or more concrete resources by a broker Allows load balancing or economic model based service selection over grid environments

Processor Dispatch Stack

3 rd party data transfers Allows ‘in place’ referencing of data –Large data sets no longer round-trip between workflow engine and data provider –Allows restricted access to sensitive data Automatic de-reference when a reference type is linked to a value type within a workflow.

Streaming Data Allow execution of downstream workflow stages on partially complete results from upstream. Service 1Service 2Service 3 Non streaming (Taverna 1), entire iteration must complete at each stage Streamed data, Service 2 starts operating on partial results from Service 1

Conclusions Taverna and its source code is free to download. – Taverna is being adopted by a number of different disciplines outside its bio-science origins, including chemoinformatics, social science, astronomy. Open architecture and support for plugins to cope with open world – allows expansion into other areas User driven development –Taverna users mailing list –Taverna hackers mailing list Production quality software within OMII-UK

Acknowledgements The my Grid group, past and present. OMII-UK All our users Carole Goble Katy Wolstencroft Daniele Turi Matthew Gamble Tom Oinn Paul Fisher