Introduction to Taverna, an environment For designing and executing workflows Franck Tanoh University of Manchester.

Slides:



Advertisements
Similar presentations
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik University of Manchester.
Advertisements

Table of Contents Part B Managing Documents & References File organizer Citing references Creating bibliographies/Using MS Word Plugin Sharing documents.
SUNY Morrisville-Norwich Campus- Week 7 CITA 130 Advanced Computer Applications II Spring 2005 Prof. Tom Smith.
1 of 5 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
XP New Perspectives on Microsoft Office Excel 2003, Second Edition- Tutorial 11 1 Microsoft Office Excel 2003 Tutorial 11 – Importing Data Into Excel.
BioMoby and Taverna Tutorial. Downloading Taverna ► Taverna can be obtained from:
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
Adding metadata to intranet documents Please note: this is a temporary test document for use in internal testing only.
1 Chapter Overview Introduction to Windows XP Professional Printing Setting Up Network Printers Connecting to Network Printers Configuring Network Printers.
1.Learning the Terms Learning the TermsLearning the Terms 2.Accessing the Internet from a PC Accessing the Internet from a PCAccessing the Internet from.
Working with SharePoint Document Libraries. What are document libraries? Document libraries are collections of files that you can share with team members.
DEMONSTRATION FOR SIGMA DATA ACQUISITION MODULES Tempatron Ltd Data Measurements Division Darwin Close Reading RG2 0TB UK T : +44 (0) F :
Use Watch folders to automatically add PDFs to Mendeley Desktop. When you place a document in a watched folder, it will be automatically added to Mendeley.
WorkPad 4 Quick Start WorkPad 4 Quick Start  Business Optix brings the rigor and discipline of business modelling and design into.
Advanced Tables Lesson 9. Objectives Creating a Custom Table When a table template doesn’t suit your needs, you can create a custom table in Design view.
TUTORIAL # 2 INFORMATION SECURITY 493. LAB # 4 (ROUTING TABLE & FIREWALLS) Routing tables is an electronic table (file) or database type object It is.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik University of Manchester materials by Dr Katy Wolstencroft and Dr Aleksandra.
Tom Oinn,  Download Taverna from  Windows or linux If you are using either.
Microsoft Office 2003 Illustrated Introductory with Programs, Files, and Folders Working.
XP New Perspectives on Microsoft Access 2002 Tutorial 51 Microsoft Access 2002 Tutorial 5 – Enhancing a Table’s Design, and Creating Advanced Queries and.
® IBM Software Group © 2009 IBM Corporation Rational Publishing Engine RQM Multi Level Report Tutorial David Rennie, IBM Rational Services A/NZ
Introducing Dreamweaver MX 2004
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
An Introduction to Designing, Executing and Sharing Workflows with Taverna Nowgen, Next Gen Workshop 17/01/2012.
CIS 205—Web Design & Development Dreamweaver Chapter 1.
An Introduction to Designing and Executing Workflows with Taverna Katy Wolstencroft University of Manchester.
Domain 3 Understanding the Adobe Dreamweaver CS5 Interface.
Taverna Workflows myExperiment Paul Fisher University of Manchester
SADI and Taverna 2 Tutorial David Withers. Preamble The Taverna 2 platform is constantly changing; while the look and feel of the workbench may change,
Key Applications Module Lesson 21 — Access Essentials
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
BioMoby and Taverna 2 Tutorial Mark Wilkinson, Edward Kawas, David Withers.
Greendale Carpets Ad. Generator: A Friendly Guide Version 0.2.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Katy Wolstencroft myGrid University of Manchester IMPACT/Taverna Hackathon 2011.
Evaluating & Maintaining a Site Domain 6. Conduct Technical Tests Dreamweaver provides many tools to assist in finalizing and testing your website for.
XP New Perspectives on Microsoft Office FrontPage 2003 Tutorial 7 1 Microsoft Office FrontPage 2003 Tutorial 8 – Integrating a Database with a FrontPage.
Introduction to Taverna Online and Interaction service Aleksandra Pawlik University of Manchester.
IS493 INFORMATION SECURITY TUTORIAL # 1 (S ) ASHRAF YOUSSEF.
Debugging tools in Flash CIS 126. Debugging Flash provides several tools for testing ActionScript in your SWF files. –The Debugger, lets you find errors.
XP New Perspectives on Microsoft Office Access 2003, Second Edition- Tutorial 6 1 Microsoft Office Access 2003 Tutorial 6 – Creating Custom Forms.
Word and the Writing Process. To create a document 1.On the Start menu, point to Programs, and then click Microsoft Word. A new document opens in Normal.
XP New Perspectives on Macromedia Dreamweaver MX 2004 Tutorial 5 1 Adding Shared Site Elements.
Designing, Executing and Sharing Workflows with Taverna 2.2 Katy Wolstencroft myGrid University of Manchester.
Editing and Debugging Mumps with VistA and the Eclipse IDE Joel L. Ivey, Ph.D. Dept. of Veteran Affairs OI&T, Veterans Health IT Infrastructure & Security.
An Introduction to Taverna Workflows Dr K Wolstencroft University of Manchester.
Exploring Taverna engine Aleksandra Pawlik materials by Katy Wolstencroft University of Manchester.
Data Exchange and Sharing using Taverna Workflows and myExperiment Katy Wolstencroft myGrid University of Manchester.
An Introduction to Taverna Workflows Paul Fisher, University of Manchester
Advanced Taverna Aleksandra Pawlik University of Manchester materials by Katy Wolstencroft, Aleksandra Pawlik, Alan Williams
Getting data out of XML These exercises provide an overview of how to use the native Taverna XPath services to get data out of XML.
An Introduction to Running, Reusing and Sharing Workflows with Taverna – part 2 Aleksandra Pawlik materials by Katy Wolstencroft University of Manchester.
Taverna allows you to automatically iterate through large data sets. This section introduces you to some of the more advanced configuration options for.
TechKnowlogy Conference August 2, 2011 Using GoogleDocs for Collaboration.
Exploring Taverna 2 Katy Wolstencroft myGrid University of Manchester.
Aleksandra Pawlik University of Manchester. Something that can be put into a workflow Well described - what the component does Behaves “well” - conforms.
Aleksandra Pawlik Alan Williams University of Manchester.
An Introduction to Designing, Executing and Sharing Workflows with Taverna BioVel Workshop 2011.
An Introduction to Designing and Executing Workflows with Taverna Part 2 – Importing and exporting data Norman Morrison University of Manchester Credits:
These exercises highlight the services that do not perform biological functions, but are vital for running life science workflows.
Designing, Executing and Sharing Workflows with Taverna 2.4 Different Service Types Katy Wolstencroft Helen Hulme myGrid University of Manchester.
Designing and Sharing Taverna Workflows: Exploring Taverna 2.1 Beta
Writing simple Java Web Services using Eclipse
An Introduction to Designing and Executing Workflows with Taverna
Taverna Tutorial exercise 2: REST services from BioCatalogue
An Introduction to Designing, Executing and Sharing Workflows with Taverna and myExperiment Katy Wolstencroft University of Manchester.
Shim (Helper) Services and Beanshell Services
Aleksandra Pawlik materials by Katy Wolstencroft
REST Services Data and tools on the Web have been exposed in both WSDL and REST. Taverna provides a custom processor for accessing REST services Peter.
An Introduction to Designing and Executing Workflows with Taverna
Presentation transcript:

Introduction to Taverna, an environment For designing and executing workflows Franck Tanoh University of Manchester

 Download Taverna from  Windows or linux If you are using either a modern version of Windows (Win2k or WinXP, with XP preferred) or any form of linux, solaris etc. you should download the workbench zip file. For windows users, Taverna can be unzipped and used, for linux you will also need to install GraphViz ( the appropriate rpm for your platform)  Mac OSX If you are using Mac OSX you should download the.dmg workbench file. Double-click to open the disk image and copy both components (Taverna and GraphViz) onto your hard-disk to run the application  YOU WILL ALSO NEED a modern Java Runtime Environment (JRE) or Java Software Development Kit (SDK) from Java 5 or abovehttp://java.sun.com

Taverna workbench has a standard menu of 6 tabs:  File: with 6 items  Open a new workspace  Load a workflow from a file  Load a workflow from the web  Close existing workflow  Save workflow  Import workflow from a file  Import workflow from the web  Run your workflow  Close the workbench

 Tools: for plug-in and updates  Workflow: list of all created workflows  Advanced: to create new perspectives  Design: Workflow design space  Result: view workflow results Standard menu

Taverna Design view is composed of 3 main windows: 1- Available Services Lists services available by default in Taverna  Local java services  Simple web services  Soaplab services – legacy command-line application  BioMart database services  BioMoby services Allows the user to add new services or workflows from the web or from file systems

2- AME – Advanced Model Explorer The Advanced Model Explorer (AME) is the primary editing component within Taverna. Through it you can load, save and edit any property of a workflow. It enables: -building -loading -editing -saving workflows

3- Workflow Diagram Window Visual representation of workflow  Shows inputs / outputs, services and control flows  Enables saving of workflow diagrams for publishing and sharing

 Go to the ‘Tools’ menu at the top of the workbench and select the Plugin manager  Select find new plugins  Tick the boxes for Feta, Execute remotely and LogBook and install these plugins  Three more options ‘Execute remotely’, ‘Discover’ and ‘LogBook’ will now have appeared at the top of your screen  Feta is now available through the Discover tab The Discover tab can be used to search for web services by name, task, input and output parameters…

New services can be gathered from anywhere on the web Go to the following page: bin/tempconverter.exe/wsdl/ITempConverter and copy the web page addresshttp://developerdays.com/cgi- bin/tempconverter.exe/wsdl/ITempConverter These services were not designed for use in Taverna, but Taverna can use them if you supply the address of the WSDL file

 Go to the ‘Available services’ panel and right-click on ‘Available Processors’. For each type of service, you are given the option to add a new service, or set of services.  Select ‘Add new wsdl scavenger’. A window will pop-up asking for a web address  Enter the Web service address you have just copied.  Scroll down to the bottom of the ‘Available Services’ panel and look at the Temperature Conversion web service that is now included.

 Expand the [+] next to ‘tempconverter’ (the Temperature Conversion) web service  Right click on the ‘CtoF’ operation and select ‘Invoke’. This operation converts a temperature from Celsius to Fahrenheit.  In the pop-up ‘Run workflow’ window add a Temperature value in Celsius by selecting ‘temp’ and right-clicking. Select ‘new input value’ and enter a value in the box on the right  Click ‘Run workflow’ and the service is invoked

 Click on ‘text/plain’ in the left panel  The temperature in Fahrenheit is displayed on the Right  Click on ‘Process Report’  Look at processes. This shows the experiment provenance – where and when processes were run  Click on ‘Status’  Look at options As workflows run, you can monitor their progress here.

The processes for running and invoking a single service are the basics for any workflow and the tracking of processes and generation of results are the same however complicated a workflow becomes In the next few exercises, we will look at some example workflows and build some of our own from scratch

 Switch to the design view by clicking on ‘Design’  Select ‘Open Workflow’ from the File menu at the top of the workbench. You will see a selection of.xml files in an examples directory. These are workflow definition files  Select ‘ConvertedEMBOSSTutorial.xml’ and a pre-defined workflow will be loaded  View the workflow diagram - you will see services of in different colours

 Find out what the workflow does by reading the workflow metadata  In the AME – click on the name of the workflow – in this case ‘ A workflow version of the EMBOSS tutorial ’ and then select the ‘workflow metadata’ tab at the top of the AME. You will see a text description of the workflow, its author and its unique LSID. When publishing workflows for others, this annotation is useful information and allows the acknowledgement of intellectual property

 Run the workflow by selecting ‘run workflow’ from the file menu  Watch the progress of the workflow in the ‘enactor invocation’ window. As services complete, the enactor reports the events. If a service fails, the enactor reports this also

 Go to the webpage  Select ‘ConditionalBranchChoice’ and copy the web address  Go back to the Taverna workbench and select ‘Open workflow location’ from the file menu.  Paste the address in the pop up window and click ‘ok’  Run the workflow using ‘true’ or ‘false’ as input value.  Go to the webpage  Select ‘ConditionalBranchChoice’ and copy the web address  Go back to the Taverna workbench and select ‘Open workflow location’ from the file menu.  Paste the address in the pop up window and click ‘ok’  Run the workflow using ‘true’ or ‘false’ as input value.

 You will see at least one of the services fail. What happens when it fails depends on whether the service is set as a critical one. If it is, the workflow will abort, if it isn’t, the workflow will continue  You can set a workflow to critical by ticking the critical box in the AME.  Set the workflow to ‘critical’ and run it again  The entire workflow fails this time.

 Go back to the Design view  Look at the workflow diagram  You will see black arrows and white circles – black arrows show the flow of the data and white circles are control links.  A control link specifies that even though there is no data flowing between two services, the second should not start until the end of the first

 Open a new workspace by Selecting ‘New workflow’ from the file menu.  Then find the ‘CtoF’ service in the ‘Available services’ panel (you can use the search form on top of ‘Available Processors’).  Right-click on ‘CtoF’ and import it into the workbench by selecting ‘Add to Model’  In the AME window ‘CtoF’ shows:  1 input (Green arrow pointing up)  2 output (purple arrow pointing down)

 Define a new workflow input by right-clicking on ‘Workflow Input’ and selecting ‘create new Input ’  Supply a suitable name e.g. ‘temperatureInCelsius’  Connect this new input to the ‘CtoF’ service by right-clicking on ‘temperatureInCelsius’ and selecting ‘CtoF –>temp’ You always build workflows with the flow of data

 Define a new workflow output by right-clicking on ‘workflow output’ and selecting ‘create new output’  Supply a suitable name e.g. ‘temperatureInFahrenheit’  Connect this new output to the ‘CtoF’ service output (return). (right-click the output ‘return’ on ‘CtoF’ service and select ‘workflow output -> temperatureInFahrenheit’) Congratulation! You have built a simple workflow from scratch.  Run the workflow. You will again need to supply a temperature value in Celsius, e.g. 25  Save your workflow

In the following section you will learn to connect more than one services together. you are going to convert a temperature value from Celsius to Fahrenheit then back to Celsius again using only one workflow.  Open a new workflow workspace  Search for ‘ CtoF’ web service in the Available services panel and add it to the AME window.  Search for ‘ FtoC’ web service in the Available services panel and add it to the AME window.  Create a input called ‘ TempC ’ and connect it to ‘temp’ input on ‘ CtoF ’ service

 The temperature input for the ‘FtoC’ service will be the output from the ‘CtoF’ service. Connect the output ‘return’ on ‘CtoF’ web service to the input ‘temp’ on ‘FtoC’ web service.  Create a output called ‘temp_in_C’ and connect it to the output ‘return’ on ‘FtoC’ service. Remember: You always build workflows with the flow of data  Run the workflow

 Go back to the design view  Select and right-click the workflow input ‘TempC’  Select ‘Remove from model’ to delete it.  Select ‘string constant’ from ‘Available Services’  Right-click and select ‘add to model with name…’

 Insert ‘TemperatureC’ in the pop-up window  Right-click on ‘TemperatureC’ and select ‘Edit string value’  Enter a temperature value in Celsius.  Connect the output ‘value’ on ‘TemperatureC’ to the input ‘temp’ on ‘CtoF’ service.  Run the workflow- The workflow will run with the default value

Taverna provides several options for saving data. 1. Individual data items can be saved by right-clicking on them 2. All data can be saved to disk 3. Textual/tabular data can be saved to excel  Save all the data from your workflow

 Build a workflow following the model below. The web services (purple and green colour) names and input values are given in the diagram. Hint-use the Discover tab to find the services.  Annotate your workflow (name, author, date…) Blast result SearchSimple (Blast-DDBJ) program: blastp ID: Q09093 Get_Protein_Fasta database: SWISS Output blast_result SearchSimple (Blast-DDBJ) program: blastp ID: Q09093 Get_Protein_Fasta database: SWISS  Run the workflow

The previous exercises have covered the basics of myGrid workflows. The following demos and exercises cover more advanced features, such as rendering output, dealing with service failure and iterating over datasets. You may not reach the end of these exercises, but they will provide some examples to take home.

 Output format  Iteration  Substituting Services  Fault tolerance  Nested workflow  Shim  XMLSplitters  T2 (need to understand this)  Provenance (demo)  Workflow re-use (myExperiment demo)

Taverna is able to display results using a specific type of renderer if the workflow output is configured correctly.  Load the workflow ‘convertedEMBOSSTutorial’ from the ‘examples’ directory  Run the workflow

 Look at the results. For ‘tmapPlot’ and ‘outputPlot’, you will see the results are displayed graphically. This is achieved by specifying a particular mime type in the output.  Go back to the AME and look at the metadata for ‘tmapPlot’ and ‘outputPlot’ (e.g. select ‘tmapPlot’ and click on ‘Metadata for tmapPlot’).  Select MIME Types. As you can see, each has the image/png mime type associated with it. If you wish to render results in anything other than plain text, you MUST specify the mime-type in the workflow output

The following mime-types are currently used by Taverna text/plain=Plain Text text/xml=XML Text text/html=HTML Text text/rtf=Rich Text Format text/x-graphviz=Graphviz Dot File image/png=PNG Image image/jpeg=JPEG Image image/gif=GIF Image application/zip=Zip File chemical/x-swissprot=SWISSPROT Flat File chemical/x-embl-dl-nucleotide=EMBL Flat File chemical/x-ppd=PPD File chemical/seq-aa-genpept=Genpept Protein chemical/seq-na-genbank=Genbank Nucleotide chemical/x-pdb=Protein Data Bank Flat File chemical/x-mdl-molfile

The ‘chemical/’ mime-types are rendered using SeqVista to view formatted sequence data  Load ‘FetchPDBFlatFile’ from the ‘examples/library’ directory  Run the workflow using ‘1atp’ as input example The chemical/x-pdb can be used to view rotating 3D protein images

Nested workflows encourage the reuse of workflow within a more complex scenario and Give abstraction of an overall  Select the workflow “temperature conversion” of exercise 6: workflow1  Click on ‘Add Nested Workflow’ under ‘Advanced model explorer’.  Select ‘Open File’ and choose the workflow you saved in exercise 5.3 : workflow2  Connect both workflows together so that workflow2 becomes a subworkflow of workflow1  Run the workflow- Hint: we may need to create a new workflow output.

Taverna has an implicit iteration framework. If you connect a set of data objects (for example, a set of fasta sequences) to a process that expects a single data item at a time, the process will iterate over each sequence  Load the BiomartandEMBOSSAnalysis.xml workflow from the examples directory and run it.  Watch the progress report. You will see several services with ‘Invoking with Iteration’

The user can also specify more complex iteration strategies using the service metadata tag  Load the ‘IterationStrategyExample.xml ’ from the example directory  Read the workflow metadata to find out what the workflow does  Select the ‘ColourAnimals’ service and read the metadata for that service. Under the description is the iteration strategy  Click on ‘dot product’. This allows you to switch to cross product

 Run the workflow twice – once with ‘dot product’ and once with ‘cross product’.  Save the first results so you can compare them – what is the difference? What does it mean to specify dot or cross product?

Taverna does not own many of the services it provides. This means that it cannot control their reliability. Instead, Taverna provides strategies for dealing with services being unavailable  Reload the ‘convertedEMBOSSTutorial.xml’ from the ‘examples’ directory.  Look at the metadata for the ‘emma’ service. It is an implementation of clustalw  Find the DDBJ clustalw service’ ‘analyseSimple’, – HINT: use the Feta discovery tool

 When you have added this service to your workflow, right-click on it and select ‘add as alternate’  In the resulting menu select ‘emma’  The DDBJ version of the clustalw service is now added as an alternative to emma in the AME. It will be called ‘alternate1’  Select ‘alternate1’ and look at the inputs and outputs. These need to be mapped to the correct inputs and outputs in emma

 Right-click on the ‘query’ input in alternate1 and map it to ‘sequence_direct_data’. In both services, these inputs expect a set of fasta sequences.  Right-click on the ‘result’ output and map it to ‘outseq’ in emma in the same way.  Now you have a workflow which will run using emma when it is available – but will substitute it for DDBJ clustalw if emma fails!

Taverna also allows the user to specify the number of times a service is retried before it is considered to have failed. Sometimes network traffic is heavy, so a working service needs to be retried  Select ‘tmap’ from the same workflow. To the right of the service name are a series of 0s and 1s. By simply typing the numbers, the user can specify the number of retries and the time between the retries  Change it to 3 retries for ‘tmap’ and set the status to ‘critical’ using the final tickbox. Now it is critical, it means the whole workflow will be aborted if ‘tmap’ fails after 3 retries. Failures in non-critical services will not abort the workflow run.

A shim is a service that doesn’t do anything scientific, but helps two scientific services fit together There are many myGrid shim services. These are currently being described in a shim library, but for now, a small collection are documented here

Beanshell scripts allow users to write small, bespoke java scripts to allow incompatible services to work together  Create a new beanshell processor by right-clicking “Beanshell scripting host” in the service panel and selecting “Add to model” (you may change the name of the processor)  Right click the beanshell processor created and select “ Configure beanshell…”  Create 2 input port named: myName and mySurname  Cretate 1 output port named: myFullname Note that theses ports are automatically added to AME window

 Select the script tab and Paste the following script myFullname = myName +"\t" + mySurname  Create 2 workflow inputs and 1 workflow output and connect them to the configured beanshell processor.  Run the workflow

Some web services do not explicitly expose their inputs and/or outputs. ‘XMLSplitters’ are used to present to the user these inputs and/or outputs.  Open a new workflow workspace  Add the following wsdl service  Add the service ‘ DailyDilbertImagePath’ in the AME window.  It has 2 outputs but no input. Adding XMLSplitters

 Select the output ‘parameters’ on ‘ DailyDilbertImagePath’ service  Right-click and select ‘add XML splitter’  A new service ‘parametersXML’ is added with its input connection already made.  Search for ‘Get image from URL’ web service and add it to the AME window.  Connect the output ‘DailyDilbertImagePathResult’ on ‘ParametersXML’ service to the input ‘url’ on ‘Get image from URL’ service. Adding XMLSplitters

 The second input ‘base’ on ‘Get image from URL’ service is optional. Leave it unconnected.  Create a new workflow output ‘DailyDilbert’ and connect it to the output ‘image’ on ‘ Get image from URL’ service.  Run the workflow Adding XMLSplitters

 The Taverna Remote Execution plugin is a plugin for Taverna that allows workflows to be run on a Remote Execution Server.  To install the Remote Execution plugin use the Plugin Manager in the Tools Menu.  Configuration information and how to use the remote execution are available here:

Taverna user manual:  Taverna mailing lists: