Exploring Taverna engine Aleksandra Pawlik materials by Katy Wolstencroft University of Manchester.

Slides:



Advertisements
Similar presentations
Welcome to WebCRD.
Advertisements

Lecture 10 Sharing Resources. Basics of File Sharing The core component of any server is its ability to share files. In fact, the Server service in all.
MY NCBI (module 4.5).
Windfall Web Throughout this slide show there will be hyperlinks (highlighted in blue). Follow the hyperlinks to navigate to the specified Topic or Figure.
COMPUTER PROGRAMMING I Essential Standard 5.02 Understand Breakpoint, Watch Window, and Try And Catch to Find Errors.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik University of Manchester.
Introduction to Online Data Collection (OLDC) Community Based Abstinence Education September, 2009.
Follow these before and during the exam… Leave belongings in the locker except your laptop, power cord, network cable, and pen/pencils. Get your laptop.
Purchasing Goods and Services. Overview In this session you will learn how to utilize the eProcurement Module to create requisitions for purchasing goods.
Creating a UAA VPN Connection For Your Computer To Facilitate Polycom PVX – For Windows XP Last Modified On 10/25/2010 University of Alaska Anchorage,
Installing geant4 v9.5 using Windows Daniel Brandt, 06 April 2012 Installing Geant4 v9.5 for Windows A step-by-step guide for Windows XP/Vista/7 using.
Programming with App Inventor Computing Institute for K-12 Teachers Summer 2012 Workshop.
How to Upload Your File to Office Live Workspace Ed McCorduck CPN 100/101: Writing Studies I/II on Computer SUNY Cortland
ELecta Live Update What’s new in Version 4.8 What’s New in V. 4.8 February
CSE 486/586 CSE 486/586 Distributed Systems PA Best Practices Steve Ko Computer Sciences and Engineering University at Buffalo.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik University of Manchester materials by Dr Katy Wolstencroft and Dr Aleksandra.
by Chris Brown under Prof. Susan Rodger Duke University June 2012
Copyright ®xSpring Pte Ltd, All rights reserved Versions DateVersionDescriptionAuthor May First version. Modified from Enterprise edition.NBL.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Nowgen, Next Gen Workshop 17/01/2012.
Microsoft DirectAccess & Work Folders NICHOLAS A. HAY MONROE COUNTY ISD
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
An Introduction to Designing and Executing Workflows with Taverna Katy Wolstencroft University of Manchester.
Introduction to Taverna, an environment For designing and executing workflows Franck Tanoh University of Manchester.
Division of Alcoholic Beverages and Tobacco Liquor Distiller’s and Rectifier’s Monthly Report.
Windows Tutorial Common Objects ACOS: 1, 4. Using the Taskbar 1. Using the taskbar, you can switch between open programs and between open documents within.
SADI and Taverna 2 Tutorial David Withers. Preamble The Taverna 2 platform is constantly changing; while the look and feel of the workbench may change,
Warehouse Report. Log into EDS using your Address/User Id and Password. If you have forgotten your password, click on the Forgot Password? link.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
BioMoby and Taverna 2 Tutorial Mark Wilkinson, Edward Kawas, David Withers.
To advance to the next slide: Choose Step on your handheld Practice Presentation for Windows Mobile Smartphone.
11/25/2015Slide 1 Scripts are short programs that repeat sequences of SPSS commands. SPSS includes a computer language called Sax Basic for the creation.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Katy Wolstencroft myGrid University of Manchester IMPACT/Taverna Hackathon 2011.
Using the AccuGlobe Software with the IndianaMap Using the AccuGlobe Software.
NYS Division of Homeland Security And Emergency Services (DHSES) E-Grants Tutorial Creating an Application for the EOC RFP To access DHSES E-Grants you.
Performing statistical analyses using the Rshell processor Original material by Peter Li, University of Birmingham, UK Adapted by Norman.
Introduction to Taverna Online and Interaction service Aleksandra Pawlik University of Manchester.
BDE tutorial By Deborah Nelson Duke University Under the direction of Professor Susan Rodger July 13, 2008.
Axyz ToolPath Software Introduction
Adding, editing, and deleting items using CONTENTdm Administration.
Access Queries and Forms. Adding a New Field  To insert a field after you have saved your table, open Access, and open the table  It is easier to add.
Splunk Enterprise Instructor: Summer Partain 3 Day Course.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Anja Le Blanc Stian Soiland-Reyes Alan Willams University of Manchester.
Designing, Executing and Sharing Workflows with Taverna 2.2 Katy Wolstencroft myGrid University of Manchester.
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1 © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential.
Advanced Taverna Aleksandra Pawlik University of Manchester materials by Katy Wolstencroft, Aleksandra Pawlik, Alan Williams
Getting data out of XML These exercises provide an overview of how to use the native Taverna XPath services to get data out of XML.
An Introduction to Running, Reusing and Sharing Workflows with Taverna – part 2 Aleksandra Pawlik materials by Katy Wolstencroft University of Manchester.
Taverna allows you to automatically iterate through large data sets. This section introduces you to some of the more advanced configuration options for.
Exploring Taverna 2 Katy Wolstencroft myGrid University of Manchester.
Aleksandra Pawlik University of Manchester. Something that can be put into a workflow Well described - what the component does Behaves “well” - conforms.
Aleksandra Pawlik Alan Williams University of Manchester.
An Introduction to Designing, Executing and Sharing Workflows with Taverna BioVel Workshop 2011.
An Introduction to Designing and Executing Workflows with Taverna Part 2 – Importing and exporting data Norman Morrison University of Manchester Credits:
These exercises highlight the services that do not perform biological functions, but are vital for running life science workflows.
Designing, Executing and Sharing Workflows with Taverna 2.4 Different Service Types Katy Wolstencroft Helen Hulme myGrid University of Manchester.
Designing and Sharing Taverna Workflows: Exploring Taverna 2.1 Beta
Performing statistical analyses using the Rshell processor
An Introduction to Designing and Executing Workflows with Taverna
Computer Programming I
Tutorial for using Case It for bioinformatics analyses
Setup Of 4050 EIP To Control LOGIX PLC
Unit 11 – PowerPoint Interaction
Taverna Tutorial exercise 2: REST services from BioCatalogue
An Introduction to Designing, Executing and Sharing Workflows with Taverna and myExperiment Katy Wolstencroft University of Manchester.
Shim (Helper) Services and Beanshell Services
Aleksandra Pawlik materials by Katy Wolstencroft
Xpath service Getting data out of XML Aleksandra Pawlik materials by Katy Wolstencroft University of Manchester 1.
REST Services Data and tools on the Web have been exposed in both WSDL and REST. Taverna provides a custom processor for accessing REST services Peter.
An Introduction to Designing and Executing Workflows with Taverna
Presentation transcript:

Exploring Taverna engine Aleksandra Pawlik materials by Katy Wolstencroft University of Manchester

 In this tutorial we will explore Taverna workflow engine: iteration, looping, parallel invocation  Like in the previous tutorial workflows in this practical use small data-sets and are designed to run in a few minutes. In the real world, you would be using larger data sets and workflows would typically run for longer

As you have already seen, Taverna can automatically iterate over sets of data. When 2 sets of iterated data are combined, however, Taverna needs extra information about how they should be combined. You can have: A cross product – combining every item from list 1 with every item from list 2 - all against all A dot product – only combining item 1 from list 1 with item 1 from list 2, and so on – line against line

Find and load the workflow ‘Demonstration of configurable iteration’ by Stian Soiland-Reyes from myExperiment  Read the workflow metadata to find out what the workflow does (by looking at the ‘Details’)  Run the workflow and look at the results

 Select the ‘ColourAnimals’ service and select the ‘Details’ in the workflow explorer and ‘configure list handling’  Click on ‘dot product’ in the pop-up window. This allows you to switch to cross product (see the next slide)  Then run the workflow again and look at the results

 What is the difference between the results of the two runs? What does it mean to specify dot or cross product? NOTE: The iteration strategies are very important. Setting cross product instead of dot when you have 2000 data items can cause large and unnecessary increases in computation!

e.g. red, green, blue, yellow How does Taverna combine them? e.g. cat, donkey, koala

Red Green Blue Yellow Cat Donkey Koala Red cat, red donkey, red koala Green cat, green donkey, green koala Blue cat, blue donkey, blue koala Yellow cat, yellow donkey, yellow koala

Red Green Blue Yellow Cat Donkey Koala Red cat Green donkey Blue koala There is no yellow animal because the list lengths don’t match!

 The default in Taverna is cross product  Be careful! All against all in large iterations give very big numbers! Exercise : Iteration

 From myExperiment, find the workflow ‘InterproScan without Looping’ by Katy Wolstencroft  InterproScan analyses a given protein sequence (or set of sequences) for functional motifs and domains  This workflow is asynchronous. This means that when you submit data to the ‘runInterproScan’ service, it will return a jobID and place your job in a queue (this is very useful if your job will take a long time!)  The ‘Status’ nested workflow will query your job ID to find out if it is complete

The default behaviour in a workflow is to call each service only once for each item of data – so what if your job has not finished when ‘Status’ workflow asks?  Run the workflow, using the default protein sequence and your own address (the EBI requires an academic address for it to run)  Almost every time, the workflow will fail because the results have not been returned before the workflow reaches the ‘get_results’ service

This is where looping is useful. Taverna can keep running the ‘status’ service until it reports that the job is done.  Select the ‘Status’ nested workflow and right-click. Select ‘Configure running’ and then 'Looping...' from the drop- down list (you could also just click on ‘details’ in the workflow explorer).

 If you clicked on ‘details’ in the workflow explorerSelect ‘advanced’ and click on ‘add looping’  Use the drop-down boxes in the looping window to set ‘getStatus_output_status’ ‘is_not_equal_to’ RUNNING

 Save the workflow and run it again  This time, the workflow will run until the ‘Status’ nested workflow reports that it is either DONE, or it has an ERROR.  You will see results for ‘TextResults’, but you will still get an error for ‘Graphical_results’. This is because there is one more configuration to change – we also need ‘Control Links’

 A control link specifies that there is a dependency of one service on another even though there is no data flowing between them.  A control link is a line with a white circle at the end that connects two services (see the link between the ‘Status’ nested workflow and ‘get_Result_input’)

 We will add control links to the other output type  Right-click on getResult_graphical_input and select ‘Run after’ from the drop down menu.  Set it to ‘Run after’ -> ‘Status’  Save and run the workflow  Now you will see each result returned

 Web services can sometimes fail due to network connectivity  If you are iterating over lots of data items, you can guard against these temporary interruptions by adding retries to your workflow  In the Workbench select the service “create lots of strings” (from “test”) and add it to the workflow; then select “sometimes fails”

 Add an output port and connect the service as on the picture below  Run the workflow as it is and count the number of failed iterations

 Now, select the ‘sometimes_fails’ service and select the ‘details’ tab in the workflow explorer panel  Click on ‘advanced’ and ‘configure’ for retries  In the pop-up box, change it so that it retries each service iteration 2 times  Run the workflow again – how many failures do you get this time?  Change the workflow to retry 5 times – does it work every time now?

 If Taverna is iterating over lots of independent input data, you can improve the efficiency of the workflow by running those iterated jobs in parallel  Run the Retry workflow again and time how long it takes  Go back to the design window, right-click on the ‘sometimes_fails’ service, and select ‘configure running’  This time select ‘Parallel jobs’ and change the maximum number to 20  Run the workflow again  Does it run faster?

 Setting parallel jobs makes your workflows run faster, but you should be careful if you are using remote services. Sometimes they have policies for the number of concurrent jobs individuals should run (e.g. The EBI ask that you do not submit more than 25 at once).  If you exceed this number, your service invocations may be blocked by the provider. In extreme cases, the provider may block your whole institution!