Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath, 2010-11-30

Slides:



Advertisements
Similar presentations
Pulan Yu School of Informatics Indiana University Bloomington Web service based Varuna.Net.
Advertisements

OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
Using Taverna to access SOAP-based web services Per Larsson CBR
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik University of Manchester.
European Life Sciences Infrastructure for Biological Information Rafael C Jimenez ELIXIR CTO EMBL-EBI workshop networks and pathways.
Designing, Executing and Reusing Scientific Workflows Katy Wolstencroft, Paul Fisher, myGrid.
A Systematic approach to the Large-Scale Analysis of Genotype- Phenotype correlations Paul Fisher Dr. Robert Stevens Prof. Andrew Brass.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
1 Richard White Design decisions: architecture 1 July 2005 BiodiversityWorld Grid Workshop NeSC, Edinburgh, 30 June - 1 July 2005 Design decisions: architecture.
Software for the Data-Driven Researcher of the Future Dr. Paul Fisher
CoMPAS Pro: Comprehensive Meta Prediction and Annotation Services for Proteins Sebastian J. Schultheiß Christoph Malisi.
CS 290C: Formal Models for Web Software Lecture 1: Introduction Instructor: Tevfik Bultan.
Workflows Information Flows Prof. Silvia Olabarriaga Dr. Gabriele Pierantoni.
Create with SharePoint 2010 Jen Dodd Sr. Solutions Consultant
Provenance in my Grid Jun Zhao School of Computer Science The University of Manchester, U.K. 21 October, 2004.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik University of Manchester materials by Dr Katy Wolstencroft and Dr Aleksandra.
WorkPlace Pro Utilities.
Taverna and my Grid Basic overview and Introduction Tom Oinn
An Introduction to Designing, Executing and Sharing Workflows with Taverna Nowgen, Next Gen Workshop 17/01/2012.
14/11/11 Taverna Roadmap Shoaib Sufi myGrid Project Manager.
Software from Science for Science Steven Newhouse, Director.
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
Fundamentals of Database Chapter 7 Database Technologies.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
An Introduction to Designing and Executing Workflows with Taverna Katy Wolstencroft University of Manchester.
Copyright OpenHelix. No use or reproduction without express written consent1.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,
A framework to support collaborative Velo: Knowledge Management for Collaborative (Science | Biology) Projects A framework to support collaborative 1.
Building and Running caGrid Workflows in Taverna 1 Computation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL, USA 2 Mathematics.
SADI and Taverna 2 Tutorial David Withers. Preamble The Taverna 2 platform is constantly changing; while the look and feel of the workbench may change,
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
An Introduction to Designing and Executing Workflows with Taverna Aleksandra Pawlik materials by: Katy Wolstencroft University of Manchester.
Tom Oinn, In general a grid system is, or should be : “A collection of a resources able to act collaboratively in pursuit of an overall.
Anil Wipat University of Newcastle upon Tyne, UK A Grid based System for Microbial Genome Comparison and analysis.
Implementing computational analysis through Web services Arnaud Kerhornou CRG/INB Barcelona - BioMed Workshop IRB November 2007.
1 Taverna CISTIB Ernesto Coto Taverna Open Workshop, October 2014.
Cloud Implementation of GT-FAR (Genome and Transcriptome-Free Analysis of RNA-Seq) University of Southern California.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Technology behind using Taverna in caGrid caGrid user meeting Stian Soiland-Reyes, myGrid University of Manchester, UK
Bioinformatics Workflows Chris Wroe (based on material from the myGrid team & May Tassabehji / Hannah Tipney Medical Genetics, St Marys)
Introduction to Taverna Online and Interaction service Aleksandra Pawlik University of Manchester.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Using the ARCS Grid and Compute Cloud Jim McGovern.
Convert generic gUSE Portal into a science gateway Akos Balasko.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 14 Database Connectivity and Web Technologies.
Wrapping analytical services for caBIG Taverna-caGrid technical review meeting Stian Soiland-Reyes, myGrid University of Manchester, UK
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Jiro Sumitomo, James M. Hogan, Felicity Newell, Paul Roe Microsoft QUT eResearch Centre
An Introduction to Taverna caBIG monthly workspace call and Taverna, Franck Tanoh.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
V7 Foundation Series Vignette Education Services.
Designing, Executing and Sharing Workflows with Taverna 2.2 Katy Wolstencroft myGrid University of Manchester.
Taverna, myExperiment and HELIO services Anja Le Blanc Stian Soiland-Reyes Alan Willams University of Manchester.
The Taverna Software Suite Prof Carole Goble FREng FBCS CITP The University of Manchester, UK
An Introduction to Running, Reusing and Sharing Workflows with Taverna – part 2 Aleksandra Pawlik materials by Katy Wolstencroft University of Manchester.
Exploring Taverna 2 Katy Wolstencroft myGrid University of Manchester.
Designing, Executing and Sharing Workflows with Taverna 2.4 Different Service Types Katy Wolstencroft Helen Hulme myGrid University of Manchester.
CST 1101 Problem Solving Using Computers
Designing and Sharing Taverna Workflows: Exploring Taverna 2.1 Beta
Programmatic access to EMBL-EBI resources
Platform as a Service.
Taverna Tutorial exercise 2: REST services from BioCatalogue
Taverna workflow management system
An Introduction to Designing, Executing and Sharing Workflows with Taverna and myExperiment Katy Wolstencroft University of Manchester.
Shim (Helper) Services and Beanshell Services
Aleksandra Pawlik materials by Katy Wolstencroft
Web Application Development Using PHP
Presentation transcript:

Stian Soiland-Reyes myGrid, School of Computer Science University of Manchester, UK UKOLN DevSci: Workflow Tools Bath,

What is myGrid?  A n e-Science Collaboration Since 2001  Not a grid!  Numerous partners involved: University of Manchester University of Southampton University of Oxford EMBL-EBI  Provides sustainable and production quality software Supported by OMII-UK, EPSRC and BBSRC  Mixture of developers, bioinformaticians and researchers Software | Services | Content | Skills | Community

Motivation  Challenge: Bioinformatics Large amounts of data Many open questions Numerous freely available public datasets and analysis tools

Huge amounts of data 100+ Genes QTL regions Microarray Genes Next Gen Sequencing 10,000+ Genes How do I look at all the genes systematically?

Manual approach  Search using public web sites and databases Pubmed Uniprot EBI BioMart  Copy and paste to web tools for analysis NCBI Blast EBI InterPro  Further processing locally R Perl Python

Manual: disadvantages Scale of analysis task overwhelms researchers – lots of data User bias and premature filtering of datasets – cherry picking Hypothesis-Driven approach to data analysis Constant changes in data - problems with re- analysis of data Implicit methodologies (hyper-linking through web pages) Error proliferation from any of the listed issues – notably human error

Web services and workflows  Web services Technology and standards for exposing code and data resources that can be programmatically consumed by a remote third party Description on how to interact with the service, parameters, documentation  Workflows General technique for describing and executing a process Describe what you want to do running which services

Taverna workflows  A set of (local and remote) services to analyze or manage data  Nested workflows are also services  Data-links connects services i.e. output from service A is input to service B and C Describes the desired dataflow instead of process coordination  Automatic iterations  Can customize list handling and control links

What types of services?  Public/private/secured WSDL/SOAP web services  RESTful web services  Spreadsheet import  Command line tools (local/ssh)  Inline scripts (Beanshell, R)  Java APIs  Customizations: BioMart, BioMoby / SADI Soaplab Grid services (Globus, EGEE gLite, caGrid) … your tool (Plugin tutorial on wiki)

Which services?  Taverna is general, can connect to standard web services for any domain  Bioinformatics: From professional third-party organisations providing robust & open data/analysis services..to under-the-desk web services for one particular purpose, ran by PhD students  services from 130 providers – crowd sourced and quality monitoredhttp://biocatalogue.org/

Taverna workbench  Graphical desktop tool  No server installation required  Drag-and-drop services into diagram  Connect services, run, reconnect, rerun  Integrates diverse set of tools

Sharing workflows  myExperiment.org allows users to share, find, download and rate workflows myExperiment.org  “Facebook for the scientist”  3000 members, 1100 workflows

Extensible UI and engine  Plugins can provide new “perspectives” i.e.: BioCatalogue, myExperiment  Provide service-specific customization BioMart interface replicates web site  Adding new functionality Looping, branching, dynamic service resolution New service types Design helpers, “Find matching service”

Taverna 3 “Next-gen”  Under development for 2011 Interactive, component-centric and data-centric workflow design Pre-packaged workflow components Searching for workflow components from BioCatalogue and myExperiment New my Grid workflow components library

Taverna command line  Executes from a Windows/Linux/OSX shells  Takes a predefined workflow with files as inputs and outputs  Quick way to “productionize” a workflow

Taverna Server  REST/SOAP interface to execute workflows  Client libraries for Ruby and Java  Two demonstration web interfaces Ruby Java Portlets  Future Detailed execution support and control Security delegation

Taverna portlet  Example portlet implementation  Executes workflows using Taverna Server

Ruby web interface  Example customized web interface  Uses Ruby gem t2-server

Taverna on the cloud  Use-case: SNP analysis and annotation of genome sequenced from breeds of cows in Africa – why are some of them resistent to X? Amazon EC2 with Taverna Server and local services Custom (built-in-a-week) Ruby on Rails web interface Runs through 31 chromosomes in 6.5 hours using 10 instances - $26

Open source, open development  Taverna suite of tools are all open source and free to use  Large user community, active mailing lists  Lead developers: myGrid in Manchester  Contributors from across the world  PAL programme  myGrid provides training, tutorials and documentation

Acknowledgements

More information    