ISWC 2005, Galway Seven Bottlenecks to Workflow Reuse and Repurposing Antoon Goderis Ulrike Sattler Phillip Lord Carole Goble University of Manchester.

Slides:



Advertisements
Similar presentations
© Geodise Project, University of Southampton, Applying the Semantic Web to Manage Knowledge on the Grid Feng Tao, Colin.
Advertisements

© Geodise Project, University of Southampton, Semantic Web based Content Enrichment and Knowledge Reuse in e-Science.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Designing, Executing and Reusing Scientific Workflows Katy Wolstencroft, Paul Fisher, myGrid.
Accelerating Time to Experiment – The myExperiment Approach to Open Science David De Roure Carole Goble Jiten Bhagat.
IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
WS-VLAM Introduction presentation WS-VLAM Semantic tools Systems, Networking, and Engineering group Institute of informatics University of Amsterdam.
Microsoft Research Faculty Summit David De Roure University of Southampton, UK.
Nadia Ranaldo - Eugenio Zimeo Department of Engineering University of Sannio – Benevento – Italy 2008 ProActive and GCM User Group Orchestrating.
Workflow discovery in e-science Antoon Goderis Peter Li Carole Goble University of Manchester, UK
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Storing and Retrieving Biological Instances with the Instance Store Daniele Turi, Phillip Lord, Michael Bada, Robert Stevens.
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
An integrative approach for attaching semantic annotations to service descriptions Luc Moreau, University of Southampton,UK.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
Jiten Bhagat University of myExperiment A Social VRE for Research Objects JISC Roadshow | February.
The my Grid project aims to provide middleware layers that make the Information Grid appropriate for the needs of bioinformatics. my Grid is building high.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Composing Models of Computation in Kepler/Ptolemy II Summary. A model of computation (MoC) is a formal abstraction of execution in a computer. There is.
1 Adapting BPEL4WS for the Semantic Web The Bottom-Up Approach to Web Service Interoperation Daniel J. Mandell and Sheila McIlraith Presented by Axel Polleres.
Evgeny Zolin, School of Computer Science, University of Manchester, UK, Andrey Bovykin, Department of Computer Science, University.
A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards, Alun Preece, Nick Gotts and Gary Polhill School.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Provenance in my Grid Jun Zhao School of Computer Science The University of Manchester, U.K. 21 October, 2004.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
June Amsterdam A Workflow Bus for e-Science Applications Dr Zhiming Zhao Faculty of Science, University of Amsterdam VL-e SP 2.5.
Deciding Semantic Matching of Stateless Services Duncan Hull †, Evgeny Zolin †, Andrey Bovykin ‡, Ian Horrocks †, Ulrike Sattler † and Robert Stevens †
CHESS seminar July 2005 Promoting reuse and repurposing on the Semantic Grid Antoon Goderis University of Manchester, UK CHESS seminar, 19 July 2005.
Composing Models of Computation in Kepler/Ptolemy II
Taverna and my Grid Basic overview and Introduction Tom Oinn
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
Agent Model for Interaction with Semantic Web Services Ivo Mihailovic.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
E-Science for the SKA WF4Ever: Supporting Reuse and Reproducibility in Experimental Science Lourdes Verdes-Montenegro* AMIGA and Wf4Ever teams Instituto.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
Distributed Aircraft Maintenance Environment - DAME DAME Workflow Advisor Max Ong University of Sheffield.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
Professor Carole Goble
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
© Geodise Project, University of Southampton, Knowledge Management in Geodise Geodise Knowledge Management Team Barry Tao, Colin Puleston, Liming.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Bioinformatics Workflows Chris Wroe (based on material from the myGrid team & May Tassabehji / Hannah Tipney Medical Genetics, St Marys)
The Astronomy challenge: How can workflow preservation help? Susana Sánchez, Jose Enrique Ruíz, Lourdes Verdes-Montenegro, Julian Garrido, Juan de Dios.
Applications and Requirements for Scientific Workflow Introduction May NSF Geoffrey Fox Indiana University.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Databases, Ontologies and Text mining Session Introduction Part 2 Carole Goble, University of Manchester, UK Dietrich Rebholz-Schuhmann, EBI, UK Philip.
Using DAML+OIL Ontologies for Service Discovery in myGrid Chris Wroe, Robert Stevens, Carole Goble, Angus Roberts, Mark Greenwood
Asymmetries in Retrieval of Gene Function Information Timothy B. Patrick, PhD 1, Lillian C. Folk, MS 2, Catherine K. Craven, MLS 3 1 Healthcare Administration.
The 10 Best Practices for Workflow Design BioVeL M6 Workshop Göteborg, May 10-11, 2012 Kristina Hettne, Marco Roos (LUMC), Katy Wolstencroft, Carole Goble.
ISMB Demo, 01 July 2009 Franck Tanoh University of Manchester, UK.
CIMA and Semantic Interoperability for Networked Instruments and Sensors Donald F. (Rick) McMullen Pervasive Technology Labs at Indiana University
Semantic Web unleashes your data! The Semantic Web will transform the use of content. Semantic Web – is an extension of the current web. Semantic Web.
Workflow-Driven Science using Kepler Ilkay Altintas, PhD San Diego Supercomputer Center, UCSD words.sdsc.edu.
MyGrid: Personalised Bioinformatics on the Information Grid Robert Stevens, Alan Robinson & Carole Goble University of Manchester & EBI, UK myGrid project.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Taverna, myExperiment and HELIO services Anja Le Blanc Stian Soiland-Reyes Alan Willams University of Manchester.
EBI is an Outstation of the European Molecular Biology Laboratory. Semantic Interoperability Framework Sarala M. Wimalaratne (RICORDO project)
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Databases, Ontologies and Text mining Session Introduction Part 2
Professor Carole Goble University of Manchester, UK
Development of the Amphibian Anatomical Ontology
Chaitali Gupta, Madhusudhan Govindaraju
GGF10 Workflow Workshop Summary
Presentation transcript:

ISWC 2005, Galway Seven Bottlenecks to Workflow Reuse and Repurposing Antoon Goderis Ulrike Sattler Phillip Lord Carole Goble University of Manchester

ISWC 2005, Galway Take home message New problem –Workflow reuse and repurposing is happening, how do we make it scale? Data: Survey of 6 e-Science middleware projects Requirements analysis: 7 bottlenecks –Creating a pool of process knowledge –Accessing this pool

ISWC 2005, Galway e-Science Support sharing and col-laboratories in science The world of distributed web services –A boom in services: e.g bio services in the my Grid project Pulled together as in silico experiments –Scientist-friendly workflow languages –Hard to build (>1 year!) –A boom in workflows? 100 workflows in my Grid, up to 50 services

ISWC 2005, Galway Evolving e-Science to a Web of Science? In silico experiments as commodities and know-how Share, reuse, repurpose – authoring time, quality and provenance collection Manchester, CS Manchester, Biology Newcastle, CS

ISWC 2005, Galway Scientists & developers 3 rd party annotation providers Scientists Discover existing work Edit workflow (repurposing actions) Try out workflow Register and annotate workflow and new services for reuse Deploy workflow Workflow by example Scientists & developers Maintain reuse/repurpose history Wroe, Goble, Goderis, Lord et al. Recycling workflows and services through discovery and reuse. CCPE 2005

ISWC 2005, Galway Analyze This

ISWC 2005, Galway Analyze This x #scientists x #workflows x #versions x #runs

ISWC 2005, Galway Workflow Web service Describes process Different workflow languages: BPEL, Scufl etc. SOAP/WSDL interface Orchestration/choreography of Web and web services Participant in a workflow Executable with workflow enactor Executable Can be published as a web or Web service

ISWC 2005, Galway Workflow reuse Web service reuse Reuse of editable processesReuse of encapsulated processes Repurpose / build on other people’s work Incorporate other people’s work Hackable; change data/control flow Parametrisable operations Discovery based on data/control flow Discovery based on WSDL operations Measures of aggregated task similarity and flow similarity Measures of task similarity

ISWC 2005, Galway Repurposing, discovery and composition Discovery –The process of finding, ranking and selecting existing resources Composition –The process of combining resources into a new working assembly –(auto-) discovery + (auto-) integration Repurposing –Auto discovery + manual integration –Need techniques for composition-oriented discovery Discovery supporting integration through rankings

ISWC 2005, Galway A field report of six projects –reuse by collaborators –personal reuse (versioning) –10 complex workflows –reuse of distributed execution models –intranet exchanges within large pharmas –150 Matlab functions, 10 scripts –reuse of function combinations

ISWC 2005, Galway A field report of six projects –reuse by collaborators –personal reuse (versioning) –10 complex workflows –reuse of distributed execution models –intranet exchanges within large pharmas –150 Matlab functions, 10 scripts –reuse of function combinations No support for comparing workflows! No third party reuse!

ISWC 2005, Galway 7 bottlenecks to reuse & repurposing Service availability Workflow interoperability Workflow rigidity Discovery model Process KA IP rights Ranking We are here

ISWC 2005, Galway Step 1: Collect as many workflows as possible Ranking Service availability Workflow interoperability Workflow rigidity Discovery model Process KA IP rights

ISWC 2005, Galway Ranking Service availability Workflow interoperability Workflow rigidity Discovery model Process KA IP rights Step 2: Make this collection usable

ISWC 2005, Galway Ranking Service availability Workflow interoperability Workflow rigidity Discovery model Process KA IP rights e-Science community Semantic Web community? Wanted: technology providers

ISWC 2005, Galway The bottlenecks, in more detail 1.Service availability – web services: Kepler actors, my Grid processors, Inforsense services –Local services: Web enable, encode, repository 2.Intellectual property rights –Anonymization; journal policies 3.Workflow rigidity –Evolution and adaptation: parametrisation

ISWC 2005, Galway 4 The nice thing about workflow standards… Workflow languages abound Out of 6 projects, 5 do not use BPEL Behavioural semantics left implicit, as a feature Repurposing in case of multiple workflow systems –outside system boundaries –and across Benesh notationLaba notation

ISWC 2005, Galway Bring out the behavioural semantics –Comparing 3 projects through workflow patterns E.g. simple merge –Scientific workflows use functional programming patterns –How do these combine into different distributed execution models? –WSMO/SWSI/OWL-S? 4 The nice thing about workflow standards…

ISWC 2005, Galway How to retrieve existing scientific workflows? –Scientists & developers facing distributed programs For scientists? Data flow discovery, in jargon, largely abstracting from control ACAAGATGCCATTGT For developers? Control flow discovery, largely abstracting from data –Workflow patterns, Kepler distributed execution models Process networks, process algebra, Petri nets… 5 What belongs in the discovery model? = ? ?

ISWC 2005, Galway For scientists –WSMO Capability and OWL-S Profile clearly not intended for data flow-based queries –OWL DL: A-Box based workflow queries [Goderis+DL’05] For developers –Workflow patterns, Kepler distributed execution models Pattern example based retrieval An early table of combined execution models 5 What belongs in the discovery model?

ISWC 2005, Galway Who does the annotation? + + What should be in the annotation? –Workflow fragments Task aggregation/prediction “Service decomposition” –The things that went wrong! 6 New challenges in Knowledge Acquisition

ISWC 2005, Galway Who does the annotation? –Updated service ontology learning and automated service annotation techniques What should be in the annotation? –Workflow fragments “Service decomposition” –Cutting up service webs »Social network analysis (services as users!) –The things that went wrong Web site usability mining 6 New challenges in Knowledge Acquisition

ISWC 2005, Galway Repurposing  measuring integration effort Ranking data flow (in jargon) Structural edit distance E.g. services to remove/add/replace to equal 2 workflows For OWL workflow ontology, need abduction or off-line processing Ranking control flow Relationship between control flow constructs 7 Ranking workflow relevance

ISWC 2005, Galway Take home message Problem: Workflow reuse and repurposing is happening, how do we make it scale Data: Survey of 6 e-Science middleware projects Requirements analysis: 7 bottlenecks –Creating a pool of process knowledge Workflow interoperability –Accessing this pool of knowledge Workflow discovery, KA and ranking

ISWC 2005, Galway Acknowledgements This work is supported by the UK e-Science programme EPSRC GR/ R The authors would like to acknowledge the myGrid team. Hannah Tipney developed the Williams’ syndrome workflow and is supported by The Wellcome Foundation (G/R: ). We thank the survey interviewees for their contribution: Chris Wroe, Mark Greenwood and Peter Li ( my Grid), Ilkay Altintas (Kepler), Vasa Curcin (InforSense), Ian Wang (Triana), Colin Puleston (Geodise) and Ben Butchart (Sedna). Sean Bechhofer provided useful comments on the draft.