“Workflow” in Data Access and Integration An OGSA-DAI/DAIS Perspective Mario Antonioletti EPCC

Slides:



Advertisements
Similar presentations
Large-Scale, Adaptive Fabric Configuration for Grid Computing Peter Toft HP Labs, Bristol June 2003 (v1.03) Localised for UK English.
Advertisements

Tom Sugden EPCC OGSA-DAI Future Directions OGSA-DAI User's Forum GridWorld 2006, Washington DC 14 September 2006.
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
Experiences with Converting my Grid Web Services to Grid Services Savas Parastatidis & Paul Watson
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh Alan Chappell PNNL
1 OGSA-DAI Platform Dependencies Malcolm Atkinson for OMII SC 18 th January 2005.
What the Search Engines are up to Now: same ingredients different recipes Karen Blakeman RBA Information Services, UK 02 June 20141Karen Blakeman
PeopleSoft Ping David Kurtz
Web Service Composition Prepared by Robert Ma February 5, 2007.
RPC versus Documents Malcolm Atkinson Director of National e-Science Centre 1 st May 2003 IBM Almaden Research Centre DAIS WG Face-to-Face.
An Overview of OGSA-DAI Kostas Tourlas
© 2001 empolis UK1 Topic Maps, NewsML and XML: Possible Integration and Implementations. By Soelwin Oo.
WS Orchestration Eyal Oren DERI 2004/04/07
Debates in HE ASS 3. Aims To describe development of HE post- war To analyse the “widening participation” debate.
Pitching for finance Social Enterprise North West February 2014.
Good Salespeople johnpc ltd: John Cunningham.
Don’t go with the flow : Web services composition standards exposed
1 XML Web Services Practical Implementations Bob Steemson Product Architect iSOFT plc.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Slides thanks to Steve Lynden Amy Krause EPCC Distributed Query Processing with OGSA-DQP Principles and Architectures for Structured Data Integration:
Business Process Orchestration
Inside the GDS The Engine, Activities, Data Resource Implementations and Role Mapping EPCC, University of Edinburgh Tom Sugden First.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
1 An Introduction to OGSA-DAI Konstantinos Karasavvas 13 th September 2005.
Writing Perform Documents EPCC, University of Edinburgh Amy Krause ( Tom Sugden First International Summer.
17 July 2006ISSGC06, Ischia, Italy1 Agenda Session 26 – 14:30-16:00 An Overview of OGSA-DAI OGSA-DAI today – and future features How to extend OGSA-DAI.
David Harrison Senior Consultant, Popkin Software 22 April 2004
Course Instructor: Aisha Azeem
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
XML, distributed databases, and OLAP/warehousing The semantic web and a lot more.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Engr. M. Fahad Khan Lecturer Software Engineering Department University Of Engineering & Technology Taxila.
OGSA-DAI Architecture The OGSA-DAI Team
DAIT (DAI Two) NeSC Review 18 March Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids:
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
1 1 EPCC 2 Curtin Business School & Edinburgh University Management School Michael J. Jackson 1 Ashley D. Lloyd 2 Terence M. Sloan 1 Enabling Access to.
An Ontological Framework for Web Service Processes By Claus Pahl and Ronan Barrett.
OGSA-DAI.
Data access and integration with OGSA-DAI: OGSA-DQP Steven Lynden University of Manchester.
INFSO-RI Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.
The OGSA-DAI Client Toolkit The OGSA-DAI Team
State Key Laboratory of Resources and Environmental Information System China Integration of Grid Service and Web Processing Service Gao Ang State Key Laboratory.
Mike Jackson EPCC OGSA-DAI Architecture + Extensibility OGSA-DAI Tutorial GGF17, Tokyo.
Metadata Mòrag Burgon-Lyon University of Glasgow.
OGSA-DAI Neil Chue Hong 29 th January 2007 OGF19, Chapel Hill.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI Technology Update GGF17, Tokyo (Japan)
Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The.
BPEL Business Process Engineering Language A technology used to build programs in SOA architecture.
Course: COMS-E6125 Professor: Gail E. Kaiser Student: Shanghao Li (sl2967)
Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer Science Faculty of Information Technology.
OGSA-DQP Steven Lynden University of Manchester. Data access & integration with OGSA-DAI: GGF 17 2 Introduction OGSA-DQP is a service based distributed.
OGSA-DAI 简介及其它在 China-VO DAS 系统中的应用 杨阳 中国虚拟天文台研发团队 Chinese Virtual Observatory.
OGSA-DAI.
A service Oriented Architecture & Web Service Technology.
Servicing Seismic and Oil Reservoir Simulation Data through Grid Data Services Sivaramakrishnan Narayanan, Tahsin Kurc, Umit Catalyurek and Joel Saltz.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
CS 389 – Software Engineering
Chris Menegay Sr. Consultant TECHSYS Business Solutions
Optimising the OGSA-DAI Enactment Model
UK e-Science OGSA-DAI November 2002 Malcolm Atkinson
Software Design and Architecture
Chapter 2 – Software Processes
Service-centric Software Engineering
Chapter 2 – Software Processes
Query Optimization.
Chapter 5 Architectural Design.
Grid Systems: What do we need from web service standards?
Presentation transcript:

“Workflow” in Data Access and Integration An OGSA-DAI/DAIS Perspective Mario Antonioletti EPCC

e-Science Workflow Services - Talk Overview l Background: OGSA-DAI and DAIS l Motivation and Definitions l Hierarchies of Service Coordination l Conclusions

e-Science Workflow Services - OGSA-DAI and DAIS l GGF DAIS WG u Database Access and Integration Services u Attempting to standardise interfaces based on OGSI l OGSA-DAI u Aim to provide an implementation of DAIS u Serve UK e-Science Community l OGSA-DAI and DAIS u Currently not aligned l Data service interface in OGSA-DAI coarse grained u Based on an earlier version of DAIS l Data service interface in DAIS currently fine grained u Scope for more coarse grained interfaces u OGSA-DAI will realign DAIS once the latter stabilizes

e-Science Workflow Services - OGSA-DAI Project Partners Powered by ….

e-Science Workflow Services - Data Resource 1. Provides access to a data resource. Simple Data Service Scenario Client Data Service Data Resource 2. May provide integration of several data resources.

e-Science Workflow Services - Some Definitions l Data Resource u An object that can source/sink data u Currently databases in scope l Files and file systems may come in scope l Data Services u Grid services u Provides common interface to data resources u Exposes some capabilities of a data resource l SQL Queries, XPath, BinX, … u Can also provide additional capabilities l Transformations, Third party data delivery, etc …

e-Science Workflow Services - Motivation l Want common interfaces for: u Data access u Data integration l As requests to data service may produce lots of data u Want to minimise data movement l Hence encapsulate interactions with service u Serialise multiple interactions into one interaction u Abstract each interaction into an “activity” u Data flows between activities u Use a document mechanism to describe this l DAIS and OGSA-DAI u Concerned with data flow u Currently do not have control constructs l No looping, conditionals, splits, joins, …

e-Science Workflow Services - Service Coordination Patterns Client Data Service 1. Coordinate of activities performed at one Data Service. Data Service 2. Client choreographs a set of services to work together. Service … or a service may orchestrate on behalf of the client. 3. Orchestration of services using a document directed to one service. 4. Possibly interface with standard workflow languages, e.g. BPEL4WS, WSCI, …

e-Science Workflow Services - Coordination Hierarchies l Service coordination may take place: u Intra service l Document based u Inter services – application driven l Choreographed/orchestrated by a client or service u Inter service – document driven l Orchestration l Ideally would look the same as the intra service document based interface u Combined with other workflow languages

e-Science Workflow Services - Intra Service Processing l Service processing described by a document l Possible activities (OGSA-DAI perspective): u Statement l SQL Query, XPath Query u Delivery l Input data from third party l Output data to a third party l Deliver data in the response u Transformations l XSL Transformations, compression l OGSA-DAI has produced a framework for this

e-Science Workflow Services - Simple Example: no data flow sqlQueryStatement DeliverToURL select * from myTable where id=10

e-Science Workflow Services - Simple Example: with data flow DeliverToURL select * from myTable where id=10 sqlQueryStatement

e-Science Workflow Services - The Perform Document <gridDataServicePerform xmlns=" xmlns:xsi=" xsi:schemaLocation=" This example performs a simple select statement to retrieve one row from the test database. The results are delivered within the response document. select * from littleblackbook where id=10

e-Science Workflow Services - Predefined Building Blocks sqlQueryStatement sqlStoredProcedure sqlUpdateStatement sqlBulkLoadRowset xPathStatement xUpdateStatement xQueryStatement xmlResourceManagement xmlCollectionManagement relationalResourceManager gzipCompression zipArchive xslTransform inputStream outputStream DeliverFromURL DeliverToURL DeliverToGFTP DeliverFromGFTP DeliverToStream DeliverFromGDT DeliverToGDT

e-Science Workflow Services - Activities: positives l Simple sequence pattern u Data-flow l Avoid multiple message exchanges l Minimise data movement l Extensible u XML Schema excerpt gives syntax u Associate an implementation with activity u Done at configuration l Allows optimisation u Enactment engine can optimise interaction

e-Science Workflow Services - Activities: negatives l Incomplete syntax u Activity inputs and outputs are not typed u No typing of data streams u Possible issue in coming up with a sensible document l Activity implementation & XML schema loosely coupled u Keeping activity and implementation in synch l Semantics are not specified l Puts work load on the server u Workloads on the server may need to be managed l Activities not exposed at the interface level u This may change in line with DAIS l Perform document factored out from DAIS base specs u Standardisation to become a DAIS informational document u Scope may be bigger than DAIS

e-Science Workflow Services - Inter Service Application Defined "Workflow" l Services stitched together by an application u Could be a client l Use the OGSA-DAI GridDataTransport (GDT) portType u Could be another service l Distributed Query Processing (DQP) l Service configured separately u Each performs its part in the workflow

e-Science Workflow Services - Client Driven Scenario (aka poor man's data integration) Client Data Service … … GDT Client creates Data Services.

e-Science Workflow Services - Service Driven Scenario Client Query planning, compilation, scheduling, evaluation, partitioning GDQSGQES Evaluate sub-queries Distributed Query Processing

e-Science Workflow Services - More Complex DQP Scenario

e-Science Workflow Services - Application Driven "Workflow" l Labour intensive u Client driven (service choreography) l Restricted to small numbers of services u Need tooling u Even then this is best done through other means u Service driven (service orchestration) l DQP hides details l There may be other examples … l Need to explore this space further u Can probably accommodate these patterns in an existing workflow language l For more general data integration need: u Describe more sophisticated behaviour

e-Science Workflow Services - Inter Service Document Coordination l Currently evolving l Document describes: u Sequence of operations that may span multiple services l Single document includes enough information to: u Run an expression on a source data service u Deliver the results to a target data service u Run and expression on the target data service l Informational document to be presented at GGF10

e-Science Workflow Services - A Dataset Example Client Data Service Request DataRequest.xsd … RemoteRequiredTable DataAccessRecipe.xsd … … Data Service

e-Science Workflow Services - Document Driven "Workflow" l Work in this area is tentative u No implementations as yet l OGSA-DAI needs to see how it matures u Shows versatility l Carries over some of the OGSA-DAI activity framework u Focused on data l Can track provenance in the dataSet l Needs to be positioned against general workflow languages

e-Science Workflow Services - Traditional Workflow l OGSA-DAI has not explored this space … yet u May need such a framework to facilitate data integration l Traditionally workflow: u Revolves around the execution of atomic activities u Use a processing model, e.g. WfMC based l Akin to how people talk about service orchestration l Want to use existing frameworks as far as possible u OGSA-DAI does not want to define its own workflow u DAIS may come up with something l Clearly: u Activity model can be used to implement a workflow u Collecting use cases

e-Science Workflow Services - Workflow Issues l OGSA-DAI needs to play to see what works l Standards still evolving u IP rights: l BPEL4WS u Royalty-free … ? l WSCI u Royalty-free l Need workflow engines l Tooling to construct workflow u Ptolemy II … Triana … ?

e-Science Workflow Services - Summary & Conclusions l Base standards in a state of flux u DAIS not settled down yet l If you don't like what you see get involved and change it u Document based interface needs to be re-worked l OGSA-DAI implemented simple "workflow" patterns u Successful for data access u Shied away from real workflow u Should try to use emerging standards if possible l Data integration will require workflow patterns u Need to examine use cases l Positioning of OGSA-DAI u Want it to be the leaves of your complex workflow graphs u Wrap your data sources and sinks l Try OGSA-DAI and feedback!

e-Science Workflow Services - Further information l The OGSA-DAI Project Site: u l The DAIS-WG site: u l OGSA-DAI Users Mailing list u u General discussion on grid DAI matters l Formal support for OGSA-DAI releases u u l OGSA-DAI training courses