Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The 17th Global Grid Forum - GGF17 Tokyo, Japan May 10-12, 2006
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Motivation and Objectives Working Group: Claudio Vuerli, Giuliano Taffoni, Valeria Manna, Andrea Barisani, Fabio Pasian Problem analysis related to the integration of astronomical applications in a Grid environment Description of astronomical tasks sequences with workflow languages to process complex compositions of astronomical applications
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Astrophysics and distributed computing Computational problems Theoretical Codes Computation and analysis of many TBs of images Database of catalogues of the detected elements data-mining tools - Data source and organization interaction - Geographically distributed resources (database, telescope, storage, etc) Request of distributed computing to provide: User virtualisation Resource virtualisation GRID
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Astrophysics Application Tasks : Database AccessDatabase Access Data ReductionData Reduction Pure ComputationalPure Computational Instrument MonitoringInstrument Monitoring
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Workload Management The task sequence of the astronomical application is managed by a Workload Management System that makes a task- job mapping User : –Submit a job –Create some script for the monitoring of job status –Create script to create some task connecting link –Recover I/O data Need to a Workflow Management System
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Workflow Management A workflow language allows to describe the execution flow of pipeline tasks : – Query astronomical catalogues – Data transfer – Synchronize job – Analyse the resulting data Mapping task → service
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Web Services Composition The composition of WS consists of providing logic around a set of interactions between the composition and the WS that participate in it. These interactions are simply invocations to the operations offered by the services in play. One approach to providing the control and data logic is the use of workflow specifying: –the execution order of operations from a collection of WS –the data shared among composed WS –partner involved –how the partner are involved in the flow process Service orchestration web services can interact with each other at the message level, including the flow logic and execution order of the interactions. For orchestration, the process is always controlled by the perspective of one of the parties. Service choreography typically associated with the public message exchanges that occur between multiple web services, rather than a specific business process that is executed by a single party. Each part involved in the process describes the part it plays in the interaction.
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, A workflow language: Bpel Bpel (Business Process Execution Language) is a workflow-based composition language for Web Services Standard Open Source Support the Service Orchestration Feature: –Interaction: define WSDL port types for each interface that is used in the workflow definition –Basic Activity: to allow for interaction with the applications being composed invoke, reply and receive –Structured Activity : conditional execution and iteration / recursion of a sequence of activities to manage process flow Likely to be adopted as EGEE standard
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Bpel: process flow
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Hypothetical Scenario using Bpel BPEL ENGINE CREAM EGEE SERVICE WEB SERVICES GRID/WEB SERVICES ASTROGRID SERVICES WF
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, WSIF Problem Currently the LCG middleware is not Web Service oriented and in GLite (middleware of EGEE project) the WM and the other services are expected to offer a Web Service interface. In this scenario the use of BPEL is not feasible In a BPEL workflow it is very useful to provide a set of services that are always available. Solution To use BPEL partner abstraction that are services that workflow needs to use and typically are mapped to Web (or Grid) services. It is possible to map some BPEL partners to locally implemented services as long as a WSDL port type for a partner is provided. WSIF (Web Service Invocation Framework) is a simple Java API for invoking Web services, enables to interact with abstract representations of Web services through their WSDL descriptions
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Current Scenario with WSIF WSIFWSIF BPEL ENGINE WSIF LCG WEB SERVICES GLOBUS 2x WEB SERVICES WF
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, Conclusions Astronomical applications described using workflow language A solution to integrate astronomical task sequences in gLite middleware Use of Service Oriented technology Use of Grid Service composition Integrated solution in the current middleware and in its future development
Enabling Grids for E-sciencE The 17th Global Grid Forum - GGF17 Tokyo, May 10-12, The End Thank you for your attention