Protocols in Workflow
Membership First session –Chair: John Brooke –Note Takers: Raj Bose, Mario Antonioletti –Geoff Lusted –Michael Burns Second Session (above plus) –Peter Furniss –Jon Blower –Denise Ecklund –Martin Craig
Scope What do we mean by a protocol? –SOAP is not sufficient for using large data sets within Web/Grid services context –Concentrate on scientific based workflow –Protocols used to communicate data between activities, e.g. GridFTP, http, ftp, sockets … –Context The context determines some of the detail
Data sources/sinks Databases –accessing –querying –transferring data rates state (of transmission) Streaming –connection (possibly using sockets) –data rates Files & Formats –transferring –state
Control and data protocols We distinguished between protocols to do with workflow and control and protocols to do with data transfer. However the more we looked at it the more we realised that these are inherently linked in scientific workflows. We concentrated for a long time on data transfer between components of the workflow.
Security Issues Trust delegation –Globus model with proxy certificates –Unicore where all activities are pre-signed Static model –Need to specify the level of security of a workflow sub-workflow? whole of the workflow? –Trust chain in the workflow and encapsulation Trust between members in the chain Sometimes it's not the data that needs to be moved but the computation –Algorithms may be commercially sensitive
More Issues Level of encapsulation of the workflow –Level of workflow granularity required of the enactment, e.g. the security/state/type of data transfer between services Push vs Pull data transfer models –Implies an ordering in the process flow Does WS* have the capabilities of expressing everything that people want to do over the Grid –State issues, e.g. status of file transfer tasks.
Even More Issues Provenance Metadata Failure recovery/notification The issues here are to do with how far into the workflow protocols should these considerations extend. They are often built on lower level protocols which are built on lower level etc.. As in Simons fleas.
Next steps?? Need to examine more workflows, looked at OGSA-DAI, Unicore, GADS data serving web service but there are any more. We suggest that the issues of transfer of very large amounts of data in different ways is a differentiating area for scientific workflows. Security is also viewed differently so should look at this. Considerable overlap with concerns of other breakout groups.