Bulk Data Copy Generalization Some DMI/JSDL overlap (this indeed might be out of scope of JSDL) Extensibility options / possibly some new requirements for recursive file/dir copying between multiple sources and sinks ?
In-Scope 1.Job Submission Description Language (JSDL) An activity description language for generic compute applications. 2.OGSA Data Movement Interface (DMI) Low level schema for defining the transfer of bytes between and single source and sink. 3.JSDL HPC File Staging Profile (HPCFS) Designed to address file staging not bulk copying. 4.OGSA Basic Execution Service (BES) Defines a basic framework for defining and interacting with generic compute activities: JSDL + extensible state and information models. 5.Others that I am sure that I have missed ! (…ByteIO) Neither fully captures our requirements (not a criticism, they are designed to address their use- cases which only partially overlap with the requirements for our bulk data copy activity). Other Condor Stork - based on Condor Class-Ads Not sure if Globus has/intends a similar definition in its new developments (e.g. SaaS) anyone ? – I believe Ravi was originally supportive of a DMI for data transfers between multiple sources/sinks
Stork – Condor Class Ads Example of a Stork job request: [ dest_url= "gsiftp://eric1.loni.org/scratch/user/"; arguments = p 4 dbg vb"; src_url = "file:///home/user/test/"; dap_type = "transfer"; verify_checksum = true; verify_filesize = true; set_permission = "755" ; recursive_copy = true; network_check = true; checkpoint_transfer = true; output = "user.out"; err = "user.err"; log = "userjob.log"; ] Purportedly the first batch scheduler for data placement and data movement in a heterogeneous environment. Developed with respect to Condor Uses Condors ClassAd job description language and is designed to understand the semantics and characteristics of data placement tasks Recent NSF funding to develop as a production service
JSDL Data Staging 1 and the HPC File Staging Profile fileA overwrite true gsiftp://griddata1.dl.ac.uk:2811/myhome/fileA ftp://ngs.oerc.ox.ac.uk:2811/myhome/fileA … Define both the source and target within the same element which is permitted in JSDL. The HPC File Staging Profile (Wasson et al. 2008), limits the use of credentials to a single credential definition within a data staging element. Different credentials will be required for the source and the target. Maybe profile use of credentials within JSDL Source and Target ?
fileA MY_SCRATCH_DIR overwrite true gsiftp://griddata1.dl.ac.uk:2811/myhome/fileA e.g. MyProxyToken fileA MY_SCRATCH_DIR overwrite ftp://ngs.oerc.ox.ac.uk:2811/myhome/fileA e.g. wsa:Username/password token Coupled staging elements; A source data staging element for fileA and a corresponding target element for staging out of the same file. By specifying that the input file is deleted after the job has executed, this example simulates the effect of a data copy from one location to another through the staging host. No multiple data locations (alternative sources and sinks – we think this is kinda useful). Some more (proprietary?) elements required (e.g. DMI transfer requirements, file selectors, URI connection properties). Staging 2
OGSA DMI The OGSA Data Movement Interface (DMI) (Antonioletti et al. 2008) defines a number of elements for describing and interacting with a data transfer activity. The data source and destination are each described separately with a Data End Point Reference (DEPRs), which is a specialized form of WS-Address element (Box et al. 2004). In contrast to the JSDL data staging model, a DEPR facilitates the definition of one or more elements within a element. This is used to define alternative locations for the data source and/or sink. An implementation can select between its supported protocols and retry different source/sink combinations from the available list (improves resilience and the likelihood of performing a successful copy). There are some limitations: DMI is intended to describe only a single data copy operation between one source and one sink. To do several transfers, multiple invocations of a DMI service factory would be required to create multiple DMI service instances. We require a single (atomic) message packet that wraps multiple transfers (e.g. for routing through a message broker). Some of the existing constructs require extension / slight modification. Therefore: DMI/JSDL discussion at OGF to canvass some new possible? Extensions. Maybe build on DMI, and/or closer integration with JSDL data staging to describe a bulk copy activity.
+ <dmi:Data ProtocolUri= " " DataUrl="gsiftp://example.org/name/of/the/dir/"> <dmi:Data ProtocolUri=" urn:my-project:srm " DataUrl="srm://example.org/name/of/the/dir/">... Sink Details... ? ? ? ? DEPR defines alternative locations for the data source and/or sink and each nests its own credentials. Source (wsa:EndpointR eference type) Sink (wsa:EndpointR eference type) Transfer Requirements (needs extending) Bulk DMI Draft A pseudo-example Some overlap with jsdl data staging
Bulk Data Copy and JSDL Integration ? /usr/bin/datacopyagent.sh myBulkDataCopyDoc.xml... myBulkDataCopyDoc.xm... Possible? options for integrating the proposed document within JSDL; a) nesting within the element or b) staging-in of a document as input for the named executable? (ideas, advice…). Some (sketchy) integration options?
Or New staging requirements ? JSDL intended to be a generic compute activity description language. Rather than use a separate document to describe a bulk data copy activity, is it better to suggest some JSDL extensions to cater for bulk copying ? (ideas, advice…) Potentially a better route for more widespread adoption (e.g. existing BES implementations). Other thoughts: Orchestration of copy activities / DAG ?
Profile the OGSA BES state model for DMI sub-state specializations. Adds optional DMI sub-state specializations. Client/service may only recognize the main BES states if necessary. Suspend, resume, cancel. Add DMI fault types? Resume () Request Suspend () Request PendingFinished Cancelled Failed: Clean Unclean Unknown Running: Transferring Running: Suspended BES and DMI sub-state specialisations ? BES states DMI based sub-states Cancel ()
Message Model Requirements Document Message Bulk Data Copy Activity description Capture all information required to connect to each source URI and sink URI and subsequently enact the data copy activity. Transfer requirements, e.g. additional URI Properties, file selectors (reg-expression), scheduling parameters to define a batch-window, retry count, source/sink alternatives, checksums?, sequential ordering? DAG? Serialized user credential definitions for each source and sink. Control Messages Interact with a state/lifecycle model (e.g. stop, resume, cancel) Event Messages Standard fault types and status updates Information Model To advertise the service capabilities / properties / supported protocols