K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission
20th April AJDL basics - AJDL (Analysis Job Description Language) provides for a high-level description of job content - Basic elements include: Application: executable to process data Task: user configuration of application Dataset: specifies input and output Configuration: resource requirements - AJDL aims to define way in which job information is transmitted between software components: Allow multiple clients to communicate with independently implemented services by defining a common interface
20th April Client-service communication via AJDL
20th April Working with AJDL - AJDL represents job data using XML - To do anything useful, need classes that interpret and manipulate the data C++ implementation of relevant classes provided by DIAL - User should need to know nothing (or very little) about AJDL, but will work with client tools that use AJDL in the background
20th April Analysis on a local batch system - User workflow for running a Gaudi/Athena analysis on a local batch system might typically include the following steps: 1) Write code (Gaudi/Athena algorithm) 2) Specify job options 3) Select dataset 4) Provide job-execution script 5) Submit job to batch system, for example using: bsub < myJobScript 6) Intermittently query job status, for example using: bjobs 7) When job completes, look at output
20th April Analysis on a distributed system - Aim is that user workflow for running Gaudi/Athena analysis with distributed resources and datasets should be no more complicated than for analysis on local batch system - In ADA, steps 1) to 4) of workflow become as follows; Steps 1) and 2) Define task Step 3) Define dataset Step 4) Define application - Client tools should help with these, and should deal with job submission, status queries and output retrieval
20th April AJDL and Ganga - Ganga already has its own job-description scheme, and and job-related classes, as do ATLAS and LHCb production systems Largely subjective as to which scheme is better - AJDL not needed internally by Ganga, but is essential for interacting with ATLAS analysis services - Two approaches possible: 1) Use current Ganga job description internally, and translate to/from AJDL as needed 2) Adopt AJDL internally also Second approach currently being investigated
20th April Using AJDL from Python (1) - Two approaches to supporting AJDL from Python: 1) Provide Python bindings for C++ implementation of DIAL Advantages: - Makes use of work already done - Little maintenance required (on Python side) - Demonstrates possibility to use Python with non-Python components Disadvantage: - Lose portability: rely on significant number of shared libraries, and must recompile on each platform
20th April Using AJDL from Python (2) 2) Write Python implementation of AJDL Advantages: - Portable Disadvantages: - Work has to start from scratch - Code must be maintained, to reflect any future changes in AJDL - Ideally, probably want both solutions: Python bindings for C++ implementation to give most complete functionality Minimal Python implementation to allow portability
20th April Binding DIAL C++ implementation of AJDL (1) - Start by providing Python bindings for DIAL C++ implementation of AJDL - From outside Python, use lcgdict command, provided by SEAL, to generate LCG dictionaries from C++ class header files - Compile dictionary files (suffix _dict.cpp) to produce shared-object libraries - Construct Python package, with initialisation file that uses PyLCGDict/PyLCGDict2 from SEAL to load libraries and allow class access
20th April Binding DIAL C++ implementation of AJDL (2) - Dictionaries and libraries successfully created for DIAL AJDL classes, with help from M.Marino - Have been able to create AJDL objects from Python/Ganga, and have tested some simple methods - Some limitations with current binding software: Not possible to import free functions Problem addressed in newer version of Reflection package? Overloaded << operator not mapped to __str__ Will find workaround to allow object printing Mapping between std::string and Python string Works in PyLCGDict, not yet available with PyLCGDict2 - More extensive testing to start soon
20th April Job submission to LCG - Submission of LHCb analysis jobs under LCG1 tested successfully outside of Ganga - Data for analysis copied in advance to preselected destination site - Needed only very simple JDL file: Executable = "/usr/bin/python2.2"; Arguments = "DaVinci.py"; StdOutput = "std.out"; StdError = "std.err"; InputSandbox = {"DaVinci.py","InstallDaVinci.py","SoftwareDistribution.py","RunDaVinci.py","DaVinci.opts"}; OutputSandbox = {"std.out","std.err"}; Requirements = other.GlueCEUniqueID == "farm012.hep.phy.cam.ac.uk:2119/jobmanager-lcgpbs-cgSq"; Python modules loaded in sandbox take care of installing required softwae and running analysis executable
20th April LCG submission from Ganga - In Ganga, could use job handler very similar to the one used for EDG submission (or an improved version of this) - For ATLAS, would want to add an AJDL interface - Need to understand if Ganga should continue to provide components for directly interacting with Grid/batch systems Might be better to use components developed for ATLAS and LHCb production systems