David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Contents DAC mandate Scope Strategy Scenario for first release Plans for the first release GANGA status DIAL status Deliverables for the first release Conclusions
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, ADA Strategy Implement DA as a collection of grid services As described in ARDA document Use ARDA components where possible Add missing and ATLAS-specific pieces Provide clients for ATLAS analysis environments Python, ROOT, command line Regular releases Perhaps for each SW week and ATLAS X.0 Provide useful tool Demonstrate functionality Expand functionality with each release
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, ADA Strategy (cont) Look to common projects for most of the pieces ARDA, GANGA, DIAL, … Share as much as possible with ATLAS production –Also distributed –Similar interfaces and code for bulk and user-level production ADA (ATLAS distributed analysis) must identify these pieces and tie them together Deployment ADA services must be deployed at relevant sites Provide testing and monitoring of these services Work with facilities to deploy and maintain –Also to develop facility-specific features
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, DIAL JDL High-level JDL DIAL envisions a hierarchy of schedulers Interface to these schedulers constitutes a high-level JDL (job definition language) –Job submission, monitoring and gathering of results –See figure Would like to standardize this JDL so schedulers can be shared between projects and experiments –See figure Exchanged objects have XML representations
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, User Analysis Job 1 Job 2 ApplicationTask Dataset 1 Scheduler 1. Create or locate 2. select3. Create or select 4. select 5. submit(app,tsk,ds) 6. split Dataset Dataset 2 7. create e.g. ROOT e.g. athena Result 9. fill 10. gather Result 9. fill ResultCode Components of DIAL high-level JDL
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, DIAL status: sharing via JDL
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Scheduler Web service interface Scheduler class has similar interface See DIAL JDL page for complete interface & WSDL Partial list (last argument is return) –has_application(XML app, bool stat) –add_task(XML app, XML tsk, bool stat) –submit(XML app, XML tsk, XML dst, XML jobid) –job(XML jobid, XML job) Provide clients with similar interface Command line C++ (imported in ROOT) Python (future from GANGA)
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Application Contents Name Version Corresponds to a software package Same name and version Provides two entry points –Build task –Process a dataset and generate a result Specifies dependencies (other packages) Eventually install with package management system –Now applications must be preinstalled
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Task Contents Collection of named files –Embedded text, PFN or LFN Usage Input to task build –Used only by application
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Dataset Contents Depend on type –See following class diagram for existing types Usage User –select from catalog (DSC) –query for content, # events, … System –locate accessible replica (DRC) –split –determine logical files for staging –extract application view (e.g. event collection) And more…
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Dataset classes Single combined ntuple file (input to application) User selects Used for splitting
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Result Content Depends on type Perhaps should be fixed as list of files –as for task Also need collection of results Usage Communicate results –From application to scheduler –From scheduler to user Provides code to merge with another result –This should probably move to a SW package >Then carry package name in result?
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Job Content Application, task, dataset –provenance Status (running, done, failed, …) Time of start, stop and last update Result –May be empty or partial Usage User –Check status –Access partial results
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Catalogs Two categories Repository –Provides access to XML description indexed by ID –Might be XML DB Selection –Enables user to select with query on catalog contents –Might be relational table Grid service interface User access only through grid service interface Same for access from other services? What granularity? –1 svc, 1 svc/catalog, …
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Catalogs (cont) Application Which applications are available Task Repository –Task XML accessible by ID –Part of provenance system Selection catalog? Application-task consistency?
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Catalogs (cont) Dataset Repository Selection catalog –Very important user interface –Only virtual datasets? Replica catalog –Virtual to non-virtual mapping Single file catalog File Replica catalog Result Repository
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Catalogs (cont) Job Repository –Maybe use selection catalog as repository Selection –Contents >Job ID >Application name and version >Task ID >Dataset ID (and type?) >Result ID (and type?) >Parent job ID >Status –Is this the same catalog that feeds the production supervisor?
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Connection to ATLAS production
David Adams ATLAS DIAL/ADA JDL ATLAS SW – Prod sessionDecember 4, Conclusions