Download presentation
Presentation is loading. Please wait.
Published byClinton Chapman Modified over 9 years ago
1
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC miguel.branco@cern.ch
2
10/05/2004Don Quijote - Status & Plans2 Overview Don Quijote o New Focus Functionalities o POOL Architecture Current Status o NorduGrid o US Grid 3(+) o LCG-2 o Integration with ATLAS prodsys Future plans
3
10/05/2004Don Quijote - Status & Plans3 Don Quijote Data Management for the ATLAS Automatic Production System Allow transparent registration and movement of replicas between all grid “flavors” used by ATLAS o US Grid o Nordugrid o LCG o (support for legacy systems might be introduced soon) Avoid creating yet another catalog o which grid middleware wouldn't recognize (e.g Resource Brokers) o use existing catalogs and data management tools o find common features between tools and catalogs o bridge them and provide a unified interface Accessible as a service o lightweight clients
4
10/05/2004Don Quijote - Status & Plans4 Don Quijote – new focus Provide a single tool to end-users to manage data files o Integrates all tools that users would have to know about into a single one. E.g.: FCpublish, FCregister, … (POOL File Catalogs) edg-rm, edg-rmc, edg-lrc, … (EDG) globus-rls-cli, globus-url-copy, … (Globus) ldapsearch, … (querying information system) rfdir, rfcp, … (common use of Castor) Acts as a POOL-aware Replica Manager Eases security requirements for end-users o Temporarily!
5
10/05/2004Don Quijote - Status & Plans5 Functionalities Replica Catalogs Manipulation File Movement LPN = Logical Collection Name + Logical File Name (unique) search | fullSearch | searchHosts ( lpn ) add[Restricted] ( lpn, url [, guid, fsize, md5sum ] ) addTemporary[Restricted] ( lpn, url, nrhours [, guid, fsize, md5sum ] ) keepUntil ( url, nrhours ) makePermanent ( url ) removeReplica ( url ) remove ( lpn ) rename ( old lpn, new lpn ) stageOut( url ) getToDestination ( src SE, lpn, dest ) putToSE ( src turl, lpn, dest SE [, guid, md5sum] )
6
10/05/2004Don Quijote - Status & Plans6 Functionalities - POOL Integrates file movement with POOL XML File Catalogs o Uses DQ + POOL FC command line tools o Python scripts Use-cases: o Get local copy of file and generate or update corresponding PoolFileCatalog.xml (to provide input data and input POOL XML catalog for a job) o Copy and register a local copy of a file to a grid flavor given UUID in the local PoolFileCatalog.xml (to register output data from a job)
7
10/05/2004Don Quijote - Status & Plans7 Architecture Python Client o C++ client library o Configuration file indicating endpoint of each server Servers o Per grid-flavor o GSI and insecure o Configuration file User interface tool written in Python Servers and client library written in C++
8
10/05/2004Don Quijote - Status & Plans8 Changes on Server-side Why was server-side code rewritten? o Partly because of CMS experience Persistent connections were necessary Connection pooling mechanism Each request could not instantiate a connection to the grid catalog – too slow! o Partly from our initial experience Flexible security mechanism Either provide a single certificate for all, or delegate credentials Initial version: o A command line tool for each grid flavor with the same syntax and same “output” o Clarens server was forking out a process that executed the request by calling the command line tool o This proved to be inefficient and too restrictive – e.g. could not maintain persistent connections across multiple requests! Therefore, o Server code was built by extending the command line tools – each tool is now a daemon
9
10/05/2004Don Quijote - Status & Plans9 Current Status Current structure: DqCore DqFakePoolFileCatalog DqGlobusRls DqLcgPoolFileCatalog DqClassicReplicaAccessDqLcgReplicaAccess DqPoolRls DqConfigFile DqFactory DqInterfaceDqMonitor DqUI dms.py Python Module C++ Python wrapper (user interface) C++ Client Module DqLcgInfoService DqVdtInfoService DqNgInfoService DqServerLcg, DqServerNg, DqServerVdt
10
10/05/2004Don Quijote - Status & Plans10 NorduGrid Globus RLS 2.x Only Classic Storage Elements (GridFTP servers) Information System o Connects to LDAP o Special attributes in the RLS DqCore DqFakePoolFileCatalog DqGlobusRls DqClassicReplicaAccess DqConfigFile DqFactory DqInterfaceDqMonitor DqUI DqNgInfoService DqServerNg
11
10/05/2004Don Quijote - Status & Plans11 LCG-2 EDG/LCG RLS (v2.2) GFAL support: o SRM/Castor support o SRM/dCache support o Classic Storage Element support Information System: o LDAP-based (MDS) Native POOL Support o Using POOL-1.6.5 DqCore DqLcgPoolFileCatalog DqPoolRls DqLcgReplicaAccess DqConfigFile DqFactory DqInterfaceDqMonitor DqUI DqLcgInfoService DqServerLcg
12
10/05/2004Don Quijote - Status & Plans12 US Grid 3(+) Globus RLS 2.x DQ supports at the moment only Classic Storage Elements (GridFTP servers) No “information system” interface o DQ creates a “dummy” information system which consists of a local configuration file DqCore DqFakePoolFileCatalog DqGlobusRls DqClassicReplicaAccess DqConfigFile DqFactory DqInterfaceDqMonitor DqUI DqVdtInfoService DqServerVdt
13
10/05/2004Don Quijote - Status & Plans13 Integration with ATLAS prodsys Executors are using their “native” grid tools to do file registration o But are adding extra-metadata attributes required by DQ o This allows integration with DQ Windmill is using DQ o To locate replicas of files o Renaming of logical files to their final names (after validation) o This week: move files across grids so that each executor finds at least a replica of all files required by the jobs
14
10/05/2004Don Quijote - Status & Plans14 Future plans Better integration with POOL o Must come from end-users experience Better end-user documentation and support o For now, focus has been only on the Automatic Production System Get “best” replica (not high priority) o within a grid o between grids Monitoring o Still being discussed… Reliable transfer service o Using MySQL database to manage transfers and automatic retries
15
10/05/2004Don Quijote - Status & Plans15 Future plans Release command line tools appropriate for end-users o Request has been made to provide such tools for the Combined Test Beam effort Provide servers as Pacman-caches Much to improve o Reliability o Easy installation of client tool for users outside “grid” Get local copies of files to non-grid machine ? wrap in Pacman the minimal Globus GridFTP libraries As true interoperability comes, Don Quijote goes… o Common information schema & similar catalogs o Common interface to storage resource “managers”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.