Download presentation
Presentation is loading. Please wait.
Published byShawn Phillips Modified over 9 years ago
1
LHCb week, 27 May 2004, CERN1 Using services in DIRAC A.Tsaregorodtsev, CPPM, Marseille 2 nd ARDA Workshop, 21-23 June 2004, CERN
2
LHCb week, 27 May 2004, CERN2 DIRAC Services and Resources DIRAC Job Management Service DIRAC Job Management Service DIRAC CE LCG Resource Broker Resource Broker CE 1 DIRAC Sites Agent CE 2 CE 3 Production manager Production manager GANGA UI User CLI JobMonitorSvc JobAccountingSvc AccountingDB Job monitor InfomarionSvc FileCatalogSvc MonitoringSvc BookkeepingSvc BK query webpage BK query webpage FileCatalog browser FileCatalog browser User interfaces DIRAC services DIRAC resources DIRAC Storage DiskFile gridftp bbftp rfio
3
LHCb week, 27 May 2004, CERN3 GT3 attempt JobReceiver prototype Grid Service was done Service gets jobs, checks its integrity and stores in the Job DB (MySQL) Notifies Optimizers via Jabber Uses Tomcat and GT3
4
LHCb week, 27 May 2004, CERN4 DIRAC WM architecture WM S Source job (JDL) WMS -Submit (JDL) -Status job Job
5
LHCb week, 27 May 2004, CERN5 Globus Toolkit 3 (aka GT3) What is GT3? The Globus implementation of OGSI Really just the new version of Globus, prepared predominantly by Argonne National Labs (ANL) and IBM Re-implements much of Globus Toolkit v2.4, but in Java Based on Web Services, and runs from Tomcat Java Application Server
6
LHCb week, 27 May 2004, CERN6 OGSI vs. GT3 OGSA – idea “Physiology of the Grid” OGSI – formal standard (v1.0) Web Services – foundation for OGSI Concepts Implementations GT3 – Globus OGSI implementation Dirac Service Axis - Web Service framework Tomcat – java application server
7
LHCb week, 27 May 2004, CERN7 GT3 attempt results GT3 is currently difficult to install, configure, and administer took 2+ weeks to get GT3 fully installed and running properly discuss@globus.org mailing list is full of similar stories Service development and deployment is not straight forward Must define GWSDL, WSDD, XML Schemas, etc. (don’t worry about acronyms) Service performance is very low: More than a minute to submit one job Good news: Client side is not too complicated, and should be language independent.
8
LHCb week, 27 May 2004, CERN8 Other OGSI Implementations OGSI::Lite Perl version developed at Manchester pyGridWare 100% python OGSI implementation Currently: client maybe ready, server isn’t We are interested because Possibility of Ganga OGSI client Expect it to be lightweight, easy to deploy, robust, and high performance MUST CONFIRM THESE EXPECTATIONS
9
LHCb week, 27 May 2004, CERN9 XML-RPC protocol Standard, simple, available out of the box in the standard python library Both server and client Using expat XML parser Server Very simple socket based Multithreaded Client Dynamically built service proxy
10
LHCb week, 27 May 2004, CERN10 File catalog service The LHCb Bookkeeping was not meant to be used as a File (Replica) Catalog Main use as Metadata and Job Provenance database Replica catalog based on specially built views AliEn File Catalog was chosen to get a (full) set of the necessary functionality: Hierarchical structure: Logical organization of data – optimized queries; ACL by directory; Metadata by directory; File system paradigm; Robust, proven implementation Easy to wrap as an independent service: Inspired by the ARDA RTAG work
11
LHCb week, 27 May 2004, CERN11 AliEn FileCatalog in DIRAC AliEn FC SOAP interface was not ready in the beginning of 2004 Had to provide our own XML-RPC wrapper Compatible with XML-RPC BK File Catalog Using AliEn command line “alien –exec” Ugly, but works Building service on top of AliEn which is run by the lhcbprod AliEn user Not really using the AliEn security mechanisms Using AliEn version 1.32 So far in DC2004: >100’000 files with >250’000 replicas Very stable performance
12
LHCb week, 27 May 2004, CERN12 File catalogs MySQL AliEn FC AliEn UI XML-RPC server XML-RPC server AliEn FC Client AliEn FC Client ORACLE LHCb BK DB LHCb BK DB XML-RPC server XML-RPC server BK FC Client BK FC Client FC Client DIRAC Application, Service DIRAC Application, Service AliEn FileCatalog Service BK FileCatalog Service FileCatalog Client
13
LHCb week, 27 May 2004, CERN13 FileCatalog common interface addFile(lfn,guid,size,SE,pfn,poolType ) rmFile(lfn) rmFileByGuid(guid) addPfn(lfn,pfn,SE ) addPfnByGuid(guid,pfn) removePfn(lfn,pfn) getPfnsByLfn(lfn) getPfnsByGuid(guid) getGuidByLfn(lfn) getLfnByGuid(guid) exists(lfn) existsGuid(guid) getFileSize(lfn)
14
LHCb week, 27 May 2004, CERN14 Reliability issues Regular back-up of underlying database Journaling of all the write operations Running more than one instance of the service Configuration service running at CERN and in Oxford Running services reliably: Runit set of tools
15
LHCb week, 27 May 2004, CERN15 Runit service management tool http://smarden.org/runit/ Runit package features Automatic restart on failure Automatic logging of standard output with time/date stamps Automatic log rotation Cleanup scripts on process completion Simple interface to pause, stop, and send control signals to services Runs service in daemon mode Can be used entirely in user space Light-weight service management
16
LHCb week, 27 May 2004, CERN16 Using Instant Messaging Mechanism to pass messages across the network (ICQ, MSN, etc): Asynchronous, robust Client needs only outbound connectivity for bidirectional message exchange Jabber as IM protocol Open source, mature server implementation Using XML for message formatting Possibility to build a network of Jabber servers Load balancing, proxies Client API exists in many languages: Python included
17
LHCb week, 27 May 2004, CERN17 Using Instant Messaging (2) Initially used for WMS components communication: JobReceiver notifies Optimizers using Jabber Monitoring of Agents Using Chat Room paradigm where agents are updating their “presence” status Possibility of broadcasting messages to agents Monitoring Jobs Jabber client running near the job Access to the job std.out/err Interactive control channel to the job
18
LHCb week, 27 May 2004, CERN18 Using Instant Messaging (3) Services communicating via IM protocol XML-RPC over Jabber works just fine Agents deployed on the fly on a computing resource can work as a fully functional service With outbound connectivity only (no firewall problems) Using this paradigm to run jobs on LCG now Proxy for message passing is just a matter of yet another Jabber server Client and server are keeping connection open Authenticate once Scalability issues Problems when monitoring thousands of jobs
19
LHCb week, 27 May 2004, CERN19 Conclusions Building services from the existing components is rather easy; Secure services is not easy Having more than one implementation of services is a must; Instant Messaging is very promising for creating dynamically deployed, light-weight services – agents.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.