The EDG Testbed The European DataGrid Project Team
The EDG Testbed - n° 2 Contents u User’s Perspective of the Grid u Grid Services u Hardware Components of an EDG Testbed u The EDG Testbed Configuration u How to set up an EDG Testbed n Obtaining code n Configuring different machines
The EDG Testbed - n° 3 A 3 Tier Business Architecture ClientApplication Server Data Server Request Result Data Request User Interface Computing Element/ Worker Nodes Storage Element On the EDG:
The EDG Testbed - n° 4 Situation on a Grid
The EDG Testbed - n° 5 Information Services u Hardware: n EDG Information Service n Information Providers u Data: n Replica Catalog u Software & Services: n EDG Grid Services: s Information Service n Application Services: s Currently only EDG applications directly supported Machine Types: u Information Service (IS) u Replica Catalog (RC)
The EDG Testbed - n° 6 Situation on a Grid Cont’d Info Service Information Providers Replica Catalog
The EDG Testbed - n° 7 Main EDG Grid Services u Authentication & Authorization u Job submission service n Resource Broker u Replica Management n Grid Data Mirroring Package (GDMP) n EDG-Replica-Manager (Globus Replica Manager) n Mass storage system support u Logging & Bookkeeping
The EDG Testbed - n° 8 EDG Logical Machine Types u User Interface (UI) u Information Service (IS) u Computing Element (CE) n Frontend Node n Worker Nodes (WN) u Storage Element (SE) u Replica Catalog (RC) u Resource Broker (RB)
The EDG Testbed - n° 9 Services per Machine Type DeamonUIISCE (frontend ) WNSERCRB Globus Gatekeeper Replica Catalog GSI-enabled FTPd Globus MDS Info-MDS Broker Job submission Information Index Logging & Bookkeeping Local Logger CRL Update Grid mapfile Update RFIO GDMP
The EDG Testbed - n° 10 A Simple Testbed Configuration User Interface Resource Broker Replica Catalog Information Service Storage Element 1 Storage Element 2 Computing Element 1 Computing Element 2 “CLOSE”
The EDG Testbed - n° 11 Testbeds Application Testbed: End-user Applications n Software: Stable, certified release (EDG 1.4.7) Certification Testbed: Extended, Detailed Testing n Software: Tagged release n State: Starting…; Collaboration with Testing Group/LCG. Development Testbed: Integration & Evaluation of SW n Software: Current tagged release + new pkg. New tagged release. n State: Active use; 5 sites involved. Development Machines: Testing of Middleware in Isolation n Software: Bleeding edge versions. n State: Varied; under control of middleware work packages.
The EDG Testbed - n° 12 DataGrid testbeds Application testbed: More than 1000 CPUs 5 Terabyte of storage EDG sw installed at more than 40 sites
The EDG Testbed - n° 13 Application Testbed Resources Since Last Year: n Improved software (EDG 1.4.7). n Doubled sites. More waiting… s Australia, Taiwan, USA (U. Wisc.), UK Sites, INFN, French sites, CrossGrid, … n Significantly more CPU/Storage. Hidden Infrastructure n MDS Hierarchy n Resource Brokers n User Interfaces n VO Replica Catalogs n VO Membership Servers n Certification Authorities SiteCountryCPU s Storage CC-IN2P3*FR GB CERN*CH GB CNAF*IT GB Ecole Poly.FR6220 GB Imperial Coll.UK92450 GB LiverpoolUK210 GB ManchesterUK915 GB NIKHEF*NL GB OxfordUK130 GB PadovaIT11666 GB RAL*UK6332 GB SARANL GB TOTAL GB *also Dev. TB; +200 TB including tape
The EDG Testbed - n° 14
The EDG Testbed - n° 15 Example IS Content Site: NIKHEF CE tbn09.nikhef.nl:2119/jobmanager-pbs-qlong: - PBS queue "qlong" with 96 hours time limit - Software installed: CMS ATLAS ALICE LHCb IDL-5.4 NIKHEF D0MCC There are 0 jobs running and 0 waiting, with 16 CPUs free Close SE tbn03.nikhef.nl with mount point /flatfiles CE tbn09.nikhef.nl:2119/jobmanager-pbs- qshort: - PBS queue "qshort" with 240 minutes time limit - Software installed: CMS ATLAS ALICE LHCb IDL-5.4 NIKHEF D0MCC There are 0 jobs running and 0 waiting, with 16 CPUs free Close SE tbn03.nikhef.nl with mount point /flatfiles SE tbn03.nikhef.nl close to 2 CEs: - tbn09.nikhef.nl:2119/jobmanager-pbs- qshort - tbn09.nikhef.nl:2119/jobmanager-pbs- qlong - VOs supported: alice atlas biomedical cms earthob lhcb iteam - gridftp on port rfio on port file Mb of free space
The EDG Testbed - n° 16 EDG Software Distribution u All software available as source & binary RPMs u Binaries for RedHat 6.2 and RedHat 7.3 u > 600 packages including n Complete globus distribution n EDG packages (~50 packages) n Support tools (perl, ant, jdk, …) u Pre-packaged for different machine types
The EDG Testbed - n° 17 Automatic EDG Fabric Management Setup Tasks u Node Installation & Management u Configuration Management Runtime Tasks u Monitoring & Fault Tolerance u Resource Management Runtime tasks may automatically trigger setup tasks n New machines join the grid n Failure detection/repair (e.g. restarting daemons)
The EDG Testbed - n° 18 LCFG (Local ConFiGuration system) u Developed at University of Edinburgh u Widely used fabric installation & configuration tool u Automated installation and configuration in a very diverse and evolving environment LCFG configuration files Compiler (mkxprof) Web Server XML Profile LCFG SERVER HTTP ldxprof Generic Component Generic Component rdxprof LCFG Components DBM File LCFG CLIENT Notif y UDP Acknowledge
The EDG Testbed - n° 19 Example LCFG Configuration File +inet.services telnet login ftp +inet.allow telnet login ftp sshd +inet.allow_telnet ALLOWED_NETWORKS +inet.allow_login ALLOWED_NETWORKS +inet.allow_ftp ALLOWED_NETWORKS +inet.allow_sshd ALL +inet.daemon_sshd yes auth.users myckey +auth.userhome_mickey /home/mickey +auth.usershell_mickey /bin/tcsh +inet.services telnet login ftp +inet.allow telnet login ftp sshd +inet.allow_telnet ALLOWED_NETWORKS +inet.allow_login ALLOWED_NETWORKS +inet.allow_ftp ALLOWED_NETWORKS +inet.allow_sshd ALL +inet.daemon_sshd yes auth.users myckey +auth.userhome_mickey /home/mickey +auth.usershell_mickey /bin/tcsh Config files , /home/MickeyMouseHome /bin/tcsh , /home/MickeyMouseHome /bin/tcsh XML profiles mkxprof
The EDG Testbed - n° 20 Fabric Monitoring & Fault Tolerance Sensor Collector agent Local Node Cache monitoring Decision unit Actuator agent Actuator Rule config Central Repository DB Consumer
The EDG Testbed - n° 21 Wrap Up u Logical machine types of an EDG Testbed u Mapping of services to logical machines u Example and current EDG Testbed configuration u Code distribution strategy u Fabric management strategy è How to set up an EDG Testbed
The EDG Testbed - n° 22 LCFG Installation Server setup: u Download rpms (perl + lcfg + apache) u Install rpms u Start http server (apache, …) u Create configuration files u Run mkxprof on them Client setup: u Download rpms (perl + lcfg) u Install rpms u Reboot (rdxprof will be started) Configuration management (server): u Update config files u Run mkxprof
The EDG Testbed - n° 23 EDG Machine Installation On the LCFG server: u Create directories for rpms u Download rpms from central edg repository u Create LCFG profile for each client machine: n Filename = hostname; includes machine type specific config file and site specific config file (needs to be customized!) n Example templates + rpm-lists are provided n Run mkxprof on each of these files On the LCFG clients: u Setup clients as describe before D O N E
The EDG Testbed - n° 24 Manual Setup (without LCFG) u Download rpms directly on machine (RPM-lists per machine type exist) u Install rpms u Configure individual services (see installation guide)
The EDG Testbed - n° 25 Issues when Adding new Sites to the Testbed u EDG is currently setting-up procedures explaining how to add new sites n Variations already tested with Taiwan and Romania n Step-by-step instructions produced which we expect to become simpler over time u Need to clarify the “minimum requirements” for a site to become a member of the testbed n A number of regular tasks have to be performed by the sites administrators n A maximum delay needs to be defined for responding to requests/problems if the testbed is to run efficiently u Sites from new countries have to identify/create a supporting CA n Since CAs need mutual trust this could lead to an explosion of inspection activities u Some tasks will fall on the people responsible for managing the VOs n HEP experiment secretariats already perform some level of authentication of their institutes and members. How an we get some leverage from this?
The EDG Testbed - n° 26 Summary u Logical machine types of an EDG Testbed u Mapping of services to logical machines u Example and current EDG Testbed configuration u Code distribution strategy u Fabric management strategy u How to obtains EDG software u How to automatically configure machines
The EDG Testbed - n° 27 Outlook u Release currently deployed u Release 2.0 (currently being deployed, rollout expected May 2003) will contain more advanced services n Advanced information systems (based upon relational databases) n Enhanced security n Optimization (resource broker and replica management) n Fabric management with monitoring, automatic fault detection & recovery
The EDG Testbed - n° 28 Further Information u EDG Testbed homepage: u Fabric management: fabric/ u LCFG on EDG Testbed information: oc/