USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Points of Discussion Our plans Our requirements What GLOBUS can do for us –The GLOBUS toolkit –Using the GLOBUS tools Pros and Cons of using GLOBUS? Any other options to consider? Conclusions
Our Requirements A common infrastructure to avoid –Repetition of code –Only bits and pieces and no structured tool set –Inconsistencies in interfaces Security Performance Inline with standards (existing & upcoming)
Executing the plan Setting up an information service for the shared resources –CPU power –Storage capability –Network status Add security features Managing resource allocation for remote requests Fault detection mechanisms
The GLOBUS toolkit Provides basic services for a computational GRID infrastructure. Toolkit components include –MDS –GRAM, RSL, DUROC –GSI –GASS, GEM, RIO –GloPerf, HBM
Using GLOBUS Toolkit - 1 Information Service –MDS (Metacomputing Directory Service) Provides static and dynamic information about compute resources, network performance, etc.. Yellow pages: List all computers of a particular class or with a certain property. White pages: Look up IP address, memory, CPU power of a particular machine… Information stored in a set of LDAP servers (using LDAPv3 with referrals).
MDS Usage –Enroll in MDS –Initialize and Populate Build the Directory Information Tree (DIT) Define the object class values –Visualizing MDS MDS object class browser MDS explorer Command line searches Using GLOBUS Toolkit – 1a Mapping resources to DIT
Security –GSI (GLOBUS Security Infrastructure: PKI + X.509 certificates + proxies) –GSI enabled SSH –GSI enabled FTP Usage –exchange certificate, authenticate & delegate –check grid-map file –check services –run service program e.g job manager Using GLOBUS Toolkit - 2 Steps to GLOBUS authentication
Resource Management –GRAM (GLOBUS resource allocation manager) –DUROC (Dynamically updated request online co- allocator) Usage –globusrun, globus-duroc –RSL (Resource Specification Language) –GRAM client, myjob and jobmanager APIs and DUROC libraries for application development Using GLOBUS Toolkit - 3 Application Information service DUROC GRAM LSF EASY-LL NQE Local resource managers queries information RSL A co-allocation multi request
Remote File Access –GASS (Global access to secondary storage) –GEM (GLOBUS executable management ) Usage –File access API –Cache Management API –GASS Client API –GASS Server EZ API Using GLOBUS Toolkit - 4 GASS server GASS Client Data store Compute Resource Data request data results Job request GEM WAN Accessing remote data and executable management
Communication –Globus I/O Communication library –NEXUS (now obsolete) –MPICH-G (Grid-enabled MPI – not for us) Fault Detection –HBM (Heart Beat Monitor) Using GLOBUS Toolkit - 5
Pros –Using GIS to obtain resource information Search can extend to directories containing multiple hosts using filters No need to know OS commands on every host. Same data format for the info of compute & network resources Pros and Cons of using GLOBUS ou = CERN/IT o = CERN hn = ‘abc’ hn = ‘lmn’ hn = ‘xyz’ search results WAN Searching Directories
...Pros –Local Delegation authenticate once using proxies. –Submitting a job in parallel to multiple hosts using DUROC –Ease of use in general –A common infrastructure to build on Pros and Cons of using GLOBUS Proxy WAN ANL USC NASA Authenticate once using proxy
Cons –Will become more evident as we start using the toolkit. –More content today for computation grids than data grids. Pros and Cons of using GLOBUS
Status of GLOBUS work on data grids –Grid storage API defined. Interface to storage systems including local file access, HTTP servers and DPSS network caches Provides create, delete, open, close, read, write operations on file instances Support for storage to storage transfers –Simple replica management and meta-data services Store and query attribute info about file instances, storage systems and replica catalogs using MDS Pros and Cons of using GLOBUS
Any other options to consider Should have the same core infrastructure with applications built on top. Looking at: –Legion meta computing system –Condor high throughput computing facilities –CORBA architecture –Netbroker for network management issues
Conclusions The GLOBUS toolkit can provide us with the core infrastructure required GLOBUS provides us with most required services except the remote data management. We will need to develop application specific tools around these core services