Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Information service Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY 25-29 September 2006.

Similar presentations


Presentation on theme: "The Information service Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY 25-29 September 2006."— Presentation transcript:

1 The Information service Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY 25-29 September 2006

2 2 Introduction The Information Service (IS) provides information about the Grid resources and their status. This information is essential for the operation of the whole Grid, as it is via the IS that resources are discovered. The published information is also used for monitoring and accounting purposes. Two IS systems are used in gLite 3.0: the Globus Monitoring and Discovery Service (MDS), used for resource discovery and to publish the resource status. The Relational Grid Monitoring Architecture (RGMA), used for accounting, monitoring and publication of user- level information.

3 3 MDS The Globus MDS implements the GLUE Schema using OpenLDAP, an open source implementation of the Lightweight Directory Access Protocol (LDAP), a specialised database optimised for reading, browsing and searching information. No authentication is required for reading informations through ldap protocol. e.g. ldapsearch command or a graphical interface (LDAP browser/Editor) http://www-unix.mcs.anl.gov/~gawcr/ldap

4 4 The Information System (IS) provides information about grid resources and their actual status. Computing and Storage resources at a site report their static and dynamic status information via GRISes (Grid Resource Information Server)  one per grid element At each site an element called GIIS (Grid Index Information Server) collects information from all the site GRISes (From LCG2.3.0 site GIIS has been replaced by “local” BDII)  one per grid site The BDII (Berkley Database Information Index) queries the GIISes (of different sites) and acts as a cache storing information about the Grid status in its database. MDS System Overview

5 5 IS Components: GRISs, GIISs and BDII Each site can run a BDII. It collects the information coming from the GIISs At each site, a site GIIS collects the information given by the GRISs Local GRISes run on CEs and SEs at each site and report dynamic and static information Abbreviations: BDII: Berkeley DataBase Information Index GIIS: Grid Index Information Server GRIS: Grid Resource Information Server From LCG2.3.0 site GIIS has been replaced by “local” BDII

6 6 GLUE Schema The GLUE Schema (namely the LDAP implementation of the GLUE Schema) describes the Grid resources information that is stored by the Information System. Is there an object class hierarchy Each class contains a set of attributes.

7 7 GLUE Schema e.g. the GlueCEInfo objectClass General Info for the queue associated to the CE (objectclass GlueCEInfo) – GlueCEInfoLRMSType: name of the local batch system – GlueCEInfoLRMSVersion: version of the local batch system – GlueCEInfoGRAMVersion: version of GRAM – GlueCEInfoHostName: fully qualified name of the host where the gatekeeper runs – GlueCEInfoGateKeeperPort: port number for the gatekeeper – GlueCEInfoTotalCPUs: number of CPUs in the cluster associated to the CE ……..

8 How to query the IS? In order to query directly the IS elements two high level tools are provided. lcg-infositeslcg-info These tools should be enough for most common user needs and will usually avoid the necessary of raw LDAP queries.

9 lcg-infosites The lcg-infosites command can be used as an easy way to retrieve information on Grid resources for the most use cases. USAGE: lcg-infosites --vo options -v --is

10 lcg-infosites options

11 11 lcg-infosites The "lcg-infosites" command is actually just a perl script wrapping a series of LDAP commands and was developed to allow the user to retrieve information on Grid resources for the most common cases. Before beginning it is worth observing that "lcg-infosites" does not use your VOMS proxy certificate and hence all commands need to include the option "--vo gilda" the BDII defined into the LCG_GFAL_INFOSYS environment variable will be interrogated e.g. $echo $LCG_GFAL_INFOSYS $grid004.ct.infn.it:2170 --is option : BDII user wishes to query. e.g. $lcg-infosites –vo atlas ce --is prod-bdii.cern.ch

12 Obtaining information about CE $ lcg-infosites --vo gilda ce **************************************************************** These are the related data for gilda: (in terms of queues and CPUs) **************************************************************** #CPU Free Total Jobs Running Waiting ComputingElement ------------------------------------------------------------------------------------------ 4 3 0 0 0 cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-long 4 3 0 0 0 cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-short 34 33 0 0 0 grid010.ct.infn.it:2119/jobmanager-lcgpbs-long 16 16 0 0 0 grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long 1 1 0 0 0 grid006.cecalc.ula.ve:2119/jobmanager-lcgpbs-log 2 1 1 0 1 gildace.oact.inaf.it:2119/jobmanager-lcgpbs-short $ lcg-infosites --vo gilda ce -v 2 RAMMemory Operating System System Version Processor CE Name --------------------------------------------------------------------------------------------------------------------------------- 1024 SLC 3 P4 ced-ce0.datagrid.cnr.it 4096 SLC 3 Xeon cn01.be.itu.edu.tr 1024 SLC 3 PIII cna02.cna.unicamp.br 917 SLC 3 PIII gilda-ce-01.pd.infn.it 1024 SLC 3 Athlon gildace.oact.inaf.it 1024 SLC 3 Xeon grid-ce.bio.dist.unige.it

13 Obtaining information about SE $ lcg-infosites --vo gilda se ************************************************************** These are the related data for gilda: (in terms of SE) ************************************************************** Avail Space(Kb) Used Space(Kb) Type SEs -------------------------------------------------------------------------------------- 143547680 2472756 disk cn02.be.itu.edu.tr 168727984 118549624 disk grid009.ct.infn.it 13908644 2819288 disk grid003.cecalc.ula.ve 108741124 2442872 disk gildase.oact.inaf.it 28211488 2948292 disk testbed005.cnaf.infn.it 349001680 33028 disk gilda-se-01.pd.infn.it 31724384 2819596 disk cna03.cna.unicamp.br 387834656 629136 disk grid-se.bio.dist.unige.it

14 Listing the close Storage Elements $ lcg-infosites --vo gilda closeSE Name of the CE: cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-long Name of the close SE: cn02.be.itu.edu.tr Name of the CE: cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-short Name of the close SE: cn02.be.itu.edu.tr Name of the CE: grid010.ct.infn.it:2119/jobmanager-lcgpbs-long Name of the close SE: grid009.ct.infn.it Name of the CE: grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long Name of the close SE: testbed005.cnaf.infn.it

15 15 lcg-info QUERY & list of attributes: The "lcg-info" command is similar to the "lcg-infosites" except that it is used to list either CE's or SE’s satisfying a given set of conditions on their attributes and to print, for each of them, the values of a given set of attributes. HELPFUL for “Requirements” tag: This is very similar to the usage of the "Requirements" tag in a JDL file along with the command "glite-job-list-match". The "lcg-info" command can therefore be useful when constructing the "Requirements" tag in a JDL file.

16 lcg-info --list-ce [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list] lcg-info --list-se [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list] lcg-info --list-attrs lcg-info --help lcg-info usage

17 lcg-info options

18 $ lcg-info --list-attrs Attribute name Glue object class Glue attribute name MaxTime GlueCE GlueCEPolicyMaxWallClockTime CEStatus GlueCE GlueCEStateStatus TotalJobs GlueCE GlueCEStateTotalJobs CEVOs GlueCE GlueCEAccessControlBaseRule TotalCPUs GlueCE GlueCEInfoTotalCPUs FreeCPUs GlueCE GlueCEStateFreeCPUs CE GlueCE GlueCEUniqueID WaitingJobs GlueCE GlueCEStateWaitingJobs RunningJobs GlueCE GlueCEStateRunningJobs CloseCE GlueCESEBindGroup GlueCESEBindGroupCEUniqueID CloseSE GlueCESEBindGroup GlueCESEBindGroupSEUniqueID SEVOs GlueSA GlueSAAccessControlBaseRule UsedSpace GlueSA GlueSAStateUsedSpace AvailableSpace GlueSA GlueSAStateAvailableSpace Type GlueSE GlueSEType SE GlueSE GlueSEUniqueID Protocol GlueSEAccessProtocol GlueSEAccessProtocolType ArchType GlueSL GlueSLArchitectureType Processor GlueSubCluster GlueHostProcessorModel OS GlueSubCluster GlueHostOperatingSystemName Cluster GlueSubCluster GlueSubClusterUniqueID Tag GlueSubCluster GlueHostApplicationSoftwareRunTimeEnvironment Memory GlueSubCluster GlueHostMainMemoryRAMSize

19 List all the CE(s) in the BDII satisfying given conditions $ lcg-info –-vo gilda --list-ce --query 'TotalCPUs=10,OS=SL*' --attrs 'RunningJobs,FreeCPUs‘ - CE: dgt01.ui.savba.sk:2119/jobmanager-lcgpbs-long - RunningJobs 0 - FreeCPUs 10 - CE: dgt01.ui.savba.sk:2119/jobmanager-lcgpbs-short - RunningJobs 0 - FreeCPUs 10 - CE: dgt01.ui.savba.sk:2119/jobmanager-lcgpbs-infinite - RunningJobs 1 - FreeCPUs 10 - CE: gilda-ce-01.pd.infn.it:2119/jobmanager-lcgpbs-long - RunningJobs 0 - FreeCPUs 10

20 Print all the tags published by a specific query $ lcg-info –-vo gilda --list-ce --query 'CE=*grid-ce.bio.dist.unige.it*‘ --attrs ‘Tag’ LCG-2_1_1 LCG-2_2_0 LCG-2_3_0 LCG-2_3_1 LCG-2_4_0 R-GMA AFS CMS-1.1.0 ATLAS-6.0.4 GATE-1.0.0-3 LHCb-1.1.1 IDL-5.4 CMSIM-125 ALICE-4.01.00 ALIEN-1.32.14 POVRAY-3.5 DEMTOOLS-1.0 CSOUND-4.13 MPICH VIRGO-1.0 CMS-OSCAR-2.4.5 LHCb_dbase_common-v3r1 GEANT4-6 VLC-0.7.2 EGEODE-1.0 RASTER3D SCILAB-2.6 G95-3.5.0 MAGIC-6.19 CODESA3D-1.0 VO-gilda-slc3_ia32_gcc323 VO-gilda-CMKIN_5_1_1 VO-gilda-GEANT VO-gilda-GKS05

21 21 List the CEs with a particular SW $ lcg-info –-vo gilda --list-ce --query ‘Tag=*MPICH*’ --attrs ‘CE’ - CE: cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-long - CE cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-long - CE: cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-short - CE cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-short - CE: grid010.ct.infn.it:2119/jobmanager-lcgpbs-long - CE grid010.ct.infn.it:2119/jobmanager-lcgpbs-long - CE: grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long - CE grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long - CE: ced-ce0.datagrid.cnr.it:2119/jobmanager-lcgpbs-long - CE ced-ce0.datagrid.cnr.it:2119/jobmanager-lcgpbs-long

22 22 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs'

23 23 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs' Filters out info which are not available to the gilda VO

24 24 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs' Lists CE (for the following query)

25 25 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs' We wish to find all sites that support the MPICH package

26 26 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs' We wish to display how many CPU's are available

27 27 lcg-info: useful query II lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE Filters out info which are not available to the gilda VO

28 28 lcg-info: useful query II lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE Lists SE (for the following query)

29 29 lcg-info: useful query II lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE We wish to find all entries containing ‘aliserv6.ct.infn.it’

30 30 lcg-info: useful query II lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE We wish to display the CloseCE attribute

31 31 Relational Grid Monitoring Architecture (R-GMA)  Provides Information (which resources are available on the Grid) and Monitoring Services  Developed as part of the EuropeanDataGrid Project (EDG)  Now as part of the EGEE project.  Implementation of the Grid Monitoring Architecture (GMA) from the Global Grid Forum (GGF). Uses a relational data model.  Data are viewed as tables.  Data structure defined by the columns.  Each entry is a row (tuple).  Queried using Structured Query Language (SQL). Introduction to R-GMA

32 32 Data are organised in relational tables, and inserted and queried with SQL-style INSERT and SELECT statements (the allowed syntax is a subset of SQL, but reasonably complete for most purposes). There are some differences to bear in mind. The most basic is that a standard relational database can only have one row (tuple) with a given primary key value, but R-GMA usually has more than one. Introduction to R-GMA II

33 33 Latest query Each tuple has a timestamp, and for a given primary key value you can query the most recent tuple. History query A history of all tuples within some defined retention period. Continuous query Streaming Introduction to R-GMA: Three different query types.

34 34 The data model is relational. The table definition is globally unique and is stored in the Schema. The Registry stores the Producers table name as well as the URL. The data is inserted in the form of a tuple. The Consumer gets the tuple from Producer. Producers publish: SQL “INSERT” Consumers collect: SQL “SELECT” Registry ProducerConsumer Execute or Stream data Schema Store Location Look up Location Relational GMA

35 35 To Start the R-GMA command line tool run the following command: >rgma On startup you should receive the following message: R-GMA Command Line Tool (1)

36 36 Commands are entered by typing at the rgma> prompt and hitting ‘enter’ to execute the command. A history of the commands executed can be accessed using the Up and Down arrow keys. To search a command from history use CTRL-R and type the first few letters of the command to recall. Command autocompletion is supported (use Tab when you have partly entered a command). Entering Command

37 37 General Commands help Display general help information. help Display help for a specific command. exit or quit Exit from R-GMA command line interface. Show tables Display the name of all tables existing in the Schema Describe Show all information about the structure of a table

38 38 Querying Data (1) Querying data uses the standard SQL SELECT statement, e.g.: rgma> SELECT * FROM GlueService The behaviour of SELECT varies according to the type of query being executed. In R-GMA there are three basic types of query: LATEST Queries only the most recent tuple for each primary key HISTORY Queries all historical tuples for each primary key CONTINUOUS Queries returns tuples continuously as they are inserted.

39 39 The type of query can be changed using the SET QUERY command as follow: rgma> SET QUERY LATEST or rgma> SET QUERY CONTINUOUS The current query type can be displayed using rgma> SHOW QUERY Querying Data (2)

40 40 Exercises 1.Display all the table of the Schema rgma>show tables 2.Display information about GlueSite table rgma>describe GlueSite 3.Basic select query on the table named GlueSite rgma>set query latest rgma>show query rgma>select Name,Latitude,Longitude from GlueSite

41 41 Maximum AGE of tuples The maximum age of tuples to return can also be controlled. To limit the age of latest or historical tuples use the SET MAXAGE command. The following are equivalent: rgma> SET MAXAGE 2 minutes rgma> SET MAXAGE 120 The current maximum tuple age can be displayed using rgma> SHOW MAXAGE To disable the maximum age, set it to none: rgma> SET MAXAGE none

42 42 Query Timeout The final property affecting queries is timeout. –For a latest or history query the timeout exists to prevent a problem (e.g. network failure) from stopping the query from completing. –For a continuous query, timeout indicates how long the query will continue to return new tuples. Default timeout is 1 minute and it can be changed using rgma>SET TIMEOUT 3 minutes or SET TIMEOUT 180 The current timeout can be displayed using rgma>SHOW TIMEOUT

43 43 Producer & Inserting Data The SQL INSERT statement may be used to add data to the system: rgma> INSERT INTO userTable VALUES (‘a’, ‘b’, ‘c’, ‘d’) In R-GMA, data is inserted into the system using a Producer component which handles the INSERT statement. Using the command line tool you may work with one producer at a time. The current producer type can be displayed using: rgma>show producer The producer type can be set using: rgma>set producer latest

44 44 Exercises 1.Insert and Select using Primary Producer to support Continuos + History query rgma>describe userTable rgma>set producer continuous rgma>insert into userTable values('cod','string',1.4,66) rgma>set query continuous rgma>set maxage 1 minutes rgma>set timeout 5 seconds rgma>select * from userTable

45 45 Secondary Producer To instruct the secondary producer to consume from table MyTable: rgma> SECONDARYPRODUCER userTable Like the producer, the secondary producer may be configured to answer latest and/or history queries: rgma> SET SECONDARYPRODUCER latest (By default the secondary producer can answer both latest and history queries. ) The current secondary producer type can be displayed using: rgma> SHOW SECONDARYPRODUCER

46 46 Exercises 2.Insert and Select using the Secondary Producer to support the latest query. rgma>set secondaryproducer latest rgma>secondaryproducer userTable rgma>show producers of userTable rgma>set producer continuous rgma>insert into userTable values ('cod','string',5.2,44) rgma>set query latest rgma>select * from userTable

47 47 Questions…

48 48 References GLITE 3.0 user guide manuals series https://edms.cern.ch/file/722398//gLite-3-UserGuide.pdf BDII server installation and configuration http://agenda.euchinagrid.org/askArchive.php?base=agenda&categ =a0615&id=a0615s0t5/transparencies The gLite Information System http://agenda.euchinagrid.org/askArchive.php?base=agenda&categ =a0615&id=a0615s4t10/transparencies RGMA Server Installation http://agenda.euchinagrid.org/askArchive.php?base=agenda&categ =a0615&id=a0615s0t7/transparencies


Download ppt "The Information service Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY 25-29 September 2006."

Similar presentations


Ads by Google