The Information service Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY 25-29 September 2006.

Slides:



Advertisements
Similar presentations
Hands-on on Information System Antonio Juan Rubio Montero CIEMAT 10 th EELA Tutorial. Madrid, May 7 th -11 th,2007.
Advertisements

FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
Hands-on on Information System Antonio Fuentes Bermejo Oviedo, 20 de Noviembre de 2006.
Hands-on on Information System Antonio Fuentes Bermejo Oviedo, 20 de Noviembre de 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Introduction to R-GMA: Relational Grid Monitoring Architecture.
Fourth EELA Tutorial for Managers and Users E-infrastructure shared between Europe and Latin America Hands-on on Information System (R-GMA)
The EU DataGrid – Information and Monitoring Services The European DataGrid Project Team
INFSO-RI Enabling Grids for E-sciencE Information System : a detailed overview Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos.
E-science grid facility for Europe and Latin America Installation and configuration of a top BDII Gianni M. Ricciardi – Consorzio COMETA.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Middleware: gLite Information Systems (IS) EGEE Tutorial 23 rd APAN Meeting,
INFSO-RI Enabling Grids for E-sciencE Information System Valeria Ardizzone INFN Singapore, 1st South East Asia Forum -- EGEE tutorial.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Information System (IS) Valeria Ardizzone.
EGEE is a project funded by the European Union under contract IST SEE-GRID tutorial, Istanbul, Information services.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
LCG Information and Monitoring System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
GLite Information System(s) Antonio Juan Rubio Montero CIEMAT 10 th EELA Tutorial. Madrid, May 7 th -11 th,2007.
E-infrastructure shared between Europe and Latin America 12th EELA Tutorial for Users and System Administrators gLite Information System.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America BDII Server Installation and Configuration Antonio Juan.
BDII Server Installation and Configuration Manuel Rubio del Solar Extremadura Advanced Research Center (CETA-CIEMAT) 11th EELA Tutorial for Users Sevilla,
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America BDII Server Installation and Configuration.
INFSO-RI Enabling Grids for E-sciencE
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
gLite Information System UNIANDES OOD Team Daniel Alberto Burbano Sefair, Michael Angel.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Practicals on Security – Infosys -- WMS.
E-infrastructure shared between Europe and Latin America gLite Information System(s) Manuel Rubio del Solar CETA-CIEMAT EELA Tutorial, Mérida,
Ninth EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America BDII Server Installation Yubiryn Ramírez.
The EU DataGrid – Information and Monitoring Services The European DataGrid Project Team
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical: The Information Systems.
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
INFSO-RI Enabling Grids for E-sciencE R-GMA Gergely Sipos and Péter Kacsuk MTA SZTAKI Credit to Valeria Ardizzone.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System Tutorial Laurence Field.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using R-GMA.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America R-GMA Practicals Claudio Cherubino INFN.
INFSO-RI Enabling Grids for E-sciencE gLite Information System: R-GMA Tony Calanducci INFN Catania gLite tutorial at the EGEE User.
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals GILDA Tutors INFN Catania 4th EGEE Conference Pisa 23.October.2005.
FESR Trinacria Grid Virtual Laboratory gLite Information System Muoio Annamaria INFN - Catania gLite 3.0 Tutorial Trigrid Catania,
LCG Information and Monitoring System Jason Shih ASGC Grid Administrator Tutorial March 15-16, Academia Sinica.
First South Africa Grid Training Installation and configuration of BDII Gianni M. Ricciardi Consorzio COMETA First South Africa Grid Training Catania,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks BDII Server Installation & Configuration.
E-science grid facility for Europe and Latin America Updates on Information System Annamaria Muoio - INFN Tutorials for trainers 01/07/2008.
INFSO-SSA International Collaboration to Extend and Advance Grid Education Information System Valeria Ardizzone INFN Catania Corso di Grid Computing.
INFSO-RI Enabling Grids for E-sciencE Information System Giuseppe La Rocca Valeria Ardizzone INFN Catania 4th EGEE Conference Pisa.
The Information System in gLite middleware
Practicals on gLite Information Systems
gLite Information System
R-GMA Command Line Tool
The Information System
Information System: Hands On
Hands-on on R-GMA Tony Calanducci INFN Catania
gLite Information System(s)
lcg-infosites documentation (v2.1, LCG2.3.1) 10/03/05
The EU DataGrid – Information and Monitoring Services
The Information System in gLite
gLite Information System
Practicals on R-GMA Valeria Ardizzone INFN
Practicals on gLite Information System
gLite Information System
Hands-on on Information System
gLite Information System(s)
gLite Information System Practicals
R-GMA (Relational Grid Monitoring Architecture) for monitoring applications “s” gLite and LCG.
EGEE Middleware: gLite Information Systems (IS)
Information and Monitoring System
gLite Information System
Hands-on on the gLite Information System
Information System (BDII)
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

The Information service Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY September 2006

2 Introduction The Information Service (IS) provides information about the Grid resources and their status. This information is essential for the operation of the whole Grid, as it is via the IS that resources are discovered. The published information is also used for monitoring and accounting purposes. Two IS systems are used in gLite 3.0: the Globus Monitoring and Discovery Service (MDS), used for resource discovery and to publish the resource status. The Relational Grid Monitoring Architecture (RGMA), used for accounting, monitoring and publication of user- level information.

3 MDS The Globus MDS implements the GLUE Schema using OpenLDAP, an open source implementation of the Lightweight Directory Access Protocol (LDAP), a specialised database optimised for reading, browsing and searching information. No authentication is required for reading informations through ldap protocol. e.g. ldapsearch command or a graphical interface (LDAP browser/Editor)

4 The Information System (IS) provides information about grid resources and their actual status. Computing and Storage resources at a site report their static and dynamic status information via GRISes (Grid Resource Information Server)  one per grid element At each site an element called GIIS (Grid Index Information Server) collects information from all the site GRISes (From LCG2.3.0 site GIIS has been replaced by “local” BDII)  one per grid site The BDII (Berkley Database Information Index) queries the GIISes (of different sites) and acts as a cache storing information about the Grid status in its database. MDS System Overview

5 IS Components: GRISs, GIISs and BDII Each site can run a BDII. It collects the information coming from the GIISs At each site, a site GIIS collects the information given by the GRISs Local GRISes run on CEs and SEs at each site and report dynamic and static information Abbreviations: BDII: Berkeley DataBase Information Index GIIS: Grid Index Information Server GRIS: Grid Resource Information Server From LCG2.3.0 site GIIS has been replaced by “local” BDII

6 GLUE Schema The GLUE Schema (namely the LDAP implementation of the GLUE Schema) describes the Grid resources information that is stored by the Information System. Is there an object class hierarchy Each class contains a set of attributes.

7 GLUE Schema e.g. the GlueCEInfo objectClass General Info for the queue associated to the CE (objectclass GlueCEInfo) – GlueCEInfoLRMSType: name of the local batch system – GlueCEInfoLRMSVersion: version of the local batch system – GlueCEInfoGRAMVersion: version of GRAM – GlueCEInfoHostName: fully qualified name of the host where the gatekeeper runs – GlueCEInfoGateKeeperPort: port number for the gatekeeper – GlueCEInfoTotalCPUs: number of CPUs in the cluster associated to the CE ……..

How to query the IS? In order to query directly the IS elements two high level tools are provided. lcg-infositeslcg-info These tools should be enough for most common user needs and will usually avoid the necessary of raw LDAP queries.

lcg-infosites The lcg-infosites command can be used as an easy way to retrieve information on Grid resources for the most use cases. USAGE: lcg-infosites --vo options -v --is

lcg-infosites options

11 lcg-infosites The "lcg-infosites" command is actually just a perl script wrapping a series of LDAP commands and was developed to allow the user to retrieve information on Grid resources for the most common cases. Before beginning it is worth observing that "lcg-infosites" does not use your VOMS proxy certificate and hence all commands need to include the option "--vo gilda" the BDII defined into the LCG_GFAL_INFOSYS environment variable will be interrogated e.g. $echo $LCG_GFAL_INFOSYS $grid004.ct.infn.it: is option : BDII user wishes to query. e.g. $lcg-infosites –vo atlas ce --is prod-bdii.cern.ch

Obtaining information about CE $ lcg-infosites --vo gilda ce **************************************************************** These are the related data for gilda: (in terms of queues and CPUs) **************************************************************** #CPU Free Total Jobs Running Waiting ComputingElement cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-long cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-short grid010.ct.infn.it:2119/jobmanager-lcgpbs-long grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long grid006.cecalc.ula.ve:2119/jobmanager-lcgpbs-log gildace.oact.inaf.it:2119/jobmanager-lcgpbs-short $ lcg-infosites --vo gilda ce -v 2 RAMMemory Operating System System Version Processor CE Name SLC 3 P4 ced-ce0.datagrid.cnr.it 4096 SLC 3 Xeon cn01.be.itu.edu.tr 1024 SLC 3 PIII cna02.cna.unicamp.br 917 SLC 3 PIII gilda-ce-01.pd.infn.it 1024 SLC 3 Athlon gildace.oact.inaf.it 1024 SLC 3 Xeon grid-ce.bio.dist.unige.it

Obtaining information about SE $ lcg-infosites --vo gilda se ************************************************************** These are the related data for gilda: (in terms of SE) ************************************************************** Avail Space(Kb) Used Space(Kb) Type SEs disk cn02.be.itu.edu.tr disk grid009.ct.infn.it disk grid003.cecalc.ula.ve disk gildase.oact.inaf.it disk testbed005.cnaf.infn.it disk gilda-se-01.pd.infn.it disk cna03.cna.unicamp.br disk grid-se.bio.dist.unige.it

Listing the close Storage Elements $ lcg-infosites --vo gilda closeSE Name of the CE: cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-long Name of the close SE: cn02.be.itu.edu.tr Name of the CE: cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-short Name of the close SE: cn02.be.itu.edu.tr Name of the CE: grid010.ct.infn.it:2119/jobmanager-lcgpbs-long Name of the close SE: grid009.ct.infn.it Name of the CE: grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long Name of the close SE: testbed005.cnaf.infn.it

15 lcg-info QUERY & list of attributes: The "lcg-info" command is similar to the "lcg-infosites" except that it is used to list either CE's or SE’s satisfying a given set of conditions on their attributes and to print, for each of them, the values of a given set of attributes. HELPFUL for “Requirements” tag: This is very similar to the usage of the "Requirements" tag in a JDL file along with the command "glite-job-list-match". The "lcg-info" command can therefore be useful when constructing the "Requirements" tag in a JDL file.

lcg-info --list-ce [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list] lcg-info --list-se [--bdii bdii] [--vo vo] [--sed] [--query query] [--attrs list] lcg-info --list-attrs lcg-info --help lcg-info usage

lcg-info options

$ lcg-info --list-attrs Attribute name Glue object class Glue attribute name MaxTime GlueCE GlueCEPolicyMaxWallClockTime CEStatus GlueCE GlueCEStateStatus TotalJobs GlueCE GlueCEStateTotalJobs CEVOs GlueCE GlueCEAccessControlBaseRule TotalCPUs GlueCE GlueCEInfoTotalCPUs FreeCPUs GlueCE GlueCEStateFreeCPUs CE GlueCE GlueCEUniqueID WaitingJobs GlueCE GlueCEStateWaitingJobs RunningJobs GlueCE GlueCEStateRunningJobs CloseCE GlueCESEBindGroup GlueCESEBindGroupCEUniqueID CloseSE GlueCESEBindGroup GlueCESEBindGroupSEUniqueID SEVOs GlueSA GlueSAAccessControlBaseRule UsedSpace GlueSA GlueSAStateUsedSpace AvailableSpace GlueSA GlueSAStateAvailableSpace Type GlueSE GlueSEType SE GlueSE GlueSEUniqueID Protocol GlueSEAccessProtocol GlueSEAccessProtocolType ArchType GlueSL GlueSLArchitectureType Processor GlueSubCluster GlueHostProcessorModel OS GlueSubCluster GlueHostOperatingSystemName Cluster GlueSubCluster GlueSubClusterUniqueID Tag GlueSubCluster GlueHostApplicationSoftwareRunTimeEnvironment Memory GlueSubCluster GlueHostMainMemoryRAMSize

List all the CE(s) in the BDII satisfying given conditions $ lcg-info –-vo gilda --list-ce --query 'TotalCPUs=10,OS=SL*' --attrs 'RunningJobs,FreeCPUs‘ - CE: dgt01.ui.savba.sk:2119/jobmanager-lcgpbs-long - RunningJobs 0 - FreeCPUs 10 - CE: dgt01.ui.savba.sk:2119/jobmanager-lcgpbs-short - RunningJobs 0 - FreeCPUs 10 - CE: dgt01.ui.savba.sk:2119/jobmanager-lcgpbs-infinite - RunningJobs 1 - FreeCPUs 10 - CE: gilda-ce-01.pd.infn.it:2119/jobmanager-lcgpbs-long - RunningJobs 0 - FreeCPUs 10

Print all the tags published by a specific query $ lcg-info –-vo gilda --list-ce --query 'CE=*grid-ce.bio.dist.unige.it*‘ --attrs ‘Tag’ LCG-2_1_1 LCG-2_2_0 LCG-2_3_0 LCG-2_3_1 LCG-2_4_0 R-GMA AFS CMS ATLAS GATE LHCb IDL-5.4 CMSIM-125 ALICE ALIEN POVRAY-3.5 DEMTOOLS-1.0 CSOUND-4.13 MPICH VIRGO-1.0 CMS-OSCAR LHCb_dbase_common-v3r1 GEANT4-6 VLC EGEODE-1.0 RASTER3D SCILAB-2.6 G MAGIC-6.19 CODESA3D-1.0 VO-gilda-slc3_ia32_gcc323 VO-gilda-CMKIN_5_1_1 VO-gilda-GEANT VO-gilda-GKS05

21 List the CEs with a particular SW $ lcg-info –-vo gilda --list-ce --query ‘Tag=*MPICH*’ --attrs ‘CE’ - CE: cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-long - CE cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-long - CE: cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-short - CE cn01.be.itu.edu.tr:2119/jobmanager-lcglsf-short - CE: grid010.ct.infn.it:2119/jobmanager-lcgpbs-long - CE grid010.ct.infn.it:2119/jobmanager-lcgpbs-long - CE: grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long - CE grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long - CE: ced-ce0.datagrid.cnr.it:2119/jobmanager-lcgpbs-long - CE ced-ce0.datagrid.cnr.it:2119/jobmanager-lcgpbs-long

22 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs'

23 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs' Filters out info which are not available to the gilda VO

24 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs' Lists CE (for the following query)

25 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs' We wish to find all sites that support the MPICH package

26 lcg-info: useful query I lcg-info --vo gilda --list-ce --query 'Tag=MPICH' --attrs 'FreeCPUs' We wish to display how many CPU's are available

27 lcg-info: useful query II lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE Filters out info which are not available to the gilda VO

28 lcg-info: useful query II lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE Lists SE (for the following query)

29 lcg-info: useful query II lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE We wish to find all entries containing ‘aliserv6.ct.infn.it’

30 lcg-info: useful query II lcg-info --vo gilda --list-se --query 'SE=aliserv6.ct.infn.it' --attrs CloseCE We wish to display the CloseCE attribute

31 Relational Grid Monitoring Architecture (R-GMA)  Provides Information (which resources are available on the Grid) and Monitoring Services  Developed as part of the EuropeanDataGrid Project (EDG)  Now as part of the EGEE project.  Implementation of the Grid Monitoring Architecture (GMA) from the Global Grid Forum (GGF). Uses a relational data model.  Data are viewed as tables.  Data structure defined by the columns.  Each entry is a row (tuple).  Queried using Structured Query Language (SQL). Introduction to R-GMA

32 Data are organised in relational tables, and inserted and queried with SQL-style INSERT and SELECT statements (the allowed syntax is a subset of SQL, but reasonably complete for most purposes). There are some differences to bear in mind. The most basic is that a standard relational database can only have one row (tuple) with a given primary key value, but R-GMA usually has more than one. Introduction to R-GMA II

33 Latest query Each tuple has a timestamp, and for a given primary key value you can query the most recent tuple. History query A history of all tuples within some defined retention period. Continuous query Streaming Introduction to R-GMA: Three different query types.

34 The data model is relational. The table definition is globally unique and is stored in the Schema. The Registry stores the Producers table name as well as the URL. The data is inserted in the form of a tuple. The Consumer gets the tuple from Producer. Producers publish: SQL “INSERT” Consumers collect: SQL “SELECT” Registry ProducerConsumer Execute or Stream data Schema Store Location Look up Location Relational GMA

35 To Start the R-GMA command line tool run the following command: >rgma On startup you should receive the following message: R-GMA Command Line Tool (1)

36 Commands are entered by typing at the rgma> prompt and hitting ‘enter’ to execute the command. A history of the commands executed can be accessed using the Up and Down arrow keys. To search a command from history use CTRL-R and type the first few letters of the command to recall. Command autocompletion is supported (use Tab when you have partly entered a command). Entering Command

37 General Commands help Display general help information. help Display help for a specific command. exit or quit Exit from R-GMA command line interface. Show tables Display the name of all tables existing in the Schema Describe Show all information about the structure of a table

38 Querying Data (1) Querying data uses the standard SQL SELECT statement, e.g.: rgma> SELECT * FROM GlueService The behaviour of SELECT varies according to the type of query being executed. In R-GMA there are three basic types of query: LATEST Queries only the most recent tuple for each primary key HISTORY Queries all historical tuples for each primary key CONTINUOUS Queries returns tuples continuously as they are inserted.

39 The type of query can be changed using the SET QUERY command as follow: rgma> SET QUERY LATEST or rgma> SET QUERY CONTINUOUS The current query type can be displayed using rgma> SHOW QUERY Querying Data (2)

40 Exercises 1.Display all the table of the Schema rgma>show tables 2.Display information about GlueSite table rgma>describe GlueSite 3.Basic select query on the table named GlueSite rgma>set query latest rgma>show query rgma>select Name,Latitude,Longitude from GlueSite

41 Maximum AGE of tuples The maximum age of tuples to return can also be controlled. To limit the age of latest or historical tuples use the SET MAXAGE command. The following are equivalent: rgma> SET MAXAGE 2 minutes rgma> SET MAXAGE 120 The current maximum tuple age can be displayed using rgma> SHOW MAXAGE To disable the maximum age, set it to none: rgma> SET MAXAGE none

42 Query Timeout The final property affecting queries is timeout. –For a latest or history query the timeout exists to prevent a problem (e.g. network failure) from stopping the query from completing. –For a continuous query, timeout indicates how long the query will continue to return new tuples. Default timeout is 1 minute and it can be changed using rgma>SET TIMEOUT 3 minutes or SET TIMEOUT 180 The current timeout can be displayed using rgma>SHOW TIMEOUT

43 Producer & Inserting Data The SQL INSERT statement may be used to add data to the system: rgma> INSERT INTO userTable VALUES (‘a’, ‘b’, ‘c’, ‘d’) In R-GMA, data is inserted into the system using a Producer component which handles the INSERT statement. Using the command line tool you may work with one producer at a time. The current producer type can be displayed using: rgma>show producer The producer type can be set using: rgma>set producer latest

44 Exercises 1.Insert and Select using Primary Producer to support Continuos + History query rgma>describe userTable rgma>set producer continuous rgma>insert into userTable values('cod','string',1.4,66) rgma>set query continuous rgma>set maxage 1 minutes rgma>set timeout 5 seconds rgma>select * from userTable

45 Secondary Producer To instruct the secondary producer to consume from table MyTable: rgma> SECONDARYPRODUCER userTable Like the producer, the secondary producer may be configured to answer latest and/or history queries: rgma> SET SECONDARYPRODUCER latest (By default the secondary producer can answer both latest and history queries. ) The current secondary producer type can be displayed using: rgma> SHOW SECONDARYPRODUCER

46 Exercises 2.Insert and Select using the Secondary Producer to support the latest query. rgma>set secondaryproducer latest rgma>secondaryproducer userTable rgma>show producers of userTable rgma>set producer continuous rgma>insert into userTable values ('cod','string',5.2,44) rgma>set query latest rgma>select * from userTable

47 Questions…

48 References GLITE 3.0 user guide manuals series BDII server installation and configuration =a0615&id=a0615s0t5/transparencies The gLite Information System =a0615&id=a0615s4t10/transparencies RGMA Server Installation =a0615&id=a0615s0t7/transparencies