Presentation is loading. Please wait.

Presentation is loading. Please wait.

GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001.

Similar presentations


Presentation on theme: "GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001."— Presentation transcript:

1 GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001

2 Introduction › In a distributed environment like a Grid, one of the primary needs is to collect and retrieve resource informations. › The Grid Information Service (GIS) is the way of making informations available to Grid application.

3 GIS=GRIS+GIIS › Globus implements the GIS by using two kinds of LDAP servers:  GRIS (Grid Resource Information Service) runs on each resource (machine). It registers itself to a GIIS providing info about itself.  GIIS (Grid Index Information Service) usually runs on few machines per organization and is a search engine for a set of GRISes

4 Why LDAP? › LDAP is a protocol for accessing directories. › A directory is like a database, but easier to implement.

5 Why LDAP? › PROs:  it is a standard way to describe and collect data.  it provides a distributed topological model for the data. › CONs:  directories are designed more for reading than for writing. Good for a DNS, but not for storing dynamic data like the CPU load of a machine.

6 General implementation › The proposed implementation of the GIS is to have an hierarchical structure of GIIS having a root server at CERN. › Each organization has its top level GIIS registered on the root server, but can choose its own low level topology.

7 EU GIIS o=grid INFN (Italy) dc=infn,dc=it,o=grid IN2P3 (France) dc=in2p3,dc=fr,o=grid LIP (Portugal) dc=lip,dc=pt,o=grid...

8 INFN implementation › INFN has implemented a hierarchical structure of GIIS based on INFN departments (about 25) › Each GRIS registers itself to the site GIIS which in turn registers itself to the top level INFN GIIS

9 Top level GIIS dc=infn,dc=it,o=grid GIIS Milano GRIS dc=mi,dc=infn,dc=it,o=grid GIIS Bologna GRIS dc=bo,dc=infn,dc=it,o=grid

10 INFN top level GIIS › 11 GIIS’s registered › More than 40 GRIS’s

11

12 GIS Requirements › Each experiment needs to be able to select its own set of machines (with its own name space ?) › We need more attributes to describe the status of jobs and machines. › Superior knowledge: referral to upper GIIS › Data replication for backup and mirroring

13 Experiments resources › Each GRIS can register itself to several GIIS’s. › This allows repartitioning of resources by experiment.

14 EU CMS GIIS exp=cms,o=grid GIIS Milano GRIS GIIS Bologna GRIS Top level INFN GIIS dc=infn,dc=it,o=grid

15 Jobs and machines info › The underlying resource management systems, like Condor,LSF,PBS, provide useful information about machines and jobs that should be published in the GIS.

16 Examples of jobs info › job id › current status of the job › the size of the executable › the name of the user › the submitting and the executing host › why the job is not running › etc.

17 Example of machines info › the total and available physical memory and swap space › the speed of the machine in MIPS › the number of CPUs › the CPU load average › etc.

18 Extending the GRIS › The GRIS uses programs called information providers to collect information from the machine. › The requirements for an information provider are:  the program must emit LDIF objects to stdout  the object generated must respect the GLOBUS schema

19 Caching › Information are not pushed periodically from a GRIS to a GIIS, but is the GIIS that queries the GRISes when an application needs information. › Information are stored in cache for a period of time (TTL=Time To Live). › Higher the level of GIIS higher the TTL, lower the details.

20

21 Performance › In the worst case the whole set of machines must be queried. › Some indexing techniques should be used to implement a search space pruning. › Also a periodic information update mechanism can be investigated.

22 Some tests › We have tested the performance dependency from caching and cpu load. › Test have been made on WAN. › The same queries on a GIIS take 10 sec. when off

23 Some tests (cont.) › When a GRIS has a loaded CPU the response time from its own GIIS is much higher when cache is expired (> 1 min. vs 1 sec.) › Also when a GIIS has a loaded CPU and the cache is not expired the response time is higher (6-7 sec.): better do not use a GIIS for computation!

24 Security and access policies › In the current implementation any machine can register itself to a GIIS › No access control when searching the GIIS. From any ldap client I can: ldapsearch –p 389 –h mds.infn.it –b “o=grid” –s sub “*=*” and get all the information from the GIIS

25 Documentation › The documentation is currently on: www.infn.it/grid where there is also the pointer to:  INFN Globus documentation  INFN Globus toolkits distribution  INFN testbed (www.infn.it/testbed-grid) › For testbed GIS support, mailing list: is-datagrid@infn.it › There will be soon a more general documentation for Datagrid.

26 Conclusions › The Globus Information Service is based on a standard protocol (LDAP). › It provides flexibility and a potentially good distributed data model. › But...

27 Conclusions (cont.) › A good topology for the HEP experiments must be still implemented. › The GRIS must be extended with new information providers. › Lack of data replication. › Some new mechanism should be introduced to improve performance and security.


Download ppt "GIIS Implementation and Requirements F. Semeria INFN European Datagrid Conference Amsterdam, 7 March 2001."

Similar presentations


Ads by Google