IST E-infrastructure shared between Europe and Latin America The gLite Information System(s) Christian Grunfeld, UNLP EELA Tutorial, La Plata, December 2006
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, What? –System to collect information on the state of resources. Why? –To discover resources of the grid and their nature. –To have useful data to know who is in charge of managing the workload to do it more efficiently. –To check for health status of resources. How? –Monitoring state of resources locally and publishing fresh data on the information system. –Adopting a data model that MUST be well known to all components that want to access monitored information –Using different approaches that we are going to investigate in the next slides Information System
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, if you are a middleware developer Workload Management System: Matching job requirements and Grid resources Monitoring Services: Retrieving information of Grid Resources status and availability If you are a user Retrieve information of Grid resources and status Get the information of your jobs status If you are site manager or service You “generate” the information for example related to your site or to a given service Uses of the IS in Grid
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, LCG adopts a combination of solutions –Globus MDS At the lowest level of the information system To discover and monitor resources and publish information Grid Information Security (GSI) credentials Caching –BDII At the highest level of the system Because MDS had some troubles in terms of scalability Used by the Resource Broker for the matchmaking process Can be configured by each VO Queries underlying systems periodically (2 minutes) Hierarchical system –Information is collected on the leaves of a hierarchical tree and travels towards the root –Clients can query the hierarchical tree at every level –The higher the level against which queries are made, the older is the obtained information LCG Information System
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, The BDII (Berkeley DB Information Index) –has been adopted in LCG middleware as the Information System provider. –It is an evolution of the Globus Meta Directory System (MDS) –LCG-2 actually adopts BDII as Information System. –It is based on Lightweight Directory Access Protocol (LDAP) server The Relational Grid Monitoring Architecture (R-GMA) –Is an implementation of the Grid Monitoring Architecture (GMA) standardized by the Global Grid Forum (GGF) –It is a relational implementation of the GMA –It is strongly Web Services Oriented –It uses standard SQL query syntax Information Systems in gLite
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, Gathering of information at different levels –Lower level: Grid Resource Information Server (GRIS) - MDS Collects information on the state of a given resource One GRIS on top of each resource: CE, SE, RB, MyProxy A set of scripts and sensors that try to extract useful info on the resource –Medium level: Grid Index Information Server (GIIS) – Local BDII Collects information on resources of a given site One GIIS for each site –Higher level: Top-level BDII Collects information on resources of a given VO One BDII for each VO (suggested solution) Way of collecting info –Pull model (higher level servers periodically query lower level servers) –LDAP query model Collecting Information
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, Way of working –One GRIS for each resource –One GIIS for each site collecting info from below GRIS systems –One BDII for a given VO collecting information from below GIIS systems –Two LDAP servers, one for write access and one for read access –Every two minutes a cron-job runs a script and collects info from a list of GIIS sites –The list of GIIS is placed in the configuration file of the BDII The hierarchy
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, The Relational Grid Monitoring Architecture (R-GMA) –It is the relational implementation of GMA defined by the GGF –Adopts a database model with tables and relations between tables –Implements a virtual database –The user queries the R-GMA as he/she was querying to a classical database (SQL string) –Implements different type of queries The information –Produced and accessed locally to its site –Always new –Can be collected by an entity (secondary producer) to be accessed faster R-GMA
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, The Producer stores its location (URL) in the Registry. The Consumer looks up producer URLs in the Registry. The Consumer contacts the Producer to get all the data. Or the Consumer can listen to the Producer for new data. Registry ProducerConsumer Store Location Look up Location Execute or Stream data nameIDbirthGroup SELECT * FROM people WHERE group=‘HR’ Tom HR GMA Architecture and Relational Model
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, Consumer Producer 1 Registry TableName Value 1Value2 Value 3Value 4 TableName Value 1Value 2 TableNameURL 1 TableNameURL 2 The Consumer will get all the URLs that could satisfy the query. The Consumer will connect to all the Producers. Producers that can satisfy the query will send the tuples to the Consumer. The Consumer will merge these tuples to form one result set. Producer 2 TableName Value 3Value 4 Multiple Producers
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, CPULoad (Producer 3) CHCERNATLAS CHCERNCDF CPULoad (Producer 1) UKRALCDF UKRALATLAS CPULoad (Producer 2) UKGLACDF UKGLAALICE CPULoad (Consumer) CountrySiteFacilityLoadTimestamp UKRALCDF UKRALATLAS UKGLACDF UKGLAALICE CHCERNALICE CHCERNCDF Select * from CPULoad
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, Service URIVOtype Contactsite ServiceStatus URIVOtypeupstatus gppse01aliceSEySE is running gppse01atlasSEySE is running gppse02cmsSEnSE ERROR 101 lxshare0404aliceSEySE is running lxshare0404atlasSEySE is running Result Set (Consumer) URI Contact SELECT Service.URI Service. Contact FROM Service S, ServiceStatus SS WHERE (S.URI= SS.URI and SS.up=‘n’) Joins
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, GLUE Schema
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, Definition and main goals Schema: a description of objects and attributes needs to describe Grid resources, and the relationships between the objects. Main goals: Define a minimum common schema requirement for interoperability –Compute Elements, Network Elements, Storage Elements
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, Grid Laboratory Uniform Environment (GLUE) Schema –It is a data model to describe in a meaningful way information on grid resources (static and dynamic info) –As result of a collaboration between the EU-DataTAG and iVDGL projects –EGEE, NorduGrid, LCG and Grid3/OSG contributed to the definition of the schema XML Schema –Now, GLUE Schema is being mapped to an XML representation – Glue Schema
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, Example of attibutes Operating System –OSName –OSRelease –OSVersion QueueState –RunningJobs –TotalJobs –QueueStatus –WaitQueueLength –WorstResponseTime –EstimatedResponseTime
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, gLite 3.0 User Guide – R-GMA home page – GLUE Schema – References
IST E-infrastructure shared between Europe and Latin America Santiago, Chile, EELA Tutorial, Questions… Thanks to Roberto Barbera who firstly developed these slides