Download presentation
Presentation is loading. Please wait.
1
Information and Monitoring System
Fabio Scibilia INFN – Catania Catania,
2
Grid Information System
EGEE Tutorial, Roma,
3
Information System What is? Why? How?
System to collect information on the state of resources Why? To discover resources of the grid and their nature To have useful data that helps who is in charge of managing the workload to do it more efficiently. To check for health status of resources. How? Monitoring state of resources locally and publishing right information on the information system. Adopting a data model that MUST be well known to all components that want to access monitored information Using different approaches that we are going to investigate in next slides EGEE Tutorial, Roma,
4
Design of Information Systems
About Measures Measures SHOULD be sensitive to the aim the users want to achieve. Measures SHOULD be enough accurate to be considered valid. Rate of taking measures MUST be adequate to be used. About the Gathering of Information How and when collected info should be published? Where should collected info be stored? How long should this info be maintained in the storage? Querying of the Information System Where should queries be sent to have a response? What syntax and protocols have to be adopted to make queries? What is the adopted data model to describe resources? Security Who is allowed to execute queries against the IS and what type of queries is he allowed to do? What about management of user rights and credentials. Others . . . EGEE Tutorial, Roma,
5
Adopted Information Systems
The BDII (Berkley DB Information Index) has been adopted in LCG middleware as the Information System provider. Is an evolution of the Globus Meta Directory System (MDS) LCG-2 actually adopts BDII as Information System. It is based on a system of Light Direct Access Protocol (LDAP) servers. The Relational Grid Monitoring Architecture (R-GMA) Is an implementation of the Grid Monitoring Architecture (GMA) standardized by the Global Grid Forum (GGF) It is a relational implementation of the GMA It is strongly Web Services Oriented It will be adopted by next releases of the gLite middleware EGEE Tutorial, Roma,
6
LCG Information System
EGEE Tutorial, Roma,
7
LCG Information System
LCG adopts a combination of solutions Globus MDS At the lowest level of the information system To discover and monitor resources and publish information Grid Information Security (GSI) credentials Caching BDII At the highest level of the system Because MDS had some troubles in terms of scalability Used by the Resource Broker for the matchmaking Can be configured by each VO Queries underlying systems periodically (2 minutes) Hierarchical system Information is collected on the leaves of a hierarchical tree and travel towards the root Clients can query the hierarchical tree at every level The higher is the level against which queries are made, the older is the obtained information EGEE Tutorial, Roma,
8
Collecting of Information
Gathering of information at different levels Lower level: Grid Resource Information Server (GRIS) Collects information on the state of a given resource One GRIS on top of each resource A set of scripts and sensor that try to extract useful info on the resource Medium level: Grid Index Information Server (GIIS) Collects information on resources of a given site One GIIS for each site Higher level: BDII Collects information on resources of a given VO One BDII for each VO (suggested solution) Way of collecting info Pull model (higher level servers query periodically lower level servers) LDAP query model EGEE Tutorial, Roma,
9
Globus MDS (The past) Globus Meta Directory Server (MDS)
It is a hierarchical system Based on LDAP servers GRISes are leaves of the tree GIISes are intermediate nodes of the tree The user can query the system at every level The higher the information is in the tree, the older it is Grid Resource Information Service (GRIS) One for each Grid Resource (CE or SE) Collects info on that resource Static or dynamic info Adopts techniques to take measures (such as sensors) Grid Index Information Service (GIIS) One for each site Collects info from above GRISes Caches info according to its validity time Queries above GRISes or GIISes whether needed EGEE Tutorial, Roma,
10
BDII (the present) The Berkley Database Information Index (BDII)
Developed within the context of LCG project Solves problems of instability of the MDS occurring when the number of sites grows too much Stays on top of GIIS sites One for each VO Centralized system Three levels of hierarchy Accessed by the Workload Management System Way of working One GRIS for each resource One GIIS for each site collecting info from below GRIS systems One BDII for a given VO collecting information from below GIIS systems Two LDAP servers, one for write access and one for read access Every two minutes a cron-job runs a script and collects info from a list of GIIS sites The list of GIIS is placed in the configuration file of the BDII EGEE Tutorial, Roma,
11
R-GMA (the future) The Relational Grid Monitoring Architecture (R-GMA)
Is an implementation of GMA defined by the GGF Adopts a database model with tables and relations between tables Implements a virtual database The user accesses to the R-GMA as he/she was accessing to a classical database (SQL string) Implements different type of queries The information Produced and accessed locally to its site Ever almost fresh Can be collected by an entity (secondary producer) to be accessed fast EGEE Tutorial, Roma,
12
R-GMA EGEE Tutorial, Roma,
13
About R-GMA Services? Three types of Producer Services
Primary Producer: Produces and stores tuples in its tuples store Secondary Producer: Collects tuples by querying Primary Producers and stores them locally On Demand Producer: Produce tuples on demand. Does not store produced tuple in any case One type of Consumer Service Consumer: Queries the virtual database to get tuples according a SQL query syntax Virtual Database with three type of Services Registry: Contains useful information to match consumer queries with right producers Schema: Stores schemas of tables in terms of columns definitions Mediator: Matches queries made by consumers with right producers EGEE Tutorial, Roma,
14
Registration as Producer of Tuples
The Producer Calls a declareTable primitive to dclare its intention to publish tuples in a given table Hands a predicate string to the mediator in the same primitive The Mediator Receives the request Stores this data into the registry The Predicate A string describing which tuples will be published by this producers It is a collection of AND clause in which the value of some columns are defined as constants I want to publish tuples in table userTable where userId=10. Producer Table: userTable Producer: Predicate: userId=10 Registry EGEE Tutorial, Roma,
15
Insertion of a new Tuple
The user stores a new tuple in the virtual database insert into (tuple description) Producer The tuples keep stored locally to the producer service because the database is just virtual Tuples storage userId aString 10 Jackson 1.3 Memory Database EGEE Tutorial, Roma,
16
Tell me about producer(s) for table ‘userTable’ WHERE userId == 10
Getting of Tuples Registry Table: Artists Producer: Predicate: userId == 10 Producer userIId aString real 10 Jackson 1.3 URLs of Producers Id aString real 10 Jackson 1.3 Tell me about producer(s) for table ‘userTable’ WHERE userId == 10 Consumer select * from userTable where userId == 10 EGEE Tutorial, Roma,
17
Type of Producers Primary Producer Secondary Producer
Creates and stores tuples in its local tuples storage Secondary Producer Just stores tuples in its tuples storage Collects tuples as it was a Consumer to Primary Producers On Demand Producer Generates tuples just in time in response to Consumer queries A proper code provides for the production of tuples just in time Producer service tuples storage Consumer service queries tuples Secondary Producer tuples storage Consumer service queries tuples Primary Producer On demand Producer Consumer service queries tuples Tuples generator code EGEE Tutorial, Roma,
18
Type of Queries SQL Query Processor Continuous Latest History Static
To create a stream channel from producers to consumers that works in real time (more in next slide) Latest To get the latest tuple published by a Producer The Producer specifies a Latest Retention Period (LRP) as deadline for the produced tuple Tuples are stored in a Latest Storage History To get all tuples less old than a History Retention Period (HRP) Tuples are stored in a History Storage Static Supported only by On-Demand Producers To make not R-GMA data storage accessible through R-GMA infrastructure continuous History Tuples Store query history SQL Query Processor insert tuples Latest Tuples Store latest EGEE Tutorial, Roma,
19
Focusing on Continuous Queries
Way of working One Consumer asks the Registry for Producers who put tuples in the virtual database that satisfy the Consumer query The Registry responds with a list of suitable Producers The Consumer subscripts with these Producers to get all new tuples that satisfy the query Each time anyone of the suitable Producers inserts a new tuple that satisfies the query, that tuple is streamed to the Consumer One way streams of tuples are now created between the suitable Producers and the Consumers The registry keeps track of the Consumer to notify it in the event of the addition of a new suitable Producer Duration of the Stream A time parameter defines how long the channel has to work Periodically the Registry checks for the Consumer to be alive. If not, the Registry deregisters the Consumers from its list of Continuous Consumers EGEE Tutorial, Roma,
20
Periodically: “Are you still alive ?”
Continuous Queries P2 Registry subscribe SSL/TLS streams P1 Table: userTable Producer: Predicate: userId == 10 Consumer P1, P2 Select * from Artists where surname== ‘Jackson’. Type of query = continuous Select * from Artists where surname== ‘Jackson’. Type of query = continuous Periodically: “Are you still alive ?” EGEE Tutorial, Roma,
21
Security in R-GMA Network Security Authentication Authorization
All network communications are over SSL encrypted channels One weak point: The URI to which On Demand Producers forward their messages because we don’t any guarantees Authentication X509 certificates Mutual authentication between peers Authorization Grid credential extracted by the certificate Adoption of VOMS Ownership One R-GMA infrastructure for each VO One owner for each VDB who decides who read, write etc… EGEE Tutorial, Roma,
22
GLUE Schema EGEE Tutorial, Roma,
23
GLUE Schema Grid Laboratory Uniform Environment (GLUE) Schema
It is a data model to describe in a meaningful way information on grid resources (static and dynamic info) As result of a collaboration between the EU-DataTAG and iVDGL projects Then, EGEE, NorduGrid, LCG and Grid3/OSG have contributed to the definition of the schema XML Schema Now, GLUE Schema is being mapped to an XML representation EGEE Tutorial, Roma,
24
Site Element EGEE Tutorial, Roma,
25
Cluster Element EGEE Tutorial, Roma,
26
Computing Element EGEE Tutorial, Roma,
27
More information R-GMA overview page. R-GMA in EGEE R-GMA Documenation
R-GMA in EGEE R-GMA Documenation GLUE Schema EGEE Tutorial, Roma,
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.