WP3 Werner Nutt (Heriot-Watt University) R-GMA – DataGrid’s Monitoring System 1/7/2003.

Slides:



Advertisements
Similar presentations
EGEE is a project funded by the European Union under contract IST R-GMA status and plans Abdeslem DJAOUI / RAL GRIDPP10 meeting at CERN, 3.
Advertisements

21 Sep 2005LCG's R-GMA Applications R-GMA and LCG Steve Fisher & Antony Wilson.
INFSO-RI Enabling Grids for E-sciencE Building a robust distributed system: some lessons from R-GMA CHEP-07, Victoria,
WP3 WP3 17/9/2002 Steve Fisher / RAL. WP3 Steve Fisher 17/9/2002WP32 Summary Quality Current status 1.2 R-GMA in release 2.0 Recent Requirements Work.
WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
The role of a Mediator in R-GMA Manfred Oevers IBM Andrew Cooke Heriot Watt Laurence Field RAL Steve Fisher RAL James Magowan IBM Werner Nutt Heriot Watt.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
WP3 R-GMA Revisited 23/7/2002 Werner Nutt / Heriot-Watt University.
WP3 R-GMA & OGSA 23/7/2002 James Magowan / IBM. WP3 James Magowan - 23/7/2002R-GMA & OGSA2 Contributors Brian CoghlanTCD Andy CookeHeriot-Watt Ari DattaQMUL.
Sept 27 th – 29 th, 2002Linz 2002, Task Task 3.3 Grid Monitoring Subtask SANTA-G Brian Coghlan, Stuart Kenny Trinity College Dublin.
Republishers in a Publish/Subscribe Architecture for Data Streams Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences, Heriot-Watt.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Introduction to R-GMA: Relational Grid Monitoring Architecture.
The EU DataGrid – Information and Monitoring Services The European DataGrid Project Team
WP3 RGMA Deployment Laurence Field / RAL Steve Fisher / RAL.
Republishing Mechanisms for R-GMA Benefits and Approaches. Talk by: Alasdair Gray Collaborators: Andy Cooke, Lisha Ma, and Werner Nutt Heriot-Watt University.
Introduction on R-GMA Shi Jingyan Computing Center IHEP.
Computer and Automation Research Institute Hungarian Academy of Sciences Presentation and Analysis of Grid Performance Data Norbert Podhorszki and Peter.
DataGrid is a project funded by the European Union CHEP March 2003 R-GMA 1 R-GMA: First results after deployment Steve Fisher (EDG - WP3)
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May Grid Monitoring Services Robin Middleton RAL/PPD24-May-01.
Application code Registry 1 Alignment of R-GMA with developments in the Open Grid Services Architecture (OGSA) is advancing. The existing Servlets and.
GLite Information System(s) Antonio Juan Rubio Montero CIEMAT 10 th EELA Tutorial. Madrid, May 7 th -11 th,2007.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks R-GMA Now With Added Authorization Steve.
13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
An information and monitoring system for static and dynamic information about grid resources, applications, networks … RDBMS Servlet aware of API during.
E-infrastructure shared between Europe and Latin America 12th EELA Tutorial for Users and System Administrators gLite Information System.
WP3 R-GMA: Likely status New Years Eve Steve Fisher / RAL 24/2/2003.
WP3 Authorization and R-GMA Linda Cornwall WP3 workshop 2-4 April 2003.
CLRC and the European DataGrid Middleware Information and Monitoring Services The current information service is built on the hierarchical database OpenLDAP.
EGEE is a project funded by the European Union under contract IST R-GMA: Production Services for Information and Monitoring in the Grid John.
WP3 RGMA Deployment Laurence Field / RAL Steve Fisher / RAL.
WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.
INFSO-RI Enabling Grids for E-sciencE
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
Website: Answering Continuous Queries Using Views Over Data Streams Alasdair J G Gray Werner.
A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,
R-GMA – an Update A reminder of R-GMA The need for a mediator Work with WP7 Release 1.2 and beyond Some Implications of OGSA.
INFSO-RI Enabling Grids for E-sciencE Building a robust distributed system: some lessons from R-GMA WLCG Service Reliability.
 CMS data challenges. The nature of the problem.  What is GMA ?  And what is R-GMA ?  Performance test description  Performance test results  Conclusions.
WP3 The status of the EU DataGrid's R-GMA system Steve Fisher / RAL 24/4/2003.
E-infrastructure shared between Europe and Latin America gLite Information System(s) Manuel Rubio del Solar CETA-CIEMAT EELA Tutorial, Mérida,
The impact of R-GMA (upon WP1 and WP4). EDG (Paris) 6 Mar James MagowanImpact of R-GMA Grid Monitoring Architecture (GMA) We use it not only for.
INFSO-RI Enabling Grids for E-sciencE Information System Valeria Ardizzone INFN EGEE NA4 Generic Applications Meeting Catania,
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
FESR Trinacria Grid Virtual Laboratory Relational Grid Monitoring Architecture (R-GMA) Valeria Ardizzone INFN Catania Tutorial per Insegnanti.
INFSO-RI Enabling Grids for E-sciencE R-GMA Gergely Sipos and Péter Kacsuk MTA SZTAKI Credit to Valeria Ardizzone.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
Supporting Join Queries Talk by: Andy Cooke Collaborators: Alasdair Gray, Lisha Ma, and Werner Nutt Heriot-Watt University.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using R-GMA.
INFSO-RI Enabling Grids for E-sciencE gLite Information System: R-GMA Tony Calanducci INFN Catania gLite tutorial at the EGEE User.
CERN 21 January 2005Piotr Nyczyk, CERN1 R-GMA Basics and key concepts Monitoring framework for computing Grids – developed by EGEE-JRA1-UK, currently used.
EGEE is a project funded by the European Union under contract IST The UK Cluster Steve Fisher / RAL JRA1 meeting at Cork, 19/ April
The Mediator: What Next? Talk by: Andy Cooke Collaborators: Alasdair Gray, Lisha Ma, and Werner Nutt Heriot-Watt University.
Enabling Grids for E-sciencE Agreement-based Workload and Resource Management Tiziana Ferrari, Elisabetta Ronchieri Mar 30-31, 2006.
Relational Grid Monitoring Architecture (R-GMA)
OGSA Information Abdeslem Djaoui OGSA Information, OGSA-WG #11
Grid Event Management Using R-GMA Monitoring Framework
gLite Information System(s)
R-GMA as an example of a generic framework for information exchange
gLite Information System
gLite Information System(s)
Information and Monitoring System
RELATIONAL GRID MONITORING ARCHITECHTURE
gLite Information System
Canonical Producer CP API CP Servlet User Code Files
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

WP3 Werner Nutt (Heriot-Watt University) R-GMA – DataGrid’s Monitoring System 1/7/2003

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System2 RGMA = Relational Grid Monitoring Architecture Grid Monitoring and Information System developed within DataGrid (Work Package 3) Based on the “Grid Monitoring Architecture” of the Global Grid Forum Code is open source and freely available Homepage: type “wp3” into Google

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System3 Contributors Heriot-Watt, Edinburgh –Andrew Cooke, Alasdair Gray, Lisha Ma, Werner Nutt IBM-UK –James Magowan, Manfred Oevers, Paul Taylor Queen Mary, University of London –Roney Cordenonsi CCLRC/PPARC –Rob Byrom, Laurence Field, Steve Hicks, Manish Soni, Antony Wilson, Jason Leake –Linda Cornwall, Abdeslem Djaoui, Steve Fisher, Robin Middleton SZTAKI, Hungary –Peter Kacsuk, Norbert Podhorszki Trinity College Dublin –Brian Coghlan, Stuart Kenny, David O’Callaghan

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System4 Overview Grid monitoring: Requirements The R-GMA approach: A virtual monitoring database Components of R-GMA: –Schema –Producers and Consumers –Registry –Republishers Query Planning

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System5 Major Components of DataGrid Storage Element User Interface Resource BrokerLogging and Bookkeeping Replica Catalogue Computer Computing Element Monitoring System Status Information Data Transfer Job Submission

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System6 WP7: R-GMA Collects Network Monitoring Data

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System7 The Grid Monitoring Problem In a Grid we have –Computers –Storage elements –Network nodes and connections –Application programmes, … Monitoring: –What is the current state of the system? –How did the system behave in the past ?

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System8 Monitoring Data Come in two Kinds A Grid monitoring system makes available two kinds of data static data “pools”, e.g., databases on –network topology, nodes connected –applications available (versions, licences,...) “streams” of data, e.g., –sensor data (cpu load, network traffic,...) Data streams may give rise to data pools if they are archived Today: R-GMA is tailored towards streams, but not pools

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System9 Examples of Monitoring Queries “Show me the (average) cpu-load of computers at Heriot-Watt!” “Between which nodes was yesterday the average transportation time for 1 MB packets higher than than 0.… seconds?” For every computing element CE, how many computers of CE have currently a cpu-load of no “ more than 30%?”

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System10 Grid Monitoring Requirements Support for publishing data “pools” and “streams” Support for locating data sources (automatic, if possible) Queries with different temporal interpretations (continuous, latest state, history) Scalability (there may be thousands of data sources) Resilience to failure (data sources may become unavailable) Flexibility (we don’t know which queries will be posed)

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System11 Architecture Approach 1: A Monitoring Data Warehouse Idea: –store all data about the Grid status into a huge database –and query it Not realistic: Loading takes time Data occupy space Connections to the warehouse may fail Often monitoring data flow as data streams, and queries ask for data streams as output

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System12 Approach 2: Monitoring with a “Multi-agent System” The Grid Monitoring Architecture (GMA) of the Global Grid Forum distinguishes between: Consumer Producer Monitoring- Application Data BaseSensor Directory Service find/ register Consumers of information Producers of information Directory Service –Producers register their supply –Consumers register their demand Directory Service mediates between producers and consumers

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System13 Questions about GMA: Which kinds of producers and consumers are there? In which language do producers register their supply and consumers their demand ? What is the meaning of a registration? How does a consumer find suitable producers? And how does a producer find suitable consumers? Producers have different capabilities to answer queries (e.g. selections, joins, …). Which of them should they register?

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System14 R-GMA: A Virtual Monitoring Data Warehouse Language of producers and consumers: relational queries (SQL) Vocabulary: Relations in a global schema Consumer DB-Producer Global Schema S DB Stream Producer Sensor V1 V2... Vn V Views on S Registry Query Consumer: poses queries over global schema Producer: –has a type (stream p., database p.) –publishes relations R1, …,Rk –for every R, registers a simple view V on the global schema

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System15 Schema & Contributions CPULoad (Global Schema) CountrySiteFacilityLoadTimestamp UKRALCDF UKRALATLAS UKGLACDF UKGLAALICE CHCERNALICE CHCERNCDF CPULoad (Stream Producer 3) CHCERNATLAS CHCERNCDF CPULoad (Stream Producer 1) UKRALCDF UKRALATLAS CPULoad (Stream Producer 2) UKGLACDF UKGLAALICE

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System16 Contributions are Views CPULoad (Producer 1) UKRALCDF UKRALATLAS CPULoad (Producer 2) UKGLACDF UKGLAALICE SELECT * FROM cpuLoad WHERE country = ’UK’ AND site = ’RAL’ SELECT * FROM cpuLoad WHERE country = ’UK’ AND site = ’GLA’

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System17 Keys in the Global Schema Network t hrough p ut: tp(src, dest, method, pcktSize, timestamp, time) Intuitively, tp has the primary key (src, dest, method, pcktSize, timestamp). We need to know the primary keys to understand the global schema to answer latest snapshot queries Primary keys are declared, but not enforced! Although, sometimes they hold globally if they hold locally !

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System18 Metaphor: Roles and Agents R-GMA Clients: Grid components or Grid applications Clients can play the roles of producers or consumers A client would need special capabilities for a role: Clients are supported in their roles by agents Implementation: APIs for client roles: “ new StreamProducer(…) ” Agents are objects on a Web server

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System19 Primary Producers Database producer supports queries over fixed set of tuples (static queries) can be used to publish a database Stream producer supports queries over changing set of tuples (continuous queries) supports “latest snapshot queries” –offers up-to-date values for each primary key in a db Today: DatabaseProducer’s and StreamProducer’s in R-GMA are different from the above!

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System20 Communication Modes of Stream Producers Stream Producers may offer two communication modes for continuous queries: –lossless (… but tuples could become stale) –lossy (… but tuples are fresh) Producer Servlet IIIIIIII... ProducerConsumer Consumer Servlet IIIIIIII... Queue Today: R-GMA ’ s StreamProducer ’ s are resilient and support lossless communication

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System21 Republishers Publish Query Answers Archiver: shows the history of a stream. Stream Republisher: enables –merging, –thinning, –summarising of streams …

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System22 Republishers in R-GMA Today Republishers are called “archivers” (although some of them don't archive anything) An archiver (= republisher) is defined by a query consumes only from “stream producers” publishes the query result according to its type, using –a “stream producer”, or –a “latest snapshot producer”, or –a “database producer” (which keeps an archive) Republishers are used to answer complex queries!

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System23 The Next Step: Hierarchies of Stream Republishers country = ‘uk’ National Republisher site = ‘hw’ site =‘ral’ Local/site Republisher Stream Producers ral hw

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System24 Republisher Hierarchies: The Issues Republishers are defined by queries: hierarchies have to be maintained automatically new stream producers must only be added to republishers at “lowest level” hierarchy has to be replanned if a republisher fails difficult: transition from one plan to the other without loss of tuples How well can we describe the content of a stream? Possibly need for descriptions that join stream relations CPULoad(machineID, load, timestamp) static relations locatedAt(machineID, site)

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System25 What is the Meaning of a Query in R-GMA? Assumption: the views of (primary) producers are selections on a single relation, i.e., queries of the form SELECT * FROM cpu_load WHERE machine_id = ‘AB123’ AND loc = ‘hw’ (each producer contributes its parts of a relation) The virtual database contains the union of the data of all the primary producers Conceptually, a query is evaluated over the entire virtual db

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System26 Stream Queries can have Various Temporal Interpretations Consider a query over the relation “Transport Time” tt(src, dest, pcktSize, method, timestamp, time) SELECT * FROM tt WHERE src = ral AND dest = bologna What is meant? Measurements –from now ? (Continuous Query) –up until now ? (History Query) –right now ? (Latest Snapshot Query) Today: Queries can be “flagged” with their type

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System27 Advanced Queries: Mixing Temporal Query Types “Which connections have currently a transportation time that is higher than last week's average?” (latest snapshot and history) “Show me the cpu load of those machines where it is lower than yesterday's load average!” (continuous and history) We do not intend to support such queries by R-GMA!

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System28 In R-GMA Query Answering Needs Mediation Suppose P1, P2 publish for tp ( t hrough p ut) P1: … WHERE src = hw P2: … WHERE src = ral AND pcktSize > 20 A global consumer poses its query over global relations SELECT * FROM tp WHERE pcktSize > 10 A mediator translates this into queries over local relations SELECT * FROM P1.tp WHERE pcktSize > 10 UNION SELECT * FROM P2.tp Today: R-GMA’s mediator handles simple queries like the one above

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System29 Global consumers pose queries over global relations SELECT * FROM tp WHERE pcktSize > 10, which are translated into queries over local relations SELECT * FROM P1.tp WHERE pcktSize > 10 UNION SELECT * FROM P2.tp Local consumers pose queries over local relations directly SELECT * FROM P1.tp WHERE method = ping Today: a consumer can be global or local, but local relations cannot be referred to explicitly Global and Local Consumers

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System30 How does the Mediator Find Suitable Publishers? P1, P2, P3 publish for tt (Transport Time) P1: … src = hw P2: … src = ral AND pcktSize > 20 P3: … src = ral AND method = ping Q: SELECT * FROM tt WHERE src = ral AND method = ping We see: P1 is not suitable for Q, but P2 and P3 are. Why? src = hw AND src = ral AND method = ping is never true src = ral AND pcktSize > 20 AND … is sometimes true Satisfiability Test! Today: implemented

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System31 … So Which Publishers Should the Mediator Ask? P2: … src = ral AND pcktSize > 20 P3: … src = ral AND method = ping Q: SELECT * FROM tt WHERE src = ral AND method = ping All answers to Q returned by P2 are also returned by P3 : whenever src = ral AND pcktSize > 20 AND src = ral AND method = ping is true, then src = ral AND method = ping AND src = ral AND method = ping is true. Hence, R-GMA only needs to ask P3 Entailment Test! Needed for Republisher Hierarchies! (not yet implemented)

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System32 … But What Did the Producers Promise? P registers view V Does P promise –some of V ? (sound description) –all of V? (sound and complete description) The Entailment Test only makes sense when the registered views are sound and complete descriptions Producers should register completeness flags

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System33 … Why May a Producer not be Complete? The language of views is more restricted than the language of queries Hence: republishers may be unable to say exactly what they publish Archivers may archive in lossy mode Producers may lose tuples A producer may not know everything about the real world Open to debate

WP3 Werner Nutt - 1/7/2003R-GMA -DataGrid's Monitoring System34 Summary (1) Monitoring data come in Pools and Streams Global Schema primary keys Types of Stream Queries continuous vs. history vs. latest snapshot Producers DB producers: publish database stream producers: lossless vs. lossy communication modes