Presentation is loading. Please wait.

Presentation is loading. Please wait.

WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.

Similar presentations


Presentation on theme: "WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003."— Presentation transcript:

1 WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003

2 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation2 Contributors Rob Byrom RAL Andy CookeHeriot-Watt Roney CordenonsiQMUL Abdeslem DjaouiRAL Laurence FieldPPARC Steve FisherRAL Alasdair GrayHeriot-Watt Steve HicksRAL Jason Leake RAL Lisha MaHeriot-Watt James MagowanIBM-UK Werner NuttHeriot-Watt Norbert PodhorszkiSZTAKI Manish SoniPPARC Paul TaylorIBM-UK Antony WilsonPPARC

3 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation3 Grid Monitoring: Where are the Concepts? There are two styles of talking about the Grid: –General metaphors (virtual organisations, services,…) –Low-level technicalities and jargon (LDAP, XML, SOAP, OGSA, OGSI,...) What is missing –Clear definitions of the problems –intuitive concepts for solving them Needed for communication with both, users and developers

4 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation4 The Grid Monitoring Problem In a Grid we have –Computers –Storage elements –Network nodes and connections –Application programmes, … Monitoring: –What is the current state of the system? –How did the system behave in the past ?

5 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation5 Monitoring Data Come in two Kinds A Grid monitoring system makes available two kinds of data static data pools, e.g., databases on –network topology, nodes connected –applications available (versions, licences,...) streams of data, e.g., –sensor data (cpu load, network traffic,...) Data streams may give rise to data pools if they are archived Today: R-GMA is tailored towards streams, but not pools

6 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation6 Examples of Monitoring Queries Show me the (average) cpu-load of computers at Heriot-Watt! Between which nodes was yesterday the average transportation time for 1 MB packets higher than than 0.… seconds? For every node N, how many computers connected to N have currently a cpu-load of no more than 30%?

7 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation7 Stream Queries can have Various Temporal Interpretations Consider a query over the relation Transport Time tt(src, dest, pcktSize, method, timestamp, time) SELECT * FROM tt WHERE src = ral AND dest = bologna What is meant? Measurements –from now ? (Continuous Query) –up until now ? (History Query) –right now ? (Latest Snapshot Query) Today: Queries can be flagged with their type

8 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation8 Advanced Queries: Mixing Temporal Query Types Which connections have currently a transportation time that is higher than last week's average? (latest snapshot and history) Show me the cpu load of those machines where it is lower than yesterday's load average! (continuous and history) We do not intend to support such queries by R-GMA!

9 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation9 Architecture Approach 1: A Monitoring Data Warehouse Idea: –store all data about the Grid status into a huge database –and query it Not realistic: Loading takes time Data occupy space Connections to the warehouse may fail Often monitoring data flow as data streams, and queries ask for data streams as output

10 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation10 Approach 2: Monitoring with a Multi-agent System The Grid Monitoring Architecture (GMA) of the Global Grid Forum distinguishes between: Consumer Producer Monitoring- Application Data BaseSensor Directory Service find/ register Consumers of information Producers of information Directory Service –Producers register their supply –Consumers register their demand Directory Service mediates between producers and consumers

11 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation11 Questions about GMA: Which kinds of producers and consumers are there? In which language do producers register their supply and consumers their demand ? What is the meaning of a registration? How does a consumer find suitable producers? And how does a producer find suitable consumers? Producers have different capabilities to answer queries (e.g. selections, joins, …). Which of them should they register?

12 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation12 R-GMA: A Virtual Monitoring Data Warehouse Language of producers and consumers: relational queries (SQL) Vocabulary: Relations in a global schema Consumer DB-Producer Global Schema S DB Stream Producer Sensor V1 V2... Vn V Views on S Registry Query Consumer: poses queries over global schema Producer: –has a type (stream p., database p.) –publishes relations R1, …,Rk –for every R, registers a simple view V on the global schema

13 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation13 Primary Producers Database producer supports queries over fixed set of tuples (static queries) can be used to publish a database Stream producer supports queries over changing set of tuples (continuous queries) supports latest snapshot queries –offers up-to-date values for each primary key Today: DatabaseProducers and StreamProducers in R-GMA are different from the above!

14 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation14 Communication Modes of Stream Producers Stream Producers may offer two communication modes for continuous queries: –lossless (… but tuples could become stale) –lossy (… but tuples are fresh) Producer Servlet IIIIIIII... ProducerConsumer Consumer Servlet IIIIIIII... Queue Today: R-GMA s StreamProducer s are resilient and support lossless communication

15 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation15 Republishers Publish Query Answers Archiver: shows the history of a stream. Stream Republisher: enables –merging, –thinning, –summarising of streams …

16 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation16 Republishers in R-GMA Today Republishers are called archivers (although some of them don't archive anything) An archiver (= republisher) is defined by a query consumes only from stream producers publishes the query result according to its type, using –a stream producer, or –a latest snapshot producer, or –a database producer (which keeps an archive)

17 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation17 Which View should a Republisher Register? Problem: Republishers may compute complex queries … but complex views would confuse the mediator! Ideas: –register a simplified view for a complex query –register a new table

18 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation18 What is the Meaning of a Query in R-GMA? Assumption: the views of (primary) producers are selections on a single relation, i.e., queries of the form SELECT * FROM cpu_load WHERE machine_id = AB123 AND loc = hw (each producer contributes its parts of a relation) The virtual database contains the union of the data of all the primary producers Conceptually, a query is evaluated over the entire virtual db

19 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation19 In R-GMA Query Answering Needs Mediation Suppose P1, P2 produce for tt (Transport Time) P1: … WHERE src = hw P2: … WHERE src = ral AND pcktSize > 20 A global consumer poses its query over global relations SELECT * FROM tt WHERE pcktSize > 10 A mediator translates this into queries over local relations SELECT * FROM P1.tt WHERE pcktSize > 10 UNION SELECT * FROM P2.tt Today: R-GMAs mediator handles simple queries like the one above

20 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation20 Global consumers pose queries over global relations SELECT * FROM tt WHERE pcktSize > 10, which are translated into queries over local relations SELECT * FROM P1.tt WHERE pcktSize > 10 UNION SELECT * FROM P2.tt Local consumers pose queries over local relations directly SELECT * FROM P1.tt WHERE method = ping Today: a consumer can be global or local, but local relations cannot be referred to explicitly Global and Local Consumers

21 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation21 How does the Mediator Find Suitable Producers? P1, P2, P3 produce for tt (Transport Time) P1: … src = hw P2: … src = ral AND pcktSize > 20 P3: … src = ral AND method = ping Q: SELECT * FROM tt WHERE src = ral AND method = ping We see: P1 is not suitable for Q, but P2 and P3 are. Why? src = hw AND src = ral AND method = ping is never true src = ral AND pcktSize > 20 AND … is sometimes true Satisfiability Test! Today: implemented

22 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation22 … So Which Producers Should the Mediator Ask? P2: … src = ral AND pcktSize > 20 P3: … src = ral AND method = ping Q: SELECT * FROM tt WHERE src = ral AND method = ping All answers to Q returned by P2 are also returned by P3 : whenever src = ral AND pcktSize > 20 AND src = ral AND method = ping is true, then src = ral AND method = ping AND src = ral AND method = ping is true. Hence, R-GMA only needs to ask P3 Entailment Test! Today: not implemented

23 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation23 … But What Did the Producers Promise? P registers view V Does P promise –some of V ? (sound description) –all of V? (sound and complete description) The Entailment Test only makes sense when the registered views are sound and complete descriptions Producers should register completeness flags

24 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation24 … Why May a Producer not be Complete? The language of views is more restricted than the language of queries Hence: republishers may be unable to say exactly what they publish Archivers may archive in lossy mode Producers may lose tuples A producer may not know everything about the real world Open to debate

25 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation25 Keys in the Global Schema tt(src, dest, method, pcktSize, timestamp, time) Intuitively, tt has the primary key (src, dest, method, pcktSize, timestamp). We need to know the primary keys to understand the global schema to answer latest snapshot queries But can we enforce them? Sometimes, they hold globally if they hold locally ! Today: global tables have keys, which are used to keep a latest snapshot cache

26 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation26 Summary (1) Types of Stream Queries continuous vs. history vs. latest snapshot Producers primary producers vs. republishers DB producers: publish database stream producers: lossless vs. lossy communication modes republishers: materialised views vs. archivers vs. stream republishers

27 WP3 Werner Nutt - 24/4/2003R-GMA - Architecture and Query Mediation27 Summary (2) Global Schema primary keys Consumers global vs. local consumers Mediator translates global query into local queries applies Satisfiability Test to find suitable producers Query Planning Entailment Test sound vs. sound and complete producers


Download ppt "WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003."

Similar presentations


Ads by Google