Supporting Join Queries Talk by: Andy Cooke Collaborators: Alasdair Gray, Lisha Ma, and Werner Nutt Heriot-Watt University.

Slides:



Advertisements
Similar presentations
21 Sep 2005LCG's R-GMA Applications R-GMA and LCG Steve Fisher & Antony Wilson.
Advertisements

WP3 WP3 17/9/2002 Steve Fisher / RAL. WP3 Steve Fisher 17/9/2002WP32 Summary Quality Current status 1.2 R-GMA in release 2.0 Recent Requirements Work.
WP3 Werner Nutt (Heriot-Watt University) R-GMA – Architecture and Query Mediation 24/4/2003.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
CMPT 354 Views and Indexes Spring 2012 Instructor: Hassan Khosravi.
 Database is SQL1.mdb ◦ import using MySQL Migration Toolkit 
Group functions cannot be used in the WHERE clause: SELECT type_code FROM d_songs WHERE SUM (duration) = 100; (this will give an error)
Row Migration can Aggravate Contention on Cache Buffer Chains Latch David Kurtz Go-Faster Consultancy Ltd.
1 Query-by-Example (QBE). 2 v A “GUI” for expressing queries. –Based on the Domain Relational Calulus (DRC)! –Actually invented before GUIs. –Very convenient.
COMP 3715 Spring 05. Working with data in a DBMS Any database system must allow user to  Define data Relations Attributes Constraints  Manipulate data.
Database Management Systems 3ed, Online chapter, R. Ramakrishnan and J. Gehrke1 Query-by-Example (QBE) Online Chapter Example is the school of mankind,
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query-by-Example (QBE) Chapter 6 Example is the school of mankind, and they will learn at no.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
The role of a Mediator in R-GMA Manfred Oevers IBM Andrew Cooke Heriot Watt Laurence Field RAL Steve Fisher RAL James Magowan IBM Werner Nutt Heriot Watt.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 6: Set Functions.
WP3 R-GMA Revisited 23/7/2002 Werner Nutt / Heriot-Watt University.
Implementing Business Analytics with MDX Chris Webb London September 29th.
Canonical Producer CP API User Code CP Servlet Files CreateTable, Port, Protocol, Security, SQL Support, Multiple Query Support Security Insert Query Port.
Using Relational Databases and SQL Steven Emory Department of Computer Science California State University, Los Angeles Lecture 7: Aggregates.
Database Systems More SQL Database Design -- More SQL1.
Concepts of Database Management Sixth Edition
Republishers in a Publish/Subscribe Architecture for Data Streams Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences, Heriot-Watt.
CS2008/CS5035 Exam Preparation. Dept. of Computing Science, University of Aberdeen2 Organization of Lecture Notes Group 1 - SQL –L1 – Introduction –L2.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Republishing Mechanisms for R-GMA Benefits and Approaches. Talk by: Alasdair Gray Collaborators: Andy Cooke, Lisha Ma, and Werner Nutt Heriot-Watt University.
Introduction on R-GMA Shi Jingyan Computing Center IHEP.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Automatically Synthesizing SQL Queries from Input-Output Examples Sai Zhang University of Washington Joint work with: Yuyin Sun.
DataGrid is a project funded by the European Union CHEP March 2003 R-GMA 1 R-GMA: First results after deployment Steve Fisher (EDG - WP3)
Programming using C# Joins SQL Injection Stored Procedures
Data Warehouse Database Design Methods For Technical IT Audience Peter Nolan
Application code Registry 1 Alignment of R-GMA with developments in the Open Grid Services Architecture (OGSA) is advancing. The existing Servlets and.
Using Special Operators (LIKE and IN)
Concepts of Database Management Seventh Edition
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks R-GMA Now With Added Authorization Steve.
13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.
Registry Replication Registry calls are forwarded by a registry Service to a single registry instance (i.e. replica) per VDB. If a replica cannot be contacted.
An information and monitoring system for static and dynamic information about grid resources, applications, networks … RDBMS Servlet aware of API during.
Concepts of Database Management Eighth Edition Chapter 3 The Relational Model 2: SQL.
WP3 R-GMA: Likely status New Years Eve Steve Fisher / RAL 24/2/2003.
DATABASE VIEWS CHAPTER 5 (6/E) CHAPTER 8 (5/E) 1.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
SBTeach Introduction School of Business Course Coordination System.
ESRI User Conference 2004 ArcSDE. Some Nuggets Setup Performance Distribution Geodatabase History.
Information Integration By Neel Bavishi. Mediator Introduction A mediator supports a virtual view or collection of views that integrates several sources.
WP3 RGMA Deployment Laurence Field / RAL Steve Fisher / RAL.
WP3 R-GMA: A Relational Grid information and monitoring system Steve Fisher / RAL 13/12/2002.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
Website: Answering Continuous Queries Using Views Over Data Streams Alasdair J G Gray Werner.
WP3 Werner Nutt (Heriot-Watt University) R-GMA – DataGrid’s Monitoring System 1/7/2003.
A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,
Concepts of Database Management Seventh Edition Chapter 3 The Relational Model 2: SQL.
R-GMA – an Update A reminder of R-GMA The need for a mediator Work with WP7 Release 1.2 and beyond Some Implications of OGSA.
Mining real world data RDBMS and SQL. Index RDBMS introduction SQL (Structured Query language)
The impact of R-GMA (upon WP1 and WP4). EDG (Paris) 6 Mar James MagowanImpact of R-GMA Grid Monitoring Architecture (GMA) We use it not only for.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 5 SQL.
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
INFSO-RI Enabling Grids for E-sciencE gLite Information System: R-GMA Tony Calanducci INFN Catania gLite tutorial at the EGEE User.
CERN 21 January 2005Piotr Nyczyk, CERN1 R-GMA Basics and key concepts Monitoring framework for computing Grids – developed by EGEE-JRA1-UK, currently used.
The Mediator: What Next? Talk by: Andy Cooke Collaborators: Alasdair Gray, Lisha Ma, and Werner Nutt Heriot-Watt University.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
Queries.
R-GMA as an example of a generic framework for information exchange
Practicals on R-GMA Valeria Ardizzone INFN
CS 440 Database Management Systems
Inference and Flow Control
The Relational Algebra
The Relational Algebra
Query Functions.
Presentation transcript:

Supporting Join Queries Talk by: Andy Cooke Collaborators: Alasdair Gray, Lisha Ma, and Werner Nutt Heriot-Watt University

What queries would users like to ask? (1) n A continuously executing query that might involve matching tuples across several streams. “ stream to me average net traffic passing between two ComputingElements (CEs)” u need to specify in the query the age of tuples that can be matched (a “sliding window”) u e.g. “consider only tuples no older than 5 min. from now” Possibly interesting?

n A “latest snapshot” query that joins the latest values of keys. “return all CEs that Steve is allowed to use” (Resource Broker)  This query would involve joining tuples from CE tables, VO tables and denied users tables Probably interesting!  A “history” query involving self-joins and aggregation “what was the growth in net traffic since last week?” Possibly interesting? What queries would users like to ask? (2)

How can R-GMA answer such queries? Observation: n If all the relevant tuples are inside one DBMS, then we can pass the query on to that DBMS query engine. - EASY! n If there are > 1 relevant producers, then our mediator probably needs an execution engine!- HARD! In any case, we know that some R-GMA users are defining Archivers and querying these directly. However:  the local answer may only be a subset of the global answer.  they may get a wrong answer (if the query involved max, avg, count, etc.)

Answering Joins using Archivers tables: cpuLoad, discspace condition: country =‘britain’ Requirements: Complete views (I publish everything!) “Latest” or “History” query-type (so data in a database, not a buffer). A smart registry hmm.. just need to go to 1 Archiver. Tuple matching always needs to take place in the same database, and never across databases. e.g. “SELECT * FROM cpuload c, discspace s WHERE c.site = s.site” can easily be answered using site archivers

n Archivers can’t access the tuples introduced by LatestProducers and DatabaseProducers Problems with Answering Joins using Archivers (1)  If a  new LatestProducer registers.  Archiver can’t stream from it.  mediator needs to mediate between two producers, but doesn’t have a query engine!

Problems: n Archivers can’t access the tuples introduced by LatestProducers and DatabaseProducers Answering Joins using Archivers (2) u If a new LatestProducer is registered, the Archiver cannot access these tuples because LatestProducers can’t answer stream queries. u consider a Archiver at some site that pores the tuples from several StreamProducers into a LatestProducer u Therefore the Mediator can’t rely on the Archiver’s query engine to return a complete answer, and so must mediate (hard!). n What if one Archiver isn’t enough? F Consumer.canAnswer()? Consumer.getPlan() ? (“you need an Archiver with these declarations”) (“I can’t answer your query, but could answer this sub-query”) n What if the Archiver disappears before the consumer calls start()? n Would a “Latest Archiver” be up-to-date enough?  new LatestProducer registers.  Archiver can’t stream from it.  mediator needs to mediate between two producers, but doesn’t have a query engine!

Problems: n Archivers can’t access the tuples introduced by LatestProducers and DatabaseProducers Answering Joins using Archivers (2) u If a new LatestProducer is registered, the Archiver cannot access these tuples because LatestProducers can’t answer stream queries. u consider a Archiver at some site that pores the tuples from several StreamProducers into a LatestProducer u Therefore the Mediator can’t rely on the Archiver’s query engine to return a complete answer, and so must mediate (hard!). n What if one Archiver isn’t enough? F Consumer.canAnswer()? Consumer.getPlan() ? (“you need an Archiver with these declarations”) (“I can’t answer your query, but could answer this sub-query”) n What if the Archiver disappears before the consumer calls start()? n Would a “Latest Archiver” be up-to-date enough?  new LatestProducer registers.  Archiver can’t stream from it.  mediator needs to mediate between two producers, but doesn’t have a query engine!

Query Planning and Execution: F What are the relevant Producers? F What sub-queries should we send them? F How should results be combined and operated on? (need a query engine!) Where? Possible Query Engines: F MySQL - dump all the data into MySQL … easy! F Polar Star (Manchester) ?… compatability? Answering Joins without Archivers

Conclusions We could support some “global” join queries quite easily:  when just one Archiver is enough (needs a smarter Registry)  suggestions could be given when there isn’t one Archiver available (consumer.getPlan())  and/or ad hoc joins could answered (in-efficiently) by first loading data into MySQL But:  what queries do users want to pose?  shouldn’t we restrict users to using only StreamProducers?