An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations Presenter: Liyan Zhang Presentation of ICS 224 1.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Advertisements

Semantics and Evaluation Techniques for Window Aggregates in Data Streams Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, Peter A. Tucker SIGMOD.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 SQL: Queries, Programming, Triggers Chapter 5 Modified by Donghui Zhang.
1 CHAPTER 4 RELATIONAL ALGEBRA AND CALCULUS. 2 Introduction - We discuss here two mathematical formalisms which can be used as the basis for stating and.
Fast Algorithms For Hierarchical Range Histogram Constructions
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A Modified by Donghui Zhang.
Data Streams & Continuous Queries The Stanford STREAM Project stanfordstreamdatamanager.
Static Optimization of Conjunctive Queries with Sliding Windows over Infinite Streams Presented by: Andy Mason and Sheng Zhong Ahmed M.Ayad and Jeffrey.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Instructor: Craig Duckett CASE, ORDER BY, GROUP BY, HAVING, Subqueries
FALL 2004CENG 351 File Structures and Data Management1 SQL: Structured Query Language Chapter 5.
Query Processing, Resource Management, and Approximation in a Data Stream Management System Selected subset of slides taken from talk by Jennifer Widom.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra Chapter 4, Part A.
Database Systems More SQL Database Design -- More SQL1.
By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental.
1 Relational Algebra and Calculus Yanlei Diao UMass Amherst Feb 1, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Rutgers University Relational Algebra 198:541 Rutgers University.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Concepts of Database Management Sixth Edition
10/3/2000SIMS 257: Database Management -- Ray Larson Relational Algebra and Calculus University of California, Berkeley School of Information Management.
CSCD343- Introduction to databases- A. Vaisman1 Relational Algebra.
Efficient Query Evaluation over Temporally Correlated Probabilistic Streams Bhargav Kanagal, Amol Deshpande ΗΥ-562 Advanced Topics on Databases Αλέκα Σεληνιωτάκη.
STREAM The Stanford Data Stream Management System.
The Relational Model. Review Why use a DBMS? OS provides RAM and disk.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Query Processing, Resource Management, and Approximation in a Data Stream Management System.
Database Management 9. course. Execution of queries.
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku, Rajeev Motwani Standford University VLDB2002.
A Query Translation Scheme for Rapid Implementation of Wrappers Presented By Preetham Swaminathan 03/22/2007 Yannis Papakonstantinou, Ashish Gupta, Hector.
Lecture 05 Structured Query Language. 2 Father of Relational Model Edgar F. Codd ( ) PhD from U. of Michigan, Ann Arbor Received Turing Award.
Constraints on Relations Foreign Keys Local and Global Constraints Triggers Following lecture slides are modified from Jeff Ullman’s slides
Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI Feb 2012 Presentation.
Concepts of Database Management Seventh Edition
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Relational Algebra.
Data Streams: Lecture 101 Window Aggregates in NiagaraST Kristin Tufte, Jin Li Thanks to the NiagaraST PSU.
1 STREAM: The Stanford Data Stream Management System STanfordstREamdatAManager 陳盈君 吳哲維 林冠良.
ICS 321 Fall 2011 The Relational Model of Data (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 8/29/20111Lipyeow.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
JONATHAN LESSINGER A CRITIQUE OF CQL. PLAN 1.Background (How CQL, STREAM work) 2.Issues.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
1 Relational Algebra Chapter 4, Sections 4.1 – 4.2.
Scaling Heterogeneous Databases and Design of DISCO Anthony Tomasic Louiqa Raschid Patrick Valduriez Presented by: Nazia Khatir Texas A&M University.
Advanced Relational Algebra & SQL (Part1 )
CSCD34-Data Management Systems - A. Vaisman1 Relational Algebra.
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Triggers and Streams Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems March 28, 2005.
1 Semantics and Evaluation Techniques for Window Aggregates in Data Streams Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, Peter Tucker This work.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
Query Execution Query compiler Execution engine Index/record mgr. Buffer manager Storage manager storage User/ Application Query update Query execution.
1 SQL: The Query Language. 2 Example Instances R1 S1 S2 v We will use these instances of the Sailors and Reserves relations in our examples. v If the.
CS240A: Databases and Knowledge Bases TSQL2 Carlo Zaniolo Department of Computer Science University of California, Los Angeles Notes From Chapter 6 of.
SQL: Interactive Queries (2) Prof. Weining Zhang Cs.utsa.edu.
Streaming Semantic Data COMP6215 Semantic Web Technologies Dr Nicholas Gibbins –
Data Streams COMP3017 Advanced Databases Dr Nicholas Gibbins –
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Using Collaborative Filtering to Weave an Information Tapestry
S. Sudarshan CS632 Course, Mar 2004 IIT Bombay
COMP3211 Advanced Databases
Relational Model By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany)
Instructor: Craig Duckett Lecture 09: Tuesday, April 25th, 2017
Relational Algebra Chapter 4, Part A
Chapter 15 QUERY EXECUTION.
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Implementation of Relational Operations
Chapter 8 Advanced SQL.
Introduction to Spark.
Evaluation of Relational Operations: Other Techniques
CS240A: Databases and Knowledge Bases TSQL2
Presentation transcript:

An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations Presenter: Liyan Zhang Presentation of ICS 224 1

outline Introduction Related Work Running Example Streams and Relations – Modeling the Running Example – Mapping Operators Abstract Semantics – Relation-to-Stream Operators – Example Concrete Query Language – Window Specification Language – Syntactic Shortcuts and Defaults – Example Queries Discussion Conclusion 2

What is CQL? SQL -- Structured Query Language CQL -- Continuous Query Language Interest in query processing over data streams – E.g., computer network traffic, phone conversations, ATM transactions, web searches, and sensor data simple queries----easy to handle using SQL – take a relational query language – replace references to relations with references to streams – register the query with the stream processor – wait for answers to arrive Complex queries----difficulties – aggregation, subqueries, windowing constructs, relations mixedwith streams, S is a stream R is a relation [Rows 5] specifies a sliding window one-time queries over stored data sets Continuous query over continuously arriving data 3

How to define CQL? Define abstract semantics based on components – any relational query language – any window specification language – a set of relation-to-stream operators Define Concrete language that instantiates the abstract semantics several goals in mind: – exploit well-understood relational semantics – wanted queries performing simple tasks to be easy and compact to write – wanted to enable new transformations specific to streams contributions of this paper: – formalize streams, updateable relations, and their Interrelationship – define an abstract semantics for continuous queries – propose a concrete language, CQL (Continuous Query Language) – consider two issues: exploiting CQL equivalences for query-rewrite optimization, Dealing with time-related issues 4

outline Introduction Related Work Running Example Streams and Relations – Modeling the Running Example – Mapping Operators Abstract Semantics – Relation-to-Stream Operators – Example Concrete Query Language – Window Specification Language – Syntactic Shortcuts and Defaults – Example Queries Discussion Conclusion 5

Related work focus on languages and semantics for continuous queries Continuous queries were introduced for the first time in Tapestry with a SQL-based language called TQL – TQL query is executed once every time instant as a one-time SQL query – the results of all the one-time queries are merged using set union – Semantics based on periodic execution of one-time queries Several systems support procedural continuous queries – Aurora system based on users directly creating a network of stream operators A large number of operator types, from simple stream filters to complex windowing and aggregation operators. – Tribeca stream-processing system for network traffic analysis supports windows, a set of operators adapted from relational algebra, and a simple language for composing query plans from them Tribeca does not support joins across streams 6

outline Introduction Related Work Running Example Streams and Relations – Modeling the Running Example – Mapping Operators Abstract Semantics – Relation-to-Stream Operators – Example Concrete Query Language – Window Specification Language – Syntactic Shortcuts and Defaults – Example Queries Discussion Conclusion 7

Running Example online auction application Users: – Registers: providing a name and current state of residence – Deregister 3 transactions: – place an item for auction and specify a starting price – close an auction they previously started – bid for currently active auctions by specifying a bid price Continuous queries: – Users can register various monitoring queries in the system For example, a user might request to be notified about any auction placed by a user from California within a specified price range. – The auction system can run continuous queries for administrative purposes Whenever an auction is closed, generate an entry with the closing price of the auction based on bid history Maintain the current set of active auctions and currently highest bid for them Maintain the current top 100 “hot items,” i.e., 100 items with the most number of bids in the last hour. 8

outline Introduction Related Work Running Example Streams and Relations – Modeling the Running Example – Mapping Operators Abstract Semantics – Relation-to-Stream Operators – Example Concrete Query Language – Window Specification Language – Syntactic Shortcuts and Defaults – Example Queries Discussion Conclusion 9

Streams and Relations example example tuple s arrives on stream S at time t Given t, there could be 0, 1 or multiple elements with timestamp t in stream S Base stream: source streams Derive stream: streams resulting from queries or subqueries. Base relations: stored relations Derive relations : relation s resulting from queries or subqueries. denotes an unordered bag of tuples at any time instant Timestamp t means logical time, NOT physical time 10 Mapping

Modeling the Running Example back back The input to the online auction system consists of the following five streams: Register Deregister Open Close Bid 11

Mapping Operators stream-to-relation relation-to-relation relation-to-stream take a “sliding window” over the stream that contains the bids over the last ten minutes stream the average price resulting from operator every time the average price changes 12

outline Introduction Related Work Running Example Streams and Relations – Modeling the Running Example – Mapping Operators Abstract Semantics – Relation-to-Stream Operators – Example Concrete Query Language – Window Specification Language – Syntactic Shortcuts and Defaults – Example Queries Discussion Conclusion 13

Abstract Semantics example example relation-to-relation operators – Any relational query language stream-to-relation operators – window specification language: extract tuples from streams relation-to-stream operators relation-to-stream operators – Istream, Dstream, and Rstream Applying the window semantics on the elements of S up to t if R is the output of a window operator over a stream S Applying the semantics of the relational query on the input relations at time t if R is the output of a relational query computed by 14

Relation-to-Stream Operators back back Istream Dstream Rstream counterpart Rstream subsums combination of Istream and Dstream 15

Example Previous example: Using relational algebra, written as: At any time instant t, S[5] is an instantaneous relation containing the last five tuples in S up to t, and then joined with R(t) Relation may change whenever a new tuple arrives in S or R is updated Adding an outermost Istream to this query: convert the relational result into a stream With Istream semantics, a new element is streamed whenever tuple u is inserted into S[5] R at time t, as the result of a stream arrival or relation update. S is a stream R is a relation [Rows 5] specifies a sliding window 16

outline Introduction Related Work Running Example Streams and Relations – Modeling the Running Example – Mapping Operators Abstract Semantics – Relation-to-Stream Operators – Example Concrete Query Language – Window Specification Language – Syntactic Shortcuts and Defaults – Example Queries Discussion Conclusion 17

Concrete Query Language example example CQL contains 3syntactic extensions to SQL: – Anywhere a relation may be referenced in SQL, a stream may be referenced in CQL – In CQL every reference to a stream(base or derived) must be followed immediately by a window specification.window specification – In CQL any reference to a relation(base or derived)may be converted into a stream by applying any of the operators Istream, Dstream, or Rstream Defaults: – Default windows When a stream is referenced in a CQL query and is not followed by a window specification, an Unbounded window is applied by default. – Default Relation-to-Stream Operators On the outermost query, even when streamed results rather than stored results are desired On an inner subquery, even though a window is specified on the subquery result Add an Istream when the query produce a monotonic relation 18

Window Specification Language back back CQL supports only sliding windows, it supports three types: Time-Based Windows Parameters: a time interval T Specified by “S[Range T]”, sliding an interval of size T time over S Special cases: – T=0, tuples from elements of S with timestamp t “S[Now]” – T=, tuples obtained from all elements of S up to t, “S[Range Unbounded]” Tuple-Based Windows Parameters: a positive integer N Specified by “S [Rows N]”, N elements with largest timestamp <= t Special cases: – N=, “S[Rows Unbounded]” Partitioned Windows Parameters: a positive integer N, and a subset of S’s attributes Specified by S “. partitions S into different substreams based on the attributes (similar to SQL Group By), computes a tuple-based sliding window of size N independently on each substream cases, then takes the union of these windows to produce the output relation. 19

Example Queries Window specification default – Open stream is referenced without window Istream default – output relation is Monotonic – Converting the output relation into a stream The query rewritten as explicit window specification Nonmonotonic result, so no default Istream – If add Istream: result will stream new value when count changes – If add Rstream: count will be streamed at each time instant. 20

Example Queries Unbounded windows are applied by default on both Open and Close Default Istream is not applied – Subquery return a monotonic relation, but no window specification following the query. – The result of the entire query is not monotonic—auction tuples are deleted from the result when the auction is closed—and therefore an outermost Istream operator is not applied. partitioned window on the Register stream obtains the latest registration for each user Where clause filters out users who have already deregistered. 21

Example Queries join the Open stream with the User relation If use an Unbounded window on Open – then whenever a user moved into California, all previous auctions started by that user would be generated in the result stream. if a stream is joined with a relation ( in order to add attributes to or filter the stream) – then a Now window on the stream coupled with an Istream or Rstream operator usually provides the desired behavior stream any item_id from Close whose corresponding Open tuple arrived within the last 5 hours Unbounded windows are applied by default on the Bid and Open streams An Istream operator is applied to the Union result by default since the relational output of the Union subquery is monotonic followed by a window specification. 22

outline Introduction Related Work Running Example Streams and Relations – Modeling the Running Example – Mapping Operators Abstract Semantics – Relation-to-Stream Operators – Example Concrete Query Language – Window Specification Language – Syntactic Shortcuts and Defaults – Example Queries Discussion Conclusion 23

Discussion Stream-Only Query Language – CQL distinguish two fundamental data types, relations and streams – derive a stream-only language from CQL Equivalences and Query Transformations – Window Reduction Unbounded windows require buffering the entire history of a stream, while Now windows allow a stream tuple to be discarded as soon as it is processed – Filter-Window Commutativity Timestamps and Physical Time – no direct relationship between T and physical clock-time at the Data Stream Management System Unbounded window and an Istream operator Now window and an Rstream operator 24

outline Introduction Related Work Running Example Streams and Relations – Modeling the Running Example – Mapping Operators Abstract Semantics – Relation-to-Stream Operators – Example Concrete Query Language – Window Specification Language – Syntactic Shortcuts and Defaults – Example Queries Discussion Conclusion 25

Conclusion This paper firstly presented an abstract semantics based on – any relational query language – any window specification language to map from streams to relations – and a set of operators to map from relations to streams Proposed CQL, a concrete language – using SQL as the relational query language – window specifications derived from SQL-99 Identified several practical issues arising from CQL – syntactic shortcuts and defaults – intuitive query formulation – equivalences for query optimization 26

Q&A Thanks! 27