Continuous Query Languages for DSMS Notes by Carlo Zaniolo UCLA CSD Spring 2009
Relational Algebra Operators DB Relations Selection, Projection Union Join (including X) on tables Set Difference Aggregates: Traditional Blocking aggregates OLAP functions on windows or unlimited preceding Data Streams (DS) ... same (no duplicate eliminated) DS Union by Sort-Merging on timestamps Join of DS with table OK Window joins on streams (timestamps merged into 1 column) DS diff not supported (blocking!) diff of stream with table OK Aggregates: No blocking aggregate OLAP functions on windows or unlimited preceding Slides, and tumbles.
Cascading of Streams CREATE STREAM OpenAuction (itemID int, sellerID char(10), price real, start time timestamp) ORDER BY start time; /*external timestamps*/ SOURCE ’port4445’; CREATE STREAM expensiveItems AS SELECT itemID, sellerID, price, start time FROM OpenAuction WHERE price > 1000 SELECT itemID, price, start time FROM expensiveItems WHERE sellerID= `ABCwarehouse` Source port4445 σ Sink OpenAuction ExpensiveItems σπ 4 4
Continuous Query Graph: many components—arbitrary DAGs Source σ ∑1 Sink ∑2 Source Sink O2 O3 O1 Source1 ⋈ Sink Source2 σ Source1 U Source2 σ ∑1 Sink ∑2
Data Stream of Bids on Bolts and Nuts create stream bids(bid#, item, offer, Time) %example of selection followed by union create stream mybids as (select bid#, offer, Time from bids where item=bolt union select bid#, offer, Time where item=nut) % Result same as: select bid#, offer, Time where item= bolt or item=nut
Window Joins We could create a stream called interesting bids by say joining bids with the ‘interesting_items’ table. For each bolt bid find all the nut bids in the last 5 minutes create stream selfjoinbids as select S1.bid#, S1.offer, S2.bid#, S1.Time from bids as S1, bids as S2 [window of 5 minutes] where S1.item=bolt and S2.item=nut and S1.offer=S2.offer S1 only sees older S2 tuples in a window of 5 minutes: S1.Time >= S2.Time and S2.Time >= S1.Time-5 minutes. These are logical windows: Physical windows defined by a tuple count. Symmetrically a window can also be defined on S1
Window Join of Stream A and Stream B SourceA SourceB op2 Sink op1 A ⋈ B Join of A having window W(A) and Stream B having window W(B) When tuples are present at both inputs, and the timestamp of A is less than or equal to that of B, then perform the following operations (symmetric operations are performed if timestamp of B is less than or equal to that of A): Production: compute the join of the tuple in A with the tuples in W(B) and add the resulting tuples to output buffer (these tuples take their timestamps from the tuple in A) Consumption: the current tuple in A is removed from the input and added to the window buffer W(A) (from which the expired tuples are also removed)
Union—Merging tuples by timestamps The Union operator performs a merge operation on tuples sorted by their timestamps The tuple with the smallest timestamp goes through first Output tuples thus sorted by timestamps Tuples on Union are subject to idle-waiting Tuples in some input might be slow to arrive or might be slow to arrive due network traffic and operator scheduling, etc. When some input is empty, tuples on the other inputs have to wait Input tuples idle-wait for future arrivals, greatly increase query response time We use a union operator to explain the idle-waiting problem. Source1 U Source2 σ Sink
The Idle-Waiting Problem Source1 U Source2 σ Sink ∑1 ∑2 1 6 3 ? B A C Only timestamps are shown for tuples in buffers Tuple with TS=1 goes through union first, followed by that with TS=3 Source1 U Source2 σ Sink ∑1 ∑2 ? 6 A B C We use a union operator to explain the idle-waiting problem. The union produces tuples by increasing timestamp values Nothing is produced until there is a tuple in A— Idle Waiting Idle Waiting: poor response time—also extra memory used.
Idle Waiting: Simultaneous Tuples Source1 U Source2 σ Sink ∑1 ∑2 1 4 ? B A C Only timestamps of tuples are shown in buffers Tuple with TS=1 goes through union first, followed by one with TS=4 Source1 U Source2 σ Sink ∑1 ∑2 ? 4 A B C We use a union operator to explain the idle-waiting problem. No need to wait here Timestamp memory registers can solve that problem
A General Solution for Idle Waiting Source1 U Source2 σ Sink ∑1 ∑2 ? 6 A B C To avoid idle waiting, we need to get values into A fast. How ?? By going back to ∑1 that checks B for tuples to be processed and sent to A. If B is empty then we go to , which processes the tuples in C. This process is called Backtracking! Other Execution models, such as those used by other DSMS, will not do. E.g., Round Robin: a fixed execution order can take us to different components or different branches. Backtracking takes us back to the only buffers and operators that can unblock the idle waiting Yes, But: ... what if the source buffer C is empty? Then On-demand Timestamp Generation, and Punctuation tuples to deliver information: basically these are tuples with only timing information. Punctuation tuples were also used to deliver End-of-input information for blocking operators. We use a union operator to explain the idle-waiting problem.
Time-stamped Punctuation Marks Heartbeats: timestamps are generated periodically and sent out from the source. [Gigascope] Effective but far from optimal: too few when needed, too many when not needed On-demand generation, to Avoid useless operations when there is no idle-waiting Send request to right source nodes that can fix the idle-waiting Much less response time, less memory, but An execution model capable of supporting backtracking is needed for on-demand generation [Stream Mill] Regular heartbeat tuples has a number of problems: response time improvement limited by heartbeat frequency to have high improvements, heartbeat tuples themselves bring high overhead of both memory and cpu time. not on-demand, therefore incur overhead even when there is no idle-waiting occurring.
Backtracking without Tears A Simple Rule for Next Operator Selection (NOS), based on the input & output buffers: YIELD is true if the output buffer of the current operator contains some tuples MORE is true if there are still tuples in the input buffer of the current operator [Forward:] if YIELD then next := succ [Encore:] else if MORE then next := self [Backtrack:] else next := pred NOS for Depth-First Note that DFS and BFS rules here only differ on the order of the two condition checks for yield and more. Slight modifications can also be made to support other strategies, such RR. Source σ ∑1 Sink ∑2 ?
A General Model: Breadth/Depth First A Simple Rule for Next Operator Selection (NOS) based on the input & output buffers: YIELD is true if the output buffer of the current operator contains some tuples MORE is true if there are still tuples in the input buffer of the current operator NOS for Depth-First [Forward:] if YIELD then next := succ [Encore:] else if MORE then next := self [Backtrack:] else next := pred NOS for Breadth-First [Encore:] if MORE then next := self [Forward:] else if YIELD then next := succ [Backtrack:] else next := pred Note that DFS and BFS rules here only differ on the order of the two condition checks for yield and more. Slight modifications can also be made to support other strategies, such RR. Source σ ∑1 Sink ∑2 ?
Timestamp Propagation by Special Arcs Timestamps can be propagated back to the idle-waiting operators By punctuation marks By special arcs that connect the source to idle-waiting operators shown are dashed arcs in the Enhanced Query Graph (EQG) ∑1 Source1 It is important that timestamp propagation only occurs after unsuccessful backtracking, since we need to make sure that there are no tuples left in the intermediate non-IWP operators. U Sink Source2 σ ∑2 Source3
Execution Model Benefits Simple and regular: The same basic cycle is shared by all strategies, with only the NOS rules being different Amenable to an efficient Deterministic Finite Automata (DFA) based implementation: Optimization/scheduling Flexibility A run time, we can easily switch between policies Different strategies at the same time in different components Highly reconfigurable At run-time. Next we will take a look at how it is implemented and how the reconfiguration happens
Experiments – Timestamp Propagation Union query with unmatched arrival rates at input These experiments are done with internal timestamps, where the ETS (Enabling TimeStamp) value is easy to generate. The experiments use one union operator, where the arrival rates on the two inputs are very different (50 tuples/sec and 0.05 tuples/sec). In fact, not shown here is that on-demand timestamp here works very close to the optimal case of latent timestamps, where the input tuples goes through the union operator as they arrive without timestamp consideration (timestamp assigned upon arrival). Periodic timestamp propagation reduces latency in proportion to the rate of the heartbeat However memory overhead increases when heartbeat tuple rate is high On-demand timestamp propagation reduces latency to very small values with no memory overhead
DFS vs. BFS How DFS and BFS behave under different input burstiness To show that our execution model supports different strategies, we compare DFS and BFS under different input burstiness Both DFS and BFS shows increased latency with increasing burstiness, but BFS has a steeper increase, which is dictated by the nature of the strategies. How DFS and BFS behave under different input burstiness We introduce bursts of nearly simultaneous tuples Both DFS and BFS shows increased latency when burstiness increases, but BFS has a steeper increase
References A. Arasu, S. Babu, and J. Widom. An abstract semantics and concrete language for continuous queries over streams and relations. Technical report, Stanford University, 2002. C. Cranor, Y. Gao, T. Johnson, V. Shkapenyuk, and O. Spatscheck. Gigascope: A stream database for network applications. In SIGMOD Conference, pages 647-651. ACM Press, 2003. Jaewoo Kang, Jeffrey F. Naughton, and Stratis Viglas. Evaluating window joins over unbounded streams. In ICDE, pages 341--352, 2003. Utkarsh Srivastava and J.Widom. Flexible time management in data stream systems. In PODS, pages 263.274, 2004. Peter A. Tucker, David Maier, Tim Sheard, and Leonidas Fegaras. Exploiting punctuation semantics in continuous data streams. IEEE Trans. Knowl. Data Eng, 15(3):555-568, 2003. Theodore Johnson et. al. A heartbeat mechanism and its applicationin gigascope. In VLDB, pages 1079.1088, 2005. Yijian Bai, Hetal Thakkar, Haixun Wang and Carlo Zaniolo: Optimizing Timestamp Management in Data Stream Management Systems. ICDE 2007.
Relational Algebra Operators Stored data Selection, Projection Union Join (including X) on tables Set Difference Aggregates: Traditional Blocking aggregates OLAP functions on windows or unlimited preceding Data Streams ... same Union by Sort-Merging on timestamps Join of Stream with table Window joins on streams (timestamps merged into 1 column) No stream difference (blocking—diff of stream with table OK). Aggregates: No blocking aggregate OLAP functions on windows or unlimited preceding Slides, and tumbles. Including UDAs