Download presentation
Presentation is loading. Please wait.
1
Adaptive Query Processing (Background)
Advisor: Elke A. Rundensteiner Luping Ding Brad Pielech 5/21/2019 DSRG TALK
2
Contents Motivation Issues to consider when building adaptive query system Category of adaptivity and related issues Related work Our initial ideas thus far (to be continued…) 5/21/2019 DSRG TALK
3
Motivation New environment and applications Characteristics
Internet and web-based query system Sample applications Network monitoring system Financial applications: stock trading, … Characteristics Distributed, heterogeneous, autonomous data sources Un-predictable, variable data volume and transfer rate 5/21/2019 DSRG TALK
4
Adaptive Query Processor
… XML View DS1 DS2 DSn User Query Adaptive Query Processor N S J T 5/21/2019 DSRG TALK
5
Motivation II Requirements
Ability to process streaming data using non-blocking operators Dynamic inter- and intra- operator scheduling to adapt to data transfer rate Sharing and re-use of sub-plan across multiple queries The ability to output partial/approximate results according to user preferences (discussed later) 5/21/2019 DSRG TALK
6
Traditional vs. Adaptive
Ready data One-time query Blocking operators Query optimization before execution Exact answer Streaming data may be continuous query Non-blocking operators Query optimization before and during execution Partial/approximate answer 5/21/2019 DSRG TALK
7
Challenges and Possible Solutions
The data arrive at a very high speed Sample data and compute approximate answer Un-predictable change of data transfer rate due to sources drying up or network congestion Interleave query execution and optimization to rework the query plan to minimize execution downtime Blocking operators appear in query plan caused by GroupBy, OrderBy, and Join clauses Implement non-blocking alternatives for blocking operators Unbounded or huge data streams need unbounded or huge intermediate storage Compute approximate answer Switch between memory and disk 5/21/2019 DSRG TALK
8
Contents Motivation Issues to consider when building adaptive system
Category of adaptivity and related issues Related work 5/21/2019 DSRG TALK
9
General Issues I Decide granularity of stream data Each token
Individual Element Decided by XPath specified by query 5/21/2019 DSRG TALK
10
for $b in document(“bib.xml")/bib/book return <result>
{ $b/title } { $b/author } </result> <bib> <book year="1994"> <title>TCP/IP Illustrated</title> <author>W. Stevens</author> <price> 65.95</price> </book> <book year="2000"> <title>Data on the Web</title> <author>Serge Abiteboul</author> <author>Peter Buneman</author> <author>Dan Suciu</author> <price> 39.95</price> 5/21/2019 DSRG TALK
11
General Issues II Give order-sensitive result
Assign unique ID for each data unit (sequence number or timestamp) Each algebra node keeps order of the data Each algebra node doesn’t keep order, but the top node do sorting 5/21/2019 DSRG TALK
12
General Issues III Generate approximate results
Answers to aggregate queries may change based on new tuples and thus the results are approximate Generate partial results New tuples will not change the validity of existing results Both require non-blocking operator implementations to provide the answer so far 5/21/2019 DSRG TALK
13
* * * * * P * * * * * * * * * P * * * *
General Issues IV Compute statistics Data arrive speed Selectivity of operator Execution cost of operator Introduce control message for synchronization Within algebra node Along with data stream * * * * * P * * * * * * * * * P * * * * 5/21/2019 DSRG TALK
14
General Issues V Design mechanisms for query plan re-optimization
When to re-optimize Action-event rule (Tukwila) Signal in the stream (Niagara) How to re-optimize Reorder joins based on statistics Possibly find other sources to obtain data from slow sources 5/21/2019 DSRG TALK
15
Contents Motivation Issues to consider when building adaptive system
Category of adaptivity and related issues Related work Our Initial Ideas Thus Far (to be continued…) 5/21/2019 DSRG TALK
16
Categories of Adaptively
An adaptive system can be adaptive on many different levels including: Batch: adapt query plans after X unit of time Per query: adapt after every query Inter-operator: adapt after several operators Intra-operator: adapt within an operator Per tuple: adapt after one or more tuples 5/21/2019 DSRG TALK
17
Per Query Adaptivity Illustration
XML View Data Sources N S J T Adapt after every query has been executed Sharing execution of common sub expressions between similar queries Reusing of optimized sub-plans 5/21/2019 DSRG TALK
18
Inter-Operator Adaptivity Illustration
Adapt after one or more operators have been executed XML View Data Sources N S J T Modify query execution plans on-the-fly when delays are encountered during runtime Operator scheduling for CPU and memory allocation Alternative source selecting 5/21/2019 DSRG TALK
19
J Intra-Operator Adaptivity Illustration
Adapt during the execution of one operator J J N S N N S S Change execution of one operator to another semantically correct implementation Input stream scheduling XML View Data Sources 5/21/2019 DSRG TALK
20
J J Per Tuple Adaptivity Illustration
Adapt some operator’s execution on a tuple by tuple basis T J J Each tuple can be routed to a different join in the query plan so that each join is busy at all times Uses timestamp to keep track of which tuples have run through which joins Tuple Router N S S N N S XML View Data Sources 5/21/2019 DSRG TALK
21
Contents Motivation Issues to consider when building adaptive system
Category of adaptivity and related issues Related work 5/21/2019 DSRG TALK
22
Related Work Tukwila project at U. of Washington
Pure XML AQP through the integration of query planning and execution Optimizes for time-to-first tuple first, then for the whole result later Dynamic scheduling of operators to adjust to I/O delays and flow rates Breaks query into execution groups or fragments and can re-optimize plan after each group has been executed Uses event-condition-action rules to determine if re-optimization should take place 5/21/2019 DSRG TALK
23
Related Work II Havasu project at Arizona State U.
User preference driven query optimization Niagara project at U. of Wisconsin User doesn’t have to specify the sources for a query Allows user to “give me results so far” even in the presence of aggregation operators MIX system at San Diego State Information integration system using XML as the intermediate data model Lazy navigation into the result controlled by the user Doesn’t adapt query plan during execution 5/21/2019 DSRG TALK
24
Related Work III Aurora project at Brown/MIT/Brandeis
Telegraph project at UC Berkeley Stream project at Stanford Univ. 5/21/2019 DSRG TALK
25
To be continued… 5/21/2019 DSRG TALK
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.