Adaptive Query Processing Adapted from a Tutorial given at SIGMOD 2006 by: Amol Deshpande, University of Maryland Joseph M. Hellerstein, University of.

Adaptive Query Processing Adapted from a Tutorial given at SIGMOD 2006 by: Amol Deshpande, University of Maryland Joseph M. Hellerstein, University of California, Berkeley Vijayshankar Raman, IBM Almaden Research Center

Data Independence Redux  The taproot of modern database technology  Separation of specification (“what”) from implementation (“how”)  Refamiliarizing ourselves: Why do we care about data independence? d app dt  d env dt

D. I.  Adaptivity  Query Optimization: the key to data independence –bridges specification and implementation –isolates static applications from dynamic environments  How does a DBMS account for dynamics in the environment?  This tutorial is on a 30-year-old topic –With a 21st-Century renaissance ADAPTIVITY ADAPTIVITY

Why the Renaissance?  Breakdown of traditional query optimization –Queries over many tables –Unreliability of traditional cost estimation –Success & maturity make problems more apparent, critical c.f. Oracle v6!  Query processing in new environments –E.g. data integration, web services, streams, P2P, sensornets, hosting, etc. –Unknown and dynamic characteristics for data and runtime –Increasingly aggressive sharing of resources and computation –Interactivity in query processing  Note two separate themes? –Unknowns: even static properties often unknown in new environments and often unknowable a priori –Dynamics: can be very high -- motivates intra-query adaptivity ? denv dt

20th Century Summary  System R’s optimization scheme deemed the winner for 25 years  Nearly all 20thC research varied System R’s individual steps –More efficient measurement (e.g. sampling) –More efficient/effective models (samples, histograms, sketches) –Expanded plan spaces (new operators, bushy trees, richer queries and data models, materialized views, parallelism, remote data sources, etc) –Alternative planning strategies (heuristic and enumerative)  Speaks to the strength of the scheme –independent innovation on multiple fronts –as compared with tight coupling of INGRES  But… minimal focus on the interrelationship of the steps –Which, as we saw from Ingres, also affects the plan space

21 st Century Adaptive Query Processing  (well, starts in late 1990’s)  Revisit basic architecture of System R –In effect, change the basic adaptivity loop!  As you examine schemes, keep an eye on: –Rate of change in the environment that is targeted –How radical the scheme is wrt the System R scheme ease of evolutionary change –Increase in plan space: are there new, important opportunities? even if environment is ostensibly static! –New overheads introduced –How amenable the scheme is to independent innovation at each step Measure/Analyze/Plan/Actuate

Tangentially Related Work  An incomplete list!!!  Competitive Optimization [Antoshenkov93] –Choose multiple plans, run in parallel for a time, let the most promising finish 1x feedback: execution doesn’t affect planning after the competition  Parametric Query Optimization [INSS92, CG94, etc.] –Given partial stats in advance. Do some planning and prune the space. At runtime, given the rest of statistics, quickly finish planning. Changes interaction of Measure/Model and Planning No feedback whatsoever, so nothing to adapt to!  “Self-Tuning”/“Autonomic” Optimizers [CR94, CN97, BC02, etc.] –Measure query execution (e.g. cardinalities, etc.) Enhances measurement, on its own doesn’t change the loop –Consider building non-existent physical access paths (e.g. indexes, partitions) In some senses a separate loop – adaptive database design Longer timescales

Tangentially Related Work II  Robust Query Optimization [CHG02, MRS+04, BC05, etc.] –Goals: Pick plans that remain predictable across wide ranges of scenarios Pick least expected cost plan –Changes cost function for planning, not necessarily the loop. If such functions are used in adaptive schemes, less fluctuation [MRS+04] –Hence fewer adaptations, less adaptation overhead  Adaptive query operators [NKT88, KNT89, PCL93a, PCL93b] –E.g. memory-adaptive sort and hash-join –Doesn’t address whole-query optimization problems –However, if used with AQP, can result in complex feedback loops Especially if their actions affect each other’s models!

Extended Topics in Adaptive QP  An incomplete list!!  Parallelism & Distribution –River [A-D03] –FLuX [SHCF03, SHB04] –Distributed eddies [TD03]  Data Streams –Adaptive load shedding –Shared query processing

Adaptive Selection Ordering Title slide

Selection Ordering  Complex predicates on relations common –Eg., on an employee relation: ((salary > 120000) AND (status = 2)) OR ((salary between 90000 and 120000) AND (age < 30) AND (status = 1)) OR …  Selection ordering problem Decide the order in which to evaluate the individual predicates against the tuples  We focus on evaluating conjunctive predicates (containing only AND’s) Example Query select * from R where R.a = 10 and R.b < 20 and R.c like ‘%name%’;

Why Study Selection Ordering  Many join queries reduce to this problem –Queries posed against a star schema –Queries where only pipelined left-deep plans are considered –Queries involving web indexes  Increasing interest in recent years –Web indexes [CDY’95, EHJKMW’96, GW’00] –Web services [SMWM’06] –Data streams [AH’00, BMMNW’04] –Sensor Networks [DGMH’05]  Similar to many problems in other domains –Sequential testing (e.g. for fault detection) [SF’01, K’01] –Learning with attribute costs [KKM’05]

Execution Strategies Pipelined execution (tuple-at-a-time) R.a = 10 R.b < 20 R.c like … For each tuple r Є R Apply predicate R.a = 10 first; If tuple satisfies the selection, apply R.b < 20; If both satisfied, apply R.c like ‘%name%’; R result Static optimization ? 1. Using the KBZ algorithm Order by c/(1-p) Assumes predicate independence 2. A greedy algorithm Known to be 4-approximate

Adaptive Greedy [BMMNW’04]  Context: Pipelined query plans over streaming data  Example: R.a = 10 R.b < 20 R.c like … Initial estimated selectivities 0.05 0.1 0.2 Costs 1 unit Three independent predicates R.a = 10 R.b < 20 Rresult R.c like … R1 R2 R3 Optimal execution plan orders by selectivities (because costs are identical)

Adaptive Greedy [BMMNW’04]  Monitor the selectivities  Switch order if the predicates not ordered by selectivities R.a = 10 R.b < 20 Rresult R.c like … R1 R2 R3 R sample Randomly sample R.a = 10 R.b < 20 R.c like … estimate selectivities of the predicates over the tuples of the profile Reoptimizer IF the current plan not optimal w.r.t. these new selectivities THEN reoptimize using the Profile Profile

Adaptive Greedy [BMMNW’04]  Correlated Selections –Must monitor conditional selectivities monitor selectivities sel(R.a = 10), sel(R.b < 20), sel(R.c …) monitor conditional selectivities sel(R.b < 20 | R.a = 10) sel(R.c like … | R.a = 10) sel(R.c like … | R.a = 10 and R.b < 20) R.a = 10 R.b < 20 Rresult R.c like … R1 R2 R3 R sample Randomly sample R.a = 10 R.b < 20 R.c like … (Profile) Reoptimizer Uses conditional selectivities to detect violations Uses the profile to reoptimize O(n 2 ) selectivities need to be monitored

Adaptive Greedy [BMMNW’04]  Advantages: –Can adapt very rapidly –Theoretical guarantees on performance Not known for any other AQP protocols  Disadvantages: –Limited applicability Only applies to selection ordering and specific types of join queries –Possibly high runtime overheads Several heuristics described in the paper

Eddies [AH’00]  Treat query processing as routing of tuples through operators Pipelined query execution using an eddy An eddy operator Intercepts tuples from sources and output tuples from operators Executes query by routing source tuples through operators A traditional pipelined query plan R.a = 10 R.b < 20 Rresult R.c like … R1 R2 R3 Eddy R result R.a = 10 R.c like … R.b < 20

Eddies [AH’00] abc… 1510AnameA… An R Tuple: r1 r1 Eddy R result R.a = 10 R.c like … R.b < 20

ready bit i : 1  operator i can be applied 0  operator i can’t be applied Eddies [AH’00] abc…readydone 1510AnameA…111000 An R Tuple: r1 r1 Operator 1 Operator 2 Operator 3 Eddy R result R.a = 10 R.c like … R.b < 20

done bit i : 1  operator i has been applied 0  operator i hasn’t been applied Eddies [AH’00] abc…readydone 1510AnameA…111000 An R Tuple: r1 r1 Operator 1 Operator 2 Operator 3 Eddy R result R.a = 10 R.c like … R.b < 20

Eddies [AH’00] abc…readydone 1510AnameA…111000 An R Tuple: r1 r1 Operator 1 Operator 2 Operator 3 Used to decide validity and need of applying operators Eddy R result R.a = 10 R.c like … R.b < 20

Eddies [AH’00] abc…readydone 1510AnameA…111000 An R Tuple: r1 r1 Operator 1 Operator 2 Operator 3 satisfied r1 abc…readydone 1510AnameA…101010 r1 not satisfied eddy looks at the next tuple For a query with only selections, ready = complement(done) Eddy R result R.a = 10 R.c like … R.b < 20

Eddies [AH’00] abc… 1015AnameA… An R Tuple: r2 Operator 1 Operator 2 Operator 3 r2 Eddy R result R.a = 10 R.c like … R.b < 20 satisfied

Eddies [AH’00] abc…readydone 1015AnameA…000111 An R Tuple: r2 Operator 1 Operator 2 Operator 3 r2 if done = 111, send to output r2 Eddy R result R.a = 10 R.c like … R.b < 20 satisfied

Eddies [AH’00]  Adapting order is easy –Just change the operators to which tuples are sent –Can be done on a per-tuple basis –Can be done in the middle of tuple’s “pipeline”  How are the routing decisions made ? –Using a routing policy Operator 1 Operator 2 Operator 3 Eddy R result R.a = 10 R.c like … R.b < 20

Routing Policy 1: Non-adaptive  Simulating a single static order –E.g. operator 1, then operator 2, then operator 3 Routing policy: if done = 000  route to 1 100  route to 2 110  route to 3 table lookups  very efficient Operator 1 Operator 2 Operator 3 Eddy R result R.a = 10 R.c like … R.b < 20

Overhead of Routing  PostgreSQL implementation of eddies using bitset lookups [Telegraph Project]  Queries with 3 selections, of varying cost –Routing policy uses a single static order, i.e., no adaptation

Routing Policy 2: Deterministic  Monitor costs and selectivities continuously  Reoptimize periodically using KBZ Statistics Maintained: Costs of operators Selectivities of operators Routing policy: Use a single order for a batch of tuples Periodically apply KBZ Operator 1 Operator 2 Operator 3 Eddy R result R.a = 10 R.c like … R.b < 20 Can use the A-Greedy policy for correlated predicates

Overhead of Routing and Reoptimization  Adaptation using batching –Reoptimized every X tuples using monitored selectivities –Identical selectivities throughout  experiment measures only the overhead

Routing Policy 3: Lottery Scheduling  Originally suggested routing policy [AH’00]  Applicable when each operator runs in a separate “thread” –Can also be done single-threaded, via an event-driven query executor  Uses two easily obtainable pieces of information for making routing decisions: –Busy/idle status of operators –Tickets per operator Operator 1 Operator 2 Operator 3 Eddy R result R.a = 10 R.c like … R.b < 20

Routing Policy 3: Lottery Scheduling  Routing decisions based on busy/idle status of operators Rule: IF operator busy, THEN do not route more tuples to it Rationale: Every thread gets equal time SO IF an operator is busy, THEN its cost is perhaps very high Operator 1 Operator 2 Operator 3 Eddy R result R.a = 10 R.c like … R.b < 20 BUSY IDLE

Routing Policy 3: Lottery Scheduling  Routing decisions based on tickets Rules: 1. Route a new tuple randomly weighted according to the number of tickets tickets(O1) = 10 tickets(O2) = 70 tickets(O3) = 20 Will be routed to: O1 w.p. 0.1 O2 w.p. 0.7 O3 w.p. 0.2 Operator 1 Operator 2 Operator 3 Eddy result R.a = 10 R.c like … R.b < 20 r

Routing Policy 3: Lottery Scheduling  Routing decisions based on tickets Rules: 1. Route a new tuple randomly weighted according to the number of tickets tickets(O1) = 10 tickets(O2) = 70 tickets(O3) = 20 r Operator 1 Operator 2 Operator 3 Eddy result R.a = 10 R.c like … R.b < 20

Routing Policy 3: Lottery Scheduling  Routing decisions based on tickets Rules: 1. Route a new tuple randomly weighted according to the number of tickets 2. route a tuple to an operator O i tickets(O i ) ++; Operator 1 Operator 2 Operator 3 Eddy result R.a = 10 R.c like … R.b < 20 tickets(O1) = 11 tickets(O2) = 70 tickets(O3) = 20

Routing Policy 3: Lottery Scheduling  Routing decisions based on tickets r Rules: 1. Route a new tuple randomly weighted according to the number of tickets 2. route a tuple to an operator O i tickets(O i ) ++; 3. O i returns a tuple to eddy tickets(O i ) --; Operator 1 Operator 2 Operator 3 Eddy result R.a = 10 R.c like … R.b < 20 tickets(O1) = 11 tickets(O2) = 70 tickets(O3) = 20

Routing Policy 3: Lottery Scheduling  Routing decisions based on tickets r Rules: 1. Route a new tuple randomly weighted according to the number of tickets 2. route a tuple to an operator O i tickets(O i ) ++; 3. O i returns a tuple to eddy tickets(O i ) --; Operator 1 Operator 2 Operator 3 Eddy result R.a = 10 R.c like … R.b < 20 tickets(O1) = 10 tickets(O2) = 70 tickets(O3) = 20 Will be routed to: O2 w.p. 0.777 O3 w.p. 0.222

Routing Policy 3: Lottery Scheduling  Routing decisions based on tickets Rationale: Tickets(O i ) roughly corresponds to (1 - selectivity(O i )) So more tuples are routed to the more selective operators Rules: 1. Route a new tuple randomly weighted according to the number of tickets 2. route a tuple to an operator O i tickets(O i ) ++; 3. O i returns a tuple to eddy tickets(O i ) --; Operator 1 Operator 2 Operator 3 Eddy result R.a = 10 R.c like … R.b < 20 tickets(O1) = 10 tickets(O2) = 70 tickets(O3) = 20

Routing Policy 3: Lottery Scheduling  Effect of the combined lottery scheduling policy: –Low cost operators get more tuples –Highly selective operators get more tuples –Some tuples are randomly, knowingly routed according to sub-optimal orders To explore Necessary to detect selectivity changes over time

Routing Policy 4: Content-based Routing  Routing decisions made based on the values of the attributes [BBDW’05]  Also called “conditional planning” in a static setting [DGHM’05]  Less useful unless the predicates are expensive –At the least, more expensive than r.d > 100 Example Eddy notices: R.d > 100  sel(op1) > sel(op2) & R.d < 100  sel(op1) < sel(op2) Routing decisions for new tuple “r”: IF (r.d > 100): Route to op1 first w.h.p ELSE Route to op2 first w.h.p Operator 1 Operator 2 Eddy result Expensive predicates

Eddies: Post-Mortem  Cost of adaptivity –Routing overheads Minimal with careful engineering –E.g. using bitset-indexed routing arrays “Batching” helps tremendously –Statistics Maintenance –Executing the routing policy logic

Discussion  Benefits for AQP techniques come from two places –Increased explored plan space Can use different plans for different parts of data –Adaptation Can change the plan according to changing data characteristics  Selection ordering is STATELESS –No inherent “cost of switching plans” Can switch the plan without worrying about operator states –Key to the simplicity of the techniques

Discussion  Adaptation is not free –Costs of monitoring and maintaining statistics can be very high A selection operation may take only 1 instruction to execute Comparable to updating a count –“Sufficient statistics” May need to maintain only a small set of statistics to detect violations E.g. The O(n 2 ) matrix in Adaptive-Greedy [BBMNW’04]

Adaptive Join Processing Title slide

Outline  20 th Century Adaptivity: Intuition from the Classical Systems  Adaptive Selection Ordering  Adaptive Join Processing –Additional complexities beyond selection ordering –Four plan spaces Simplest: pipelines of Nested Loop Joins Traditional: Trees of Binary Operators (TOBO) Multi-TOBO: horizontal partitioning Dataflows of unary operators –Handling asynchrony  Research Roundup

Select-Project-Join Processing  Query: select count(*) from R, S, T where R.a=S.a and S.b=T.b and S.c like ‘%name%’ and T.d = 10  An execution plan  Cost metric: CPU + I/O  Plan Space: –Traditionally, tree of binary join operators (TOBO): access methods Join algorithms Join order –Adaptive systems: Some use the same plan space, but switch between plans during execution Others use much larger plan spaces –different adaptation techniques adapt within different plan spaces S T R SMJ NLJ

Approach  Pipelined nested-loops plans –Static Rank Ordering –Dynamic Rank Ordering – Eddies, Competition  Trees of Binary Join Operators (TOBO) –Static: System R –Dynamic: Switching plans during execution Multiple Trees of Binary Join Operators –Convergent Query Processing –Eddies with Binary Join Operators –STAIRs: Moving state across joins  Dataflows of Unary Operators –N-way joins switching join algorithms during execution  Asynchrony in Query Processing B NLJ C A

Pipelined Nested Loops Join  Simplest method of joining tables –Pick a driver table (R). Call the rest driven tables –Pick access methods (AMs) on the driven tables –Order the driven tables –Flow R tuples through the driven tables For each r  R do: look for matches for r in A; for each match a do: look for matches for in B; … R B NLJ C A

Adapting a Pipelined Nested Loops Join  Simplest method of joining tables –Pick a driver table (R). Call the rest driven tables –Pick access methods (AMs) on the driven tables –Order the driven tables –Flow R tuples through the driven tables For each r  R do: look for matches for r in A; for each match a do: look for matches for in B; … R B NLJ C A Keep this fixed for now Almost identical to selection ordering

Tree Of Binary Join Operators (TOBO) recap  Standard plan space considered by most DBMSs today –Pick access methods, join order, join algorithms –search in this plan space done by dynamic programming A B R MJ NLJ D C HJ

Switching plans during query execution  Monitor cardinalities at well-defined checkpoints  If actual cardinality is too different from estimated,  re-optimize to switch to a new plan  Most widely studied technique: -- Federated systems (InterViso 90, MOOD 96), Red Brick, Query scrambling (96), Mid-query re-optimization (98), Progressive Optimization (04), Proactive Reoptimization (05), … A C B R MJ NLJ MJ B C HJ MJ sort

Questions  Where to place checkpoints ? –Typically at materialization points  When to reoptimize ? –State reuse a key –The cost of switching shouldn’t be more than cost of continuing  How to switch ? –Try to reuse the work

Next  Eddies…

Example Database NameLevel JoeJunior JenSenior NameCourse JoeCS1 JenCS2 CourseInstructor CS2Smith select * from students, enrolled, courses where students.name = enrolled.name and enrolled.course = courses.course Students Enrolled NameLevelCourse JoeJuniorCS1 JenSeniorCS2 Enrolled Courses StudentsEnrolled Courses NameLevelCourseInstructor JenSeniorCS2Smith

Symmetric Hash Join NameLevel JenSenior JoeJunior NameCourse JoeCS1 JenCS2 JoeCS2 select * from students, enrolled where students.name = enrolled.name NameLevelCourse JenSeniorCS2 JoeJuniorCS1 JoeSeniorCS2 StudentsEnrolled  Pipelined hash join [WA’91]  Simultaneously builds and probes hash tables on both sides  Widely used: adaptive qp, stream joins, online aggregation, …  Naïve version degrades to NLJ once memory runs out –Quadratic time complexity –memory needed = sum of inputs  Improved by XJoins [UF 00]  Needs more investigation

Query Execution using Eddies Eddy S E C Probe to find matches S E HashTable S.Name HashTable E.Name E C HashTable E.Course HashTable C.Course JoeJunior JoeJunior JoeJunior No matches; Eddy processes the next tuple Output Insert with key hash(joe)

Query Execution using Eddies Eddy S E C Insert Probe S E HashTable S.Name HashTable E.Name E C HashTable E.Course HashTable C.Course JoeJr JenSr JoeCS1 JoeCS1 JoeCS1 JoeJrCS1 JoeJrCS1 JoeJrCS1 Output CS2Smith

Query Execution using Eddies Eddy S E C Output Probe S E HashTable S.Name HashTable E.Name E C HashTable E.Course HashTable C.Course JoeJr JenSr CS2Smith JenCS2 JoeCS1 JoeJrCS1 JenCS2 JenCS2 JenCS2Smith Probe JenCS2Smith JenCS2Smith JenSr.CS2Smith JenSr.CS2Smith

Eddies: Postmortem (1) Eddy executes different TOBO plans for different parts of data StudentsEnrolled Output Courses E C S E CoursesEnrolled Output Students E S C E CourseInstructor CS2Smith CourseInstructor CS2Smith NameCourse JoeCS1 NameLevel JoeJunior JenSenior NameLevel JoeJunior JenSenior NameCourse JenCS2

Routing Policies  routing of tuples determines join order  same policies as described for selections –can simulate static order –lottery scheduling –can tune policy for interactivity metric [RH02]  much more work remains to be done

Summary  Eddies dynamically reorder pipelined operators, on a per-tuple basis –selections, nested loop joins, symmetric hash joins –extends naturally to combinations of the above  This can also be extended to non-pipelined operators (e.g. hybrid hash)  But, eddy can adapt only when it gets control (e.g., after hash table build) –just like with mid-query reoptimization  Next we study two extensions that widen the plan space still further –STAIRs: moving state across join operators –SteMs: unary operators for adapting access methods, join algorithms dynamically

N-way Symmetric Hash Join NameLevel JenSenior JoeJunior NameCourse JoeCS1 JenCS2 JoeCS2 select * from students, enrolled where students.name = enrolled.name NameLevelCourseInstructor JenSeniorCS2Prof. B JoeJuniorCS1Prof. A JoeSeniorCS2Prof. B StudentsEnrolled  Pipelined join –XJoin-like disk spilling possible [VNB’03]  Simplest version atomically does one build + (n-1) probes  Breaking this atomicity is key to adaptation CourInst CS1P. A CS2P. B Levels

N-way Joins  State Modules (SteMs) NameLevel JenSenior JoeJunior NameCourse JoeCS1 JenCS2 JoeCS2 select * from students, enrolled where students.name = enrolled.name NameLevelCourseInstructor JenSeniorCS2Prof. B JoeJuniorCS1Prof. A JoeSeniorCS2Prof. B StudentsEnrolled CourInst CS1P. A CS2P. B Levels

References  [A-D03] R. Arpaci-Dusseau. Runtime Adaptation in River. ACM TOCS 2003.  [AH’00] R. Avnur, J. M. Hellerstein: Eddies: Continuously Adaptive Query Processing SIGMOD Conference 2000: 261-272  [Antoshenkov93] G. Antoshenkov: Dynamic Query Optimization in Rdb/VMS. ICDE 1993: 538-547.  [BBD’05] S. Babu, P. Bizarro, D. J. DeWitt. Proactive Reoptimization. VLDB 2005: 107-118  [BBDW’05] P. Bizarro, S. Babu, D. J. DeWitt, J. Widom: Content-Based Routing: Different Plans for Different Data. VLDB 2005: 757-768  [BC02] N. Bruno, S. Chaudhuri: Exploiting statistics on query expressions for optimization. SIGMOD Conference 2002: 263-274  [BC05] B. Babcock, S. Chaudhuri: Towards a Robust Query Optimizer: A Principled and Practical Approach. SIGMOD Conference 2005: 119-130  [BMMNW’04] S. Babu, et al: Adaptive Ordering of Pipelined Stream Filters. SIGMOD Conference 2004: 407-418  [CDHW06] Flow Algorithms for Two Pipelined Filter Ordering Problems; Anne Condon, Amol Deshpande, Lisa Hellerstein, and Ning Wu. PODS 2006.  [CDY’95] S. Chaudhuri, U. Dayal, T. W. Yan: Join Queries with External Text Sources: Execution and Optimization Techniques. SIGMOD Conference 1995: 410-422  [CG94] R. L. Cole, G. Graefe: Optimization of Dynamic Query Evaluation Plans. SIGMOD Conference 1994: 150-160.  [CHG02] F. C. Chu, J. Y. Halpern, J. Gehrke: Least Expected Cost Query Optimization: What Can We Expect? PODS 2002: 293-302  [CN97] S. Chaudhuri, V. R. Narasayya: An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. VLDB 1997: 146-155

References (2)  [CR94] C-M Chen, N. Roussopoulos: Adaptive Selectivity Estimation Using Query Feedback. SIGMOD Conference 1994: 161-172  [DGHM’05] A. Deshpande, C. Guestrin, W. Hong, S. Madden: Exploiting Correlated Attributes in Acquisitional Query Processing. ICDE 2005: 143-154  [DGMH’05] A. Deshpande, et al.: Model-based Approximate Querying in Sensor Networks. In VLDB Journal, 2005  [DH’04] A. Deshpande, J. Hellerstein: Lifting the Burden of History from Adaptive Query Processing. VLDB 2004.  [EHJKMW’96] O. Etzioni, et al: Efficient Information Gathering on the Internet. FOCS 1996: 234-243  [GW’00] R. Goldman, J. Widom: WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web. SIGMOD Conference 2000: 285-296  [INSS92] Y. E. Ioannidis, R. T. Ng, K. Shim, T. K. Sellis: Parametric Query Optimization. VLDB 1992.  [Ives02] Z. G. Ives. Efficient Query Processing for Data Integration. Ph.D., U. Washington, 2002.  [K’01] M.S. Kodialam. The throughput of sequential testing. In Integer Programming and Combinatorial Optimization (IPCO) 2001.  [KBZ’86] R. Krishnamurthy, H. Boral, C. Zaniolo: Optimization of Nonrecursive Queries. VLDB 1986.  [KD’98] N. Kabra, D. J. DeWitt: Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans. SIGMOD Conference 1998: 106-117  [KKM’05] H. Kaplan, E. Kushilevitz, and Y. Mansour. Learning with attribute costs. In ACM STOC, 2005.  [KNT89] Masaru Kitsuregawa, Masaya Nakayama and Mikio Takagi, "The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method”. VLDB 1989.

References (3)  [LEO 01] M. Stillger, G. M. Lohman, V. Markl, M. Kandil: LEO - DB2's LEarning Optimizer. VLDB 2001.  [MRS+04] Volker Markl, et al.: Robust Query Processing through Progressive Optimization. SIGMOD Conference 2004: 659-670  [MSHR’02] S. Madden, M. A. Shah, J. M. Hellerstein, V. Raman: Continuously adaptive continuous queries over streams. SIGMOD Conference 2002: 49-60  [NKT88] M. Nakayama, M. Kitsuregawa, and M. Takagi. Hash partitioned join method using dynamic destaging strategy. In VLDB 1988.  [PCL93a] H. Pang, M. J. Carey, M. Livny: Memory-Adaptive External Sorting. VLDB 1993: 618-629  [PCL93b] H. Pang, M. J. Carey, M. Livny: Partially Preemptive Hash Joins. SIGMOD Conference 1993.  [RH’05] N. Reddy, J. Haritsa: Analyzing Plan Daigrams of Database Query Optimizers; VLDB 2005.  [SF’01] M.A. Shayman and E. Fernandez-Gaucherand: Risk-sensitive decision-theoretic diagnosis. IEEE Trans. Automatic Control, 2001.  [SHB04] M. A. Shah, J. M. Hellerstein, E. Brewer. Highly-Available, Fault-Tolerant, Parallel Dataflows, SIGMOD, June 2004.  [SHCF03] M. A. Shah, J. M. Hellerstein, S. Chandrasekaran and M. J. Franklin. Flux: An Adaptive Partitioning Operator for Continuous Query Systems, ICDE, March 2003.  [SMWM’06] U. Srivastava, K. Munagala, J. Widom, R. Motwani: Query Optimization over Web Services; VLDB 2006.  [TD03] F. Tian, D. J. Dewitt. Tuple Routing Strategies for Distributed Eddies. VLDB 2003.  [UFA’98] T. Urhan, M. J. Franklin, L. Amsaleg: Cost Based Query Scrambling for Initial Delays. SIGMOD Conference 1998: 130-141  [UF 00] T. Urhan, M. J. Franklin: XJoin: A Reactively-Scheduled Pipelined Join Operator. IEEE Data Eng. Bull. 23(2): 27-33 (2000)  [VNB’03] S. Viglas, J. F. Naughton, J. Burger: Maximizing the Output Rate of Multi-Way Join Queries over Streaming Information Sources. VLDB 2003: 285-296  [WA’91] A. N. Wilschut, P. M. G. Apers: Dataflow Query Execution in a Parallel Main-Memory Environment. PDIS 1991: 68-77

Adaptive Query Processing Adapted from a Tutorial given at SIGMOD 2006 by: Amol Deshpande, University of Maryland Joseph M. Hellerstein, University of.

Similar presentations

Presentation on theme: "Adaptive Query Processing Adapted from a Tutorial given at SIGMOD 2006 by: Amol Deshpande, University of Maryland Joseph M. Hellerstein, University of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Adaptive Query Processing Adapted from a Tutorial given at SIGMOD 2006 by: Amol Deshpande, University of Maryland Joseph M. Hellerstein, University of.

Similar presentations

Presentation on theme: "Adaptive Query Processing Adapted from a Tutorial given at SIGMOD 2006 by: Amol Deshpande, University of Maryland Joseph M. Hellerstein, University of."— Presentation transcript:

Similar presentations

About project

Feedback