Download presentation
Presentation is loading. Please wait.
Published byChester Benson Modified over 9 years ago
1
Multi-Query Optimization and Applications Prasan Roy Indian Institute of Technology - Bombay
2
May 2000Multi-Query Optimization and Applications2 Motivation Queries often involve repeated computation –Queries on overlapping views, stored procedures, nested queries, etc. –Update expressions for a set of overlapping materialized views –Automatically generated queries XML-QL complex path expressions SQL query batches Our focus: Faster query processing by avoiding repeated computation
3
May 2000Multi-Query Optimization and Applications3 Outline Multi-query optimization Application to related problems –Query result caching –Materialized view selection and maintenance Conclusions and future work
4
Multi-Query Optimization Prasan Roy, S. Seshadri, S. Sudarshan and Siddhesh Bhobe, Efficient and Extensible Algorithms for Multi-Query Optimization, ACM SIGMOD 2000
5
May 2000Multi-Query Optimization and Applications5 Motivating Example A B C B CD Best Plan for A JOIN B JOIN C Best Plan for B JOIN C JOIN D Foreign Key Dependency: A B C D Total Cost = 460 100 10 100 100 10 10 10 100 10 10
6
May 2000Multi-Query Optimization and Applications6 BC Motivating Example A B C D Total Cost = 370 Benefit = 90 100100 100 10 10 10 10 10 10 10 Foreign Key Dependency: A B C D
7
May 2000Multi-Query Optimization and Applications7 Problem Statement A B C D Find the cheapest plan exploiting transiently materialized common subexpressions (CSEs) –Assumption: No shared pipelines Common Subexpression
8
May 2000Multi-Query Optimization and Applications8 Problems Locally optimal subplans may not be globally optimal Mutually exclusive alternatives (A JOIN B JOIN C) (B JOIN C JOIN D) (B JOIN C JOIN D) (C JOIN D JOIN E) (C JOIN D JOIN E) (B JOIN C)(C JOIN D) What to share: (B JOIN C) or (C JOIN D) ? Materializing and sharing a CSE not necessarily cheaper
9
May 2000Multi-Query Optimization and Applications9 Example A B C B CD Best Plan for A JOIN B JOIN C Best Plan for B JOIN C JOIN D Foreign Key Dependency: A B C D Total Cost = 154 100 10 10 10 1 10 10 1 1 1
10
May 2000Multi-Query Optimization and Applications10 BC Example A B C D 10010 10 10 1 10 10 1 10 10 Foreign Key Dependency: A B C D Total Cost = 172 Benefit = -18
11
May 2000Multi-Query Optimization and Applications11 Approach 1. Set up the search space of execution plans 2. Explore the search space to find the best execution plan
12
May 2000Multi-Query Optimization and Applications12 Representation of Plan Space Equivalence Class (OR node) Operation (AND node) AND/OR Query DAG BC A ABC BCD CD AB C D B Example Plan (Solution Graph)
13
May 2000Multi-Query Optimization and Applications13 DAG Generation Modifications Unification Volcano: Duplicate subexpressions No CSEs! BC A ABC AB C B BC BCD CD C D B Modification: Duplicate subexpressions unified
14
May 2000Multi-Query Optimization and Applications14 DAG Generation Modifications Subsumption Volcano: No expression subsumption Missed CSEs (A<10) (A>50) (A 50) (A>50) (A>10) (A>50) Subsumptionderivation Modification: Subsumption derivations introduced
15
May 2000Multi-Query Optimization and Applications15 Exploring the Search Space An Exhaustive Algorithm Input: DAG for query Q Output: Set of nodes to materialize, corresp. best plan 1. Y = set of equivalence nodes in DAG 2. Pick X Y which minimizes BestCost(Q, X) 3. Return X BestCost(Q, X) = cost of the best plan for Q given that the nodes in X are transiently materialized Too expensive! Need heuristics.
16
May 2000Multi-Query Optimization and Applications16 Exploring the Search Space A Greedy Heuristic Input: DAG for query Q Output: Set of nodes to materialize, corresp. best plan 1. X = {}; Y = set of equivalence nodes in DAG 2. While( Y {} ) Pick z Y which maximizes Benefit(z | Q, X) If( Benefit(z | Q, X) > 0 ) Y = Y – {z}; X = X U {z} Else Y = {} 3. Return X Benefit(z | Q, X) = BestCost(Q, X) - BestCost(Q, X U {z}) Appeared in [Gupta, ICDT97]. Our Contribution: improve efficiency
17
May 2000Multi-Query Optimization and Applications17 Improving Efficiency Summary Input: DAG for query Q Output: Set of nodes to materialize, corresp. best plan 1. X = {}; Y = set of equivalence nodes in DAG 2. While( Y {} ) Pick z Y which maximizes Benefit(z | Q, X) If( Benefit(z | Q, X) > 0 ) Y = Y – {z}; X = X U {z} Else Y = {} 3. Return X Restrict the set of materialization candidates Compute Benefit efficiently Heuristically avoid computing Benefit for some nodes
18
May 2000Multi-Query Optimization and Applications18 Improving Efficiency Only CSEs Materialized CSEs identified in a bottom-up traversal Common Subexpression BC A ABC BCD CD AB C D B
19
May 2000Multi-Query Optimization and Applications19 Improving Efficiency Summary Input: DAG for query Q Output: Set of nodes to materialize, corresp. best plan 1. X = {}; Y = set of equivalence nodes in DAG 2. While( Y {} ) Pick z Y which maximizes Benefit(z | Q, X) If( Benefit(z | Q, X) > 0 ) Y = Y – {z}; X = X U {z} Else Y = {} 3. Return X Restrict the set of materialization candidates Compute Benefit efficiently Heuristically avoid computing Benefit for some nodes
20
May 2000Multi-Query Optimization and Applications20 Efficient Benefit Computation Incremental Re-optimization X : Set of CSEs already materialized z : unmaterialized CSE Best plan given X materialized Best plan given X U {z} materialized Observation Best plans change only for the ancestors of z
21
May 2000Multi-Query Optimization and Applications21 Incremental Re-optimization Example BC ABC BCD CD AB Best Plan X = {} 10101010 100 100100 100 100100100 230 230230 230 z = (B JOIN C) BC 1010 10 120120 130 C BA D
22
May 2000Multi-Query Optimization and Applications22 Incremental Re-optimization Efficient Propagation Ancestor nodes visited bottom-up in a topological order –Guarantees no revisits Propagation path pruned if the current node’s best cost remains unchanged
23
May 2000Multi-Query Optimization and Applications23 Improving Efficiency Summary Input: DAG for query Q Output: Set of nodes to materialize, corresp. best plan 1. X = {}; Y = set of equivalence nodes in DAG 2. While( Y {} ) Pick z Y which maximizes Benefit(z | Q, X) If( Benefit(z | Q, X) > 0 ) Y = Y – {z}; X = X U {z} Else Y = {} 3. Return X Restrict the set of materialization candidates Compute Benefit efficiently Heuristically avoid computing Benefit for some nodes
24
May 2000Multi-Query Optimization and Applications24 Avoiding Benefit Computation Monotonicity Assumption –Benefit of a node does not increase due to materialization of other nodes Often true An earlier benefit of a node is an upper bound on its current benefit Do not recompute a node’s benefit if another node’s current benefit is greater Optimization costs decrease by 90%
25
May 2000Multi-Query Optimization and Applications25 Experimental Results TPCD-0.1 on Microsoft SQL Server 6.5 –using SQL rewriting for MQO
26
May 2000Multi-Query Optimization and Applications26 Alternatives to Greedy Volcano-SH A lightweight post-pass heuristic 1.Compute the best plan for each query independently, using Volcano 2.Find the set of nodes in the best plans to materialize (cost-based) Similar previous work [Subramanium and Venkataraman, SIGMOD 1998]
27
May 2000Multi-Query Optimization and Applications27 Alternatives to Greedy Volcano-RU A lightweight extension of Volcano 1.Batched queries optimized in sequence Q1, Q2, …, Qn 2.Find the best plan for query Qi given the best plans for queries Qj, j < i 3.Cost based materialization of nodes in best plans of Qj, j < i Plan quality sensitive to the query sequence
28
May 2000Multi-Query Optimization and Applications28 Experimental Results TPCD-0.1 query batches
29
May 2000Multi-Query Optimization and Applications29 Experimental Results TPCD-0.1 query batches
30
May 2000Multi-Query Optimization and Applications30 Features Easily implemented –First MQO implementation integrated with a state-of-the-art optimizer (as far as we know) –Also partially prototyped on Microsoft SQL-Server Support for index selection –Index modeled as physical property (like “interesting order”) Extensible and flexible –New operators, data models –Readily adapts to other problems Query result caching Materialized view selection/maintenance
31
Query Result Caching P. Roy, K. Ramamritham, S. Seshadri, P. Shenoy and S. Sudarshan, Don’t Trash Your Intermediate Results, Cache ‘em, Submitted for publication
32
May 2000Multi-Query Optimization and Applications32 Problem Statement Minimize the total execution time of an online workload by –Caching intermediate/final results of individual queries, and –Using these cached results to answer later queries
33
May 2000Multi-Query Optimization and Applications33 System Model
34
May 2000Multi-Query Optimization and Applications34 Contributions Intermediate as well as final results cached –Optimizer-driven cache management –Adapts to workload changes Cache-aware cost-based optimization –Novel framework for cached result matching
35
May 2000Multi-Query Optimization and Applications35 Experimental Results Overheads negligible Performance on 900 query TPCD-1 based uniform cube-point workload
36
Materialized View Selection and Maintenance Hoshi Mistry, Prasan Roy, K. Ramamritham and S. Sudarshan, Materialized View Selection and Maintenance Using Multi-Query Optimization, Submitted for publication
37
May 2000Multi-Query Optimization and Applications37 Problem Statement Speed up maintenance of a set of materialized views by –Exploiting CSEs between different view maintenance expressions –Selecting additional views to be materialized
38
May 2000Multi-Query Optimization and Applications38 Contributions Optimization of maintenance expressions –Support for transiently materialized “delta’’ views Nicely integrates transient vs permanent view materialization choices
39
May 2000Multi-Query Optimization and Applications39 Experimental Results Overheads negligible Performance benefit for maintenance of two TPCD-0.1 based SPJA views
40
May 2000Multi-Query Optimization and Applications40 Conclusion MQO is practical –Low overheads, high benefits –Easily implemented and integrated Leads to novel solutions to related problems –Query result caching –Materialized view selection and maintenance
41
May 2000Multi-Query Optimization and Applications41 Future Work Further extensions of MQO –Shared execution pipelines Query result caching in presence of updates Other problems –Continuous queries, XML view caching, etc.
42
May 2000Multi-Query Optimization and Applications42 Other Contributions Garbage Collection in Object Oriented Databases –Developed a “transaction-aware” cyclic reference counting algorithm –Provided a formal proof of correctness S. Ashwin, Prasan Roy, S. Seshadri, Avi Silberschatz and S. Sudarshan, Garbage Collection in Object-Oriented Databases Using Transactional Cyclic Reference Counting, VLDB 1997 Prasan Roy, S. Seshadri, Avi Silberschatz, S. Sudarshan and S. Ashwin, Garbage Collection in Object-Oriented Databases Using Transactional Cyclic Reference Counting, Invited Paper, VLDB Journal, August 1998
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.