Download presentation
Presentation is loading. Please wait.
Published byAnderson Rennick Modified over 10 years ago
1
Robust query processing Goetz Graefe, Christian König, Harumi Kuno, Volker Markl, Kai-Uwe Sattler Dagstuhl – September 2010
2
April 13, 2015Dagstuhl - Robust Query Processing2 Max-diff histograms True distribution Average value Equal width Equal area Max-diff Equal height?
3
April 13, 2015Dagstuhl - Robust Query Processing3 Histograms with slope True distribution Average value Linear regression Max-diff with slope Max-diff
4
April 13, 2015Dagstuhl - Robust Query Processing4 Slope, patterns, extrapolation
5
April 13, 2015Dagstuhl - Robust Query Processing5 Detecting query slowdown
6
April 13, 2015Dagstuhl - Robust Query Processing6 External merge sort Initial runs: size M, count N/M Merge fan-in F = M − read-ahead buffers Merge depth = merge levels = log F (N/M) …… … … Size = F×M Size = M Fan-in = F
7
April 13, 2015Dagstuhl - Robust Query Processing7 Hybrid hash join Applies if M < N 1 ≤ F×M 1 < N 1 /M ≤ F 0 < log F (N 1 /M) ≤ 1 Actual fan-out K: 1 < K ≤ F Hash table + K output buffers (M−K) + (K×M) ≥ N 1 K ≥ (N 1 −M) / (M−1) Fairly smooth cost function Eases query optimization Eases memory management 1 K … 1 K
8
April 13, 2015Dagstuhl - Robust Query Processing8 Merging vs. partitioning Duality of sorting & hashing
9
April 13, 2015Dagstuhl - Robust Query Processing9 Multiple optimization techniques are needed to find this plan Join clause inferred between line item & part supply Group-by list reduced by functional dependencies Grouping (on alternative column) pushed down through join “Interesting orderings” between scans, joins, grouping
10
April 13, 2015Dagstuhl - Robust Query Processing10 Multiple optimization techniques in a hash-based plan Same as previous example, plus Integrated hash operation … … within a hash team Disk-order scans
11
April 13, 2015Dagstuhl - Robust Query Processing11 Star joins: semi-join reduction First, join each dimension table with an index of the fact table; then, (hash-) intersect bookmark lists; finally, fetch fact table rows Also considered: Cartesian products of dimension tables
12
April 13, 2015Dagstuhl - Robust Query Processing12 Symmetric semi-join reduction Index T1 (a, s)Index T2 (a, s) Join “T1.a = T2.a” Select … from T1 join T2 on T1.a = T2.a where … Fetch using T1.s Fields T1.s, T2.s Fields T1.*, T2.s Fields T2.a, T2.s Fields T1.*, T2.*
13
April 13, 2015Dagstuhl - Robust Query Processing13 Index-to-index navigation performance Trad. fetch
14
April 13, 2015Dagstuhl - Robust Query Processing14 2-dimensional parameter space
15
April 13, 2015Dagstuhl - Robust Query Processing15 Fast loads and fast queries Query performance Load bandwidth Multiple indexes No indexes or statistics Zone maps Partitioned B-trees Zone filters Zone indexes ?
16
April 13, 2015Adaptive merging16 Traditional index choices Don’t index. Scan for each query – no cost for index creation Index creation before query processing –Useful for predictable workloads “Monitoring and tuning” wizard –Extra effort, hard to predict Scan Index creation Index searches Adaptive Indexing Index tuning
17
April 13, 2015Dagstuhl - Robust Query Processing17 April 13, 201517 Adaptive merging in partitioned B-trees run generation merging a z aaazzz a z a z aaazzz … after merging a-j a zkkkzzzkj #4#3#2#1#0
18
April 13, 2015Dagstuhl - Robust Query Processing18 April 13, 201518 Adaptive merging vs database cracking Database cracking Improved cracking Adaptive merging
19
April 13, 2015Dagstuhl - Robust Query Processing19 Tree of losers Traditional priority queue –Enter and exit at root –2 log 2 M comparisons Tree of winners –Enter at leaf, exit at root –log 2 M comparisons –Specific entry points –Duplicate entries –M/2 entries Tree of losers –Enter at leaf, exit at root –No duplicates, M entries Run 4: key A 0: F7: B Run 3: key D 1: G2: E5: D6: C 0: F 1: G 2: E 3: D 4: A 5: D 6: C 7: B Array slot 0 1 23 7654
20
April 13, 2015Dagstuhl - Robust Query Processing20 Graceful degradation Exploit large memory –Even during small merge –Merge from memory Smooth transition –Run generation to merging Continuous cost function –Effect of hybrid hash join –2 × 6 GB ÷ 100 MB/s = 120 sec = 2 min 12 01 23 0
21
April 13, 2015Dagstuhl - Robust Query Processing21 Graceful degradation in memory hierarchy Output Main memory Flash memory A few runs on disk Rotating disk drive Run in memory A few runs on flash Buffer for large disk pages High fan-in merge
22
April 13, 2015Dagstuhl - Robust Query Processing22 SQL Server lock modes
23
April 13, 2015Dagstuhl - Robust Query Processing23 Optimal B-tree node sizes in 1997
24
April 13, 2015Dagstuhl - Robust Query Processing24 Hilbert space-filling curve
25
Nicolas Bruno and Surajit Chaudhuri, Automatic Physical Database Tuning: A Relaxation-based Approach, in Proceedings of the ACM International Conference on Management of Data (SIGMOD), Association for Computing Machinery, Inc., 2005Automatic Physical Database Tuning: A Relaxation-based Approach Automatic Tuning: Relaxation-based
26
Sanjay Agrawal, Nicolas Bruno, Surajit Chaudhuri, and Vivek Narasayya, AutoAdmin: Self-Tuning Database Systems Technology, in Data Engineering Bulletin, IEEE Computer Society, 2006 AutoAdmin: Self-Tuning Database Systems Technology Self-Tuning DB: AutoAdmin
27
Surajit Chaudhuri, Arnd Christian König, and Vivek Narasayya, SQLCM: A Contiuous Monitoring Framework for Relational Database Engines, in ICDE 2004.SQLCM: A Contiuous Monitoring Framework for Relational Database Engines Continuous Monitoring: SQLCM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.