Download presentation
Presentation is loading. Please wait.
Published by裙 蔺 Modified over 8 years ago
1
Database cracking Stratos Idreos, Martin Kersten and Stefan Manegold
CWI Amsterdam, The Netherlands
2
Outline Motivation A cracking architecture (so far...) Experiments
Motivation A cracking architecture (so far...) Experiments Conclusions and future work
3
Motivation Create a database with self-* properties
Create a database with self-* properties Continuously adapt the way: Data is stored Queries are answered Without external administration Handle huge data sets, e.g., scientific databases We will present database cracking in the context of column oriented databases as a promising direction to handle these challenges
4
Cracking a database Each query triggers physical reorganization of the database A query select A < v will crack at most one piece A query select v1 < A < v2 will crack at most two pieces select A>=8 select A<=3 3 3 3 <= 3 8 6 2 < 8 6 2 6 > 3 2 4 4 12 13 13 13 8 >= 8 8 >= 8 4 12 12
5
A promissing direction
We can adapt to query workload No external administration Do just enough Handle big data sets
6
Cracking vs Indices and Sorting
How cracking compares to a sorting strategy? Sort data upfront and then use binary search For a sorting strategy we have to make an investment upfront Sorting needs prior knowledge of query workload Similar arguments stand for indices.
7
Design The first time a range query is posed on an attribute A, a cracking database makes a copy of column A. We call this copy the cracker column of A, denoted as Acrk Acrk will be continuously physically reorganized based on queries that need to touch attribute Definition. Cracking based on a query q on an attribute A is the act of physically reorganizing Acrk in such a way that the values that satisfy q are stored in a contiguous space For each cracker column, there is a cracker index
8
The cracker select operator
The simple select operator: Scans the column Return a new column that contains qualifying values The crackers select operator: Searches the cracker index Physically reorganizes found pieces Update the cracker index Return a slice of the cracker column as result More steps but faster because we touch less data
9
Impact on query plan Becomes expensive
select R.c from R where 5 < R.a < 10 and 9 < R.b < 20 Ra1 Rb1 crackers.select(Ra, 5, 10) crackers.select(Rb, 9, 20) algebra.select(Ra, 5, 10) algebra.select(Rb, 9, 20) algebra.joinselect(Ra1, Rb,9,20) algebra.OIDintersect(Ra1, Rb1) Becomes expensive Ra2 algebra.fetch(Ra2, Rc)
10
Testing the select operator
11
TPC-H query 6
12
Future work Optimization
Optimization optimal piece size / granularity / index (avl) depth Exploit cracking for join queries, aggregate queries etc. Concurency issues
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.