Presentation is loading. Please wait.

Presentation is loading. Please wait.

M.Kersten 2008 1 MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam.

Similar presentations


Presentation on theme: "M.Kersten 2008 1 MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam."— Presentation transcript:

1 M.Kersten 2008 1 MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam

2 M.Kersten 2008 2 Try to maximize performance Paste Present Potency Cracking B-tree, Hash Indices Materialized Views

3 M.Kersten 2008 3 Indices in database systems focus on: All tuples are equally important for fast retrieval There are ample resources to maintain indices MonetDB cracks the database into pieces based on actual query load Find a trusted fortune teller

4 M.Kersten 2008 Cracking algorithms Physical reorganization happens per column based on selection predicates. Split a piece of a column in two new pieces A<10 A>=10 A<10

5 M.Kersten 2008 Cracking algorithms Physical reorganization happens per column Split a piece of a column in two new pieces Split a piece of a column in three new pieces A<10 A>=10 A<10 5<A<10 A>=10 5<A<10 A<5

6 M.Kersten 2008 Cracking example 3 8 6 2 12 13 4 17 15 select A>5 and A<10

7 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12

8 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >=10

9 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >=10

10 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >=10 <=5

11 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >=10 <=5

12 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >=10 <=5

13 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >=10 <=5

14 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >=10

15 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >=10

16 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <=5

17 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <=5

18 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <=5 >5 and <10

19 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 62 15 13 4 17 12 <=5 >5 and <10

20 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 62 15 13 4 17 12 <=5 >5 and <10

21 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <=5 >5 and <10

22 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 >5 and <10

23 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 13 4 17 12 <= 5 >= 10 > 5 15

24 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries

25 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14

26 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5

27 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5

28 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5

29 M.Kersten 2008 racking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 >3 and <14 <=3

30 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 >3 and <14 <=3

31 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 >3 and <14 <=3

32 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 >3 and <14 <=3

33 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 <=3

34 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 10 > 5 <=3

35 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 10 > 5 <=3

36 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 10 > 5 <=3

37 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 10 > 5 <=3

38 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 10 > 5 <=3

39 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 10 > 5 <=3

40 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 10 > 5 <=3

41 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 10 > 5 <=3

42 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 14 > 5 <=3 >=10

43 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 > 3 >= 14 > 5 <=3 >=10

44 M.Kersten 2008 Cracking example 3 8 6 2 15 13 4 17 12 select A>5 and A<10 3 8 6 2 15 13 4 17 12 <= 5 >= 10 > 5 Improve data access for future queries select A>3 and A<14 3 8 6 2 15 13 4 17 12 >3 >= 14 > 5 <=3 >=10 The more we crack the more we learn

45 M.Kersten 2008 Design The first time a range query is posed on an attribute A, a cracking DBMS makes a copy of column A, called the cracker column of A A cracker column is continuously physically reorganized based on queries that need to touch attribute such as the result is in a contiguous space For each cracker column, there is a cracker index Cracker Index Cracker Column

46 M.Kersten 2008 A simple range query Try to avoid useless investments

47 M.Kersten 2008 TPC-H query 6 Try to avoid useless investments

48 M.Kersten 2008 48 Cracking is easy in a column store and is part of the critical execution path Cracking works under high volume updates Try to avoid useless investments

49 M.Kersten 2008 Updates Base columns are updated as normally We need to update the cracker column and the cracker index Efficiently Maintain the self-organization properties Two issues: When How

50 M.Kersten 2008 When to propagate updates in cracking Follow the workload to maintain self-organization Updates become part of query processing When an update arrives, it is not applied For each cracker column there is a pending insertions column and a pending deletions column Pending updates are applied only when a query needs the specific values

51 M.Kersten 2008 Updates aware select We extended the cracker select operator to apply the needed updates before cracking The select operator: 1. Search the pending insertions column 2. Search the pending deletions column 3. If Steps 1 or 2 find tuples run an update algorithm 4. Search the cracker index 5. Physically reorganize the cracker column 6. Update the cracker index 7. Return a slice of the cracker column

52 M.Kersten 2008 Merging 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >12 Start position: 1 values: >1 Insert a new tuple with value 9 The new tuple belongs to the blue piece 9

53 M.Kersten 2008 Merging 7 2 10 29 25 31 57 42 53 Start position: 8 values: >35 Start position: 5 values: >12 Start position: 1 values: >1 Insert a new tuple with value 9 The new tuple belongs to the blue piece 9 Pieces in the cracker column are ordered Tuples inside a piece are not ordered Shifting is not a viable solution

54 M.Kersten 2008 Merging by Hopping 7 2 10 29 25 31 42 53 Start position: 8 values: >35 Start position: 4 values: >12 Start position: 1 values: >1 57 9 Insert a new tuple with value 9 We need to make enough room to fit the new tuples

55 M.Kersten 2008 Merge Gradually A query merges only the qualifying values, i.e., only the values that it needs for a correct and complete result Average cost increases significantly We avoid the large peaks but... Merge CompletelyMerge Gradually

56 M.Kersten 2008 The Ripple Touch only the pieces that are relevant for the current query

57 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 Touch only the pieces that are relevant for the current query

58 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 Select 7<= A< 15 Touch only the pieces that are relevant for the current query

59 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 Select 7<= A< 15 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query

60 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query Select 7<= A< 15

61 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query Select 7<= A< 15

62 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query Select 7<= A< 15

63 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query Avoid shifting down non interesting pieces Select 7<= A< 15

64 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query Avoid shifting down non interesting pieces Select 7<= A< 15

65 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

66 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

67 M.Kersten 2008 The Ripple 7 2 10 29 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

68 M.Kersten 2008 The Ripple 7 2 10 25 31 57 42 53 Start position: 7 values: >35 Start position: 4 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions 29 Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

69 M.Kersten 2008 The Ripple 7 2 10 25 31 57 42 53 Start position: 7 values: >35 Start position: 5 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions 29 Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

70 M.Kersten 2008 The Ripple 7 2 10 25 31 57 42 53 Start position: 7 values: >35 Start position: 5 values: >22 Start position: 1 values: >1 5 9 16 35 Pending insertions 29 Touch only the pieces that are relevant for the current query Immediately make room for the new tuples Avoid shifting down non interesting pieces Select 7<= A< 15

71 M.Kersten 2008 The Ripple Maintain high performance through the whole query sequence in a self-organizing way

72 M.Kersten 2008 The Ripple Maintain high performance through the whole query sequence in a self-organizing way Merge GraduallyMerge Completely Merge Ripple

73 Recycling intermediates M.Kersten 2008 73

74 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 74/20 MonetDB Background Operator-at-a-time execution paradigm Canonical implementation of a column-store Reduced dimensionality Finer granularity Simplified overlap analysis Recycler extension of MonetDB engine

75 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 75/20 Run-time Support Recycler Optimizer MonetDB Architecture SQL MonetDB Server Tactical Optimizer MonetDB Kernel XQuery MAL Recycle Pool function user.s1_2(A0:date,...):void; X5 := sql.bind("sys","lineitem",...); X10 := algebra.select(X5,A0); X12 := sql.bindIdx("sys","lineitem",...); X15 := algebra.join(X10,X12); X25 := mtime.addmonths(A1,A2);... function user.s1_2(A0:date,...):void; X5 := sql.bind("sys","lineitem",...); X10 := algebra.select(X5,A0); X12 := sql.bindIdx("sys","lineitem",...); X15 := algebra.join(X10,X12); X25 := mtime.addmonths(A1,A2);... Admission & Eviction

76 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 76/20 Instruction Matching Run time comparison of instruction types argument values NameValueData typeSize X110:bat[:oid,:date] T1“sys”:str T2“orders”:str … X1 := sql.bind("sys","orders","o_orderdate",0); … Y3 := sql.bind("sys","orders","o_orderdate",0); Exact matching

77 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 77/20 Instruction Subsumption NameValueData typeSize X110:bat[:oid,:int]2000 X3130:bat[:oid,:int]700 X5150:bat[:oid,:int]350 … X3 := algebra.select(X1,10,80); … Y3 := algebra.select(X1,20,45); X5 := algebra.select(X1,20,60); X5

78 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 78/20 Recycle Pool: a Cache with Lineage algebra.join sql.bind(“C1“) algebra.select sql.bind(“C2“) … sql.bind(“C1“) X1 := algebra.select(X1) X2 := sql.bind(“C2“) X3 := algebra.join(X2,X3) X4 := Q1

79 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 79/20 Recycle Pool: a Cache with Lineage algebra.join sql.bind(“C1“) algebra.select sql.bind(“C2“) X1 := sql.bind(“C1“) X2 := algebra.select(X1) X3 := sql.bind(“C2“) X4 := algebra.join(X2,X3) algebra.join sql.bind(“C3“) … X1 X2 X3 X4 Q2

80 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 80/20 Mismatching algebra.join sql.bind(“C1“) algebra.select sql.bind(“C2“) X1 := sql.bind(“C1“) X2 := algebra.select(X1) X3 := sql.bind(“C2“) X4 := algebra.join(X2,X3) algebra.join sql.bind(“C3“) … Y1 Y2 Y3 Y4 Y3 := sql.bind(“C2“) Y2 := algebra.select(Y1) Y1 := sql.bind(“C1“) Y4 := algebra.join(Y2,Y3) !=X2 !=X3 Q2

81 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 81/20 Admission Policies Decide about storing the results KEEPALL all instructions advised by the optimizer CREDIT instructions supplied with credits storage ‘paid’ with 1 credit reuse returns credits lack of reuse limits admission and resource claims

82 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 82/20 Cache Policies Decide about eviction of intermediates Filter ‘top’ instructions without dependents Pick instructions with smallest utility LRU : time of computation or last reuse BENEFIT : estimated contribution to performance: CPU and I/O costs, recycling Triggered by resource limitations (memory or entries)

83 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 83/20 TPC-H Evaluation SF1 Baseline performance Impact of design choices Admission policies Cache policies

84 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 84/20 CREDIT Admission Impact Reused memoryReused entries Hit ratio to KeepAll

85 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 85/20 Cache Policies Evaluation 200 TPC-H queries RPTotalReuse Memory4GB42.7% Entries521928%

86 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 86/20 SkyServer Evaluation 100 GB subset of DR4 100-query batch from January 2008 log 1.5GB intermediates, 99% reuse Join intermediates major contributor to savings

87 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 87/20 Summary Database architecture augmented with recycling intermediates Significant performance benefits in SkyServer and TPC-H Self-organizing technique Extension to MonetDB transforming materialization overhead into benefit

88 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 88/20 Future Work Refining and developing admission and cache policies Opportunities by query class recognition Automatic switch to suitable policies Application to pipelined architectures

89 30/06/2009 SIGMOD'09 Providence, RI An Architecture for Recycling Intermediates M. Ivanova, M. L. Kersten, N. Nes, R. Goncalves 89/20 Recycling Is Green


Download ppt "M.Kersten 2008 1 MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam."

Similar presentations


Ads by Google