Download presentation
Presentation is loading. Please wait.
1
Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara
2
Why In-memory databases Telecommunications CAD tools Moore’s law will allow us to store relations in memory
3
Redesigning DBMS’s Optimize memory-cpu performance vs disk- memory performance Re-evaluate space/time tradeoff – space isn’t cheap Given certain space requirement, need to optimize response time for lookups
4
Indices in In-Memory DBMS’s Little extra space vs. Increased performance Index design takes on new dimensions when looking at in-memory databases Space overhead can not be ignored – hash tables are unacceptable
5
Hardware solutions Caches Growing disparity between CPU performance and memory performance. Cache misses can’t be overlapped
6
Solution CSS-trees indices exploit cache behavior to get improved performance
7
Direct Mapped Cache
8
Fully Associative Cache
9
2-Way Set Associative Cache
10
Binary Search on Sorted Array Store the relation in sorted order on a key Cache performance dependent upon tuple size 1234567891011121314
11
T-trees pointer to record 4, *8, * … 0, *3, * … 10, *16, * … key
12
Enhanced B+ trees 1, *3, *2, *4, *5, *7, *6, *8, *9, *11, *10, *12, * 13, *15, *14, *16, *17, *19, *18, *20, * 591317
13
Hash Indices 000 111 010 011 100 101 110 001 0, *8, *80, *… Put however many pairs fit into a cache line
14
Idea Behind CSS-trees Save space by not storing pointers Use an array as a tree Implicitly store pointers as offsets into the array
15
Useful Formulas for CSS-trees Children of a node b are nodes b(m+1) to b(m+1) + (m+1) N = n * m n = # of elements m = # of elements per node N = # of nodes # of Internal Nodes = First leaf node in bottom level = (EQ 1) (EQ 2) (EQ 3) (EQ 4)
16
How it works Sorted array CSS-tree array (Directory) Full CSS-tree 10 8 9 7 6 5 4 3 2 1 8 9 7 6 5 4 3 2 1 4 2 8 6 8 6 4 2 8 9 7 6 5 4 3 2 1 node 0 node 1node 2node 3 node 4node 5node 6 node 1node 2node 3node 4node 5node 6 Internal nodes Leaf nodes node 0node 1node 2node 3node 4 Values (Lemma 4.1) m (# keys per node) = 2 n (# keys) = 10 k (log m+1 N)= 2 N (# of Leaf Nodes) = 5 Internal Nodes = 2 First leaf node in bottom level = 4
17
Building a full CSS-tree
18
Searching Within a Node 12345678
19
Level CSS-trees 1234567 Value of largest key in subtree m = 2 t Entries per node = m -1
20
Level vs. Full CSS-trees Level CSS-trees will be deeper due to the difference in branching factor Level CSS-trees have fewer comparisons per node Level CSS-trees have more cache accesses and and node traversals log 2 N vslog 2 N * log m+1 m * (1 + 2/(m+1)) log m N vsLog m+1 N
21
Time Analysis R (size of rid) = 4 bytes K (size of key) = 4 bytes P (size of pointer) = 4 bytes h = 1.2 n (# records) = 10 7 c (cache line) = 32 bytes s (node size/c) = 1 D = time to derefence a pointer A b = time to compute child address for binary search A fcss = time to compute child address for full CSS A lcss = time to compute child address for level CSS s = mK/c
22
Space Analysis R (size of rid) = 4 bytes K (size of key) = 4 bytes P (size of pointer) = 4 bytes h = 1.2 n (# records) = 10 7 c (cache line) = 32 bytes s (node size/c) = 1 D = time to derefence a pointer A b = time to compute child address for binary search A fcss = time to compute child address for full CSS A lcss = time to compute child address for level CSS s = mK/c
23
Experiment Results are for Ultra Sparc II – Keys randomly generated integers between 0 and 1 million Performed 5 tests of 100,00 searches for random keys
24
Figure 5a: Array Size vs. time
25
Figure 5b: Array Size vs. Time
26
Figure 6a: Array Size vs. 2 nd cache accesses
27
Figure 6b: Array Size vs. 2 nd cache misses
28
Figure 7: Node Size vs. Time
29
CSS Performance on Other Queries CSS is very good for individual selection queries CSS will probably perform the best in range queries Index nested loops join vs. Sort merge join
30
Doubts About CSS Flexibility of CSS-trees across different cache designs Any applicability to variable sized records Multiple CSS-tree indices on different keys
31
Conclusion CSS-trees improve searching performance by exploiting cache consciousness.
32
One Last Thought Cache designs Should we redesign them to let programmers have control?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.