Topics 10: Cache Conscious Indexes

Slides:



Advertisements
Similar presentations
Databasteknik Databaser och bioinformatik Data structures and Indexing (II) Fang Wei-Kleiner.
Advertisements

File Systems.
B+-tree and Hashing.
Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:
Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara.
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
1 CS 728 Advanced Database Systems Chapter 17 Database File Indexing Techniques, B- Trees, and B + -Trees.
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
Making B+-Trees Cache Conscious in Main Memory
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
 B+ Tree Definition  B+ Tree Properties  B+ Tree Searching  B+ Tree Insertion  B+ Tree Deletion.
Data : The Small Forwarding Table(SFT), In general, The small forwarding table is the compressed version of a trie. Since SFT organizes.
1 Chapter 17 Disk Storage, Basic File Structures, and Hashing Chapter 18 Index Structures for Files.
CS261 – Recitation 5 Fall Outline Assignment 3: Memory and Timing Tests Binary Search Algorithm Binary Search Tree Add/Remove examples 1.
Spring 2003 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang
Spring 2004 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.
B-Trees Katherine Gurdziel 252a-ba. Outline What are b-trees? How does the algorithm work? –Insertion –Deletion Complexity What are b-trees used for?
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Content based on Chapter 10 Database Management Systems, (3 rd.
Tree-Structured Indexes. Introduction As for any index, 3 alternatives for data entries k*: – Data record with key value k –  Choice is orthogonal to.
Chapter 27 Hashing Jung Soo (Sue) Lim Cal State LA.
CSE 351 Section 9 3/1/12.
CS522 Advanced database Systems
Indexing Goals: Store large files Support multiple search keys
CS 728 Advanced Database Systems Chapter 18
CS522 Advanced database Systems
Storage and Indexes Chapter 8 & 9
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
Database System Implementation CSE 507
Hash-Based Indexes Chapter 11
Database Management Systems (CS 564)
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Database Performance Tuning and Query Optimization
ITEC 2620M Introduction to Data Structures
Searching.
Tree data structure.
Chapter 28 Hashing.
Disk Storage, Basic File Structures, and Hashing
Database Implementation Issues
Chapter 11: Indexing and Hashing
Tree data structure.
COSC160: Data Structures B-Trees
Chapter 21 Hashing: Implementing Dictionaries and Sets
External Memory Hashing
Introduction to Database Systems Tree Based Indexing: B+-tree
Hash-Based Indexes Chapter 10
Indexing and Hashing Basic Concepts Ordered Indices
Lecture 21: Indexes Monday, November 13, 2000.
Lecture 19: Data Storage and Indexes
Index tuning Hash Index.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
CACHE-CONSCIOUS INDEXES
Indexing and Hashing B.Ramamurthy Chapter 11 2/5/2019 B.Ramamurthy.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
CS202 - Fundamental Structures of Computer Science II
Database Systems (資料庫系統)
LINEAR HASHING E0 261 Jayant Haritsa Computer Science and Automation
Database Design and Programming
Database Systems (資料庫系統)
CPS216: Advanced Database Systems
Chapter 11 Database Performance Tuning and Query Optimization
Data Structures Lecture 29 Sohail Aslam.
Tree-Structured Indexes
Chapter 11 Instructor: Xin Zhang
Lecture 20: Indexes Monday, February 27, 2006.
Chapter 11: Indexing and Hashing
Lecture 21 Amortized Analysis
Database Implementation Issues
Presentation transcript:

Topics 10: Cache Conscious Indexes As main memory gets cheaper, it becomes affordable to build computers with large memories. In future databases all data but few large tables will be memory-resident. Therefore is it important to build efficient main-memory indexes. These indexes should consider the hierarchical memories and the memory-access bottleneck. Dr. N. Mamoulis Advanced Database Technologies

Characteristics of cache conscious indexes They should cluster data according to the access pattern; data that are likely to be accessed together (or in sequence) should be close in memory. They should compress information, so that only useful data are fetched in cache. This means that only comparison keys and reference pointers to searched data should be in the index. They should not be much larger than the indexed information. Dr. N. Mamoulis Advanced Database Technologies

Why is binary search poor? If the searched array is large, the number of cache misses is determined by the search comparisons: O(log2n). This is because from the information fetched in the cache, only one search key will be used. cache-line (128 bytes) Cache: MMem: ... 539 545 568 579 582 589 595 602 609 612 617 623 625 ... current key comparison Dr. N. Mamoulis Advanced Database Technologies

Enhanced Main Memory B+-trees Although the B+-tree is a secondary memory index, it can be used for search in main memory. The node size of the tree is set to a multiple of the cache linesize (e.g., 1 node=2 cachelines). Now the number of cache misses equals the number of tree nodes accessed at search: O(logFn), where F is the fanout of the tree. Dr. N. Mamoulis Advanced Database Technologies

Problems of Main Memory B+-trees Nodes contain as many pointers as key values. Many key values can be compared in a node during search. On the other hand, only one pointer will be followed. Binary search in a node could be expensive (requiring many comparisons) Dr. N. Mamoulis Advanced Database Technologies

The Cache Sensitive Search (CSS) tree Same as B+-tree, but does not store pointers. The children of each node are stored sequentially, thus pointers are induced by positional memory offsets. CSS-tree B+-tree 12 19 31 24 24 12 19 31 4 8 9 12 13 17 19 21 23 24 27 29 31 34 38 4 8 9 12 13 17 19 21 23 24 27 29 31 34 38 Dr. N. Mamoulis Advanced Database Technologies

The Cache Sensitive Search (CSS) tree (cont’d) The CSS tree is suitable only for static data. The capacity of each node is double the capacity of an B+-tree node. Thus the height (and search cost) of the tree is reduced. Another trick used by the CSS tree is hard-coding binary search by if-else statements. Dr. N. Mamoulis Advanced Database Technologies

Hard-coding binary search Normal binary search Binsearch(key,C,start,end)= Binsearch(key,C,mid,end) if key>C[mid] Binsearch(key,C,start,mid) if key<C[mid] Follow C[mid] if key=C[mid]. Augmented binary search if (key<C[mid]) then if (key<C[mid/2]) then ... else if (key>C[mid/2]) then ... else follow C[mid/2] else if (key>C[mid]) then if (key<C[3mid/2]) then ... else if (key>C[3mid/2]) then ... else follow C[3mid/2] else follow C[mid] Dr. N. Mamoulis Advanced Database Technologies

Presentation material A dynamic version of the CSS-tree: the cache conscious B+-tree An improved version of the cache conscious B+-tree (optional reading) Cache conscious R-trees Dr. N. Mamoulis Advanced Database Technologies