Fractal Prefetching B + -Trees: Optimizing Both Cache and Disk Performance Author: Shimin Chen, Phillip B. Gibbons, Todd C. Mowry, Gary Valentin Members:

Slides:



Advertisements
Similar presentations
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Advertisements

B+-trees. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B = n pages I/O complexity:
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
COMP 451/651 Indexes Chapter 1.
CSE332: Data Abstractions Lecture 9: B Trees Dan Grossman Spring 2010.
Chapter 15 B External Methods – B-Trees. © 2004 Pearson Addison-Wesley. All rights reserved 15 B-2 B-Trees To organize the index file as an external search.
CS4432: Database Systems II
B+-tree and Hashing.
Chapter 9 of DBMS First we look at a simple (strawman) approach (ISAM). We will see why it is unsatisfactory. This will motivate the B+Tree Read 9.1 to.
Tree-Structured Indexes. Introduction v As for any index, 3 alternatives for data entries k* : À Data record with key value k Á Â v Choice is orthogonal.
1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
1 Database indices Database Systems manage very large amounts of data. –Examples: student database for NWU Social Security database To facilitate queries,
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
Carnegie Mellon Fractal Prefetching B + -Trees: Optimizing Both Cache and Disk Performance Joint work with Shimin Chen School of Computer Science Carnegie.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
1 B+ Trees. 2 Tree-Structured Indices v Tree-structured indexing techniques support both range searches and equality searches. v ISAM : static structure;
CS4432: Database Systems II
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
Making B+-Trees Cache Conscious in Main Memory
Introduction to Database Systems1 B+-Trees Storage Technology: Topic 5.
Indexing and Hashing (emphasis on B+ trees) By Huy Nguyen Cs157b TR Lee, Sin-Min.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
 B+ Tree Definition  B+ Tree Properties  B+ Tree Searching  B+ Tree Insertion  B+ Tree Deletion.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Multi-way Trees. M-way trees So far we have discussed binary trees only. In this lecture, we go over another type of tree called m- way trees or trees.
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Tree-Structured Indexes Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Index tuning-- B+tree. overview Overview of tree-structured index Indexed sequential access method (ISAM) B+tree.
CS 405G: Introduction to Database Systems 22 Index Chen Qian University of Kentucky.
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Optimizing Multidimensional Index Trees for Main Memory Access Author: Kihong Kim, Sang K. Cha, Keunjoo Kwon Members: Iris Zhang, Grace Yung, Kara Kwon,
Spring 2003 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2003 Yanyong Zhang
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 B+-Tree Index Chapter 10 Modified by Donghui Zhang Nov 9, 2005.
Indexing Database Management Systems. Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files File Organization 2.
Spring 2004 ECE569 Lecture 05.1 ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
Indexing and B+-Trees By Kenneth Cheung CS 157B TR 07:30-08:45 Professor Lee.
Storage and Indexing. How do we store efficiently large amounts of data? The appropriate storage depends on what kind of accesses we expect to have to.
Data on External Storage – File Organization and Indexing – Cluster Indexes - Primary and Secondary Indexes – Index data Structures – Hash Based Indexing.
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Content based on Chapter 10 Database Management Systems, (3 rd.
I/O Cost Model, Tree Indexes CS634 Lecture 5, Feb 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Tree-Structured Indexes Chapter 10
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 10.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
Tree-Structured Indexes. Introduction As for any index, 3 alternatives for data entries k*: – Data record with key value k –  Choice is orthogonal to.
Multiway Search Trees Data may not fit into main memory
Tree-Structured Indexes
B+-Trees and Static Hashing
Tree-Structured Indexes
Tree-Structured Indexes
Multiway Trees Searching and B-Trees Advanced Tree Structures
CACHE-CONSCIOUS INDEXES
Tree-Structured Indexes
Tree-Structured Indexes
Presentation transcript:

Fractal Prefetching B + -Trees: Optimizing Both Cache and Disk Performance Author: Shimin Chen, Phillip B. Gibbons, Todd C. Mowry, Gary Valentin Members: Iris Zhang, Grace Yung, Kara Kwon, Jessica Wong

Outline 1.Introduction 2.Optimizing I/O Performance a.Searches b.Range Scans 3.Optimizing Cache Performance a.Disk-First fpB + -Trees b.Cache-First fpB + -Trees 4.Conclusion

Introduction Traditional B + -Trees –Optimized for I/O performance –tree nodes = disk pages Recent new types of B + -Trees –Optimized for CPU cache performance –tree nodes sizes = one or few cache lines –Introduce concept of prefetching

Introduction (cont’d) Figure 1: Traditional B + -Trees Page Control Info Index entry (key and page/tuple ID)

Introduction (cont’d) Problem (due to large discrepancy in optimal node sizes) 1.Disk-optimized B + -Trees suffer from poor cache performance 2.Cache-optimized B + -Trees suffer from poor disk performance

Introduction (cont’d) Proposal: Fractal Prefetching B + -Trees (fpB + -Trees) 1.Embed “cache-optimized” trees within “disk- optimized” trees 2.Optimize both cache and I/O performance 3.Two approaches: -> disk-first -> cache-first

Introduction (cont’d) Figure 2: Self-similar “tree within a tree” structure

Introduction (cont’d) Disk-first and Cache-first What is done to optimize performance How to process operations efficiently –Bulkload –Search –Insertion –Deletion

Optimizing I/O Performance fpB + -Trees combine features of disk- and cache-optimized B + -Trees to achieve best of both structures Consider two concepts from pB + -Trees –Searches: Prefetching and node sizes –Range Scans: Prefetching via jump-pointer arrays

Optimizing I/O Performance (cont’d) Prefetching: –Modern db servers are composed of multiple disks per processor –Goal: effectively exploit I/O parallelism Explicitly prefetching disk pages even when the access patterns are not sequential

Searches: Prefetching and Node Sizes (cont’d) For disk-resident data –Increase the B + -Tree node size to be a multiple of the disk page size –Prefetch all pages of a node when accessing it Pages are placed on different disks so that requests can be serviced in parallel Result: faster search

Searches: Prefetching and Node Sizes (cont’d) Problem –I/O latency improves for a single search, but may become worse when there are extra seeks for a node –Additional seeks may degrade performance Conclusion: target node-size for fpB + -Tree will be a single disk page

Range Scans: Prefetching via Jump-Pointer Arrays Range scan –searching for the starting key of the range, then reading consecutive leaf nodes in the tree Jump-pointer array helps leaves to be effectively prefetched One implementation: add sibling pointers to each node that is a parent of leaves

Range Scans: Prefetching via Jump-Pointer Arrays (cont’d) Figure 3: Internal jump-pointer array Tree Leaf Parent

Range Scans: Prefetching via Jump-Pointer Arrays (cont’d) This technique can be applied to fpB + - Tree Enhancement to avoid overshooting: –fpB + -Trees begin by searching for both start and end key in order to remember the range end page –This technique does not decrease throughput

Optimizing Cache Performance The search operation of B + -Trees suffers poor cache performance –During a search, each page on the path to a key is visited –In each page, binary search is performed on the large continuous array –Costly in terms of cache misses

Optimizing Cache Performance (cont’d) Example: –Key, page ID and tuple ID are all 4 bytes –With a 8KB page, can hold over 1000 entries –Cache line is 64 bytes => hold 8 entries –Suppose page has 1023 entries (1 to 1023) –Locate a matching entry 71, requires 10 probes with binary search 512, 256, 128, 64, 96, 80, 72, 68, 70, 71

Optimizing Cache Performance (cont’d) The update operation of B+-Trees is costly –Insertion and deletion both begin with search –To insert an entry in a sorted array, on average half of the page must be copied to make room for the new entry

Disk-First fpB + -Trees Start with disk-optimized B + -Trees Organize keys and pointers in each page- sized node into a cache-optimized tree In each node - small cache-optimized tree: in-page tree –Modeled after pB + -Trees, which is shown to have best cache performance

Disk-First fpB + -Trees (cont’d) Figure 4: Disk-optimized fpB + -Trees : a cache-optimized tree inside each page page control info

Disk-First fpB + -Trees (cont’d) In-page tree has nodes aligned on cache line boundaries Each node is several cache lines wide –When a node is visited as part of a search, all cache lines in the node are prefetched Increases fan-out of the node and reduce height of the in-page tree Result: better overall performance

Disk-First fpB + -Trees (cont’d) Non-leaf nodes –Contains pointers to other in-page nodes within the same page –To further pack more entries into each node, use short in-page offsets instead of full pointers Leaf nodes –Contains pointers to nodes external to their in-page tree

Disk-First fpB + -Trees (cont’d) Optimal in-page node size is determined by memory system parameters and key and pointer sizes Optimal page size is determined by I/O parameters and disk and memory prices With a mismatch between the two sizes, tree may have overflow or underflow

Disk-First fpB + -Trees (cont’d) page control info Unused Space Figure 5: Overflow and Underflow

Disk-First fpB + -Trees (cont’d) page control info Figure 6: Fitting cache-optimized trees in a page - use smaller nodes when overflow - use larger nodes when underflow

Disk-First fpB + -Trees: Operations Bulkload: operations at two granularities –At a page granularity: follow common B + - Tree bulkload algorithm –For in-page trees of non-leaf pages, pack entries into one in-page leaf node after another –For in-page trees of leaf pages, try to distribute entries across all in-page leaf nodes Maintain a linked list of all in-page leaf nodes

Disk-First fpB + -Trees: Operations (cont’d) Search –Straightforward search done for each granularity

Disk-First fpB + -Trees: Operations (cont’d) Insertion: operations at two granularities –If there are empty slots in the in-page leaf node, insert the entry into the sorted array for the node

Disk-First fpB + -Trees: Operations (cont’d) Insertion: operations at two granularities –Otherwise, split the leaf node into two a.Allocate new nodes in the same page b.Reorganize in-page tree if number of entries is fewer than page maximum fan-out c.Split the page by copying half of the in-page leaf nodes to a new page, and rebuild the two in-page trees in their respective pages

Disk-First fpB + -Trees: Operations (cont’d) Deletion –A search for the entry –Follow by a lazy deletion of entry in a leaf node –Do not merge leaf nodes that become half empty

Cache-First fpB + -Trees Start with cache-optimized B + -Trees Ignore page boundaries Then try to intelligently place cache- optimized nodes into disk pages

Cache-First fpB + -Trees (cont’d) Non-leaf node –Contains an array of keys and pointers –A pointer is a combination of a page ID and an offset in the page Use the page ID to retrieve a disk page Visit a node in the page by the offset Leaf node –Contains an array of keys and tuple ids

Cache-First fpB + -Trees: Node Placement Goal 1: group sibling leaf nodes together into the same page to reduce disk operations during range scans Approach: designate certain pages as leaf pages that contain only leaf nodes –Leaf nodes in the same page are siblings

Cache-First fpB + -Trees: Node Placement (cont’d) Goal 2: group a parent node and its children together into the same page to ensure searches only need one disk operation for a parent and its child Problems: –Not possible for all nodes –Node size mismatch (overflow and underflow)

Cache-First fpB + -Trees: Node Placement (cont’d) For underflow (i.e. “not enough” children) –Place grandchildren, great grandchildren, etc onto the same page For overflow: two approaches a.Place overflowed child into its own page as top-level node with its own children b.Store overflowed child in special overflow pages

Cache-First fpB + -Trees: Node Placement (cont’d) Figure 8: Cache-first fpB + -Tree design Nonleaf nodes Aggressive Placement Overflow pages for leaf node parents

Cache-First fpB + -Trees: Operations Bulkload: Leaf nodes –Placed consecutively in leaf pages, and linked together with sibling links

Cache-First fpB + -Trees: Operations Bulkload: Non-leaf nodes –Determine whether there is space for the node to fit into the same page as its parent –If not, then Allocate the node as the top level node in a new page, or If the non-leaf node is a parent of a leaf node, place it into the overflow page

Cache-First fpB + -Trees: Operations (cont’d) Search –Straightforward with one thing to note –When proceeding from a parent to one of its children, compare the page ID –Same page ID indicates parent and child are on the same page Can directly access the node in the page without retrieving the page from buffer manager

Cache-First fpB + -Trees: Operations (cont’d) Insertion: –If there are empty slots in the leaf node, simply insert the entry; else need to split node into two –If leaf page has space, accommodate the new node; else need to split the leaf page Move second half of the leaf nodes to a new page Update corresponding child pointers in their parents

Cache-First fpB + -Trees: Operations (cont’d) Insertion: –After leaf node split, need to insert an entry into the parent node –If parent node is full, it needs to be split For leaf parent node, the new node may be allocated from overflow pages If further splits up the tree are needed, the new node must be allocated as described in bulkload

Cache-First fpB + -Trees: Operations (cont’d) Deletion –Similar to disk-first fpB + -Trees

Conclusion 1.Problems of traditional B + -Trees 2.In optimizing I/O performance, considered two concepts from pB + -Trees: searches and range scans 3.How disk-first and cache-first fpB + -Trees performances better traditional B + -Trees 4.Operations (bulkload, search, insertion, deletion)