B-Tree Michael Tsai 2017/06/06.

Slides:



Advertisements
Similar presentations
B-Trees. Motivation When data is too large to fit in the main memory, then the number of disk accesses becomes important. A disk access is unbelievably.
Advertisements

0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
Tirgul 6 B-Trees – Another kind of balanced trees Some notes regarding Home Work.
1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter Trees and B-Trees.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
Data Structures Using C++ 2E Chapter 11 Binary Trees and B-Trees.
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
Tirgul 6 B-Trees – Another kind of balanced trees.
1 B-Trees Section AVL (Adelson-Velskii and Landis) Trees AVL tree is binary search tree with balance condition –To ensure depth of the tree is.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
Spring 2006 Copyright (c) All rights reserved Leonard Wesley0 B-Trees CMPE126 Data Structures.
Different Tree Data Structures for Different Problems
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
B-trees Eduardo Laber David Sotelo. What are B-trees? Balanced search trees designed for secondary storage devices Similar to AVL-trees but better at.
B+ Trees  What if you have A LOT of data that needs to be stored and accessed quickly  Won’t all fit in memory.  Means we have to access your hard.
Data Structures and Algorithms (AT70.02) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Prof. Sumanta Guha Slide Sources: CLRS “Intro.
B-TREE. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if we have so much data that it won’t.
Advanced Algorithm Design and Analysis (Lecture 1) SW5 fall 2004 Simonas Šaltenis E1-215b
B-Trees Katherine Gurdziel 252a-ba. Outline What are b-trees? How does the algorithm work? –Insertion –Deletion Complexity What are b-trees used for?
COMP261 Lecture 23 B Trees.
Multiway Search Trees Data may not fit into main memory
Chapter 18: B-Trees Example: M Note: Each leaf has the same depth D H
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
B-Trees Example: Comp 750, Fall 2009 M Note: Each leaf
Binary Search Tree (BST)
Chapter 18 B-Trees Lee, Hsiu-Hui
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
B+-Trees.
B+-Trees.
CSCI Trees and Red/Black Trees
Lecture 22 Binary Search Trees Chapter 10 of textbook
B+ Tree.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Trees 4 The B-Tree Section 4.7
Chapter Trees and B-Trees
Chapter Trees and B-Trees
B-Trees -Stephen R. Covey
B Tree Adhiraj Goel 1RV07IS004.
CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.
Haim Kaplan and Uri Zwick November 2014
(2,4) Trees (2,4) Trees 1 (2,4) Trees (2,4) Trees
Data Structures and Algorithms
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
CS 583 Analysis of Algorithms
Balanced-Trees This presentation shows you the potential problem of unbalanced tree and show two way to fix it This lecture introduces heaps, which are.
B- Trees D. Frey with apologies to Tom Anastasio
B-Tree.
B+-Trees (Part 1).
Other time considerations
Balanced-Trees This presentation shows you the potential problem of unbalanced tree and show two way to fix it This lecture introduces heaps, which are.
B-Tree Presenter: Jun Tao.
Multiway Trees Searching and B-Trees Advanced Tree Structures
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
2-3-4 Trees Red-Black Trees
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
B- Trees D. Frey with apologies to Tom Anastasio
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
B-Trees.
B-Trees Large degree B-trees used to represent very large dictionaries that reside on disk. Smaller degree B-trees used for internal-memory dictionaries.
Data Structures Using C++ 2E
B-Trees B-trees are characterized in the following way:
Presentation transcript:

B-Tree Michael Tsai 2017/06/06

B-Tree Overview Balanced search tree Very large “branching factor” Height = O(log n), but much less than that of RB tree Usage: Large amount of data to be stored -- Partially in memory, and partially in secondary storage (e.g., hard drive) Goal: Minimizing disk I/O operation Minimizing CPU time

Typical Storage Speed / Capacity Read speed Capacity Hard Drive Typically ~100 MB/s Up to 10 TB SSD ~500 MB/s Up to 1 TB SD 30 MB/s (UHS-3) 10 MB/s (class 10) (Typically 32 GB or 64 GB) Memory 6400 MB/s (DDR3) Desktop/laptop has 4~16 GB

Typical B-Tree (keys) Internal node x has x.n keys (3) Keys in x separate the ranges of keys in its sub trees. Internal node x has x.n+1 children (4)

Typical B-Tree (search) Internal node x has x.n keys (3) Keys in x separate the ranges of keys in its sub trees. Internal node x has x.n+1 children (4)

A more realistic B-tree Usually only a node’s keys/data is read from the disk at a time. Root is always kept in the memory.

B-Tree Definition (1) B-Tree is a rooted tree For each node x: x.n is the number of keys in x Keys: are stored in non-decreasing order. x.leaf, TRUE if x is a leaf and FALSE otherwise. Each internal node x contains x.n+1 pointers Leaves have these undefined.

B-Tree Definitions (2) The keys separate the ranges of keys stored in each subtree: is any key stored in the subtree with root , All leaves have the same depth: the tree’s height h. Minimum degree of B-tree: Every node other than root have at least t-1 keys (thus t children) Every node can have at most 2t-1 keys (thus 2t children) (In this case, this node is full)

Proof: B-Tree Height If , then for any n-key B-tree T of height h and minimum degree , Proof: Consider the case with each node having the least # of node

Proof: B-Tree Height

Disk Operation DISK-READ(x) : if x is not in memory, then we require this before accessing x. “no-op” if x is already in the memory. DISK-WRITE(x): this is required for putting any changes of x back to the disk. Root is always stored in the memory Typical work flow: x = pointer to an object DISK-READ(x) (operations to modify x) DISK-WRITE(x) Operations to access x (but no modifications)

Search in B-Tree CPU time: Disk I/O: Input: x: search from this node k: key to be searched Return value: (x,i): key k is found at node x’s i-th key CPU time: Disk I/O: i=1,2,3,4, 5 x.key_i = 3 5 6 8 10

Create an empty B-Tree CPU time: O(1) Disk I/O: O(1) Allocate-Node() is a O(1) operation to allocate a disk page to store a new node CPU time: O(1) Disk I/O: O(1)

B-Tree Insertion: Overview Cannot simply create a new leaf node and insert it: this will violate the B-tree definitions Sol: insert into an existing leaf node Problem: what if that leaf node is already FULL? FULL: having 2t-1 keys and 2t children Sol: split a full node y around its median key Then move up to y’s parent node. What if y’s parent is also full? We split it too Workflow: Start from root (search), split all traversed full nodes t keys smaller than median t keys larger than median

B-Tree Insertion: Overview

B-Tree: Split Full Child Split node x’s i-th child, which is full CPU time: O(t) Disk I/O: O(1)

B-Tree: Split the Root Splitting the root is the only way to increase the height of a B-tree Height is increased at the top, not at the bottom

B-Tree: Split the Root Please review the pseudo code below yourself.

Insertion Example

B-Tree Insertion:pseudo code If x is a leaf, insert k at the right location If x is not a leaf, then.. Find the right child node If the child node is full, first split it! (its median key will come back to this node) Finally, recursive call to continue to the child node

B-Tree Insertion: running time CPU time: Disk I/O:

Reading Assignment (Real) Chapter 18.3 Deleting a key from a B-tree