CSE373: Data Structures & Algorithms Lecture 15: B-Trees Linda Shapiro Winter 2015.

Slides:



Advertisements
Similar presentations
Advanced Database Discussion B Trees. Motivation for B-Trees So far we have assumed that we can store an entire data structure in main memory What if.
Advertisements

B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.
COMP 451/651 Indexes Chapter 1.
CSE332: Data Abstractions Lecture 9: B Trees Dan Grossman Spring 2010.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter Trees and B-Trees.
CSE 326: Data Structures Lecture #11 B-Trees Alon Halevy Spring Quarter 2001.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
CS 206 Introduction to Computer Science II 12 / 01 / 2008 Instructor: Michael Eckmann.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
1 B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Comparing B-trees and AVL-trees Searching a B-tree Insertion in a B-tree.
1 Database indices Database Systems manage very large amounts of data. –Examples: student database for NWU Social Security database To facilitate queries,
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
CSE 326: Data Structures B-Trees Ben Lerner Summer 2007.
B-Trees and B+-Trees Disk Storage What is a multiway tree?
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
(B+-Trees, that is) Steve Wolfman 2014W1
B + -Trees COMP171 Fall AVL Trees / Slide 2 Dictionary for Secondary storage * The AVL tree is an excellent dictionary structure when the entire.
E.G.M. PetrakisB-trees1 Multiway Search Tree (MST)  Generalization of BSTs  Suitable for disk  MST of order n:  Each node has n or fewer sub-trees.
Splay Trees and B-Trees
1 B-Trees Section AVL (Adelson-Velskii and Landis) Trees AVL tree is binary search tree with balance condition –To ensure depth of the tree is.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
IntroductionIntroduction  Definition of B-trees  Properties  Specialization  Examples  2-3 trees  Insertion of B-tree  Remove items from B-tree.
B-Tree. B-Trees a specialized multi-way tree designed especially for use on disk In a B-tree each node may contain a large number of keys. The number.
ICS 220 – Data Structures and Algorithms Week 7 Dr. Ken Cosh.
More Trees Multiway Trees and 2-4 Trees. Motivation of Multi-way Trees Main memory vs. disk ◦ Assumptions so far: ◦ We have assumed that we can store.
B+ Trees COMP
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Oct 29, 2001CSE 373, Autumn External Storage For large data sets, the computer will have to access the disk. Disk access can take 200,000 times longer.
INTRODUCTION TO MULTIWAY TREES P INTRO - Binary Trees are useful for quick retrieval of items stored in the tree (using linked list) - often,
Announcements Exam Friday. More Physical Storage Lecture 10.
B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee.
COSC 2007 Data Structures II Chapter 15 External Methods.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Starting at Binary Trees
School of Engineering and Computer Science Victoria University of Wellington Copyright: Xiaoying Gao, Peter Andreae, VUW B Trees and B+ Trees COMP 261.
CPSC 221: Algorithms and Data Structures Lecture #7 Sweet, Sweet Tree Hives (B+-Trees, that is) Steve Wolfman 2010W2.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture17.
Indexing CS 400/600 – Data Structures. Indexing2 Memory and Disk  Typical memory access: 30 – 60 ns  Typical disk access: 3-9 ms  Difference: 100,000.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Internal and External Sorting External Searching
8/3/2007CMSC 341 BTrees1 CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
ITEC 2620M Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: ec2620m.htm Office: TEL 3049.
Database Applications (15-415) DBMS Internals- Part III Lecture 13, March 06, 2016 Mohammad Hammoud.
CSE 326: Data Structures Lecture #9 Big, Bad B-Trees Steve Wolfman Winter Quarter 2000.
COMP261 Lecture 23 B Trees.
Unit 9 Multi-Way Trees King Fahd University of Petroleum & Minerals
TCSS 342, Winter 2006 Lecture Notes
Multiway Search Trees Data may not fit into main memory
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
B+-Trees.
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
B+-Trees.
B+-Trees.
CSE373: Data Structures & Algorithms Lecture 15: B-Trees
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
CPSC-629 Analysis of Algorithms
Chapter Trees and B-Trees
Chapter Trees and B-Trees
CPSC-310 Database Systems
B+ Trees What are B+ Trees used for What is a B Tree What is a B+ Tree
B+-Trees (Part 1).
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
CPSC-608 Database Systems
CSE 326: Data Structures Lecture #10 B-Trees
Presentation transcript:

CSE373: Data Structures & Algorithms Lecture 15: B-Trees Linda Shapiro Winter 2015

2 Announcements HW05 will be out soon on hash tables. Evan is improving it. I am grading exams. HW03 is graded and posted. HW04 will be posted by Friday. Data Structures & Algorithms

B-Trees Introduction B-Trees (and B+-Trees) are used heavily in databases. They are NOT binary trees. They are multi-way search trees that are kept somewhat shallow to limit disk accesses. Winter 2015Data Structures & Algorithms3

Example (Just the Idea) Winter 2015Data Structures & Algorithms4

Relational Databases A relational database is conceptually a set of 2D tables. The columns of a table are called attributes; they are the keys. Each table has at least one primary key by which it can be accessed rapidly. The rows are the different data records, each having a unique primary key. B+ trees are one very common implementation for these tables. In B+ trees, the data are stored only in the leaf nodes. Winter 2015Data Structures & Algorithms5

Creating a table in SQL create table Company (cname varchar(20) primary key, country varchar(20), no_employees int, for_profit char(1)); insert into Company values ('GizmoWorks', 'USA', 20000,'y'); insert into Company values ('Canon', 'Japan', 50000,'y'); insert into Company values ('Hitachi', 'Japan', 30000,'y'); insert into Company values('Charity', 'Canada', 500,'n'); Winter 2015Data Structures & Algorithms6

Company Table Winter 2015Data Structures & Algorithms7 cnamecountry no_employeesfor_profit GizmoWorks USA y CanonJapan y HitachiJapan y CharityCanada 500 n primary key

Queries select * from Company; select cname from Company where no_employees = 500; select cname, country from Company where no_employees > AND no_employees < 50000; Winter 2015Data Structures & Algorithms8 create table Company (cname varchar(20) primary key, country varchar(20), no_employees int, for_profit char(1));

Winter 2015Data Structures & Algorithms9 B+-Trees are multi-way search trees commonly used in database systems or other applications where data is stored externally on disks and keeping the tree shallow is important. A B+-Tree of order M has the following properties: 1. The root is either a leaf or has between 2 and M children. 2. All nonleaf nodes (except the root) have between  M/2  and M children. 3. All leaves are at the same depth. All data records are stored at the leaves. Internal nodes have “keys” guiding to the leaves. Leaves store between  L/2  and L data records, where L can be equal to M (default) or can be different. B+-Trees

Winter 2015Data Structures & Algorithms10 B+-Tree Details Each (non-leaf) internal node of a B-tree has: ›Between  M/2  and M children. ›up to M-1 keys k 1  k 2  k M-1 k M-1... k i-1 kiki k1k1

Winter 2015Data Structures & Algorithms11 Properties of B+-Trees Children of each internal node are "between" the keys in that node. Suppose subtree T i is the ith child of the node: all keys in T i must be between keys k i-1 and k i i.e. k i-1  T i  k i k i-1 is the smallest key in T i All keys in first subtree T 1  k 1 All keys in last subtree T M  k M-1 k 1 TT ii... kk i-1 kk ii TT M TT 11 kk M-1...

Winter 2015Data Structures & Algorithms12 DS.B.13 B-Tree Nonleaf Node in More Detail P[1] K[1]... K[i-1] P[i-1] K[i]... K[q-1] P[q] yz x < K[1] K[i-1]  y<K[i]K[q-1]  z The Ks are keys The Ps are pointers to subtrees. x | 4 | | 8 | x<4 4  x<8 8  x

Winter 2015Data Structures & Algorithms13 DS.B.14 Detailed Leaf Node Structure (B+ Tree) K[1] R[1]... K[q-1] R[q-1] Next The Ks are keys (assume unique). The Rs are pointers to records with those keys. The Next link points to the next leaf in key order (B+-tree) Jones Mark 19 4 data record

Winter 2015Data Structures & Algorithms14 Searching in B-trees 13:- 6: :- B-tree of order 3: also known as 2-3 tree (2 to 3 children per node) Examples: Search for 9, 14, 12 - means empty slot

Winter 2015Data Structures & Algorithms15 DS.B.17 Searching a B-Tree T for a Key Value K (from a database book) Find(ElementType K, Btree T) { B = T; while (B is not a leaf) { find the Pi in node B that points to the proper subtree that K will be in; B = Pi; } /* Now we’re at a leaf */ if key K is the jth key in leaf B, use the jth record pointer to find the associated record; else /* K is not in leaf B */ report failure; } How would you search for a key in a node?

Winter 2015Data Structures & Algorithms16 Inserting into B-Trees The Idea Insert X: Do a Find on X and find appropriate leaf node ›If leaf node is not full, fill in empty slot with X E.g. Insert 5 ›If leaf node is full, split leaf node and adjust parents up to root node E.g. Insert 9 13:- 6: :- Assume M=L=3, so (6 7 8) is full.

Winter 2015Data Structures & Algorithms17 Inserting into B-Trees The Idea 13:- 6: : ?

Winter 2015Data Structures & Algorithms18 DS.B.18 Inserting a New Key in a B-Tree of Order M (and L=M) from database book Insert(ElementType K, Btree B) { find the leaf node LB of B in which K belongs; if notfull(LB) insert K into LB; else { split LB into two nodes LB and LB2 with j =  (M+1)/2  keys in LB and the rest in LB2; if ( IsNull(Parent(LB)) ) CreateNewRoot(LB, K[j+1], LB2); else InsertInternal(Parent(LB), K[j+1], LB2); } } K[1] R[1]... K[j] R[j]K[j+1] R[j+1]... K[M+1] R[M+1] LB LB2

Winter 2015Data Structures & Algorithms19 DS.B.19 Inserting a (Key,Ptr) Pair into an Internal Node If the node is not full, insert them in the proper place and return. If the node is already full (M pointers, M-1 keys), find the place for the new pair and split the adjusted (Key,Ptr) sequence into two internal nodes with j =  (M+1)/2  pointers and j-1 keys in the first, the next key is inserted in the node’s parent, and the rest in the second of the new pair.

Winter 2015Data Structures & Algorithms20 Inserting into B-Trees The Idea 13:- 6: : ?8

Winter 2015Data Structures & Algorithms21 Inserting into B-Trees The Idea 13:- 6: : :- 11:- 8?

Winter 2015Data Structures & Algorithms22 Inserting into B-Trees The Idea 8:13 6: : :- 11:- <6  <11 

Winter 2015Data Structures & Algorithms23 Example of Insertions into a B+ tree with M=3, L=2 Insertion Sequence: 9, 5, 1, 7, 3,12 95 | 9 1| | 5 | 9 | | 5 | 1 23 | 5 | | 7 | 1| | 7 | 9 | 5 | | 4 | 5 | | 7 | 1| 3 |5 | | 7 | 9 | 5 1| 3 |5 | | 7 | | 9 | 12 | | 5 | | 9 | | 7 |

Winter 2015Data Structures & Algorithms24 Deleting From B-Trees (NOT THE FULL ALGORITHM) Delete X : Do a find and remove from leaf ›Leaf underflows – borrow from a neighbor E.g. 11 ›Leaf underflows and can’t borrow – merge nodes, delete parent E.g :- 6: :-

Winter 2015Data Structures & Algorithms25 Run Time Analysis of B-Tree Operations For a B-Tree of order M ›Each internal node has up to M-1 keys to search ›Each internal node has between  M/2  and M children ›Depth of B-Tree storing N items is O(log  M/2  N) Find: Run time is: ›O(log M) to binary search which branch to take at each node. But M is small compared to N. ›Total time to find an item is O(depth*log M) = O(log N)

Winter 2015Data Structures & Algorithms26 DS.B.22 How Do We Select the Order M? - In internal memory, small orders, like 3 or 4 are fine. - On disk, we have to worry about the number of disk accesses to search the index and get to the proper leaf. Rule: Choose the largest M so that an internal node can fit into one physical block of the disk. This leads to typical M’s between 32 and 256 And keeps the trees as shallow as possible.

Winter 2015Data Structures & Algorithms27 Summary of B+-Trees Problem with Binary Search Trees: Must keep tree balanced to allow fast access to stored items Multi-way search trees (e.g. B-Trees and B+-Trees): ›More than two children per node allows shallow trees; all leaves are at the same depth. ›Keeping tree balanced at all times. ›Excellent for indexes in database systems.