Download presentation
Presentation is loading. Please wait.
1
B+-tree Implementation
Donghui Zhang COM 3315 lecture slides CCIS, Northeastern Univ.
2
Goal B+-tree; file organization: paginated, some pages may be empty;
Combine the following knowledge (chapter 9, 10) with practice: B+-tree; file organization: paginated, some pages may be empty; buffer management; disk page layout: containing fixed length or variable length records; C++.
3
Problem Statement Build a B+-tree on top of a paginated file using alternative 1, i.e. the data records should be stored in the index. Each data record contains: int key, string value. For simplicity, assume no two record have the same key. Index pages use fixed-length layout; leaf pages use variable-length layout.
4
Example B+ Tree Each tree node should map to a disk page in a file.
Root 17 5 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* Each tree node should map to a disk page in a file. Besides index pages and leaf pages, needs a header page, and some empty pages (to avoid compacting the file too often). 13
5
B+-tree File Organization
root header Page Index Page Leaf Page Empty Page Leaf Page Empty Page Header page: point to root page, first empty page. Index pages. Leaf pages: form a double linked list. Empty pages: form a linked list.
6
Implementation of Pages
class Page { int type; // 1: header, 2: empty, 3: index, 4: leaf }; class HeaderPage: Page { int rootPage; int firstEmptyPage; ... // possibly num of records, level of tree char dummy[ PageSize – sizeof(int) * 3 ]; ... // functions. Constructor: set type=1
7
Implementation of Pages (Cont.)
Every page has the same size! const int PageSize = 8192; // 8 KB class EmptyPage: Page { int next; // -1 means no next char dummy[ PageSize – sizeof(int) * 2 ]; ... // functions. Constructor: set type=2 }; Needs to link together the in-memory page objects with disk pages in file!
8
Buffer Management in a DBMS
Page Requests from Higher Levels BUFFER POOL disk page free frame MAIN MEMORY DISK DB choice of frame dictated by replacement policy Data must be in RAM for DBMS to operate on it! Table of <frame#, pageid> pairs is maintained. 4
9
Buffer Management Implementation
const int BufferSize = 128; // 128 * 8 KB = 1 MB buffer class BufferEntry { int pageid; Page* page; bool dirty; int pinCount; }; class Buffer { BufferEntry entries[ BufferSize ]; int num;
10
Page* BTree::ReadPage (int pageid ) {
if ( pageid is in buffer ) return the page pointer; else { if ( buffer.num == BufferSize ) { choose a page to switch off; if the page is dirty, write to file; } // read the page from file; EmptyPage* page = new EmptyPage(); // even if other type, fine fseek( file, pageid*PageSize, SEEK_SET ); fread( page, PageSize, 1, file ); insert (pageid, page) into buffer; return page;
11
Page* BTree::WritePage (int pageid , Page* page) {
if ( pageid is in buffer ) mark as dirty; else { if ( buffer.num == BufferSize ) { choose a page to switch off; if the page is dirty, write to file; } insert pageid and page into buffer; mark as dirty;
12
B+-tree Class class BTree { Buffer buffer; FILE* file;
HeaderPage * header; BTree( char* filename, bool exists ); ~BTree(); Page* ReadPage( int pageid ); void WritePage( int pageid, Page* ); void Insert( int key, string s ); void Delete( int key ); string Search ( int key ); };
13
Constructor & destructor
BTree::BTree ( char* filename, bool exists ) { if ( exists ) { file = fopen( filename, “r+” ); // open file header = ReadPage( 0 ); } else { file = fopen( filename, “w+” ); // create file header = new HeaderPage; WritePage( 0, header ); BTree::~BTree() { write dirty pages in buffer to file; fclose( file );
14
Index Page: Fixed Length Records
typedef struct { int pageid; int router; } Entry; const int MaxEntries = 998; class IndexPage: Page { int N; Entry entries[MaxEntries]; char dummy[PageSize -sizeof(Entry)*MaxEntries – sizeof(int)*2]; ... }; Slot 1 Slot 2 . . . Free Space Slot N N number of records PACKED 11
15
Insertion into Index Page
Currently three entries. Insert 20? Assume there is no overflow. void IndexPage::Insert( int pageid, int router ) { int i = 0; while ( entries[i].router < router ) i++ ; move all entries from i afterwards down by 1; entries[i].router = router; entries[i].pageid = pageid; N ++ ; } Search in an index page? 11
16
Leaf Page: Variable Length Records
Rid = (i,N) Page i Rid = (i,2) Rid = (i,1) 20 16 24 N Pointer to start of free space N # slots SLOT DIRECTORY Every record: int key, int valueSize, string value. 12
17
Leaf Page Implementation
const int MaxRecords = 50; typedef struct { int key; int valueSize; char value[1]; } Record; class LeafPage : Page { int N; int startFree; int prev, next; int offsets[ MaxRecords ]; // -1 means the slot is not occupied char data[ PageSize – sizeof(int) * (MaxRecords+5) ]; }; To access the record at slot k: Record* rec = (Record*)(data+offsets[k]); rec.key = 5; 11
18
Insert into Leaf Page Search in an leaf page?
Assume there is enough space void LeafPage::Insert( int key, string value ) { int k = 0; while ( offsets[k] != -1 ) k++; offsets[k] = startFree; startFree += sizeof(int)*2 + value.size(); Record* rec = (Record*)(data + offsets[k]); rec -> key = key; rec -> valueSize = value.size(); strncpy( rec->value, value->c_str(), value.size() ); N++; }; Search in an leaf page? 11
19
Search in B+-tree string Btree::Search( int key) {
int pageid = header -> rootPage; Page* page = ReadPage( pageid ); while ( page -> type == 3 ) { // index page pageid = ((IndexPage*)page) -> Search( key ); page = ReadPage( pageid ); } return ((LeafPage*)page) -> Search( key ); }; 11
20
Some Issues Search in a page (leaf or index) should be binary search.
In B+-tree insertion/deletion algorithm, should pin all pages along the update path while browsing down. Reason? Header page, root page should be pinned in memory. To free a page (occurred when merging two sibling pages during deletion), insert into empty page list. To allocate a new page, try to use empty page first. If no empty page is present, allocate at the end of file. Other types of key, value. 11
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.