B-Trees B-trees are characterized in the following way: they are search trees they have large branching factors they are good for organizing data on external media
B-Trees and Disk Drives - 1 Any node contains n[x] keys and n[x]+1 children Branching factors are typically between 50 - 2000 Height is very shallow minimizing disk accesses The node size matches the sector size on the disk
B-Trees and Disk Drives - 2 Assume a branching factor of 1000 Only the root node is kept in memory With one disk access over 1,000,000 keys are accessed; with two disk accesses over one billion keys can be accessed
B-Tree Definitions Every node contains the number of keys (n[x]), the keys in nondecreasing order, and a boolean (leaf[x]) indicating whether a leaf or not There are n[x]+1 pointers in internal nodes The keys separate the ranges of keys in subtrees Every leaf is at the same depth (height of the tree) t (>= 2) is the minimum degree of the tree every node (but the root) must have at least t-1 keys every node has at most 2t-1 keys and 2t children
Height of a B-Tree If n 1 for any n-key B-Tree of height h and minimum degree t 2 then