Download presentation
Presentation is loading. Please wait.
Published byShon Morrison Modified over 9 years ago
1
Index tuning-- B+tree
2
overview
3
© Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need for range locking ARIES KVL (implemented in DB2) Tree Traversal (next page) Lock on tuples Lock on key values Range locking: –Next key lock 4 24
4
© Dennis Shasha, Philippe Bonnet 2001 A B C D EF T 1 lock B+-Tree Locking
5
Bulk Loading of a B+ Tree If we have a large collection of records, and we want to create a B+ tree on some field, doing so by repeatedly inserting records is very slow. Bulk Loading can be done much more efficiently. Initialization: Sort all data entries, insert pointer to first (leaf) page in a new (root) page.
6
Bulk Loading (Contd.) Add to the root page
7
Bulk Loading (Contd.) Split the root and create a new root page.
8
Bulk Loading (Contd.) Index entries for leaf pages always entered into rightmost index page just above leaf level. When this fills up, it splits. (Split may go up right-most path to the root.)
9
Much faster than repeated inserts, especially when one considers locking!
10
Comparison: B-trees vs. static indexed sequential file Ref #1: Held & Stonebraker, “ B-Trees Re- examined ”, CACM, Feb. 1978 Ref # 1 claims: - Concurrency control harder in B-Trees - B-tree consumes more space For their comparison: block = 512 bytes key = pointer = 4 bytes 4 data records per block
11
Example: 1 block static index 127 keys (127+1)4 = 512 Bytes -> pointers in index implicit!up to 127 blocks k1 k2 k3 k1k2k3 1 data block
12
Example: 1 block B-tree 63 keys 63x(4+4)+8 = 512 Bytes -> pointers needed in B-treeup to 63 blocks because index isblocks not contiguous k1 k2... k63 k1k2k3 1 data block next -
13
Size comparison Ref. #1 Static Index B-tree # data blocks height 2 -> 127 2 2 -> 63 2 128 -> 16,129 3 64 -> 3968 3 16,130 -> 2,048,3834 3969 -> 250,047 4 250,048 -> 15,752,961 5
14
Ref. #1 analysis claims For an 8,000 block file, after 32,000 inserts after 16,000 lookups Static index saves enough accesses to allow for reorganization Ref. #1 conclusion Static index better!!
15
Ref #2: M. Stonebraker, “ Retrospective on a database system, ” TODS, June 1980 Ref. #2 conclusion B-trees better!! DBA does not know when to reorganize DBA does not know how full to load pages of new index Buffering –B-tree: has fixed buffer requirements –Static index: must read several overflow blocks to be efficient (large & variable size buffers needed for this)
16
Speaking of buffering … Is LRU a good policyfor B+tree buffers? Of course not! Should try to keep root in memory at all times (and perhaps some nodes from second level)
17
Interesting problem: For B+tree, how large should n be? … n is number of keys / node
18
Sample assumptions: (1) Time to read node from disk is (S+Tn) msec. (2) Once block in memory, use binary search to locate key: (a + b LOG 2 n) msec. For some constants a,b; Assume a << S (3) Assume B+tree is full, i.e., # nodes to examine is LOG n N where N = # records
19
Can get: f(n) = time to find a record f(n) n opt n
20
FIND n opt by f ’ (n) = 0 Answer is n opt = “ few hundred ” (see homework for details) What happens to n opt as Disk gets faster? CPU get faster?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.