Download presentation
Presentation is loading. Please wait.
Published byIrma Ellis Modified over 8 years ago
1
CENG 3511 Indexed File Organization
2
CENG 3512 Indexing-1 Indexing allows access to records based on a key, on which the file is stored and accessed. Address of a record is some function of the key. Student id, social security id, citizen id, etc. are good candidates for an indexed file organization. Simple indexing (content based access to data) is used where a separate index file is maintained, in addition to the data file. In this case, the index file is generally as big as the available memory. The handling of the index file is treated as a fixed-length-record sorted file (or as an array).
3
CENG 3513 Indexing-2 The data file is maintained as unsorted, with possibly variable length records. The simple indexing method becomes cumbersome when the index file becomes too big to fit in the memory and when too many updates are needed, –which means deletion/insertion of records from/into the index files becomes costly. –The problem is how to have a search method that does better than the simple indexing.
4
CENG 3514 Indexing-3 The techniques, such as binary trees or AVL trees can also be used, which are logN efficient. This may be considered unaffordable for a large file… As alternative to the binary or AVL based processing, multilevel indexing or hashing is suggested. –In this category, as potential file organization methods for very large files, three methods can be mentioned: ISAM, B Trees, and Hashing.
5
CENG 3515 ISAM-Indexed Sequential Access method-1 Historically, an indexing method known as ISAM (Indexed Sequential Access Method) is used by famous vendors such as IBM and others, dealing with database management systems. ISAM-Indexed Sequential Access method is generally based on a cylinder index and a block index. The cylinder index contains the highest key record in each cylinder The block index contains the highest key record on each block
6
CENG 3516 ISAM-Indexed Sequential Access method-2 To access a record, –First, the cylinder index is accessed, generally once for each disk, to find the cylinder on which the record is. –Second, the index block (containing pointers for the data blocks) on that cylinder is access to find the block address on which the target record is located. –Then, that block is accessed to retrieve the record
7
CENG 3517 ISAM-Indexed Sequential Access method-3 Time required to access a single record would require one seek to the target cylinder containing the data, one r+btt for the index block, and one r+btt for the data block: T F =r+s+btt+r+btt An overflow area allocated at the end of each cylinder to be used when needed. When new records are added, the old records are shifted to open up space. The record which has the largest key in that block is moved to the overflow area, with a pointer placed in the current block to point to the overflow block.
8
CENG 3518 ISAM-Indexed Sequential Access method-4 The new records will cause the number of overflow records to grow, if there is no space in the primary area. After some time, with frequent insertions, there will be a long list of linked records in the overflow area. Thus, ISAM degrades as the new records are added to the overflow area, which will cause the ISAM file to be reorganized, as a costly process…
9
CENG 3519 B+Trees as an Indexed File Organization Method -1 B+ Tree is a multilevel indexing type organization In a B+ Tree, –Each node may have any number of children –It has all its leaves on the same level B+ Trees have versions called as B Trees, or B* Tree, with some differences. The ones with data in the leaves, indices in the internal nodes, seems to be the most common, which are called B+ Tree. B+ Trees probably form the most common file organization methods, currently used.
10
CENG 35110 ADAMS-BERNEFOLKS-GADDIS FABER-FOLKBOLEN-CAGE CAMP-DUTTON EMBRY-EVANS BOLEN CAMP EMBRY FABER FOLKS 1 2 3 5 46 Index set root B+Tree Example
11
CENG 35111 B+Trees as an Indexed File Organization Method-2 Properties of B+ Trees –Order of B+ Tree v is the minimum number of the keys an internal node has. (Note that different authors may define order differently!) –Except, the root node can have at least 2 children (minimum), unless it is a leaf. –No internal nodes can have more than 2v keys. –All the leaves are on the same level. –Leaves contain data records (or the address of the data records in case of secondary index).
12
CENG 35112 B+ tree: Internal/root node structure P 0 K 1 P 1 K 2 ……………… P n-1 K n P n Requirements: K 1 < K 2 < … < K n For any search key value K in the subtree pointed by Pi, If Pi = P0, we require K < K1 If Pi = Pn, Kn K If Pi = P1, …, Pn-1, Ki K < Ki+1 Each P i is a pointer to a child node; each K i is a search key value # of search key values = n, # of pointers = n+1
13
CENG 35113 Pointer L points to the left neighbor; R points to the right neighbor K 1 < K 2 < … < K n v n 2v (v is the order of this B+ tree) We will use K i * for the pair and omit L and R for simplicity B+ tree: leaf node structure L K 1 r 1 K 2 ……………… K n r n R
14
CENG 35114 Example: B+ tree with order of 1 Each node must hold at least 1 entry, and at most 2 entries 10*15*20*27*33*37*40* 46* 51* 55* 63* 97* 20335163 40 Root
15
CENG 35115 Example: Search in a B+ tree order 2 Search: how to find the records with a given search key value? –Begin at root, and use key comparisons to go to leaf Examples: search for 5*, 16*, all data entries >= 24*... –The last one is a range search, we need to do the sequential scan, starting from the first leaf containing a value >= 24. Root 1724 30 2* 3*5* 7*14*15* 19*20*22*24*27* 29*33*34* 38* 39* 13
16
CENG 35116 Cost for searching a record in B+ Tree Typically, a node is a page (block or cluster) Let d be the height of the B+ tree: we need to read d+1 pages to reach a leaf node Let F be the (average) number of pointers in a node (for internal node, called fanout ) –Level 1 = 1 page = F 0 page –Level 2 = F pages = F 1 pages –Level 3 = F * F pages = F 2 pages –Level d+1 = …….. = F d pages (i.e., leaf nodes) –Suppose there are D data entries. So there are D/F leaf nodes –D/F = F d. That is, d = log F ( D/F ) 1212 level Height 0 1
17
CENG 35117 B+Trees: secondary key Each secondary keys can be used, but for each such key, a new B+ Tree needed to be maintained. Properties of B+ Tree: –Leaves of a B+ Tree may also contain the address of the next leaf for fast sequential access. –The keys in a node are sorted, –a given key is actually the largest or the smallest key in the corresponding child node;
18
CENG 35118 Searching a B+ Tree for a record Note that if there are k keys in a node (c1,…, ck), there are k+1 pointers (p0, p1,…, pk), in the same node, for that many descendent nodes. Given a key x, start from the root and do the following until the corresponding record is reached at a leaf:- If x<c i, take the p i-1, If x>=c k, take the p k, for i=1,…,k
19
CENG 35119 B+Trees: Timing computations T F =index access time + data access time If index access time=s+r+btt, and data access time=s+r+dtt, T F =2s+2r+btt+dtt Note that dtt (data transfer time) implies a cluster (or bucket) to hold the data, which is generally several blocks. This computation is based on the assumption that, the B+ Tree (the index) is kept in the memory, except the level above the leaf nodes and the leaf nodes themselves…
20
CENG 35120 B+Trees: Size Generally, B+ Tree nodes (internal and leaf) are to be ln2 full, eg, if the max number of keys is 200, the average occupancy would be 140 (=0.7x200). An example for forming a B+ Tree: –Assuming that there are k data clusters (or blocks or buckets), total number of internal nodes (blocks) except the bottom most two levels is equal to Σk/140 i, where i=2, …,log p k, where number of ptrs p=140 –If the memory size is limited to b blocks, and the target is only two accesses. For log p k =3, there will be three levels above the bottom most two levels. Thus, b= k/140 3 +k/140 2 +1, there will be k/140 blocks in the level above the leaves, assumed to be on the disk.
21
CENG 35121 B+Trees and secondary key For each secondary key a separate B+ Tree is formed. In this case the bottom most level will contain the pointers to the data records, rather than the data clusters themselves. According to the secondary key, each record could be on an block or cluster.
22
CENG 35122 B+Trees and Time considerations-1 The time to read the whole file (exhaustive read), in the order of the primary key, ignoring in- memory processing, asumıng the entıre index is in the memory: T xp =b*(s+r+dtt) Where b=n/(ln2*m), where n is number of records, m is number of entries per node; assume that leaf nodes have links to the next node, but the next node is not contiguously located. All blocks are accessible once the first block is accessed.
23
CENG 35123 B+Trees and Time considerations-2 Time to read the whole file in the order of the secondary key, assuming the leaves of the secondary B+ Tree are on the disk. T xs =n*(s+r+dtt)+b*( s+r+btt) Where the first term is reading the file record by record; a record is in any cluster. Second term is reading the secondary key’s B+ Tree which has b blocks as leaf nodes. This is too slow!!!
24
CENG 35124 B+Trees and Time considerations-3 Accessing the next record is fast, in primary key case: TN=[1/(ln2*m)]*(s+r+dtt) Where the first factor is the probability that the record is not on the current cluster. 1/(ln2*m) is the probability that the record is not in the current block.
25
CENG 35125 Insertion-1 Top-down search to find the place to insert the new record in the leaf nodes. If there is room in the leaf node, insert the record and terminate. If there is no room in the leaf, –Allocate a new leaf node, split the records in the middle –Place the first half(ceiling) in the first, the rest in the second leaf.
26
CENG 35126 Insertion -1 Place the smallest key value in the second (new) leaf into the immediate internal parent node. –If the internal parent node is already full, split it into two internal nodes, each with half of the keys. – Carry the middle key value to the next level up (parent). –If no parent exist while bottom up process continues, create a new node (root in this case)!
27
CENG 35127 Primary key Insertion: Time considerations-1 If the new record fit into the data block, the insertion time required is the sum of the fetch and update times: T Ip =T F +2r If the record does not fit in the data block, then a data block split is required. Internal node split may be required as well. Considering the expected times for this to happen, the insertion time has to be formulated accordingly:
28
CENG 35128 Primary key Insertion: Time considerations-2 Expected total time for insertion becomes as follows: T Ip =T F +2r+(1/(m/2))[(s+r+dtt)+(s+r+btt)+(1/v)(s+r+btt)] –Where 1/(m/2)) is the probability that the data block is full, as a block has to be half full any way and m is the blocking factor; 1/v(=1/(2v/2)) is the probability that the parent of the leaf is full. –The 1 st term (s+r+dtt) is writing the 2 nd data block, the 2 nd one (s+r+btt) is reading the internal parent node, the third one is writing the 2 nd parent node –Assumptions: leaf (data blocks) and the node above the leaf are assumed to be in the external storage; thus, overflow of the two bottom most levels is considered only…
29
CENG 35129 Primary key Insertion: Time considerations-2 Meaning of each term: T F +2r: Fetch and write the original data cluster, (s+r+dtt): write a new cluster as a split cluster, write the parent internal node block, (s+r+btt): write the splitted parent internal node. Notes: the minimum data block occupancy is 50%, i.e., m/2. So, the insertion will be in positions from, m/2+1 to m in the data block, this is the reason for 2/m. m is assumed to be maximum blocking factor for the leaves, 2v is assumed to be the maximum blocking factor for the internal nodes.
30
CENG 35130 Secondary key Insertion: Time considerations-1 Whenever a data record is inserted in the primary B+ Tree, the secondary key B+ Tree needs to be updated as well. Note that this time, the maximum blocking factors for both leaves and internal nodes can practically be the same, say m. To lessen the update problem, the secondary keys may be associated (pinned) to the primary keys rather than the record addresses. This means in the leaf, with the secondary keys there will be the associated primary keys.
31
CENG 35131 Secondary key Insertion: Time considerations-2 Assuming that all the internal nodes, including the parents of the leaves, are in memory. The time to insert a secondary key: Total insertion time: T Is =T F + 2r + (2/m) (s+r+btt) –The first, term is the time to read, the second term is the time to write back after modifications, the third term is the time required if split is also considered, with probability of 2/m.
32
CENG 35132 Deletion of Records-1 When the minimum criteria is met regarding the occupancy, the deletion is no problem, just remove the related entry from the node, both for primary and secondary key cases.
33
CENG 35133 Deletion of Records-2 Record Deletion Algorithm Find the block containing the record, say X Delete the record and terminate if block limits are Ok. Otherwise, if one of the sibling blocks exceeds the minimum the most, redistribute the entries in both; change the parent accordingly to record the correct key If neither siblings have more than the minimum, coalesce (combine) them and modify the parent to reflect the change.
34
CENG 35134 Deletion of Records-3 Deletion Algorithm-cont. If an internal node has less than the minimum after the modifications, –if the total is more than the maximum, redistribute its content with one of the siblings and modify the parent… –if the total is less than the maximum, coalesce the content with a sibling and modify the parent. after the modification, if the parent is too sparse, repeat this step
35
CENG 35135 Deletion of Records: Timing considerations-1 Most usual case, deletion does not cause change in the tree. T Dp = T F + 2r If the deletion requires redistribution of the values in the sibling, because it falls below the limit, then there is a need for reading the two adjacent siblings. Thus, two sibling needs to be read and and one of them written as well as the immediate parent node: T Dp = T F + 4(s+r+dtt) +s+r+btt If one sibling is involved, then the T Dp = T F + 2(s+r+dtt) +s+r+btt
36
CENG 35136 Deletion of Records: Timing considerations-2 If we consider the probability of 1/m/2 we may have to read two siblings and write a parent and a sibling, approximated by the following formulation: Three reads and awrite. T Dp = T F +2r+2(2/m) (s+r+dtt) For large m, T Dp = T F + 2r For the secondary key deletion, if the probability term is ignored for large blocking factor, we have T Ds = T F + 2r
37
CENG 35137 Construction of a B+ Tree for an existing file Sort the file, on the disk with clusters ln2 full. Read in the sorted file cluster by cluster, and enter the addresses in the parent node until it is ln2 full. If the index node is ln2 full, create a new entry in the parent node if it is not full. Note that new a root may be created if the old one is ln2 full. The process continues until all the sorted leaves are consumed. There may be sparse nodes on the right most side of the tree, which need to be fixed. Note that a B+ Tree can also be constructed by successive insertions, but this would be very inefficient… Why?
38
CENG 35138 B+ Trees in Practice
39
CENG 35139 B+ Trees Example Typical order: 100(=v). Typical fill-factor: 70%. – average fanout = 140+1 (i.e, # of pointers in internal node) Can often hold top levels in buffer pool, ie. in memory: Suppose there are 1,000,000,000 data records. Compute the depth of the tree Number bottom most internal nodes:x=1000000000/140 –d = log 140 (x)
40
CENG 35140 How to Insert Data into a B+ Tree?
41
CENG 35141 Inserting two records: 16*, 8* Root 1724 30 13 2* 3*5* 7* 8* 2* 5*7* 3* 1724 30 13 8* You overflow One new child (leaf node) generated; must add one more pointer to its parent, thus one more key value as well. 14* 15* 16*
42
CENG 35142 Inserting 8* (cont.): Split the leaf Copy up the middle value (leaf split) 2* 3*5* 7* 8* 5 Entry to be inserted in parent node. (Note that 5 is continues to appear in the leaf.) s copied up and 13 17 24 30 You overflow! 5 13 17 24 30
43
CENG 35143 (Note that 17 is pushed up and only appears once in the index. Contrast Entry to be inserted in parent node. this with a leaf split.) 52430 17 13 Insertion 8* (cont.): split internal index node 5 13 17 24 30 There is a difference between copy-up and push-up! Observe how minimum occupancy is guaranteed in both leaf and index page splits. We split this node, redistribute entries evenly, and push up middle key.
44
CENG 35144 B+ Tree After Inserting 8* Notice that root was split, leading to increase in height. 2*3* Root 17 24 30 14*15* 19*20*22*24*27* 29*33*34* 38* 39* 135 7*5*8* 16*
45
CENG 35145 Inserting a Data Entry into a B+ Tree: Summary Find correct leaf L. Put data entry onto L. – If L has enough space, done! – Else, must split L (into L and a new node L2) Redistribute entries evenly, put middle key in L2 copy up middle key. Insert index entry pointing to L2 into parent of L. This can happen recursively – To split index node, redistribute entries evenly, but push up middle key. (Contrast with leaf splits.) Splits “grow” tree; root split increases height. – Tree growth: gets wider or one level taller at top.
46
CENG 35146 Deleting a Data Entry from a B+ Tree
47
CENG 35147 Delete 19* and 20* 2*3* Root 17 24 30 14*16* 19*20*22*24*27* 29*33*34* 38* 39* 135 7*5*8* 22* 27* 29* 22* 24* You underflow Have we still forgot something?
48
CENG 35148 Deleting 19* and 20* (cont.) Notice how 27 is copied up. But can we move it up? Now we want to delete 24 Underflow again! But can we redistribute this time? 2*3* Root 17 30 14*16* 33*34* 38* 39* 135 7*5*8*22*24* 27 27*29*
49
CENG 35149 Deleting 24* Observe the two leaf nodes are merged, and 27 is discarded from their parent, but … Observe `pull down’ of index entry (below). 30 22*27* 29*33*34* 38* 39* 2* 3* 7* 14*16* 22* 27* 29* 33*34* 38* 39* 5*8* 30 135 17 You underflow Merge with sibling! New root
50
CENG 35150 Deleting a Data Entry from a B+ Tree: Summary Start at root, find leaf L where entry belongs. Remove the entry. – If L is at least half-full, done! – If L has only d-1 entries, Try to re-distribute, borrowing from sibling (adjacent node with the same parent, L). If re-distribution fails, merge with sibling L. If merge occurred, must delete entry (pointing to L or sibling) from parent of L. Merge could propagate to root, decreasing height.
51
CENG 35151 Example of Non-leaf Re-distribution Assume the following tree after deletion of 24*. In this case one can re-distribute entry from left child of the root to right child of the root. Root 135 1720 22 30 14*16* 17*18* 20*33*34* 38* 39* 22*27*29*21* 7*5*8* 3*2*
52
CENG 35152 After Re-distribution Intuitively, entries are re-distributed by `pushing through’ the splitting entry in the parent node. Re-distribute the index entry with key 20 and 17. 14*16* 33*34* 38* 39* 22*27*29* 17*18* 20*21* 7*5*8* 2*3* Root 135 17 30 20 22
53
CENG 35153 Terminology Bucket Factor: the number of records which can fit in a leaf node. Fan-out : the average number of children of an internal node. A B+tree index can be used either as a primary index or a secondary index. –Primary index: determines the way the records are actually stored (also called a sparse index) –Secondary index: the records in the file are not grouped in buckets according to keys of secondary indexes (also called a dense index)
54
CENG 35154 Summary Tree-structured indexes are ideal for range- searches, also good for equality searches. B+ tree is a dynamic structure. – Inserts/deletes leave tree height-balanced; High fanout (F) means depth is rarely more than 3 or 4. – Almost always better than maintaining a sorted file. – Typically, 67% occupancy on average. Most b+Tree is widely used index in database management systems because of its versatility. One of the most optimized components of a DBMS.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.