15.6 Index-based Algorithms Jindou Jiao 101
Index-based algorithms are especially useful for the selection operator Algorithms for join and other binary operators also use indexes
Clustering and Nonclusteing Index Clustering indexes are on an attribute or attributes such that all the tuples with a fixed value for the search key of this index appear on roughly as few blocks as can hold them. Example:
Index-Based Selection Implement a selection σ C (R) C is a=v If there are no indexes of T, the number of disk I/O is B(R), or even T(R). If a is an attribute for which an index exists, then one can search the index with value v, and get pointers to exactly those tuples of R.
Example: Suppose B(R) = 1000, and T(R) = 20,000 operation σ a=0 (R). –If R is clustered, but don’t use index, cost is 1000 disk I/O. –If R is not clustered and don’t use index, the cost is 20,000 –If V(R,a) = 100 and the index is clustering, the the index-based algorithm uses 1000/100=10 –If V(R,a) = 10, and index is nonclustering, then 20,000/10 = 2000 –If V(R,a) = 20,000(a is a key), takes 1 disk I/O regardless of whether the index is clustering or not.
Joining by Using an Index Suppose S has an index on the attribute Y. –Examine each block of R, and within each block consider each tuple t. Let tY be the component of t corresponding to the attribute Y. Use the index to find all those tuples of S that have tY in their Y-component. –There are exactly the tuples of S that join with tuple t of R, so we output the join of each of these tuples with t.
Joins Using a Sorted Index Used when the index is a B-tree, or structure from which we easily can extract the tuples of a relation in sorted order.