Indexes … WHERE key = 22 22 Table Index 22 Row pointer Key Indexes Indexes are optional structures associated with tables. They can be created to improve the performance of data update and retrieval. An Oracle index provides a direct access path to a row of data. Indexes can be created on one or more columns of a table. After an index is created, it is automatically maintained and used by the Oracle server. Updates to a table’s data, such as adding new rows, updating rows, or deleting rows, are automatically propagated to all relevant indexes with complete transparency to users. Table Index
Types of Indexes These are several types of index structures available to you, depending on the need: A B-tree index is in the form of a binary tree and is the default index type. A bitmap index has a bitmap for each distinct value indexed, and each bit position represents a row that may or may not contain the indexed value. This is best for low-cardinality columns. Types of Indexes The following are the most common forms of indexes: B-tree Bitmap A B-tree index has its key values stored in a balanced tree (B-tree), allowing for fast binary searches. A bitmap index has a bitmap for each distinct key value being indexed. Within each bitmap, there is a bit set aside for each row in the table being indexed. This allows for fast lookups when there are few distinct values; that is, the indexed column has low cardinality. An example of this is a gender indicator. It can have values of “M” and “F” only. So, there are only two bitmaps to search. For example, if a bitmap index were used for a phone_number column, there would be so many bitmaps to manage and search that it would be very inefficient. Use bitmap indexes for low-cardinality columns.
B-Tree Index Index entry Root Branch Index entry header Leaf Key column length B-Tree Index Structure of a B-tree index At the top of the index is the root, which contains entries that point to the next level in the index. At the next level are branch blocks, which in turn point to blocks at the next level in the index. At the lowest level are the leaf nodes, which contain the index entries that point to rows in the table. The leaf blocks are doubly linked to facilitate the scanning of the index in an ascending as well as descending order of key values. Format of index leaf entries An index entry is made up of the following components: An entry header, which stores the number of columns and locking information Key column length-value pairs, which define the size of a column in the key followed by the value for the column (The number of such pairs is a maximum of the number of columns in the index.) ROWID of a row that contains the key values Key column value ROWID
Bitmap Indexes File 3 Table Block 10 Block 11 Block 12 Index Key Start <Blue, 10.0.3, 12.8.3, 1000100100010010100> <Green, 10.0.3, 12.8.3, 0001010000100100000> <Red, 10.0.3, 12.8.3, 0100000011000001001> <Yellow, 10.0.3, 12.8.3, 0010001000001000010> Key Start ROWID End Bitmap Bitmap Indexes Bitmap indexes are more advantageous than B-tree indexes in certain situations: When a table has millions of rows and the key columns have low cardinality—that is, there are very few distinct values for the column. For example, bitmap indexes may be preferable to B-tree indexes for the gender and marital status columns of a table containing passport records. When queries often use a combination of multiple WHERE conditions involving the OR operator When there is read-only or low update activity on the key columns Structure of a bitmap index A bitmap index is also organized as a B-tree, but the leaf node stores a bitmap for each key value instead of a list of ROWIDs. Each bit in the bitmap corresponds to a possible ROWID, and if the bit is set, it means that the row with the corresponding ROWID contains the key value. As shown in the diagram, the leaf node of a bitmap index contains the following: An entry header that contains the number of columns and lock information
Index Options A unique index ensures that every indexed value is unique. An index can have its key values stored in ascending or descending order. A reverse key index has its key value bytes stored in reverse order. A composite index is one that is based on more than one column. A function-based index is an index based on a function’s return value. A compressed index has repeated key values removed. Index Options For efficiency of retrieval, it may be advantageous to have an index store the keys in descending order. This decision is made on the basis of how the data is accessed most frequently. A reverse key index has the bytes of the indexed value stored in reverse order. This can reduce activity in a particular hot spot in the index. If many users are processing data in the same order, then the prefix portions of the key values (that are currently being processed) are close in value at any given instant. Consequently, there is a lot of activity in that area of the index structure. A reverse key index spreads that activity out across the index structure by indexing a reversed-byte version of the key values. An index created by the combination of more than one column is called a composite index. For example, you can create an index based on a person’s last name and first name: CREATE INDEX name_ix ON employees (last_name, first_name);
Table Types Related data from more than one table are stored together. Heap Clustered Related data from more than one table are stored together. Clustered table Data (including non-key values) is sorted and stored in a B-tree index structure. Index-organized table (IOT) Data is divided into smaller, more manageable pieces. Partitioned table Data is stored as an unordered collection (heap). Ordinary (heap- organized) table Description Type Table Types Ordinary “heap-organized” tables are introduced in the Oracle Database 10g: Administration Workshop I course. Partitions are pieces of a table or an index, created to facilitate management of a very large database (VLDB), which could contain several terabytes of data. Unlike a heap-organized table whose data is stored as an unordered collection (heap), data for an index-organized table (IOT) is stored in a B-tree index structure in a primary key–sorted manner. A cluster is a group of tables that share the same data blocks because they share common columns and are often used together. Partitioned IOT
What Is a Partition and Why Use It? A partition is: A piece of a “very large” table or index Stored in its own segment Used for improved performance and manageability What Is a Partition and Why Use It? A partition is a piece of a “very large” table or index, stored in its own segment, so that it can be managed individually. An example of a “very large” table is a data warehouse table of several hundred gigabytes of data. Partitions can be further broken down into subpartitions for finer levels of manageability and improved performance. Partitioning can also bring better performance because many queries can ignore partitions that, according to the WHERE clause, do not have the requested rows, thereby reducing the amount of data to be scanned to produce a result set. Operations on partitioned tables and indexes can be performed in parallel by assigning different parallel execution servers to different partitions of the table or index.
Index-Organized Tables Regular table access IOT access Table access by ROWID Non-key columns Index-Organized Tables Unlike an ordinary (heap-organized) table whose data is stored as an unordered collection (heap), data for an index-organized table (IOT) is stored in a B-tree index structure in a primary key–sorted manner. Besides storing the primary key column values, each index entry in the IOT B-tree stores the non-key column values as well. Index-organized tables have full table functionality. They support features such as constraints, triggers, LOB and object columns, partitioning, parallel operations, online reorganization, and replication. You can even create indexes on an index-organized table. Index-organized tables are ideal for OLTP applications, which require fast primary key access and high availability. Queries and DML on an orders table used in online order processing are predominantly primary-key based, and a heavy volume of DML causes fragmentation that results in a frequent need to reorganize. Because an index-organized table can be reorganized online and without invalidating its secondary indexes, the window of unavailability is greatly reduced or eliminated. An index-organized table is an alternative to: A table indexed on the primary key by using the CREATE INDEX statement A cluster table stored in an indexed cluster, which has been created using the CREATE CLUSTER statement that maps the primary key for the table to the cluster key Key column Row header
Index-Organized Tables and Heap Tables Compared to heap tables, IOTs: Have faster key-based access to table data Do not duplicate the storage of primary key values Require less storage Use secondary indexes and logical row IDs Have higher availability because table reorganization does not invalidate secondary indexes IOTs have the following restrictions: Must have a primary key that is not DEFERRABLE Cannot be clustered Cannot use composite partitioning Cannot contain a column of type ROWID or LONG Index-Organized Tables and Heap Tables Index-organized tables do not have regular (physical) row IDs, but use logical row IDs instead. Logical row IDs give the fastest possible access to rows in IOTs by using two methods: A physical guess whose access time is equal to that of physical row IDs Access without the guess (or after an incorrect guess); this performs a primary key access of the IOT The guess is based on knowledge of the file and block that a row resides in. The latter information is accurate when the index is created, but changes if the leaf block splits. If the guess is wrong and the row no longer resides in the specified block, then the remaining portion of the logical row ID entry, the primary key, is used to get the row. The Oracle database constructs secondary indexes on index-organized tables by using logical row IDs that are based on the table’s primary key. Because rows in index-organized tables do not have permanent physical addresses, the physical guesses can become stale when rows are moved to new blocks. To obtain fresh guesses, you can rebuild the secondary index. Note that rebuilding a secondary index on an index-organized table involves reading the base table, unlike rebuilding an index on an ordinary table.
Clusters Unclustered orders and order_item tables ORD_NO PROD QTY ... ----- ------ ------ 101 A4102 20 102 A2091 11 102 G7830 20 102 N9587 26 101 A5675 19 101 W0824 10 Cluster Key (ORD_NO) 101 ORD_DT CUST_CD 05-JAN-97 R01 PROD QTY A4102 20 A5675 19 W0824 10 102 ORD_DT CUST_CD 07-JAN-97 N45 A2091 11 G7830 20 N9587 26 ORD_NO ORD_DT CUST_CD ------ ------ ------ 101 05-JAN-97 R01 102 07-JAN-97 N45 Definition of Clusters A cluster is a group of one or more tables that share the same data blocks because they share common columns and are often used together in join queries. Storing tables in clusters offers the DBA a method to denormalize data. If you implement clustered tables in your database, you do not need to change any application code that accesses the tables. Clusters are transparent to the end user and programmer. Performance Benefits of Clusters Disk I/O is reduced and access time improved for joins of clustered tables. Each cluster key value is stored only once for all the rows of the same key value; therefore, it uses less storage space. Performance Consideration Full table scans are generally slower on clustered tables than on nonclustered tables. Unclustered orders and order_item tables Clustered orders and order_item tables
Cluster Types Index cluster Hash cluster Hash function
Situations Where Clusters Are Useful Criterion Uniform key distribution Evenly spread key values Rarely updated key Often joined master-detail tables Predictable number of key values Queries using equality predicate on key Index X Hash X