15.6 Index Based Algorithms

15.6 Index Based Algorithms
huili Tang

Contents Clustering and non-clustering indexes Index based Selection
Joining using an index Joining using a sorted index

A database index is data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space.

Clustered vs. Unclustered Index
Index entries UNCLUSTERED CLUSTERED direct search for data entries Data entries Data entries (Index File) (Data file) Data Records Data Records

use db1; CREATE CLUSTERED INDEX IX__shipments_QTY ON dbo.shipments(QTY);

use db1; CREATE CLUSTERED INDEX IX__shipments_SNUM ON dbo.shipments(SNUM);

Clustered Index Architecture
Adding a clustered index to the table has physically reordered the data pages, putting them in physical order based on the indexed column..  only 1clustered index per table In a clustered index all tuples with the same value of the key are clustered on as few blocks as possible. aaa aaaaa aa

Non-Clustered Index Architecture
Does not correspond to the order of actual data. The data rows are not automatically sorted. A non-clustered index has the indexed columns and a pointer or bookmark pointing to the actual row. A table can have multiple non-clustered indexes.

Algorithms are useful for the selection operator.
In a clustered relation tuples are packed roughly as few blocks, as they can possibly hold those tuples.

Index-based Selection
For a selection σC(R), suppose C is of the form a=v, where a is an attribute For clustering index R.a: the number of disk I/O’s will be B(R)/V(R,a)

Index-based Selection
The actual number may be higher: 1. index is not kept entirely in main memory 2. they spread over more blocks 3. may not be packed as tightly as possible into blocks

Example B(R)=1000, T(R)=20,000 number of I/O’s required:
Table scan algorthm: 1. clustered, not index 2. not clustered, not index ,000 Index based algorithm: 3. If V(R,a)=100, index is clustering 4. If V(R,a)=10, index is nonclustering 2,000

Joining by using an index
Natural join R(X, Y) , S(Y, Z) Number of I/O’s to get R Clustered: B(R) Not clustered: T(R) Number of I/O’s to get tuple t of S Clustered: T(R)B(S)/V(S,Y) Not clustered: T(R)T(S)/V(S,Y)

Example R(X,Y): 1000 blocks S(Y,Z)=500 blocks
Assume 10 tuples in each block, so T(R)=10,000 and T(S)=5000 V(S,Y)=100 If R is clustered, and there is a clustering index on Y for S the number of I/O’s for R is: 1000 the number of I/O’s for S is10,000*500/100=50,000

Joining Using a Sorted index
Natural join R(X, Y) S (Y, Z) with index on Y for either R or S Example: relation R(X,Y) and R(Y,Z) with index on Y for both relations search keys (Y-value) for R: 1,3,4,4,5,6 search keys (Y-value) for S: 2,2,4,6,7,8

Joining using a sorted index
Used when the index is a B-tree, or structure from which we easily can extract the tuples of a relation in sorted order.

15.6 Index Based Algorithms

Similar presentations

Presentation on theme: "15.6 Index Based Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

15.6 Index Based Algorithms

Similar presentations

Presentation on theme: "15.6 Index Based Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback