Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Notes on: Clusters Index and Cluster Creation in SQL Elisa Bertino CS Department and CERIAS Purdue University.

Similar presentations


Presentation on theme: "1 Notes on: Clusters Index and Cluster Creation in SQL Elisa Bertino CS Department and CERIAS Purdue University."— Presentation transcript:

1 1 Notes on: Clusters Index and Cluster Creation in SQL Elisa Bertino CS Department and CERIAS Purdue University

2 2 Employees Emp#NameJobHiringDSalaryBonusDept# 7369RedEngineer17/12/801600.00500.0020 7499AndrewsTechnician20/02/81800.00?30 7521WhiteTechnician20/02/81800.00100.0030 7566PinkManager02/04/812975.00?20 7654MartinSecretary28/09/81800.00?30 7698BlackManager01/05/812850.00?30 7782NeriEngineer01/06/812450.00200.0010 7788ScottSecretary09/11/81800.00?20 7839DareEngineer17/11/812600.00300.0010 7844TurniTechnician08/09/811500.00?30 7876AdamsEngineer23/09/811100.00500.0020 7900GianniEngineer03/12/811950.00?30 7902FordSecretary03/12/811000.00?20 7934MillEngineer23/01/821300.00150.0010 7977GreenManager10/12/803000.00?10

3 3 Departments Dept#DeptNameOffice#DivisionManager 10 Building Construction 1100D17977 20Research2200D17566 30Road Maintenance5100D27698

4 4 Clustering Consider the following query: SELECT Emp#, Name, Office FROM Employees, Departments WHERE Employees.Dept# = Departments. Dept# An efficient storage strategy is based on clustering (that is, grouping) the tuples of the two tables that have the same value for the join attribute Clustering may make the execution of other queries inefficient – ex. SELECT * FROM Departments

5 5 Clustering 10 Building Construction 1100 D1 7977 7782 NeriEngineer01/06/812450.00200.0010 7839 DareEngineer17/11/812600.00300.0010 7934 MillEngineer23/01/821300.00150.0010 7977GreenManager10/12/803000.00?10 20Research2200D1 7566 7369RedEngineer17/12/801600.00500.0020 7566 PinkManager02/04/812975.00?20 7788 ScottSecretary09/11/81800.00?20 7876 Adams Engineer23/09/811100.00500.0020 7902 FordSecretary03/12/811000.00?20 30 Road Maintenance 5100D2 7698 7499 Andrews Technician 20/02/81 800.00?30 7521 White Technician 20/02/81800.00100.0030 7698 Black Manager01/05/812850.00?30 7900Gianni Engineer03/12/811950.00?30

6 6 Definition of clusters and indexes in SQL The most relevant commands are: the command for creating indexes, on one or more columns of a relation, and the command for creating clusters A cluster allows one to store on contiguous storage locations the tuples, of one or more relations, that have the same value for one or more columns, called cluster columns

7 7 Definition of clusters and indexes in SQL The command for creating an index in SQL has the following format: CREATE INDEX IndexName ON RelationName (ColumnNameList) | ClusterName [ASC | DESC]; where – IndexName is the name of the index being created – The ON clause specifies the object on which the index is allocated An index created on more than one column is called composite index

8 8 Definition of clusters and indexes in SQL The object on which an index is allocated can be: – a relation: one must specify the names of the columns on which the index is allocated – a cluster: the index is automatically allocated on all columns of the cluster An index can be allocated on several columns The ASC and DESC options specify if the values of the index key must be ordered according to an increasing or decreasing order – ASC is the default

9 9 Definition of clusters and indexes in SQL – Example The indexes are in general implemented as a B+-tree or some variations of it Suppose to allocate an index on the column salary of the Employees table CREATE INDEX idxsalary ON Employees (salary);

10 10 ASC and DESC Options If the index is created on a single column, the two options do not make any difference; the bidirectional-traversal capability of the index makes it possible to use the index for queries that specify sorting of results in either ascending or descending order of the sort column If the index is composite, the ASC and DESC keywords might be required For example, if we want to optimize a SELECT statement whose ORDER BY clause sorts on multiple columns and sorts each column in a different order, and we want to use an index for this query, we need to create a composite index that corresponds to the ORDER BY columns. Example: SELECT salary, dept# FROM Employees ORDER BY salary ASC, dept# DESC; To use an index for this query we need to create an index that corresponds to the order requirements of this query CREATE INDEX sal_d_idx ON Employees (salary ASC, dept# DESC);

11 11 Definition of clusters and indexes in SQL Definition of clustered indexes (DB2): CREATE INDEX IndexName ON RelationName (ColumnNameList) CLUSTER; Only a single clustered index can be allocated on a given table If the table is not empty when the clustered index is created, the data are not automatically re-grouped; it is necessary to use a special utility called REORG

12 12 Definition of clusters and indexes in SQL Command for the creation of a cluster: CREATE CLUSTER ClusterName (ColName_1 Domain_1,.., ColName_n Domain_n) [INDEX | HASH IS Expr | HASHKEYS n]; ClusterName is the name of the cluster being defined (ColName_1 Domain_1,..., ColName_n Domain_n), with n >= 1, is the specification of the cluster columns – such set of columns is called cluster key

13 13 Definition of clusters and indexes in SQL An auxiliary access structure is always associated with each cluster – Index: The tuples with the same value for the cluster key are clustered and indexed by a B+-tree (default) The index is convenient if there are frequent queries with range predicates on the cluster key or if the relations may frequently change size Index cluster – Hash: The tuples with the same hash value for the cluster key are clustered and indexed by a hash function The hash function is convenient if there are frequent queries with equality predicates on all cluster key columns and the relations are static Hash cluster

14 14 Definition of clusters and indexes in SQL – hash clusters The DBMS always provides an internal hash function used as default (very often based on the division method) the HASHKEYS option allows one to specify the number of values for the hash function If such value v is not a prime number, it is replaced by the system with the first prime number which is greater than v such value is used as input for the integer remainder function used by the system to generate the values of the hash function

15 15 Definition of clusters and indexes in SQL – Example Index cluster: CREATE CLUSTER Personnel(D# NUMBER); CREATE INDEX idxpersonnel ON CLUSTER Personnel; Hash cluster: CREATE CLUSTER Personnel (D# NUMBER) HASHKEYS 10; given that the HASHKEYS option is equal to 10, the number of values generated by the hash function is 11 (first prime number > 10)

16 16 Definition of clusters and indexes in SQL – hash clusters It is possible to modify the hash function to be used through the HASH IS option Such option can however only be used if: – The cluster key only includes columns containing integer values – The expression must return only positive values – + other conditions

17 17 Definition of clusters and indexes in SQL – index clusters If the cluster is of type index, before executing queries or modifications, an index on the cluster must be created through the CREATE INDEX command a cluster may include one or more relations – Single relation: the cluster is used to group the tuples of the relation that have the same value for the columns that are in the cluster key – Multiple relations: the cluster is used to group the tuples of all relations having the same value for the columns that are in the cluster key (the joins on the columns that are part of the cluster key are efficient) A relation must be inserted in the cluster when it is created

18 18 Definition of clusters and indexes in SQL - Example Suppose to insert in the Personnel cluster the Employees and Departments relations CREATE TABLE Employees (Emp# Decimal(4) NOT NULL, Dept# Decimal(2)) CLUSTER Personnel (Dept#); CREATE TABLE Departments (Dept# Decimal(4) NOT NULL) CLUSTER Personnel (Dept#);

19 19 Definition of clusters and indexes in SQL - Example The names of the columns on which the clustering of the relations is executed must not necessarily have the same names of the cluster columns; they must however have the same type


Download ppt "1 Notes on: Clusters Index and Cluster Creation in SQL Elisa Bertino CS Department and CERIAS Purdue University."

Similar presentations


Ads by Google