Index tuning Performance Tuning.

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Index Dennis Shasha and Philippe Bonnet, 2013.
Hashing and Indexing John Ortiz.
Lock Tuning. overview © Dennis Shasha, Philippe Bonnet 2001 Sacrificing Isolation for Performance A transaction that holds locks during a screen interaction.
Dr. Kalpakis CMSC 661, Principles of Database Systems Index Structures [13]
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
Query Evaluation. SQL to ERA SQL queries are translated into extended relational algebra. Query evaluation plans are represented as trees of relational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “How index-learning turns no student pale Yet.
Chapter 8 File organization and Indices.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part A Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “How index-learning turns no student pale Yet.
H.Lu/HKUST L07: Lock Tuning. L07-Lock Tuning - 2 H.Lu/HKUST Lock Tuning  The key is to combine the theory of concurrency control with practical DBMS.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
1 CS143: Index. 2 Topics to Learn Important concepts –Dense index vs. sparse index –Primary index vs. secondary index (= clustering index vs. non-clustering.
AOBD07/08H. Galhardas Index Tuning. AOBD2007/08 H. Galhardas Index An index is a data structure that supports efficient access to data Set of Records.
DBMS Internals: Storage February 27th, Representing Data Elements Relational database elements: A tuple is represented as a record CREATE TABLE.
Lecture 6 Indexing Part 2 Column Stores. Indexes Recap Heap FileBitmapHash FileB+Tree InsertO(1) O( log B n ) DeleteO(P)O(1) O( log B n ) Range Scan O(P)--
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
1 Storage Refinement. Outline Disk failures To attack Intermittent failures To attack Media Decay and Write failure –Checksum To attack Disk crash –RAID.
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Oracle Data Block Oracle Concepts Manual. Oracle Rows Oracle Concepts Manual.
1 Physical Data Organization and Indexing Lecture 14.
Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Database Tuning Prerequisite Cluster Index B+Tree Indexing Hash Indexing ISAM (indexed Sequential access)
Index tuning Performance Tuning. Overview Index An index is a data structure that supports efficient access to data Set of Records index Condition on.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
Chapter 11 Indexing & Hashing. 2 n Sophisticated database access methods n Basic concerns: access/insertion/deletion time, space overhead n Indexing 
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
1 Performance Tuning Next, we focus on lock-based concurrency control, and look at optimising lock contention. The key is to combine the theory of concurrency.
CS Hardware Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:
CS Operating System & Database Performance Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:
CS 405G: Introduction to Database Systems 21 Storage Chen Qian University of Kentucky.
Lecture 5 Cost Estimation and Data Access Methods.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
1 Overview of Storage and Indexing Chapter 8. 2 Data on External Storage  Disks: Can retrieve random page at fixed cost  But reading several consecutive.
Overview of Storage and Indexing Content based on Chapter 4 Database Management Systems, (Third Edition), by Raghu Ramakrishnan and Johannes Gehrke. McGraw.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “How index-learning turns no student pale Yet.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Indexing and Hashing By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8 “If you don’t find it in the index, look very.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Physical Database Design I, Ch. Eick 1 Physical Database Design I Chapter 16 Simple queries:= no joins, no complex aggregate functions Focus of this Lecture:
Chapter 5 Index and Clustering
B+ Trees: An IO-Aware Index Structure Lecture 13.
CS 440 Database Management Systems Lecture 6: Data storage & access methods 1.
Lock Tuning. Overview Data definition language (DDL) statements are considered harmful DDL is the language used to access and manipulate catalog or metadata.
Table Structures and Indexing. The concept of indexing If you were asked to search for the name “Adam Wilbert” in a phonebook, you would go directly to.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Storage and Indexing Chapter 8.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 File Organizations and Indexing Chapter 8 Jianping Fan Dept of Computer Science UNC-Charlotte.
1 Overview of Storage and Indexing Chapter 8. 2 Review: Architecture of a DBMS  A typical DBMS has a layered architecture.  The figure does not show.
1 Chapter 3. Tuning Indexes May 2002 Prof. Sang Ho Lee School of Computing, Soongsil University
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Database System Architecture and Implementation
Indexes By Adrienne Watt.
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
Record Storage, File Organization, and Indexes
CS 540 Database Management Systems
Indexing and hashing.
COMP 430 Intro. to Database Systems
File organization and Indexing
Index Tuning Additional knowledge.
Indexing 4/11/2019.
Presentation transcript:

Index tuning Performance Tuning

Condition on attribute value Index An index is a data structure that supports efficient access to data Set of Records Matching records Condition on attribute value index (search key)

Index Implementations in some major DBMS SQL Server B+-Tree data structure Clustered indexes are sparse Indexes maintained as updates/insertions/deletes are performed DB2 B+-Tree data structure, spatial extender for R-tree Clustered indexes are dense Explicit command for index reorganization Oracle B+-tree, hash, bitmap, spatial extender for R-Tree No clustered index until 10g Index organized table (unique/clustered) Clusters used when creating tables. MySQL B+-Tree, R-Tree (geometry and pairs of integers) Indexes maintained as updates/insertions/deletes are performed

Types of Queries Point Query SELECT balance FROM accounts WHERE number = 1023; Multipoint Query SELECT balance FROM accounts WHERE branchnum = 100; Range Query SELECT number FROM accounts WHERE balance > 10000; Prefix Match Query SELECT * FROM employees WHERE name = ‘Jensen’ and firstname = ‘Carl’ and age < 30;

Types of Queries Extremal Query SELECT * FROM accounts WHERE balance = max(select balance from accounts) Ordering Query SELECT * FROM accounts ORDER BY balance; Grouping Query SELECT branchnum, avg(balance) FROM accounts GROUP BY branchnum; Join Query SELECT distinct branch.adresse FROM accounts, branch WHERE accounts.branchnum = branch.number and accounts.balance > 10000;

Benefits of Clustered Index Benefits of a clustered index: A sparse clustered index stores fewer pointers than a dense index. This might save up to one level in the B-tree index. A clustered index is good for multipoint queries White pages in a paper telephone book A clustered index based on a B-Tree supports range, prefix, extremal and ordering queries well. A clustered index (on attribute X) can reduce lock contention: Retrieval of records or update operations using an equality, a prefix match or a range condition based on X will access and lock only a few consecutive pages of data

Advantage of Clustered Index Multipoint query that returns 100 records out of 1000000. Cold buffer Clustered index is twice as fast as non-clustered index and orders of magnitude faster than a scan. 7

Disvantage of Clustered Index Cost of a clustered index Cost of overflow pages Due to insertions Due to updates (e.g., replace a NULL value by a long string)

Index “Face Lifts” Index is created with fillfactor = 100. Insertions cause page splits and extra I/O for each query Maintenance consists in dropping and recreating the index With maintenance performance is constant while performance degrades significantly if no maintenance is performed. 9

Index “Face Lifts” Index is created with pctfree = 0 Insertions cause records to be appended at the end of the table Each query thus traverses the index structure and scans the tail of the table. Performances degrade slowly when no maintenance is performed.

Index “Face lifts” In Oracle, clustered index are approximated by an index defined on a clustered table No automatic physical reorganization Index defined with pctfree = 0 Overflow pages cause performance degradation

Clustered Index Because there is only one clustered index per table, it might be a good idea to replicate a table in order to use a clustered index on two different attributes Yellow and white pages in a paper telephone book Which is feasible for Low insertion/update rate

Non-Clustered Index Benefits of non-clustered indexes A non-clustered index can eliminate the need to access the underlying table through covering. It might be worth creating several indexes to increase the likelihood that the optimizer can find a covering index A non-clustered index is good if each query retrieves significantly fewer records than there are pages in the table. Point queries Multipoint queries: number of distinct key values > c * number of records per page Where c is the number of pages can be prefetched in each disk read

Example Non-clustering index on attribute A, which has 20 different values, each equality query will retrieve approximately 1/20 records If each page contains 80 record, then nearly every page will have almost every distinct values of A If each page contains 2 record, a query will touch only every tenth page on the average

Scan Can Sometimes Win IBM DB2 v7.1 on Windows 2000 Range Query If a query retrieves 10% of the records or more, scanning is often better than using a non-clustering non-covering index. Crossover > 10% when records are large or table is fragmented on disk – scan cost increases.

Covering Index - defined Select name from employee where department = “marketing” Good covering index would be on (department, name) Index on (name, department) less useful. Index on department alone moderately useful.

Covering Index - impact Covering index performs better than clustering index when first attributes of index are in the where clause and last attributes in the select. Select B,C from R where A=5, there exists a non-clustered index on (A,B,C) When attributes are not in order then performance is much worse.

Index on Small Tables Tuning manuals suggest to avoid indexes on small tables If all data from a relation fits in one page then an index page adds an I/O However, in following cases, index is preferred If each record fits in a page then an index helps performance, since retrieving each page needs a page I/O Allowing row locking

Index on Small Tables Small table: 100 records Two concurrent processes perform updates (each process works for 10ms before it commits) No index: the table is scanned for each update. The whole table is locked. No concurrent updates. A clustered index allow to take advantage of row locking.

Multipoint query: B-Tree, Hash Tree, Bitmap There is an overflow chain in a hash index In a clustered B-Tree index records are on contiguous pages. Bitmap is proportional to size of table and non-clustered for record access.

B-Tree, Hash Tree, Bitmap Hash indexes don’t help when evaluating range queries Hash index outperforms B-tree on point queries

Key Compression Use key compression If you are using a B-tree Compressing the key will reduce the number of levels in the tree The system is not CPU-bound Updates are relatively rare

Summary Use a hash index for point queries only. Use a B-tree if multipoint queries or range queries are used Use clustering if your queries need all or most of the fields of each records returned (compared to index-only scan) if multipoint or range queries are asked Use a dense index to cover critical queries Don’t use an index if the time lost when inserting and updating overwhelms the time saved when querying

Index Tuning Wizard The index wizard MS SQL Server 7 and above A database (schema + data + existing indexes) Trace representative of the workload Out: Evaluation of existing indexes Recommendations on index creation and deletion The index wizard Enumerates possible indexes on one attribute, then several attributes Traverses this search space using the query optimizer to associate a cost to each index

Index Tuning -- data Settings: employees(ssnum, name, lat, long, hundreds1, hundreds2); clustered index c on employees(hundreds1) with fillfactor = 100; nonclustered index nc on employees (hundreds2); index nc3 on employees (ssnum, name, lat); index nc4 on employees (lat, ssnum, name); 1000000 rows ; Cold buffer Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.

Index Tuning -- operations Update: update employees set name = ‘XXX’ where ssnum = ?; Insert: insert into employees values (1003505,'polo94064',97.48,84.03,4700.55,3987.2); Multipoint query: select * from employees where hundreds1= ?; select * from employees where hundreds2= ?; Covered query: select ssnum, name, lat from employees; Range Query: select * from employees where long between ? and ?; Point Query: select * from employees where ssnum = ? © Dennis Shasha, Philippe Bonnet 2001

Bitmap vs. Hash vs. B+-Tree Settings: employees(ssnum, name, lat, long, hundreds1, hundreds2); create cluster c_hundreds (hundreds2 number(8)) PCTFREE 0; create cluster c_ssnum(ssnum integer) PCTFREE 0 size 60; create cluster c_hundreds(hundreds2 number(8)) PCTFREE 0 HASHKEYS 1000 size 600; create cluster c_ssnum(ssnum integer) PCTFREE 0 HASHKEYS 1000000 SIZE 60; create bitmap index b on employees (hundreds2); create bitmap index b2 on employees (ssnum); 1000000 rows ; Cold buffer Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000. 27