Indexes and Performance

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Physical Database Design CIT alternate keys - named constraints - indexes.
Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.
Designing for Performance Announcement: The 3-rd class test is coming up soon. Open book. It will cover the chapter on Design Theory of Relational Databases.
A Guide to MySQL 7. 2 Objectives Understand, define, and drop views Recognize the benefits of using views Use a view to update data Grant and revoke users’
A Guide to SQL, Seventh Edition. Objectives Understand, create, and drop views Recognize the benefits of using views Grant and revoke user’s database.
1 Physical Data Organization and Indexing Lecture 14.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Database Management 9. course. Execution of queries.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
Chapter 6 Database Administration
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Views In some cases, it is not desirable for all users to see the entire logical model (that is, all the actual relations stored in the database.) In some.
Indexes and Views Unit 7.
University of Sunderland COM 220 Lecture Ten Slide 1 Database Performance.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Relational Databases: Basic Concepts BCHB Lecture 21 By Edwards & Li Slides:
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Session 1 Module 1: Introduction to Data Integrity
Bigtable: A Distributed Storage System for Structured Data
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
CS4432: Database Systems II
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
10/3/2017 Chapter 6 Index Structures.
Fundamentals of DBMS Notes-1.
Relational Databases: Basic Concepts
Course Developer/Writer: A. J. Ikuomola
INLS 623– Database Systems II– File Structures, Indexing, and Hashing
Indexes By Adrienne Watt.
Indexing Structures for Files and Physical Database Design
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
Record Storage, File Organization, and Indexes
CS 540 Database Management Systems
Chapter 6 - Database Implementation and Use
Database Management System
Methodology – Physical Database Design for Relational Databases
Physical Database Design for Relational Databases Step 3 – Step 8
Database Performance Tuning and Query Optimization
Translation of ER-diagram into Relational Schema
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Chapter 15 QUERY EXECUTION.
國立臺北科技大學 課程:資料庫系統 fall Chapter 18
Database Implementation Issues
Lecture 12 Lecture 12: Indexing.
Physical Database Design
JULIE McLAIN-HARPER LINKEDIN: JM HARPER
Cse 344 APRIL 23RD – Indexing.
Practical Database Design and Tuning
Lecture 21: Indexes Monday, November 13, 2000.
A Guide to SQL, Eighth Edition
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
DATABASE IMPLEMENTATION ISSUES
Introduction to Data Structures
Accounting Information Systems 9th Edition
Implementation of Relational Operations
Credit for some of the slides in this lecture goes to
Chapter 11 Database Performance Tuning and Query Optimization
Relational Databases: Basic Concepts
Relational Databases: Basic Concepts
Query Optimization.
SQL Basics BCHB697.
Advanced SQL BCHB697.
A – Pre Join Indexes.
Database Implementation Issues
Indexing, Access and Database System Architecture
Advance Database System
External Sorting Chapter 13
Database Implementation Issues
Unit 12 Index in Database 大量資料存取方法之研究 Approaches to Access/Store Large Data 楊維邦 博士 國立東華大學 資訊管理系教授.
Presentation transcript:

Indexes and Performance BCHB697

Outline Constraints… Indexes Query optimization …on primary key, foreign key Indexes Single value index strategies, balanced tree impl. Unique, sorting Impact of column data-type, values Cost of insertion As in-memory column subset Query optimization Logically similar, but slower? Explain (SQL query execution strategy) Sorted results (order by, group by) BCHB697 - Edwards

Constraints Guarantees enforced by the DBMS whenever the data is changed: Primary key uniqueness Foreign key referential integrity Valid values (for enumerated data-types, esp.) Typically implemented using indexes! Adds to insert/delete cost Turn off, for bulk loads Usually, an index for the primary key is automatically made for each table. BCHB697 - Edwards

Linear Search Unsorted, O(n) Sorted, O(n) Can we do better? Have to examine every element Quick to insert (append) Sorted, O(n) Can stop early (element is not present) Slow to insert, O(n) Can we do better? BCHB697 - Edwards

Binary Search Sorted, O(log n) Check middle value, element is to left or right Insert, O(log n) BCHB697 - Edwards

Binary Search Conceptually, we can represent this algorithm as a tree. BCHB697 - Edwards

Balanced tree index No longer single values, disk blocks instead For good performance, require balanced tree Data Modeling Essentials BCHB697 - Edwards

Primary Key Index Index is a separate "table" Identifies the disk block containing the record (Also) sorted, (much) smaller, keep in memory Fast access to a few records BCHB697 - Edwards Fundamental of Database Systems

Types of Indexes Unique – primary key indexes Sorting index Sometimes algorithms can take advantage Sorting index The order of records in data-blocks of table Usually primary key order Access records on the same disk block cheaply Numeric index has obvious sort order Strings sort lexicographically BCHB697 - Edwards

Indexes and Data-Type Indexes on integer columns are fastest Widest variety of data-structures and algorithms Indexes are most useful when columns have lots of distinct values: "information content" of a column – how many rows have each value Strings are indexed from the beginning of the string: like 'w%' can use the index, but like '%w' cannot BCHB697 - Edwards

When to use an index Why not add an index on every column? Slow down inserts, updates, deletes Uses memory, disk-space Index won't necessarily speed up specific queries Create indexes for: Foreign keys (+ primary keys, of course) Columns with good information content used often in where constraints Columns used for sorting, grouping, distinct Columns frequently shown in results BCHB697 - Edwards

Indexes: Cost of insertion Each index increases the amount of work to insert a row into the table Trade off: time for insert vs time for retrieve Depending on complexity of data-structure – may need very expensive "rebalancing" every so often For bulk inserts: Turn off indexes, bulk load, turn on indexes Created unsorted table (append), sort once Turn off constraints too (implemented using indexes) BCHB697 - Edwards

Indexes: In-Memory Column Cache Indexes are (usually) smaller than the corresponding tables Indexes are held in memory (frequently accessed) Each index has the values of some columns If the query can be satisfied entirely from an index: FAST! Existence check using primary key: FAST! BCHB697 - Edwards

Query optimization DBMS is responsible for deciding to use an index to satisfy each query Scanning entire table(s) is always correct, but slow An index may not be used, or may not help DBMS keeps statistics on each column! Some conditions "fake out" DBMS heuristics like '%w', !=, arithmetic order by, group by require sorting the entire result max vs order by limit 1 Some queries return too many rows BCHB697 - Edwards

Query optimization Load entire taxa database into bigtaxa; 15 minutes later… cd /opt/lampp ./bin/mysql -u root < ~/bigtaxa.sql BCHB697 - Edwards

Query optimization Load entire taxa database into bigtaxa; 0.0023 seconds 4.0637 seconds select * from taxonomy select * from taxonomy order by scientific_name BCHB697 - Edwards

Query optimization 0.0395 seconds 0.6636 seconds select * from taxonomy where scientific_name like 'w%' select * from taxonomy where scientific_name like '%w' BCHB697 - Edwards

Query optimization Add index on scientific_name column 0.0037 seconds select * from taxonomy where scientific_name like 'w%' select * from taxonomy where scientific_name like '%w' BCHB697 - Edwards

Explain The SQL keyword explain in front of a query will report on the query execution strategy explain select * from taxonomy where scientific_name like 'w%' explain select * from taxonomy where scientific_name like '%w' BCHB697 - Edwards

Query optimization Avoid order by, group by, distinct unless you need it 0.0016 seconds 0.0107 seconds 3.0367 seconds select scientific_name from taxonomy select distinct scientific_name from taxonomy select scientific_name from taxonomy order by scientific_name BCHB697 - Edwards

Query optimization 2.9806 seconds 2.9581 seconds select * from taxonomy where parent_id = 9606; select * from taxonomy where parent_id + 1 = 9607 BCHB697 - Edwards

Query optimization Add index to parent_id… 0.0025 seconds select * from taxonomy where parent_id = 9606; select * from taxonomy where parent_id + 1 = 9607 BCHB697 - Edwards

Query optimization select * from taxonomy where parent_id = 9606; BCHB697 - Edwards

Query optimization select customer.first_name, customer.last_name, address.address, city.city, country.country from customer join address on customer.address_id = address.address_id join city on address.city_id = city.city_id join country on city.country_id = country.country_id order by country.country, city.city, customer.last_name BCHB697 - Edwards

Query optimization BCHB697 - Edwards

Query optimization select film.film_id, film.title, group_concat(concat(actor.first_name," ", actor.last_name) order by actor.last_name separator "; ") from film join film_actor on film.film_id = film_actor.film_id join actor on film_actor.actor_id = actor.actor_id group by film.film_id, film.title BCHB697 - Edwards

Query optimization BCHB697 - Edwards

Exercise Load the bigtaxa database, experiment! Reproduce the examples in the lecture. Look at the indexes in the sakila database Try the queries from lecture 6 and 7 Use explain to figure out which indexes are used. Delete (or add) some indexes Try the queries again Which queries get slower (faster)? BCHB697 - Edwards