Optimizing MySQL Joins and Subqueries

Slides:



Advertisements
Similar presentations
Advanced SQL (part 1) CS263 Lecture 7.
Advertisements

© 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Joins and Sub-queries in SQL.
Drop in replacement of MySQL. Agenda MySQL branch GPL licence Maria storage engine Virtual columns FederatedX storage engine PBXT storage engine XtraDB.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Introduction to Structured Query Language (SQL)
Nov 24, 2003Murali Mani SQL B term 2004: lecture 12.
Introduction to Structured Query Language (SQL)
Introduction to SQL Structured Query Language Martin Egerhill.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
MySQL. Dept. of Computing Science, University of Aberdeen2 In this lecture you will learn The main subsystems in MySQL architecture The different storage.
Chapter 7 © 2013 Pearson Education, Inc. Publishing as Prentice Hall 1 Modern Database Management 11 th Edition Jeffrey A. Hoffer, V. Ramesh, Heikki Topi.
SQL 101 for Web Developers 14 November What is a database and why have one? Tables, relationships, normalization SQL – What SQL is and isn’t – CRUD:
CS146 References: ORACLE 9i PROGRAMMING A Primer Rajshekhar Sunderraman
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
SQL Performance and Optimization l SQL Overview l Performance Tuning Process l SQL-Tuning –EXPLAIN PLANs –Tuning Tools –Optimizing Table Scans –Optimizing.
Be a MySQL Ninja Sheeri Cabral Senior DB Admin/Architect
Proactively Optimizing Queries with EXPLAIN and mk-query-digest Presented by: Sheeri K. Cabral Database Operations Manager MySQL Sunday at Oracle OpenWorld.
Introduction MySQL won't actually execute the query, just analyse it EXPLAIN helps us understand how and when MySQL will use indexes EXPLAIN returns a.
Database Fundamental & Design by A.Surasit Samaisut Copyrights : All Rights Reserved.
Views, Algebra Temporary Tables. Definition of a view A view is a virtual table which does not physically hold data but instead acts like a window into.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
In this session, you will learn to: Query data by using joins Query data by using subqueries Objectives.
Query Processing – Implementing Set Operations and Joins Chap. 19.
Manipulating Data Lesson 3. Objectives Queries The SELECT query to retrieve or extract data from one table, how to retrieve or extract data by using.
Thinking in Sets and SQL Query Logical Processing.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
There’s a particular style to it… Rob Hatton
LM 5 Introduction to SQL MISM 4135 Instructor: Dr. Lei Li.
How to kill SQL Server Performance Håkan Winther.
LAB: Web-scale Data Management on a Cloud Lab 11. Query Execution Plan 2011/05/27.
CSC314 DAY 9 Intermediate SQL 1. Chapter 6 © 2013 Pearson Education, Inc. Publishing as Prentice Hall USING AND DEFINING VIEWS  Views provide users controlled.
LEC-8 SQL. Indexes The CREATE INDEX statement is used to create indexes in tables. Indexes allow the database application to find data fast; without reading.
Bend SQL to Your Will With EXPLAIN Ligaya Turmelle MySQL Support Engineer
1 Copyright 2009 Sun Microsystems Inc. The World’s Most Popular Open Source Database How MySQL.com Improved their Database Performance with Query Analyzer.
Join-fu: The Art of SQL Jay Pipes Community Relations Manager MySQL These slides released.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Join-fu: The Art of SQL Tuning Jay Pipes Community Relations Manager MySQL These slides.
Partitioning Sheeri K. Cabral Database Administrator The Pythian Group, January 12, 2009.
MySQL Performance Tuning Tips, tricks, and techniques Jay Pipes Community Relations Manager MySQL
Join-fu: The Art of SQL Part I – Basic Join-Fu
Tuning Transact-SQL Queries
Chapter 12 Subqueries and MERGE Oracle 10g: SQL
Prepared by : Moshira M. Ali CS490 Coordinator Arab Open University
Mailboxes and MySQL at Zimbra
Indices.
02 | Advanced SELECT Statements
Database Systems: Design, Implementation, and Management Tenth Edition
Are You Getting the Best Out of Your MySQL Indexes?
© 2010, Mike Murach & Associates, Inc.
MySQL Explain examples
Introduction to Execution Plans
David M. Kroenke and David J
Lecture 12 Lecture 12: Indexing.
Introduction To Structured Query Language (SQL)
CMPT 354: Database System I
Structured Query Language – The Fundamentals
Geo/Spatial Search with MySQL
Advance Database Systems
CMPT 354: Database System I
Introduction To Structured Query Language (SQL)
Introduction to Execution Plans
Contents Preface I Introduction Lesson Objectives I-2
Chapter 8 Advanced SQL.
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Database Systems: Design, Implementation, and Management Tenth Edition
Database Administration
Introduction to Execution Plans
Introduction to Execution Plans
CSC 453 Database Systems Lecture
CS/SE 157B Database Management Systems II April 24 Class Meeting
Presentation transcript:

Optimizing MySQL Joins and Subqueries Sheeri Cabral Senior DB Admin/Architect, Mozilla @sheeri www.sheeri.com Northeast PHP 2012 http://bit.ly/2012optmysql

EXPLAIN SQL extension SELECT only Can modify other statements: UPDATE tbl SET fld1=“foo” WHERE fld2=“bar”; can be changed to: EXPLAIN SELECT fld1 FROM tbl WHERE fld2=”bar”;

What EXPLAIN Shows How many tables How tables are joined How data is looked up If there are subqueries, unions, sorts

What EXPLAIN Shows If WHERE, DISTINCT are used Possible and actual indexes used Length of index used Approx # of records examined

Metadata Optimizer uses metadata: cardinality, # rows, etc. InnoDB - approx stats InnoDB - one method of doing dives into the data MyISAM has better/more accurate metadata

EXPLAIN Output EXPLAIN returns 10 fields: mysql> EXPLAIN SELECT return_date -> FROM rental WHERE rental_id = 13534\G ******************* 1. row ******************* id: 1 select_type: SIMPLE table: rental type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.00 sec) One row per table.

Id Id = sequential identifier One per table, subquery, derived table mysql> EXPLAIN SELECT return_date -> FROM rental WHERE rental_id = 13534\G ******************* 1. row ******************* id: 1 Id = sequential identifier One per table, subquery, derived table No row returned for a view Because it is virtual Underlying tables are represented One row per table.

select_type SIMPLE – one table, or JOINs PRIMARY mysql> EXPLAIN SELECT return_date -> FROM rental WHERE rental_id = 13534\G ******************* 1. row ******************* id: 1 select_type: SIMPLE SIMPLE – one table, or JOINs PRIMARY First SELECT in a UNION Outer query of a subquery UNION, UNION RESULT One row per table.

Other select_type output Used in subqueries More on subqueries later DEPENDENT UNION DEPENDENT SUBQUERY DERIVED UNCACHEABLE SUBQUERY One row per table.

table - One per table/alias - NULL mysql> EXPLAIN SELECT return_date -> FROM rental WHERE rental_id = 13534\G ******************* 1. row ******************* id: 1 select_type: SIMPLE table: rental - One per table/alias - NULL One row per table.

NULL table EXPLAIN SELECT 1+2\G EXPLAIN SELECT return_date FROM rental WHERE rental_id=0\G One row per table.

type “Data access method” Get this as good as possible mysql> EXPLAIN SELECT return_date -> FROM rental WHERE rental_id = 13534\G ******************* 1. row ******************* id: 1 select_type: SIMPLE table: rental type: const “Data access method” Get this as good as possible One row per table.

type ALL = full table scan Everything else uses an index index = full index scan Scanning the entire data set? full index scan > full table scan (covering index) range = partial index scan <, <=, >, >= IS NULL, BETWEEN, IN One row per table.

type index_subquery using a non-unique index of one table unique subquery using a PRIMARY/UNIQUE KEY of one table More about subqueries later One row per table.

type index_merge Use more than one index Extra field shows more information sort_union, intersection, union One row per table.

ref_or_null Joining/looking up non-unique index values JOIN uses a non-unique index or key prefix Indexed fields compared with = != <=> Extra pass for possible NULL values One row per table.

ref Joining/looking up non-unique index values JOIN uses a non-unique index or key prefix Indexed fields compared with = != <=> No NULL value possibilities Best data access strategy for non-unique values One row per table.

eq_ref Joining/looking up unique index values JOIN uses a unique index or key prefix Indexed fields compared with = One row per table.

Fastest Data Access Joining/looking up unique index values SELECT return_date FROM rental WHERE rental_id=13534; System – system table, one value One row per table.

EXPLAIN Plan indexes possible_keys key key_len – longer keys = longer look up/ compare ref – shows what is compared, field or “const” Look closely if an index is not considered One row per table.

Approx # rows examined mysql> EXPLAIN SELECT return_date -> FROM rental WHERE rental_id = 13534\G ******************* 1. row ******************* id: 1 select_type: SIMPLE table: rental type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.00 sec) One row per table.

Approx # rows examined mysql> EXPLAIN SELECT first_name,last_name FROM customer LIMIT 10\G *************** 1. row ***************** id: 1 select_type: SIMPLE table: customer type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 541 Extra: 1 row in set (0.00 sec) LIMIT does not change rows, even though it affects # rows examined. One row per table.

Extra Can be good, bad, neutral Sometimes you cannot avoid the bad Distinct – stops after first row match Full scan on NULL key – subquery, no index (bad) Impossible WHERE noticed after reading const tables One row per table.

Extra Not exists – stops after first row match for each row set from previous tables Select tables optimized away – Aggregate functions resolved by index or metadata (good) Range checked for each record (index map: N) No good index; may be one after values from previous tables are known One row per table.

Extra: Using (...) Extra: Using filesort – does an extra pass to sort the data. Worse than using an index for sort order. Index – uses index only, no table read Covering index Index for group-by GROUP BY/DISTINCT resolved by index/metadata Temporary Intermediate temporary table used One row per table.

Sample Subquery EXPLAIN mysql> EXPLAIN SELECT first_name,last_name,email -> FROM customer AS customer_outer -> WHERE customer_outer.customer_id -> IN (SELECT customer_id FROM rental AS rental_subquery WHERE return_date IS NULL)\G ********* 1. row ******** id: 1 select_type: PRIMARY table: customer_outer type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 541 Extra: ********** 2. row ********** id: 2 select_type: DEPENDENT SUBQUERY table: rental_subquery type: index_subquery possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: func rows: 13 Extra: Using where; Full scan on NULL key 2 rows in set (0.00 sec) One row per table.

MySQL and Subqueries Avoid unoptimized subqueries Not all subqueries...any more Derived tables → views or intermediate temp tbls Subqueries → joins in some cases Getting better all the time Optimized in MariaDB 5.3 One row per table.

MySQL Does Not Have Materialized views Materialized derived tables Functional indexes (e.g. WHERE date(ts)=2012_05_30)

Convert a Subquery to a JOIN SELECT first_name,last_name,email IN (SELECT customer_id FROM rental AS rental_subquery WHERE return_date IS NULL) FROM customer AS customer_outer\G One row per table.

Convert a Subquery to a JOIN SELECT first_name,last_name,email IN (SELECT customer_id FROM rental AS rental_subquery WHERE return_date IS NULL) FROM customer AS customer_outer\G Think in data sets One row per table.

Convert a Subquery to a JOIN SELECT first_name,last_name,email IN (SELECT customer_id FROM rental AS rental_subquery WHERE return_date IS NULL) FROM customer AS customer_outer\G Think in data sets SELECT first_name,last_name, email FROM rental INNER JOIN customer ON (customer.id=rental.customer_id) WHERE return_date IS NULL One row per table.

Convert a Subquery to a JOIN SELECT first_name,last_name,email IN (SELECT customer_id FROM rental AS rental_subquery WHERE return_date IS NULL) FROM customer AS customer_outer\G Think in data sets SELECT first_name,last_name, email FROM rental INNER JOIN customer ON (customer.id=rental.customer_id) WHERE return_date IS NULL Note the ANSI-style JOIN clause Explicit declaration of JOIN conditions Do not use theta-style implicit JOIN conditions in WHERE One row per table.

ANSI vs. Theta JOINs INNER JOIN, CROSS JOIN, JOIN are the same SELECT first_name,last_name, email FROM rental INNER JOIN customer ON (customer.id=rental.customer_id) WHERE return_date IS NULL AND customer.id=rental.customer_id INNER JOIN, CROSS JOIN, JOIN are the same Don't use a comma join (FROM rental,customer) One row per table.

A Correlated Subquery Show the last payment info for each customer: For each customer, find the max payment date, then get that info SELECT pay_outer.* FROM payment pay_outer WHERE pay_outer.payment_date = (SELECT MAX(payment_date) FROM payment pay_inner WHERE pay_inner.customer_id=pay_outer.customer_id)

EXPLAIN SELECT pay_outer.* FROM payment pay_outer WHERE pay_outer.payment_date = (SELECT MAX(payment_date) FROM payment pay_inner WHERE pay_inner.customer_id=pay_outer.customer_id) ********************** 1. row ********************** id: 1 select_type: PRIMARY table: pay_outer type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 16374 Extra: Using where ******************** 2. row ******************** id: 2 select_type: DEPENDENT SUBQUERY table: pay_inner type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.pay_outer.customer_id rows: 14 Extra: 2 rows in set (0.00 sec)

Think in Terms of Sets Show the last payment info for each customer: Set of last payment dates, set of all payment info, join the sets SELECT payment.* FROM (SELECT customer_id, MAX(payment_date) as last_order FROM payment GROUP BY customer_id) AS last_orders INNER JOIN payment ON payment.customer_id = last_orders.customer_id AND payment.payment_date = last_orders.last_order\G

EXPLAIN EXPLAIN SELECT payment.* FROM (SELECT customer_id, MAX(payment_date) as last_order FROM payment GROUP BY customer_id) AS last_orders INNER JOIN payment ON payment.customer_id = last_orders.customer_id AND payment.payment_date = last_orders.last_order\G ************ 3. row ************ id: 2 select_type: DERIVED table: payment type: range possible_keys: NULL key: customer_id key_len: 2 ref: NULL rows: 1301 Extra: Using index for group-by 3 rows in set (0.01 sec) ********** 1. row ********** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 599 Extra: ************** 2. row ************** id: 1 select_type: PRIMARY table: payment type: ref possible_keys: idx_fk_customer_id,customer_id_pay key: customer_id_pay key_len: 10 ref: last_orders.customer_id, last_orders.last_order rows: 1 Extra:

************ 1. row ************ id: 1 select_type: PRIMARY table: pay_outer type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 16374 Extra: Using where ******************** 2. row ******************** id: 2 select_type: DEPENDENT SUBQUERY table: pay_inner type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.pay_outer.customer_id rows: 14 Extra: 2 rows in set (0.00 sec) ************ 3. row ************ id: 2 select_type: DERIVED table: payment type: range possible_keys: NULL key: customer_id key_len: 2 ref: NULL rows: 1301 Extra: Using index for group-by 3 rows in set (0.01 sec) ********** 1. row ********** id: 1 select_type: PRIMARY table: <derived2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 599 Extra: ************** 2. row ************** id: 1 select_type: PRIMARY table: payment type: ref possible_keys: idx_fk_customer_id,customer_id_pay key: customer_id_pay key_len: 10 ref: last_orders.customer_id, last_orders.last_order rows: 1 Extra:

Join-fu http://joinfu.com/presentations/joinfu/joinfu_part_one.pdf - p 22, mapping tables http://joinfu.com/presentations/joinfu/joinfu_part_two.pdf - heirarchies/graphs/nested sets - GIS calculations - reporting/aggregates/ranks With thanks to Jay Pipes!

Questions? Comments? OurSQL Podcast www.oursql.com MySQL Administrator's Bible - tinyurl.com/mysqlbible kimtag.com/mysql planet.mysql.com