InnoDB Performance and Usability Patches MySQL CE 2009 Vadim Tkachenko, Ewen Fortune Percona Inc MySQLPerformanceBlog.com.

Slides:



Advertisements
Similar presentations
Memory.
Advertisements

HW/Study Guide. Synchronization Make sure you understand the HW problems!
Parallel Universe Fast Parallel MySQL Server. Target Markets Database Servers Data Warehouse Servers Data Analytics Servers.
Drop in replacement of MySQL. Agenda MySQL branch GPL licence Maria storage engine Virtual columns FederatedX storage engine PBXT storage engine XtraDB.
File Systems.
CSCI 3140 Module 8 – Database Recovery Theodore Chiasson Dalhousie University.
Chapter 11: File System Implementation
File System Implementation
G Robert Grimm New York University SGI’s XFS or Cool Pet Tricks with B+ Trees.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
1 Operating Systems Chapter 7-File-System File Concept Access Methods Directory Structure Protection File-System Structure Allocation Methods Free-Space.
6/24/2015B.RamamurthyPage 1 File System B. Ramamurthy.
Ext3 Journaling File System “absolute consistency of the filesystem in every respect after a reboot, with no loss of existing functionality” chadd williams.
The Relational Model (cont’d) Introduction to Disks and Storage CS 186, Spring 2007, Lecture 3 Cow book Section 1.5, Chapter 3 (cont’d) Cow book Chapter.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
7/15/2015B.RamamurthyPage 1 File System B. Ramamurthy.
© 2011 IBM Corporation 11 April 2011 IDS Architecture.
1 CSE544 Database Architecture Tuesday, February 1 st, 2011 Slides courtesy of Magda Balazinska.
MySQL and SSD: Usage Patterns MySQL Conference & Expo Apr-2011 Vadim Tkachenko Co-founder, CTO, Percona Inc Date, time, place: Reporter:
Lecture 11: DMBS Internals
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
DBMS Transactions and Rollback Recovery Helia / Martti Laiho.
Architecture Rajesh. Components of Database Engine.
Optimized Transaction Time Versioning Inside a Database Engine Intern: Feifei Li, Boston University Mentor: David Lomet, MSR.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
© Dennis Shasha, Philippe Bonnet 2001 Log Tuning.
26-Oct-15CSE 542: Operating Systems1 File system trace papers The Design and Implementation of a Log- Structured File System. M. Rosenblum, and J.K. Ousterhout.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
Virtual Memory Virtual Memory is created to solve difficult memory management problems Data fragmentation in physical memory: Reuses blocks of memory.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Free Space Management.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
File System Implementation
Database Management 7. course. Reminder Disk and RAM RAID Levels Disk space management Buffering Heap files Page formats Record formats.
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
CS 540 Database Management Systems
Transactional Recovery and Checkpoints Chap
Storage Systems CSE 598d, Spring 2007 OS Support for DB Management DB File System April 3, 2007 Mark Johnson.
TOP 10 Thinks you shouldn’t do with/in your database
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
Transactional Recovery and Checkpoints. Difference How is this different from schedule recovery? It is the details to implementing schedule recovery –It.
What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.
Database Management 7. course. Reminder Disk and RAM RAID Levels Disk space management Buffering Heap files Page formats Record formats.
File-System Management
CS222: Principles of Data Management Lecture #4 Catalogs, Buffer Manager, File Organizations Instructor: Chen Li.

Hathi: Durable Transactions for Memory using Flash
CS 540 Database Management Systems
Lecture 11 Virtual Memory
Jonathan Walpole Computer Science Portland State University
Module 11: File Structure
Virtual Memory User memory model so far:
Lecture 16: Data Storage Wednesday, November 6, 2006.
FileSystems.
Paging COMP 755.
CS222/CS122C: Principles of Data Management Lecture #3 Heap Files, Page Formats, Buffer Manager Instructor: Chen Li.
Database Management Systems (CS 564)
Innodb status variables The Pythian Group
Lecture 11: DMBS Internals
Lecture 9: Data Storage and IO Models
File System B. Ramamurthy B.Ramamurthy 11/27/2018.
KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures
Hash-Based Indexes Chapter 10
Overview: File system implementation (cont)
Chapter 11 Instructor: Xin Zhang
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #03 Row/Column Stores, Heap Files, Buffer Manager, Catalogs Instructor: Chen Li.
Presentation transcript:

InnoDB Performance and Usability Patches MySQL CE 2009 Vadim Tkachenko, Ewen Fortune Percona Inc MySQLPerformanceBlog.com

Who are we ? Vadim Tkachenko –Co-Founder of Percona Inc Lead of R&D department –Co-Author MySQLPerformanceBlog.com –Co-Author “High Performance MySQL” 2 nd edition book Ewen Fortune –Consultant, Percona Inc Special Thanks Yasufumi Kinoshita –Not here, but author of most patches InnoDB Performance and Usability Patches -2-

What is this talk about? Patches made by Percona for InnoDB Storage Engine Two main focuses –Performance improvement patches –“Usability” patches Make InnoDB a bit more friendly World changed since time of Pentium 100MHz and 8MB of RAM –But many such assumptions still in InnoDB code InnoDB Performance and Usability Patches -3-

Why we do it Most requirements and changes come from practical work with customers We need InnoDB fully utilizing modern hardware today –16 cores –RAIDs –SSD / FusionIO / other storage technologies InnoDB team is “conservative” in making improvements in this area InnoDB Performance and Usability Patches -4-

Future Why patches ? Why it can’t be included in InnoDB? –We are often asked about, but actually question is to InnoDB team (empty space due to uncertainty of MySQL future in Oracle) Anyway we will continue our work InnoDB Performance and Usability Patches -5-

Versions 5.0 –Set of patches –SHOW PATCHES to see full list 5.1 –Storage engine XtraDB –Based on InnoDB + patches, not real competitor of InnoDB, but drop-in enhanced version InnoDB Performance and Usability Patches -6-

Performance Patches InnoDB Performance and Usability Patches -7-

Scalability Enhanced read_write locks –Improves InnoDB scalability on systems with 8-16 cores –Similar on Google implementation, InnoDB-plugin –Our implementation is alternative Topic to research which one is better InnoDB-plugin may be preferred, InnoDB team made hard job porting it to many platforms –And now in 5.4 Split buffer_pool mutex even more –Additional split of buffer_pool mutex to InnoDB Performance and Usability Patches -8-

IO patches InnoDB IO patches –Part similar to Google’s InnoDB IO patches, but again alternative –Several parts – some of them now in 5.4 InnoDB Performance and Usability Patches -9-

IO – multiple threads Read_io_threads –Number of threads for reads requests (by default 1) –Not really useful as used only for read-ahead requests Write_io_threads –Number of threads for write requests (by default 1) –This is one you may want to use on system with multiple disks Io_capacity –Amount of IO operations per second InnoDB assumes server can do (by default 100, which is not right assumptions for modern systems) InnoDB Performance and Usability Patches -10-

IO – Adaptive checkpoint InnoDB flushing of dirty buffer_pool pages may be intensive Lack of free pages may be controlled by innodb_max_dirty_pages_pct Flushing at the moment of checkpoint is not controllable, intensive and may hurt InnoDB Performance and Usability Patches -11-

Adaptive checkpointing InnoDB Performance and Usability Patches -12- InnoDB default behavior, hiccups during buffer_pool flushing

Adaptive checkpoint What we do: Flush pages more intensive –the closer checkpoint the more intensive InnoDB Performance and Usability Patches -13-

Adaptive_checkpoint Adaptive_checkpoint=1 InnoDB Performance and Usability Patches -14-

IO Control of Insert buffer Ibuf_max_size – maximal size of insert buffer (by default can be half of buffer_pool) Ibuf_accel_rate – IO rate for background thread, works in pair with io_capacity InnoDB Performance and Usability Patches -15-

IO – multiple pages Read_ahead = (both | linear | random) –Control to use or not internal InnoDB read-ahead logic Flush_neighbor_pages = (yes|no) –By default InnoDB also writes neighborhoods of flushing pages All these operations were made for disks with expensive (in time sense) random reads – may be not needed for SSD / FusionIO / other devices with cheap random reads InnoDB Performance and Usability Patches -16-

Extra rollback segments By default InnoDB uses single segment protected by mutex Sensitive in intensive parallel insert load InnoDB Performance and Usability Patches -17-

Fix group commit “Broken” in 5.0 –Problem appears on slow disks with enabled binary-logs InnoDB Performance and Usability Patches -18-

Benchmark Tpcc-like workload 100 Warehouses (about 10GB of data) Buffer_pool=5GB System: Dell PowerEdge R900, RAID 10 on 8 disks, RAM 32GB –O_DIRECT for InnoDB, xfs filesystem, mounted with nobarrier vs percona –Had no chance to test 5.4 yet InnoDB Performance and Usability Patches -19-

Benchmark InnoDB Performance and Usability Patches -20-

Usability patches InnoDB Performance and Usability Patches -21-

Microslow InnoDB part InnoDB Performance and Usability Patches -22- InnoDB_IO_r_ops: 1 InnoDB_IO_r_bytes: InnoDB_IO_r_wait: # InnoDB_rec_lock_wait: InnoDB_queue_wait: # InnoDB_pages_distinct: 5

Limit data dictionary Problem: –Data dictionary entry of once opened table kept in memory forever (or while DELETE table) –Is not problem for regular usage ( tables) –Is problem for instances with 10K+ tables 10GB+ of memory just allocated for datadictionary entries Our solution: –LRU based datadictionary entries –Remove from memory oldest entries if limit reached InnoDB Performance and Usability Patches -23-

IO access pattern InnoDB Performance and Usability Patches -24- mysql> select INDEX_ID,TABLE_NAME,INDEX_NAME,sum(N_READ),sum(N_WRITE) from INFORMATION_SCHEMA.INNODB_ALL_PAGE_IO group by INDEX_ID; | INDEX_ID | TABLE_NAME | INDEX_NAME | sum(N_READ) | sum(N_WRITE) | | 30 | tpcc/item | PRIMARY | 547 | 0 | | 32 | tpcc/district | PRIMARY | 1 | 1 | | 36 | tpcc/history | GEN_CLUST_INDEX | 11 | 5 | | 37 | tpcc/history | fkey_history_1 | 166 | 163 | | 38 | tpcc/history | fkey_history_2 | 37 | 30 | | 39 | tpcc/new_orders | PRIMARY | 76 | 76 | | 43 | tpcc/order_line | PRIMARY | 218 | 189 | | 44 | tpcc/order_line | fkey_order_line_2 | 1040 | 1040 | | 46 | tpcc/stock | PRIMARY | 3137 | 1764 | | 47 | tpcc/stock | fkey_stock_2 | 269 | 0 | | 48 | tpcc/customer | PRIMARY | 960 | 580 | | 49 | tpcc/customer | idx_customer | 171 | 0 | | 50 | tpcc/orders | PRIMARY | 94 | 70 | | 51 | tpcc/orders | idx_orders | 142 | 129 | Show pattern of pages on disk accessed

Show buffer pool content What is in buffer_pool InnoDB Performance and Usability Patches -25- select space,offset, RECORDS, DATASIZE, INDEX_NAME,TABLE_SCHEMA,TABLE_NAME from information_schema.INNODB_BUFFER_POOL_CONTENT limit 10; | space | offset | RECORDS | DATASIZE | INDEX_NAME | TABLE_SCHEMA | TABLE_NAME | | 1584 | | 9 | | PRIMARY | art104 | article104 | | 1648 | 2100 | 135 | | PRIMARY | art114 | author114 | | 1492 | 4507 | 158 | | PRIMARY | art87 | author87 | | 1406 | | 141 | | img_status | art52 | img_out52 | | 1466 | | 49 | | PRIMARY | art62 | img_out62 | | 1470 | | 24 | | PRIMARY | art84 | article84 | | 1460 | | 62 | | PRIMARY | art61 | img_out61 | | 1458 | | 20 | | PRIMARY | art61 | article61 | | 1466 | | 56 | | PRIMARY | art62 | img_out62 | | 1621 | | 46 | | PRIMARY | art110 | link_out110 |

Show memory usage Extended information about memory consuming InnoDB Performance and Usability Patches BUFFER POOL AND MEMORY Total memory allocated ; in additional pool allocated Internal hash tables (constant factor + variable factor) + Adaptive hash index ( ) + Page hash Dictionary cache ( ) + File system ( ) + Lock system ( ) + Recovery system 0 (0 + 0) + Threads ( ) Buffer pool size Buffer pool size, bytes Free buffers 12396

Show locks held ---TRANSACTION , ACTIVE 0 sec, process no 15571, OS thread id inserting mysql tables in use 1, locked 1 7 lock struct(s), heap size 1216, undo log entries 4 MySQL thread id 15, query id root update INSERT INTO history(h_c_d_id, h_c_w_id, h_c_id, h_d_id, h_w_id, h_date, h_amount, h_data) VALUES(?, ?, ?, ?, ?, ?, ?, ?) Trx read view will not see trx with id >= , sees < TABLE LOCK table `test/warehouse` trx id lock mode IX RECORD LOCKS space id 10 page no 3 n bits 168 index `PRIMARY` of table `test/warehouse` trx id lock_mode X locks rec but not gap TABLE LOCK table `test/district` trx id lock mode IX RECORD LOCKS space id 18 page no 7 n bits 216 index `PRIMARY` of table `test/district` trx id lock_mode X locks rec but not gap TABLE LOCK table `test/customer` trx id lock mode IX RECORD LOCKS space id 19 page no n bits 96 index `PRIMARY` of table `test/customer` trx id lock_mode X locks rec but not gap TABLE LOCK table `test/history` trx id lock mode IX InnoDB Performance and Usability Patches -27-

Extra undo slots By default 1024 slots to store transaction undo information, that may limit count of concurrent transactions to 512 We increase to 4072 –Only on 5.1 XtraDB –Use it only if you need, breaks compatibility with InnoDB InnoDB Performance and Usability Patches -28-

TransactionalReplication Similar to Google’s patch Information in relay-log.info is not consistent with InnoDB state. –When server crash MySQL will repeat several transaction You are lucky if replication fails on “Duplicate key error” In worst case you will have several transactions executed twice Our solution: store information of binary log name and position and relay-log name and position in InnoDB transactional log file InnoDB Performance and Usability Patches -29-

Plans Still hunt performance improvements Operations tasks: –Fast recovery There is reported bug –Preload table / index into buffer_pool. –Copy single.ibd table from one server to different –Open InnoDB tables in parallel Currently serialized –Different improvements on statistics Some patches already published (not by us) InnoDB Performance and Usability Patches -30-

To finalize Most of patches is not rocket science –Could be developed or included in official tree long time ago Even more, for some patches we just only uncommented few lines of code –Expect most of them in MariaDB 5.1 InnoDB Performance and Usability Patches -31-

Questions ? Thank you for coming! InnoDB Performance and Usability Patches -32-