Transaction Log Performance Tuning Chirag Roy – Senior SQL DBA MCITP: Database Developer 2005/2008 MCITP: Database Administrator 2005/2008 http://sqlking.wordpress.com http://www.twitter.com/chiragroy
Agenda Transaction Log Architecture Design Options for Performance Hardware Options for Performance Transaction Log Troubleshooting Summary
Transaction Log architecture
Transaction Log Architecture Physical/Logical Architecture VLF1 VLF2 VLF3 VLF4 VLF5 Logical Log File The transaction log is used to guarantee the data integrity of the database and for data recovery. Virtual Log Files * http://msdn.microsoft.com/en-us/library/ms179355.aspx
Transaction Log Architecture Lazy Writer Checkpoint Buffer Pool Dirty Page Plan Cache Transaction Log The Buffer manager will tell the transaction manager to log an update, the transaction manager then passes down to the log manager and transaction manager writes the change to the log. Once to committed to disk, the transaction manager will tell the buffer manager transaction is persisted - buffer manager then changes the pages in memory (a dirty page). As the change is always made to the log first, we called it a write-ahead log. The check point process, running roughly every minute will scan through the buffer pool and flush dirty pages to disk - then writes a time stamp to the transaction log - anything behind this can then be dropped. Once on disk the page is added to the free list. An additional process in buffer management is the lazy writer which checks through the buffer pool and ages pages. Once at 0, a page is added to the free list and can be reused. Note the checkpoint process timing is modified. Checkpoints occur periodically based on the number of log records generated by data modifications, or when requested by a user (issue the command ‘CHECKPOINT’) or a system shutdown. When a database is recovered at startup, transactions are rolled forward (ie, changes made it to the log, but not to disk), and rolled back (uncommitted transactions are ‘undone’). The time between checkpoints is calculated to attempt to ensure that the database is recovered within the recovery interval specified by sp_configure. Simple Recovery Mode Full Recovery Mode Data Cache Data file
Transaction Log Architecture Recovery Type Considerations - Simple Recovery - Log file cleared on checkpoint Full/Bulk Logged Recovery – Log file cleared on Log Backup Bulk Logged Recovery Potentially Larger Log Backups when running - ALTER INDEX REORGANIZE DBCC INDEXDEFRAG
Transaction Log Architecture Tools to Check T-LOG - DBCC LOGINFO
Transaction Log Architecture Tools to Check T-LOG - DBCC SQLPERF(LOGSPACE)
Transaction Log Architecture Tools to Check T-LOG - Disk Usage Report TRACE FLAG 3004
Design Options for Performance
Design options for performance VLF Design Too few Large VLF’s due to poor design Too many Small VLF’s in case of Autogrow Smallest Log File Size can be 512KB on creation VLF Sizing should be carefully planned according to environment needs Talk about Checkpoint process, Lazy writer
Design options for performance VLF Design Chunk Size Number of VLFs <= 1MB 2 >=1MB and < 64MB 4 >=64MB and < 1GB 8 1GB and larger 16 Talk about Checkpoint process, Lazy writer
Design options for performance VLF Design If log file designed for VLDBs > 8GB, expand Log File in Increments of 8GB on DB Creation to create 512MB VLFs If log file designed < 8GB, size Log File as per requirements Talk about Checkpoint process, Lazy writer
Design options for performance Considerations - Autoshrink is Evil – Switch OFF Autogrowth by % is Evil’er, causes VLF Fragmentation VLF Fragmentation - Leads to I/O overhead Affects Redo/Undo phase performance Increases database recovery/restore time Cluster Failover Timing
Design options for performance Considerations - Place Data and Log files on separate LUNS to distribute I/O Data Files experience Random Read/Writes Log Files experience Sequential Read/Writes SAN Admins need to provision LUNS optimized for the type of load
Design options for performance Considerations - Change Model Database Recovery Mode to Simple Full Recovery Database in Pseudo Simple Until First Full Backup Runaway Log file if subsequently no Log backups are taken Instant File Initialization does not work with Log Files When Restoring Database create database first with properly sized data and log files
Design options for performance Considerations - Log clearing can be affected by – Recovery Model Replication Database Mirroring Switch on Backup Compression in SQL 2008/R2
Design options for performance TempDB - Special Case In Large OLTP Environment Size Tempdb data and log file appropriately Test using Autogrow Size before going into production Checkpoint occurs when Log File is 70% Full Slow Disk I/O can cause delayed checkpoint Mitigate using Alerts to notify Manual Checkpoint precedes over System Checkpoint
Hardware Options for Performance
Hardware Options RAID 1 Good Read, Slower Write Performance Good Redundancy Data Availability Expensive *http://support.dell.com/support/edocs/software/svradmin/5.1/en/omss_ug/html/strcnpts.html
Hardware Options RAID 10 Good Read/Write Performance Good Redundancy Data Availability More Expensive *http://support.dell.com/support/edocs/software/svradmin/5.1/en/omss_ug/html/strcnpts.html
Hardware Options SSD Extremely Good Read + Good Write Performance Good Redundancy Data Availability Very Expensive * http://www.fusionio.com/load/media-imagesMediakit/gsyhv/image6_orig.jpg?attach=1
Hardware Options Disk Sector Alignment Still on Windows 2003 make sure to use disk sector alignment Read Jimmy May’s blogs or whitepaper http://blogs.msdn.com/jimmymay/archive/tags/Disk+Partition+Alignment/default.aspx http://msdn.microsoft.com/en-us/library/dd758814.aspx In Windows 2008, disk sectors are aligned to 1MB by default for disks larger than 4GB
Transaction Log troubleshooting
hardware utilisation and performance Storage Check the file latency within SQL Server using sys.dm_io_virtual_file_stats (db_id,file_id) Use this script to get the latency for each file: select db_name(database_id), io_stall_read_ms/num_of_reads AS 'Disk Read Transfer/ms', io_stall_write_ms/num_of_writes AS 'Disk Write Transfer/ms' from sys.dm_io_virtual_file_stats (2,1)
Dynamic management views sys.dm_os_waiting_tasks Wait information Task level Very accurate Transient data
Dynamic management views sys.dm_os_wait_stats Wait information Cumulative by wait type Persistent data Transient data
Dynamic management views Log_reuse_wait_desc in sys.databases NOTHING CHECKPOINT LOG_BACKUP ACTIVE_BACKUP_OR_RESTORE ACTIVE_TRANSACTION DATABASE_MIRRORING REPLICATION DATABASE_SNAPSHOT_CREATION LOG_SCAN OTHER_TRANSIENT
Common wait types ASYNC_IO_COMPLETION WRITELOG LOGBUFFER Can be for "zeroing" out a transaction log file during log creation or growth WRITELOG Writing transaction log to disk LOGBUFFER Indicates worker thread is waiting for a log buffer to write log blocks for a transaction *http://blogs.msdn.com/psssql/archive/2009/11/03/the-sql-server-wait-type-repository.aspx
Demo
summary
SUMMARY VLF Design Switch of AutoShrink Use Autogrow as last resort Enable Compression in SQL 2008/R2 Log files on Faster Dedicated Disks significant resource waits ASYNC_IO_COMPLETION WRITELOG LOGBUFFER
THANK YOU & Questions