Log Tuning
AOBD 2007/08 H. Galhardas Atomicity and Durability Every transaction either commits or aborts. It cannot change its mind Even in the face of failures: Effects of committed transactions should be permanent; Effects of aborted transactions should leave no trace. ACTIVE (running, waiting) ABORTED COMMITTED COMMIT ROLLBACK Ø BEGIN TRANS
AOBD 2007/08 H. Galhardas Outages Environment Fire in the machine room (Credit Lyonnais, 1996) Operations Problem during regular system administration, configuration and operation. Maintenance Problem during system repair and maintenance Hardware Fault in the physical devices: CPU, RAM, disks, network. Software 99% are Heisenbugs: transient software error related to timing or overload. Heisenbugs do not appear when the system is re- started.
AOBD 2007/08 H. Galhardas Outages A fault tolerant system must provision for all causes of outages Software is the problem Hardware failures cause under 10% of outages Heisenbugs stop the system without damaging the data. Database systems protect integrity against single hardware failure and some software failures. From J.Gray and A.Reuters Transaction Processing: Concepts and Techniques
AOBD 2007/08 H. Galhardas DB Architecture Hardware [Processor(s), Disk(s), Memory] Operating System Concurrency ControlRecovery Storage Subsystem Buffer Manager
AOBD 2007/08 H. Galhardas Assumptions Concurrency control is in effect. Updates are happening “in place”. i.e. data is overwritten on (deleted from) the disk. A simple scheme to guarantee Atomicity & Durability?
AOBD 2007/08 H. Galhardas Buffer: Stealing frames and forcing pages Steal approach: the changes made to an object O in the buffer pool by a transaction T are written to disk before T commits No steal approach:pages updated by a transaction are only written to disk after the transaction commits Force approach: when a transaction commits, all the changes it has made to objects in the buffer pool are immediately forced to disk No force approach: the in-memory copy of a page that is being updated by several committed transactions is written to disk only when the page is eventually replaced in the buffer pool.
AOBD 2007/08 H. Galhardas Handling the Buffer Pool Force No Force No StealSteal Trivial Desired No Steal/Force – simplest to use changes of an aborted transaction do not have to be undone, because these changes have not been written to disk the changes of a committed transaction do not have to be redone if there is a subsequent crash, because all these changes are guaranteed to have been written to disk at commit time. But has drawbacks: the no-steal app. Assumes that all pages modified by ongoing transactions can be accomodated in the buffer pool, but this is not realistic in the presence of large transactions. the force approach results in excessive Page I/Os Steal/No force usually used if a page is dirty and chosen for replacement, the page it contains is written to disk even if the modifying transaction is still active Pages in the buffer pool that are modified by a transaction are not forced to disk when the transaction commits
AOBD 2007/08 H. Galhardas Handling the Buffer Pool Force No Force No Steal Steal Undo Undo Redo Redo
AOBD 2007/08 H. Galhardas Logging REDO and UNDO information in a log. Sequential writes to log (put it on a separate disk). Minimal info written to log, so multiple updates fit in a single log page. Log: An ordered list of REDO/UNDO actions Log record contains: and additional control info Current database state = current state of data on disks + log
AOBD 2007/08 H. Galhardas Write-Ahead Logging (WAL) The Write-Ahead Logging Protocol: Must force the log record for an update before the corresponding data page gets to disk. Guarantees Atomicity ‚ Must write all log records for a Xact before commit. Guarantees Durability ARIES algorithms, developed by C.Mohan at IBM Almaden in the early 90’s
AOBD 2007/08 H. Galhardas LOGDATA STABLE STORAGE UNSTABLE STORAGE WRITE log records before commit WRITE modified pages after commit RECOVERY Pi Pj DATABASE BUFFER LOG BUFFER lrilrj
AOBD 2007/08 H. Galhardas Tuning the recovery system Put the log on a separate disk Tuning database writes: delay writing updates to DB disks as long as possible Setting intervals for database dumps and checkpoints Reduce the size of large update transactions: from batch to mini-batch
AOBD 2007/08 H. Galhardas Put the Log on a Separate Disk Writes to log occur sequentially Writes to disk occur (at least) 100 times faster when they occur sequentially than when they occur randomly A disk that has the log should have no other data + sequential I/O + log failure independent of database failure
AOBD 2007/08 H. Galhardas Put the Log on a Separate Disk transactions. Each contains an insert statement. DB2 UDB v7.1 5 % performance improvement if log is located on a different disk Controller cache hides negative impact mid-range server, with Adaptec RAID controller (80Mb RAM) and 2x18Gb disk drives.
AOBD 2007/08 H. Galhardas Group Commits transactions. Each contains an insert statement. DB2 UDB v7.1 Log records of many transactions are written together Increases throughput by reducing the number of writes at the cost of increased mean response time.
AOBD 2007/08 H. Galhardas 17 Tuning Database Writes Dirty data is written to disk When the number of dirty pages is greater than a given parameter (Oracle 8) When the number of pages in the free list crosses a given threshold (less than 3% of buffer size for SQL Server 7) When a checkpoint is performed At regular intervals When the log is full (Oracle 8).
AOBD 2007/08 H. Galhardas Tune Checkpoint Intervals A checkpoint (partial flush of dirty pages to disk) occurs at regular intervals or when the log is full: Impacts the performance of on-line processing + Reduces the size of log + Reduces time to recover from a crash transactions. Each contains an insert statement. Oracle 8i for Windows 2000
AOBD 2007/08 H. Galhardas Reduce the Size of Large Update Transactions Consider an update-intensive batch transaction (concurrent access is not an issue): It can be broken up in short transactions (mini-batch): + Easy to recover + Does not overfill the log buffers Example: Transaction that updates, in sorted order, all accounts that had activity on them, in a given day. Break-up to mini-batches each of which access 10,000 accounts and then updates a global counter.
AOBD 2007/08 H. Galhardas Logging in SQL Server 2000 Free Log caches Flush Log caches Current Log caches Flush Log caches db writer Flush queue Waiting processes LOG DATA fre e Pifre e Pj Log entries: - LSN - before and after images or logical log Lazy- writer Synchronous I/OAsynchronous I/O DB2 UDB v7 uses a similar scheme DATABASE BUFFER
AOBD 2007/08 H. Galhardas Logging in Oracle 8i Rollback segments (fixed size) After images (redo entries) Log buffer (default 32 Kb) Pi Pj Free list DATABASE BUFFER LOG DATA Rollback Segments Log File #1 Log File #2 LGWR (log writer) DBWR (database writer) Before images