CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 294 Database Systems II Coping With System Failures
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 295 Introduction System failures are events that cause the state of a transaction to be lost. Potential causes of system failures are power loss, software errors and media failures. Power loss leads to the loss of main memory states, media failure to a loss of disk states, and software errors can lead to both. Recovery from system failures is based on the concept of transactions.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 296 Introduction We distinguish two types of system failures: temporary / local system failures and permanent / global system failures. In a local failure, main memory content or the content of a few disk blocks is lost. A log of database modifications is used to recover from such failures. In a global failure, the entire database content is lost. Archiving is employed to recover from such failures.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 297 Transactions A user’s program may carry out many operations on the data retrieved from the database, but the DBMS is only concerned about what data is read/written from/to the database. A transaction is the DBMS’s abstract view of a user program: a sequence of reads and writes. Requirements for transactions: Atomicity : “all or nothing”, Consistency : transforms consistent DB state into another consistent DB state, Independence : from all other transactions, Durability : survives any system failures.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 298 Transactions Users submit transactions, and can think of each transaction as executing by itself. Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of various transactions. Each transaction must leave the database in a consistent state if the DB is consistent when the transaction begins.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 299 Transactions DBMS will enforce some ICs, depending on the ICs declared in CREATE TABLE statements, in triggers etc. Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed). Issues: effect of interleaving (concurrent) transactions (next chapter), and system failures (this chapter).
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 300 Transactions A transaction can end in two different ways: - commit : successful end, all actions completed, - abort: unsuccessful end, only some actions executed. A transaction can also be aborted by the DBMS. The DBMS guarantees that a transaction is atomic. That is, a user can think of a transaction as always executing all its actions, or not executing any actions at all.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 301 Transactions DBMS logs all actions so that it can undo the actions of aborted transactions. This ensures the atomicity of transactions. Log is also employed to redo actions of committed transactions, if a system failure occurs. This ensures the durability of transactions.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 302 Primitive Operations Database modifications are initially performed in the (main memory) buffer. In order to reduce the number of IO operations, the buffer manager writes buffer blocks back to disk only if necessary. In order to study failure recovery operations, we need to consider four primitive operations to read and modify disk blocks and buffer blocks. In the following, we assume a database element X which is not larger than a single block.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 303 Primitive Operations Input (x): transfer block containing x from disk to memory (buffer) Output (x): transfer block containing x from buffer to disk Read (x,t): do Input(x) if necessary assign value of x in block to local variable t (in buffer) Write (x,t): do Input(x) if necessary assign value of local variable t (in buffer) to x Read and Write are issued by transactions, Input and Output are issued by the buffer manager.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 304 Primitive Operations Key problem are unfinished transactions. Example Constraint: A=B T1: A A 2 B B 2 Initially, A=B=8
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 305 Primitive Operations T 1 :Read (A,t); t t 2 Write (A,t); Read (B,t); t t 2 Write (B,t); Output (A); Output (B); A: 8 B: 8 A: 8 B: 8 memory disk 16 failure! 16 failure!
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 306 Primitive Operations T 1 :Read (A,t); t t 2 Write (A,t); Read (B,t); t t 2 Write (B,t); Output (A); Output (B); A:8 B:8 A:8 B:8 memory disk log Undo logging
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 307 Logging What content should log records have? When to write log records back to disk? How to deal with system failures during logging? Different types of logging: - undo logging, - redo logging, - undo/redo logging. The log manager (a DBMS component) records (logs) relevant events and manages the corresponding log file.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 308 Logging Log records are first kept in the buffer. Log blocks are written to disk as soon as feasible. FLUSH LOG: copy to disk all log blocks that are new or have changed since last flush Generic log records used in each logging type: - : start of transaction T. - : transaction T completed successfully. - : transaction T was terminated unsuccessfully.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 309 Undo Logging Undo logging supports the undo of transactions that were incomplete at the time of a system failure. In addition to the generic log records, undo logging keeps update records : T: transaction X: database element (tuple, attribute) v: former value (before modification).
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 310 Undo Logging Log is first written in memory. Not written to disk on every action. memory DB Log A: 8 16 B: 8 16 Log: A: 8 B: 8 16 BAD STATE # 1 old value of A is lost, if system failure
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 311 Undo Logging Log is first written in memory. Not written to disk on every action. memory DB Log A: 8 16 B: 8 16 Log: A: 8 B: 8 16 BAD STATE # 2 new value of B is lost, if system failure...
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 312 Undo Logging To avoid these bad states, the log manager and buffer manager need to obey the following rules : U1: If transaction T modifies database element X, then the log record must be written to disk before the new value of X is written to disk ( write ahead logging ). U2: If a transaction commits, then its COMMIT log record must be written to disk only after all database elements changed by the transaction have been written to disk.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 313 Undo Logging Upon a system failure, the recovery manager (a DBMS component) uses the log to restore a consistent database state. Distinguish committed and uncommitted transactions, based on COMMIT log records. Committed transactions cannot have created an inconsistent state, because all of their modifications have been written to disk (U2).
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 314 Undo Logging Modifications by aborted transactions are also unproblematic, since already undone. Uncommitted transactions T may have created inconsistent DB state. For each modification of T written to disk, the corresponding log record must be on disk (U1). To undo this action of T, restore database element X to its old value v as provided by the log record.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 315 Undo Logging Consider all uncommitted transactions, starting with the most recent one and going backward. Undo all actions of these transactions. Why going backward, not forward? Example T1, T2 and T3 all write A T1 executed before T2 before T3 T1 committed, T2 and T3 incomplete T1 write A T2 write A time/log T1 commit system failure T3 write A
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 316 Undo Logging Recovery algorithm (1) Let S = set of transactions with in log, but no or record in log. (2) For each in log, in reverse order (from latest to earliest) do: if Ti S then - Write (X, v) - Output (X). (3) For each Ti S do - write to log. (4) Flush log.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 317 Undo Logging What if a system failure happens during the recovery? We just repeat the undo from scratch. This is no problem, since multiple repetitions of the recovery algorithm are equivalent to a single execution. In principle, we need to examine the entire log. Checkpointing is a method to limit the part of the log that needs to be considered during recovery up to a certain point (checkpoint).
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 318 Undo Logging To create a checkpoint: - stop accepting new transactions, - wait until all current transactions commit or abort and have written the corresponding log records, - flush the log to disk, - write a log record and flush the log, - resume accepting new transactions. When encountering a checkpoint record, we know that there are no incomplete transactions. Do not need to go backward beyond checkpoint.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 319 Redo Logging In Undo logging, we need to write all modified data to disk before committing a transaction. This may require an unnecessarily large number of block IOs. With redo logging, DB modifications can be written to disk later than commit time. No undo necessary, since DB modifications written to disk only after commit. Update records record new (not old) value (after modification).
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 320 Redo Logging T 1: Read(A,t); t = t 2; Write (A,t); Read(B,t); t = t 2; Write (B,t); Output(A); Output(B) A: 8 B: 8 A: 8 B: 8 memory DB LOG 16 output 16 Example
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 321 Redo Logging Before modifying any database element on disk, corresponding log records (update and COMMIT) must be written to disk. Redo logging works as follows: (1) For every action, generate redo log record. (2) Before X is modified on disk, all log records for transaction that modified X (including commit) must be on disk. (3) Flush log at commit. (4) Write END log record after DB modifications flushed to disk.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 322 Redo Logging In recovery, need to redo modifications by committed transactions that have not yet been flushed to the disk. Recovery algorithm (1) Let S = set of transactions with and no in log (2) For each in log, in forward order (from earliest to latest) do: - if Ti S then Write(X, v) Output(X) (3) For each Ti S, write
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 323 Redo Logging The END log records allow us to limit the number of transactions that need to be considered in a recovery. Alternatively, can set a checkpoint: (1) Do not accept new transactions. (2) Wait until all transactions finish. (3) Flush all log records to disk. (4) Flush all buffers to disk (do not discard buffers). (5) Write “checkpoint” log record on disk. (6) Resume transaction processing.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 324 Redo Logging System failure... Example Redo log (disk) recovery does not need to go beyond checkpoint
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 325 Undo/Redo Logging Undo logging requires to write modifications to disk immediately after commit, leading to an unnecessarily large number of IOs. Redo logging requires to keep all modified blocks in the buffer until the transaction commits and the log records have been flushed, increasing the buffer size requirement. Undo/redo logging combines undo and redo logging. It provides more flexibility in flushing modified blocks at the expense of maintaining more information in the log.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 326 Undo/Redo Logging Update records record new and old value of X. Undo/redo logging has only the constraints that both undo logging and redo logging have. The only undo/redo logging rule is as follows: UR1: Log record must be flushed before corres- ponding modified block ( write ahead logging ). Block of X can be flushed before or after T com- mits, i.e. before or after the COMMIT log record. Flush the log at commit.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 327 Undo/Redo Logging Because of the flexibility of flushing X before or after the COMMIT record, we can have uncommitted transactions with modifications on disk and committed transactions with modifications not yet on disk. The undo/redo recovery policy is as follows: - Redo committed transactions. - Undo uncommitted transactions.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 328 Undo/Redo Logging More details on the recovery procedure: - Backward pass From end of log back to latest valid checkpoint, construct set S of committed transactions. Undo actions of transactions not in S. - Forward pass From latest checkpoint forward to end of log, redo actions of transactions in S. Alternatively, can also perform the redos before the undos.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 329 Undo/Redo Logging In either case, the following can happen. Transaction T1 has committed and is redone. However, T1 has read X written by transaction T2 which has not committed and is undone. This situation needs to be avoided, since the resulting database state is inconsistent (not serializable). Concurrency control ensures that this situation is avoided.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 330 Protecting Against Media Failures Logging protects from local loss of main memory and disk content, but not against global loss of secondary storage content ( media failure ). To protect against media failures, employ archiving : maintaining a copy of the database on a separate, secure storage device. Log also needs to be archived in the same manner. Two levels of archiving: full dump vs. incremental dump.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 331 Protecting Against Media Failures Typically, database cannot be shut down for the period of time needed to make a backup copy (dump). Need to perform nonquiescent archiving, i.e. create a dump while the DBMS continues to process transactions. Goal is to make copy of database at time when the dump began, but transactions may change database content during the dumping. Logging continues during the dumping, and discrepancies can be corrected from the log.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 332 Protecting Against Media Failures We assume undo/redo (or redo) logging. The archiving procedure is as follows: - Write a log record. - Perform a checkpoint for the log. - Perform a (full / incremental) dump on the secure storage device. - Make sure that enough of the log has been copied to the secure storage device so that at least the log up to the check point will survive media failure. - Write a log record.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 333 Protecting Against Media Failures After a media failure, we can restore the DB from the archived DB and archived log as follows: - Copy latest full dump (archive) back to DB. - Starting with the earliest ones, make the modifications recorded in the incremental dump(s) in increasing order of time. - Further modify DB using the archived log. Use the recovery method corresponding to the chosen type of logging.