Recovery from Crashes
ACID A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction, may change the DB from a consistent state to another consistent state. Otherwise it is rejected (aborted). Concurrent execution of transactions may lead to inconsistency – each transaction must appear to be executed in isolation The effect of a committed transaction is durable i.e. the effect on DB of a transaction must never be lost, once the transaction has completed. ACID: Properties of a transaction: Atomicity, Consistency, Isolation, and Durability
Database elements Note: In our discussion, the notion of “DB element” will not be made specific. A data element could be a tuple, block, a whole relation, etc. – A block is the unit of a disk read or write. It’s better to consider blocks to be the elements.
Primitive DB Operations of Transactions INPUT(X) ≡ copy the disk block containing the database element X to a memory buffer READ(X,t) ≡ assign the value of buffer X to local variable t WRITE(X,t) ≡ copy the value of local variable t to buffer X OUTPUT(X) ≡ copy the block containing X from its buffer (in main memory) to disk
Example Consider the database elements A and B such that the constraint A=B must hold. Suppose transaction T doubles A and B A := A*2; B := B*2; Execution of T involves: – reading A and B from disk, – performing arithmetic in main memory, and – writing the new values for A and B back to disk.
Example (Cont’d) ActiontBuff ABuff BA in HDB in HD Read(A,t)8 888 t:=t* Write(A,t) Read(B,t) t:=t* Write(B,t) Output(A) Output(B) Problem: what happens if there is a system failure just before OUTPUT(B)?
Undo Logging Create a log of all “important actions.” A log is a sequential file opened for appending only. Entries that can occur in a log are: -- transaction T started. -- database element X was modified; it used to have the value Old X -- transaction T has completed -- Transaction T couldn’t complete successfully. Intention for undo logging: – If there is a crash before transaction finishes, the log will tell us how to restore old values for any DB element X changed on disk.
Undo Logging (Cont’d) Two rules of Undo Logging: U1: Log records for a DB element X must be on disk before any database modification to X appears on disk. U2: If a transaction T commits, then the log record must be written to disk only after all the database elements changed by T are written to disk. In order to force log records to disk, the Log Manager needs a FLUSH LOG command that tells the buffer manager to copy to disk any log blocks that haven’t previously been copied to disk or that have been changed since they were last copied.
ActiontBuff ABuff BA in HDB in HD Log Read(A,t)8 888 t:=t* Write(A,t) Read(B,t) t:=t* Write(B,t) Flush Log Output(A) Output(B) Flush Log Example:
Recovery With Undo Logging 1. Examine the log to identify all transactions T such that appears in the log, but neither nor does. – Call such transactions incomplete. 2. Examine each log entry a) If T isn’t an incomplete transaction, do nothing. b) If T is incomplete, restore the old value of X In what order? From most recent to earliest. 3. For each incomplete transaction T add to the log, and flush the log. – What about the transactions that had already in the log? – We do nothing about them. If T aborted, then the effect on the DB should have been restored anyway.
Example ActiontBuff ABuff BA in HDB in HD Log Read(A,t)8 888 t:=t* Write(A,t) Read(B,t) t:=t* Write(B,t) Flush Log Output(A) Output(B) Flush Log If there is crash before OUTPUT(B) then this would result in T being identified as incomplete. – We would find in the log and write A = 8 to the DB. – We also would find in the log and “restore” B to value 8, although B has already this value. Problem: What would happen if there were another system error during recovery? – Not really a problem. Recovery steps are idempotent, i.e. repeating them many times has exactly the same effect as performing them once.
Checkpointing Problem: in principle, recovery requires looking at the entire log!! Simple solution: occasional checkpoint operation during which we: 1.Stop accepting new transactions. 2.Wait until all current transactions commit or abort and have written a Commit or Abort log record 3.Flush the log to disk 4.Enter a record in the log and flush the log again 5.Resume accepting transactions If recovery is necessary, we know that all transactions prior to a record have committed or aborted and need not be undone
Example of an Undo log with CKPT decide to do a checkpoint we may now write the CKPT record If a crash occurs at this point?
Problem: we may not want to stop transactions from entering the system. Solution: 1. Write a record to log and flush to disk, where T i ’s are all current “active” transactions. 2. Wait until all T i ’s commit or abort, but do not prohibit new transactions. 3. When all T 1 …T k are “done”, write the record to log and flush. Nonquiescent Checkpoint (NQ CKPT)
Recovery with NQ CKPT First case: If the crash follows, Then we can restrict recovery to transactions that started after the. Second case: If the crash occurs between and, we need to undo: 1. All transactions T on the list associated with with no. 2. All transactions T with after the but with no. i.e. 1+2 undo any incomplete transaction that is on the CKPT list or started after.
Example of NQ Undo Log A crash occurs at this point What if we have a crash right after ?
Undo Drawback We cannot commit a transaction without first writing all its changed data to disk. Sometimes we can save disk I/O if we let changes to the DB reside only in main memory for a while; …as long as we can fix things up in the event of a crash…
Redo Logging Idea: Commit (log record appears on disk) before writing data to disk. Redolog entries contain the new values: – = “transaction T modified X and the new value is New X ” Redo logging rule: – R1. Before modifying DB element X on disk, all log entries (including ) must be written to log (in disk).
ActiontBuff ABuff BA in HDB in HD Log Read(A,t)8 888 t:=t* Write(A,t) Read(B,t) t:=t* Write(B,t) Flush Log Output(A) Output(B) Example:
Recovery for Redo Logging 1.Identify committed transactions. 2.Examine the log forward, from earliest to latest. –Consider only the committed transactions, T. – For each in the log do: WRITE(X,v); OUTPUT(X); Note 1: Uncommitted transactions will have no effect on the DB (unlike in undo logging) This is because none of the changes of an uncommitted T have reached the disk Note 2: “Redoing” starts from the head of the log; In effect, each data item X will have the value written by the last transaction in the log that changed X.
Checkpointing for Redo Logging The key action that we must take between the start and end of checkpoint is to write to disk all the “dirty buffers.” – Dirty buffers are those that have been changed by committed transactions but not written yet to disk. Unlike in the undo case, we don’t need to wait for active transactions to finish (in order to write ). However, we wait for copying dirty buffers of the committed transactions.
Checkpointing for Redo (Cont’d) 1. Write a record to the log, where T i ’s are all the active transactions. 2. Write to disk all the dirty buffers of transactions that had already committed when the START CKPT was written to log. 3. Write an record to log.
Checkpointing for Redo (Cont’d) The buffer containing value A might be dirty. If so, copy it to disk. Then write. During this period three other actions took place.
Recovery with Ckpt. Redo Two cases: 1.If the crash follows, we can restrict ourselves to transactions that began after and those in the START list. – This is because we know that, in this case, every value written by committed transactions, before START CKPT(…) , is now in disk. 2. If the crash occurs between and, then go and find the previous and do the same as in the first case. – This is because we are not sure that committed transactions before START CKPT(…) have their changes in disk.
Undo/Redo Logging Problem: Both previous methods have some drawbacks: Undo requires data to be written to disk in order to commit a transaction – this increases the # of disk I/O’s Redo requires keeping all modified blocks buffered until after transaction commits – this increases the average # of buffers needed by transactions
Undo/Redo Logging Scheme Log entries are now: – which means that transaction T updated DB element X from old value o to new value n Undo/Redo Rules UR1: Log records for a DB element X must be on disk before any database modification to X appears on disk. Note: No constraint here about whether DB elements are sent to disk before or after the commit point. – This scheme has the characteristics of both UNDO and REDO schemes in that it writes the update log records first.
ActiontBuff ABuff BA in HDB in HD Log Read(A,t)8 888 t:=t* Write(A,t) Read(B,t) t:=t* Write(B,t) Flush Log Output(A) Output(B) Flush Log Example:
Undo/Redo Recovery The undo/redo recovery scheme: 1.Undo all incomplete transactions in the order latest-first. 2.Redo all committed transactions in the order earliest-first.
Undo/Redo Checkpointing 1. Write record to log, where T i ’s are all active transactions. 2. Write to disk “all” the dirty buffers, NOT ONLY OF THE COMMITED TRANSACTIONS. 3. Write an record to log.
Undo/Redo Recovery 1. Find problematic transactions: – Analysis phase: Scan the log backward back to previous checkpoint (pair of START CKPT, END CKPT); include every transaction T that either started after the checkpoint began or is in the “active” list at START CKPT. 2. If a transaction has no COMMIT record in the log, undo it. – Must proceed from the end to the front 3. If the transaction has a COMMIT record, redo it. – Must proceed from the earliest (front) to end
Example A crash occurs at this point Suppose the crash occurs just before. We identify T2 as committed but T3 as incomplete. It is not necessary to set B to 10 since we know that this change reached disk before. However, we need to REDO E; set E=21. Also, we need to UNDO T3; set D=19