Recovery technique
Recovery concept Recovery from transactions failure mean data restored to the most recent consistent state just before the time of failure. How to recovery The system keep track of information in system log. Use this information for recovery
Strategy for recovery Disk Crash: the recovery method restores a past copy of the database that was backed up to archival storage (tape and etc.) and reconstructs a more current state by reapplying or redoing the operations of committed transactions from the backed up log, up to the time of failure. Not physically damaged but has become inconsistent from some failure: the strategy is to reverse any changes that cause inconsistency by undoing and redo operations.
Techniques for recovery Deferred update Complete -> update Immediate update Change -> update Shadow Page
Deferred update Do not physically update the database on disk until after a transaction reaches its commit point; Then updates are recorded persistently in the log and the written to the database. Before reaching commit point, the transaction updates are recorded in the local transaction workspace (buffers) During commit, the updates are first recorded persistently in log and then write to the database.
Transaction fail If transaction fails before reaching commit point, it will not have changed the database. (no need undo) It may necessary to REDO the effect of the operations of a committed transaction from the log, because their effect may not yet have been recorded. Deferred update is known as “NO-UNDO/REDO Algorithm”
Recovery based on deferred update 1.This technique postpone any actual update to the database until the transaction complete and reached check point. 2.During transaction execute updates are recorded in log file and in cache buffer. After transaction reaches it commit point and the log file is forced to write to disk, the update are record to database.
LOG File Database Update data COMMIT write Force write log to Disk update
Fail before commit, no need undo. Simplify recovery, can not use in practice because unless transaction are short and each transaction change few times. May running out of buffer space because transaction change must be held in buffer until commit.
State A transaction can not change the database on disk until it reaches it commit point. A transaction does not reach its commit point until all its update operations are recorded in the log and the log is force written to disk.
Recovery using deferred update in a single-user environment (RDU_S) RDU_S use 2 lists of transactions Committed transaction since the last checkpoint, the active transactions Apply redo operation to all the write_item operations of the committed transactions from the log in the order in which they were written to the log. Restart the active transaction
RDU_S procedure Redo procedure REDO(write_op): Redoing a write_item operation write_op consists of examining its log entry[write_item,T,X,new_value] and setting the value of item X in the database to new_value, which is the after image[AFIM]
Example T1 T2 read_item(A) read_item(B) Read_item(D) write_item(B) Write_item(D) read_item(D) write_item(D) The read and write operations of 2 transactions LOG [start_transaction,T1] Write_item,T1,D,20 Commit, T1 Start_transaction,T2 Write_item,T2,B,10 Write_item,T2,D,25 System crash The [write_item,T,D,20] operations of T1 are redone. T2 log entries are ignore by the recovery process
Deferred update concurrent execute in Multi-user environment (RDU_M) Depend on protocol use in concurrency control In 2 phase locking Log on item remain in effect until the transaction reaches its commit point. After that the locks can be released Assume [checkpoint] entries are includes in the log Algorithm
Procedure RMU_M(with checkpoint) Use 2 lists of transactions maintained by the system: commit list : The commit transactions T since the last checkpoint active list : the active transactions T’ Redo all the write operations of the committed transactions from the log, in the order which there were written into the log. The transactions that are active and did not commit are effectively canceled and must be submitted
T1 T2 T3 System crash Checkpoint T4 T5 t1 t2 T4,T5 ignored T2,T3 redo
T1T2T3T4 read(A)read(B)read(A)read(B) Read(D)write(B)write(A)write(B) Write(D)read_item(D)read(C)read(A) write(D)write(C)write(A) System crash Ignore T2,T3 Redone T4 because its commit point is after the last system checkpoint RECOVERY [Start_transaction,T1] [Write_item,T1,D,20] [Commit,T1] [Checkpoint] [start_transaction,T4] [Write_item,T4,B,15] [Write_item,T4,A,20] [commit,t4] [Start_transaction,T2] [write_item,T2,B,12] [Start_transaction,T3] [write_item,T3,A,30] [write_item,T2,D,25] LOG
Recovery from Immediate update technique Database may be updated by some operations of a transaction before the transaction reaches its commit point. These operations are typically recorded in the log on disk by force writing before applied to the database.
Transaction fail If a transaction fail after recording some change to the database, but before commit point, the effect of its operations on the database must be undone (transaction must be rollback) Need both undo and redo in recovery immediate update is known as “UNDO/REDO Algorithm”
Undo/redo recovery based on immediate update in a single-user environment If fail occurs, the executing (active) transaction at the time of failure may have recorded some changes in the database. The effect must be undone The recovery algorithm RIU_S
RIU_S Use 2 lists of transactions maintained by the system Commit list :The committed since checkpoint Active List : The active transactions Undo all the write_item operations of the active transactions from the log using undo procedure Redo the write_item operations of the committed transactions from the log in order which there were written in the log, using redo procedure
Procedure RIU_S Undoing a write_item operation write_op consists of examinating its log entry [write_item,T,X,old_val,new_val] And setting X in the database to old_val which before image[BTFM] undoing a number of write_item from one to more transaction must “reverse order”
Undo/Redo recovery Based on Immediate Update with concurrent Execution Recovery depend on Protocol used for concurrency control Assume log include checkpoints and strict schedules – the strict 2 phase locking protocol A strict schedule does not allow a transaction to read and write an item unless the transaction that last wrote the item has committed (or abort and rollback)
Procedure RIU_M Use 2 lists of transactions The commit transaction since the last checkpoint and The active transactions Undo all the write_item operations of the active transactions using undo procedure Redo all the write_item operations of the committed transactions from the log, in the order in which they were written into the log
SHADOW PAGING This technique does not require LOG in single user environment In multi-user may need LOG for concurrency control method Shadow paging considers The database is partitioned into fixed-length blocks referred to as PAGES assume n pages (no. 1-n). Page table has n entries – one for each database page. Each contain pointer to a page on disk (1 to 1 st page on database and so on…). The idea is to maintain 2 pages tables during the life of transaction. The current page table The shadow page table When transaction starts, both page tables are identical The shadow page table is never changed over the duration of the transaction. The current page table may be changed when a transaction performs a write operation. All input and output operations use the current page table to locate database pages on disk.
T j perform write(X) and x in i th page 1.if the i th page is not already in main memory, then the system issue input(X) 2.If this is first time write to i th page by this transaction, then the system modified the current page table as follows: a.It finds an unused page on disk. b.It deletes the page found in step 2a from the list free page frames; its copies the contents of the i th page to the page found in step 2a. c.It modified the current page table so that the i th entry points to the page found in step 2a. 3.It assigns the value of x j to X in the buffer page
Page 5 (old) Page 1 Page 4 Page 2 (old) Page 3 Page 6 Page 2 (new) Page 5 (new) Shadow directory Or shadow page table (not update) Current directory Or current page table (after update pages 2,5) Page on disk
Directory keep in main memory if not to large When execute transaction The current directory (entry point to the most recent or current database page on disk) is copied to shadow directory The shadow directory is then saved on disk while the current directory is used by the transaction During transaction execute, the shadow directory is never modified Write data, create new page and still keep old copy Modified the current directory to point to new page
ARIES: Algorithms for Recovery and Isolation Exploiting Semantics Most systems today use a scheme called “ARIES” or something very close to that schema. C. Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahes, and Peter Schwartz: “Aries: A transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks using Write ahead logging”, ACM TODS 17, No. 1(march 1992) Based on 3 concepts Write-ahead logging Repeating history during redo Logging changes during undo
Write ahead logging: BFIM of data item is recorded in the appropriate log entry and that log entry is flushed to disk before, the BFIM is overwritten with the AFIM in the database on disk. Repeating history during redo: This mean ARIES will restore all actions of the database system period to the crash to reconstruct the database state when the crash occurred. Transaction were uncommitted at that time are undone. Logging during undo: This will prevent ARIES from repeating the complete undo operations. If a failure occurs during recovery, which cause a restore to the recovery process.
ARIES Recovery Algorithm There 3 main phase Analysis: Build the REDO and UNDO lists Redo: Start from a position in the log determined in the analysis phase and restore the database to the state it was in at the time of the crash. Undo: Undo the effects of transactions that fails to commit.
Media Recovery Media fail such as disk crash or disk controller failure– in which some portion of the database has been physically destroyed. Recovery from such a failure basically involves reloading or restoring the database from a backup copy, and then use the log – both active and archive portion To redo all transactions that completed since that backup copy was taken (forward recovery) No need to undo the transactions that were still in progress at the time of failure, since by definition all updates of such transactions have been “undone” (actually lost) anyway.
2 phases commit 2 phase commit is important whenever transaction can interact with several independent “resource managers”, each manager have it own resource and maintaining its own recovery log. Complete transaction, the system wide instruction it issues is commit not rollback. On receiving the commit request, the coordinator goes through the following 2 phase process Prepare: first, it is instructs all resource managers to get ready to “go either way” on the transaction. Mean each resource manager must force all log records for local resources used by the transaction out to its own physical log. When success the resource manager now replies OK. Commit: after get reply (OK) from all participants will commit.