CM20145 Recovery + Intro. to Concurrency

Slides:



Advertisements
Similar presentations
Chapter 17: Recovery System
Advertisements

Database System Concepts 5 th Ed. © Silberschatz, Korth and Sudarshan, 2005 See for conditions on re-usewww.db-book.com Chapter 16 : Concurrency.
CM20145 Concurrency Control
Dr Alwyn Barry Dr Joanna Bryson
CM20145 Transactions & Serializability
Chapter 16: Recovery System
1 Integrity Ioan Despi Transactions: transaction concept, transaction state implementation of atomicity and durability concurrent executions serializability,
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Lock-Based Concurrency Control
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
Crash Recovery.
Crash Recovery. Review: The ACID properties A A tomicity: All actions in the Xaction happen, or none happen. C C onsistency: If each Xaction is consistent,
©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.
Quick Review of May 1 material Concurrent Execution and Serializability –inconsistent concurrent schedules –transaction conflicts serializable == conflict.
Database Management Systems I Alex Coman, Winter 2006
©Silberschatz, Korth and Sudarshan17.1Database System Concepts Chapter 17: Recovery System Failure Classification Storage Structure Recovery and Atomicity.
©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.
©Silberschatz, Korth and Sudarshan17.1Database System Concepts 3 rd Edition Chapter 17: Recovery System Failure Classification Storage Structure Recovery.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
TRANSACTIONS A sequence of SQL statements to be executed "together“ as a unit: A money transfer transaction: Reasons for Transactions : Concurrency control.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 17: Recovery System.
Transactions Sylvia Huang CS 157B. Transaction A transaction is a unit of program execution that accesses and possibly updates various data items. A transaction.
TRANSACTIONS. Objectives Transaction Concept Transaction State Concurrent Executions Serializability Recoverability Implementation of Isolation Transaction.
International Computer Institute, Izmir, Turkey Transactions Asst. Prof. Dr. İlker Kocabaş UBİ502 at
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan Chapter 15: Transactions.
Transaction Lectured by, Jesmin Akhter, Assistant professor, IIT, JU.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 15: Transactions.
Databases Illuminated
PMIT-6102 Advanced Database Systems
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
Recovery System By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Recovery system By Kotoua Selira. Failure classification Transaction failure : Logical errors: transaction cannot complete due to some internal error.
Chapter 11 Concurrency Control. Lock-Based Protocols  A lock is a mechanism to control concurrent access to a data item  Data items can be locked in.
Chapter 15 Concurrency Control Yonsei University 1 st Semester, 2015 Sanghyun Park.
Concurrency Control in Database Operating Systems.
1 Recovery System 1. Failure classification 2. Storage structure 3. Data access 4.Recovery & atomicity 5. Log-based recovery 6. Shadow paging 7. Recovery.
Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.
©Silberschatz, Korth and Sudarshan15.1Database System Concepts Chapter 15: Transactions Transaction Concept Transaction State Implementation of Atomicity.
©Silberschatz, Korth and Sudarshan15.1Database System Concepts Chapter 15: Transactions Transaction Concept Transaction State Implementation of Atomicity.
Chapter 10 Recovery System. ACID Properties  Atomicity. Either all operations of the transaction are properly reflected in the database or none are.
7c.1 Silberschatz, Galvin and Gagne ©2003 Operating System Concepts with Java Module 7c: Atomicity Atomic Transactions Log-based Recovery Checkpoints Concurrent.
©Silberschatz, Korth and Sudarshan14.1Database System Concepts - 6 th Edition Chapter 14: Transactions Transaction Concept Transaction State Concurrent.
Chapter 17: Recovery System
D ATABASE A DMINISTRATION L ECTURE N O 5 Muhammad Abrar.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 17: Recovery System.
Database Recovery Zheng (Godric) Gu. Transaction Concept Storage Structure Failure Classification Log-Based Recovery Deferred Database Modification Immediate.
16.1Database System Concepts - 6 th Edition Chapter 16: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based Recovery.
8.1 Oporavak Sistema od Kvara Oporavak Sistema od Kvara BAZE PODATAKA.
Lecture 11- Recovery System Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
Database Recovery Techniques
Recovery.
Backup and Recovery Techniques
Database Recovery Techniques
File Processing : Recovery
Chapter 10 Recover System
BBM 471 – Veritabanı Yönetim Sistemleri
Database Management System
Concurrency Control.
Chapter 16: Recovery System
Transactions Sylvia Huang CS 157B.
Chapter 10 Transaction Management and Concurrency Control
Chapter 15 : Concurrency Control
Module 17: Recovery System
Recovery System.
Chapter 17: Recovery System
Backup and Recovery Techniques
Database Recovery 1 Purpose of Database Recovery
UNIT -IV Transaction.
Lecture 9 Recovery System
Presentation transcript:

CM20145 Recovery + Intro. to Concurrency Dr Alwyn Barry Dr Joanna Bryson CM20145 Recovery + Intro. to Concurrency

Last Time… Now: Recovery & Some Concurrency Transaction Concepts ACID Possible States Schedules Serializability Conflict View Others Testing for Serializability Precedence Graphs Now: Recovery & Some Concurrency

Overview Recovery Intro. to Concurrency Cascading Rollbacks Storage & Data Access Algorithms for: Shadow paging, Log-based recovery Deferred & immediate DB modifications. Checkpoints Intro. to Concurrency Introduction to Locking Pitfalls of Locking The Two-Phase Locking Protocol Weaker Levels of Consistency

Recovery Algorithms Recovery algorithms are techniques to ensure database consistency and transaction atomicity and durability despite failures. Recovery algorithms have two parts Actions taken during normal transaction processing to ensure enough information exists to recover from failures. Actions taken after a failure to recover the database contents to a state that ensures atomicity, consistency and durability.

Failure – Classifications Transaction failure: Logical errors: transaction cannot complete due to some internal error condition. System errors: the database system must terminate an active transaction due to an error condition (e.g., deadlock – covered in Lecture 15). System crash: a power failure or other hardware or software failure causes the system to crash. Fail-stop assumption: non-volatile storage contents are assumed to not be corrupted by system crash. Database systems have numerous integrity checks to prevent corruption of disk data. Disk failure: a head crash or similar disk failure destroys all or part of disk storage. Destruction is assumed to be detectable. Disk drives use checksums to detect failures.

Recoverability How do we address failures when we are running concurrent transactions? Recoverable schedule: if a transaction Tj reads a data item previously written by a transaction Ti , the commit operation of Ti appears before the commit operation of Tj The schedule above is not recoverable if T9 commits immediately after the read. If T8 should abort, T9 would have read (and possibly shown to the user) an inconsistent database state. Hence database must ensure that schedules are recoverable. ©Silberschatz, Korth and Sudarshan Modifications & additions by J Bryson

Cascading Rollbacks Cascading rollback – a single transaction failure leads to a series of transaction rollbacks. Consider the following schedule where none of the transactions has yet committed (so the schedule is recoverable). If T10 fails, T11 and T12 must also be rolled back. Can lead to the undoing of a significant amount of work.

Cascadeless Schedules Cascadeless schedules — cascading rollbacks cannot occur; for each pair of transactions Ti and Tj such that Tj reads a data item previously written by Ti, the commit operation of Ti appears before the read operation of Tj Every cascadeless schedule is also recoverable.

Overview Recovery Intro. to Concurrency Cascading Rollbacks Storage & Data Access Algorithms for: Shadow paging, Log-based recovery Deferred & immediate DB modifications. Checkpoints Intro. to Concurrency Introduction to Locking Pitfalls of Locking The Two-Phase Locking Protocol Weaker Levels of Consistency

Storage Hierarchy (Lecture 9)

Storage Structure Volatile storage: Nonvolatile storage: Does not survive system crashes. Examples: main memory, cache memory. Nonvolatile storage: Survives system crashes. Examples: disk, tape, flash memory, non-volatile (battery backed up) RAM. Stable storage: A mythical form of storage that survives all failures. Approximated by maintaining multiple copies on distinct nonvolatile media.

Stable-Storage Implementation Maintain multiple copies of each block on separate disks (& locations…) Failure during data transfer can still result in inconsistent copies. Block transfer can result in: Successful completion, Partial failure – destination block has incorrect information, or Total failure – destination block was never updated.

Data Access Physical blocks: blocks residing on the disk. Buffer blocks: blocks residing temporarily in main memory. Block movements between disk and main memory are initiated through the following two operations: input(B) transfers the physical block B to main memory. output(B) transfers the buffer block B to the disk, and replaces the appropriate physical block there. Each transaction Ti has its private work-area in which local copies of all data items accessed and updated by it are kept. Ti's local copy of a data item X is called xi. We assume, for simplicity, that each data item fits in, and is stored inside, a single block.

Sample Data Access Diagram buffer Buffer Block A input(A) x A Buffer Block B Y output(B) B read(X) write(Y) disk x2 x1 y1 work area of T1 work area of T2 memory

Data Access (Cont.) Transaction transfers data items between system buffer blocks and its private work-area. Transactions Perform read(X) while accessing X for the first time; All subsequent accesses are to the local copy. After last access, transaction executes write(X). output(BX) need not immediately follow write(X). System can perform the output operation when it deems fit. Reminder: Volatile memory is faster, but more vulnerable!

Protecting Storage (FYI, not for exam) During data transfer two copies of each block: Write the information onto the first physical block. When the first write successfully completes, write the same information onto the second physical block. The output is completed only after the second write successfully completes. To recover from failure: First find inconsistent blocks: Expensive solution: Compare the 2 copies of every disk block. Better solution: Record in-progress disk writes on non-volatile storage (Non-volatile RAM or special area of disk). Use this information during recovery to find blocks that may be inconsistent, and only compare copies of these. Used in hardware RAID systems. If either copy of an inconsistent block is detected to have an error (bad checksum), overwrite it by the other copy. If both have no error, but are different, overwrite the second block by the first block.

Overview Recovery Intro. to Concurrency Cascading Rollbacks Storage & Data Access Algorithms for: Shadow paging, Log-based recovery Deferred & immediate DB modifications. Checkpoints Intro. to Concurrency Introduction to Locking Pitfalls of Locking The Two-Phase Locking Protocol Weaker Levels of Consistency

Recovery and Atomicity Modifying the database without ensuring that the transaction will commit may leave the database in an inconsistent state. To ensure atomicity despite failures, we first output information describing the modifications to stable storage without modifying the database itself. Two approaches shown here: shadow-paging (naïve), and log-based recovery. We’ll assume that transactions run serially (book goes further if you’re curious).

Shadow Database Assume only one transaction is active at a time. db_pointer always points to the current consistent copy of the database. Updates made on a copy of the database. Pointer moved to updated copy after transaction reaches partial commit & pages written. On transaction failure, old consistent copy pointed to by db_pointer is used, and the shadow copy is deleted. Assumes disks don’t fail. Useful for text editors, but extremely inefficient for large database -- executing a single transaction requires copying the entire database!

Log-Based Recovery A log is kept on stable storage. A log is a sequence of log records, records the update activities on the database. When transaction Ti starts, it registers itself by writing a <Ti start> log record. Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is the value of X before the write, and V2 is the value to be written to X. When Ti finishes its last statement, the log record <Ti commit> is written. Assume here that log records are written directly to stable storage (that is, they are not buffered). Two approaches using logs: Deferred database modification. Immediate database modification.

Deferred Database Modification Deferred database modification scheme records all modifications to the log, but defers all writes to after partial commit. Transaction starts by writing <Ti start> record to log. A write(X) operation results in a log record <Ti, X, V> being written, where V is the new value for X. Note: old value is not needed for this scheme. The real write is not performed on X at this time, but is deferred. When Ti partially commits, <Ti commit> is written to the log. Finally, the log records are used to actually execute the previously deferred writes. Assumes that transactions execute serially.

Deferred DB Modification (2) During recovery, a transaction needs to be redone if and only if both <Ti start> and<Ti commit> are there in the log. Redoing a transaction Ti ( redoTi) sets the value of all data items updated by the transaction to the new values. Crashes can occur while: the transaction is executing the original updates, or while recovery action is being taken Example: T0 and T1 (T0 executes before T1): T0: read (A) T1 : read (C) A: - A - 50 C:- C- 100 write (A) write (C) read (B) B:- B + 50 write (B)

Deferred DB Modification (3) Consider a log at three instances of time. If log on stable storage at time of crash: (a) No redo actions need to be taken. (b) redo(T0) must be performed since <T0 commit> is present. (c) redo(T0) must be performed followed by redo(T1) since <T0 commit> and <Ti commit> are present.

Immediate DB Modification The immediate database modification scheme allows database updates of an uncommitted transaction to be made as the writes are issued. Since undoing may be needed, update logs must have both old value and new value. Update log record must be written before database item. Log record must be output directly to stable storage. Can postpone log record output, so long as prior to execution of an output(B) operation, all log records corresponding to items B are flushed to stable storage. Output of updated blocks can take place at any time before or after transaction commit. Order in which blocks are output can be different from the order they are written.

Overview Recovery Intro. to Concurrency Cascading Rollbacks Storage & Data Access Algorithms for: Shadow paging, Log-based recovery Deferred & Immediate DB modifications. Checkpoints Intro. to Concurrency Introduction to Locking Pitfalls of Locking The Two-Phase Locking Protocol Weaker Levels of Consistency

Checkpoints Problems with log-based recovery procedure: Searching the entire log is time-consuming. We might unnecessarily redo transactions which have already output their updates to the database. Can streamline recovery procedure by periodically performing checkpointing. Output all log records currently residing in main memory onto stable storage. Output all modified buffer blocks to the disk. Write a log record <checkpoint> onto stable storage.

Checkpoints & Recovery Need consider only transactions that didn’t commit before checkpoint. Simple algorithm if serialized transactions: Scan backwards from end of log to find the most recent <checkpoint> record. Continue scanning backwards till a record <Ti start> is found. Need only consider the part of log following above start record. Earlier part of log can be ignored during recovery, and can be erased whenever desired. For all transactions with no <Ti commit>, execute undo(Ti). Scanning forward in the log, for all transactions starting from Ti or later with a <Ti commit>, execute redo(Ti).

Example of Checkpoints T1 can be ignored (updates already output to disk due to checkpoint) T2 and T3 redone. T4 undone Tc Tf T1 T2 T3 T4 checkpoint system failure

Overview Recovery Intro. to Concurrency Cascading Rollbacks Storage & Data Access Algorithms for: Shadow paging, Log-based recovery Deferred & Immediate DB modifications. Checkpoints Intro. to Concurrency Introduction to Locking Pitfalls of Locking The Two-Phase Locking Protocol Weaker Levels of Consistency

Concurrency Goal – to develop concurrency control protocols that will ensure serializability. These protocols will impose a discipline that avoids nonseralizable schedules. A common concurrency control protocol uses locks. While one transaction is accessing a data item, no other transaction can modify it. Require a transaction to lock the item before accessing it. Topic of Lecture 15! But we’ll introduce locking now.

Lock-Based Protocols A lock is a mechanism to control concurrent access to a data item. Lock requests are made to concurrency-control manager. Transaction can proceed only after request is granted. Data items can be locked in two modes: exclusive (X) mode. Data item can be both read and written. X-lock is requested using the lock-X instruction. shared (S) mode. Data item can only be read. S-lock is requested using lock-S.

Lock-Based Protocols (2) Lock-compatibility matrix: A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions Any number of transactions can hold shared locks on an item, but if any transaction holds an exclusive on the item no other transaction may hold any lock on the item. If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. The lock is then granted.

Lock-Based Protocols (3) Example of a transaction performing locking: T2: lock-S(A); read (A); unlock(A); lock-S(B); read (B); unlock(B); display(A+B) Locking as above is not sufficient to guarantee serializability — if A and B get updated between the read of A and B, and the display of their sum, that sum would be out of date. A locking protocol is a set of rules followed by all transactions while requesting and releasing locks. Locking protocols restrict the set of possible schedules.

Pitfalls of Lock-Based Protocols Consider the partial schedule: Neither T3 nor T4 can make progress. Executing lock-S(B) causes T4 to wait for T3 to release its lock on B, while executing lock-X(A) causes T3 to wait for T4 to release its lock on A. Such a situation is called a deadlock. To handle a deadlock one of T3 or T4 must be rolled back and its locks released.

Pitfalls of Locking (2) The potential for deadlock exists in most locking protocols. Deadlocks are a necessary evil. Starvation is also possible if the concurrency-control manager is badly designed. For example: A transaction may be waiting for an X-lock on an item, while a sequence of other transactions request and are granted an S-lock on the same item. The same transaction is repeatedly rolled back due to deadlocks. Concurrency-control managers can be designed to prevent starvation.

The Two-Phase Locking Protocol This is a protocol which ensures conflict-serializable schedules. Phase 1: Growing Phase transaction may obtain locks transaction may not release locks Phase 2: Shrinking Phase transaction may release locks transaction may not obtain locks The protocol assures serializability. It can be proved that the transactions can be serialized in the order of their lock points (i.e. the point where a transaction acquired it’s final lock.)

The Two-Phase Locking Protocol Two-phase locking does not ensure freedom from deadlocks Cascading roll-back is possible under two-phase locking. Avoided with strict two-phase locking. Transaction must hold all its exclusive locks till it commits/aborts. Rigorous two-phase locking is even stricter: All locks are held till commit/abort. This lets protocol transactions be serialized in the order in which they commit.

Lock Conversions Two-phase locking with lock conversions: – First Phase: can acquire a lock-S on item can acquire a lock-X on item can convert a lock-S to a lock-X (upgrade) – Second Phase: can release a lock-S can release a lock-X can convert a lock-X to a lock-S (downgrade) This protocol assures serializability. But still relies on the programmer to insert the various locking instructions.

Overview Recovery Intro. to Concurrency Cascading Rollbacks Storage & Data Access Algorithms for: Shadow paging, Log-based recovery Deferred & Immediate DB modifications. Checkpoints Intro. to Concurrency Introduction to Locking Pitfalls of Locking The Two-Phase Locking Protocol Weaker Levels of Consistency

Weak Levels of Consistency Degree-two consistency: differs from two-phase locking in that S-locks may be released at any time, and locks may be acquired at any time X-locks must be held till end of transaction Serializability is not guaranteed, programmer must ensure that no erroneous database state will occur] Cursor stability: For reads, each tuple is locked, read, and lock is immediately released X-locks are held till end of transaction Special case of degree-two consistency FYI only – you aren’t responsible for this slide’s content .

Weak Consistency in SQL SQL allows non-serializable executions Serializable: is the default Repeatable read: allows only committed records to be read, and repeating a read should return the same value (so read locks should be retained) However, the phantom phenomenon need not be prevented T1 may see some records inserted by T2, but may not see others inserted by T2 Read committed: same as degree two consistency, but most systems implement it as cursor-stability Read uncommitted: allows even uncommitted data to be read FYI only – you aren’t responsible for this slide’s content .

Summary Recovery Intro. to Concurrency Next: Concurrency Control Cascading & Its Avoidance Storage & Data Access Algorithms for: Shadow paging, Log-based recovery Deferred & immediate DB modifications. Checkpoints Intro. to Concurrency Introduction to Locking Pitfalls of Locking The Two-Phase Locking Protocol Weaker Levels of Consistency (for interest only) Next: Concurrency Control

Reading & Exercises Reading Exercises: Silberschatz Ch: 17.1-6. Connolly & Begg: 20.3 You will need the rest of 20.2 for next week, so if you want to stay in order go ahead and read that. Exercises: Silberschatz 17.1-7 Connolly & Begg 20.13-15, 20.27