PMIT-6102 Advanced Database Systems

PMIT-6102 Advanced Database Systems
By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University

Lecture -10 Concurrency Control

Outline Transaction Concurrency Control Transaction state Schedule
serializability, time-stamping, and shadow-database scheme. Concurrency Control Concurrency Three classic problems and example of solution Deadlock

Transaction State Active, the initial state; the transaction stays in this state while it is executing Partially committed, after the final statement has been executed. Failed, after the discovery that normal execution can no longer proceed. Aborted, after the transaction has been rolled back and the database restored to its state prior to the start of the transaction. Two options after it has been aborted: restart the transaction – only if no internal logical error kill the transaction Committed, after successful completion.

Transaction State (Cont.)

Shadow-database scheme
The recovery-management component of a database system implements the support for atomicity and durability. The shadow-database scheme: assume that only one transaction is active at a time. a pointer called db_pointer always points to the current consistent copy of the database. all updates are made on a shadow copy of the database, and db_pointer is made to point to the updated shadow copy only after the transaction reaches partial commit and all updated pages have been flushed to disk. in case transaction fails, old consistent copy pointed to by db_pointer can be used, and the shadow copy can be deleted.

The shadow-database scheme:
Assumes disks to not fail Useful for text editors, but extremely inefficient for large databases: executing a single transaction requires copying the entire database.

Schedules Schedules – sequences that indicate the chronological order in which instructions of concurrent transactions are executed a schedule for a set of transactions must consist of all instructions of those transactions must preserve the order in which the instructions appear in each individual transaction. Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B. The schedule in the figure is a serial schedule (Schedule 1 in the text), in which T1 is followed by T2.

Schedule (Cont.) In both Schedule 1 and 3, the sum A + B is preserved.
Let T1 and T2 be the transactions defined previously. The following schedule (Schedule 3 in the text) is not a serial schedule, but it is equivalent to Schedule 1. In both Schedule 1 and 3, the sum A + B is preserved.

Schedules (Cont.) The following concurrent schedule (Schedule 4 in the text) does not preserve the value of the the sum A + B.

Serializability A schedule is serializable if it is equivalent to a serial schedule. Different forms of schedule equivalence give rise to the notions of: 1. conflict serializability view serializability Our simplified schedules consist of only read and write instructions.

Conflict Serializability
Instructions li and lj of transactions Ti and Tj respectively, conflict if and only if there exists some item Q accessed by both li and lj, and at least one of these instructions wrote Q. 1. li = read(Q), lj = read(Q). li and lj don’t conflict. 2. li = read(Q), lj = write(Q). They conflict. 3. li = write(Q), lj = read(Q). They conflict 4. li = write(Q), lj = write(Q). They conflict

Conflict Serializability (Cont.)
If a schedule S can be transformed into a schedule S´ by a series of swaps of non-conflicting instructions, we say that S and S´ are conflict equivalent. We say that a schedule S is conflict serializable if it is conflict equivalent to a serial schedule Example of a schedule that is not conflict serializable: T3 T4 read(Q) write(Q) write(Q) We are unable to swap instructions in the above schedule to obtain either the serial schedule < T3, T4 >, or the serial schedule < T4, T3 >.

Conflict Serializability (Cont.)
Schedule 3 below can be transformed into Schedule 1, a serial schedule where T2 follows T1, by series of swaps of non-conflicting instructions. Therefore Schedule 3 is conflict serializable. Schedule 3 Schedule 1

View Serializability Let S and S´ be two schedules with the same set of transactions. S and S´ are view equivalent if the following three conditions are met: 1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then transaction Ti must, in schedule S´, also read the initial value of Q. 2. For each data item Q if transaction Ti executes read(Q) in schedule S, and that value was produced by transaction Tj (if any), then transaction Ti must in schedule S´ also read the value of Q that was produced by transaction Tj . 3. For each data item Q, the transaction (if any) that performs the final write(Q) operation in schedule S must perform the final write(Q) operation in schedule S´.

View Serializability (Cont.)
A schedule S is view serializable if it is view equivalent to a serial schedule. Every conflict serializable schedule is also view serializable. Schedule 9 (from text) — a schedule which is view-serializable but not conflict serializable. Every view serializable schedule that is not conflict serializable has blind writes.

Recoverable schedule Recoverable schedule — if a transaction Tj reads a data items previously written by a transaction Ti , the commit operation of Ti appears before the commit operation of Tj. The following schedule (Schedule 11) is not recoverable if T9 commits immediately after the read If T8 should abort, T9 would have read (and possibly shown to the user) an inconsistent database state. Hence database must ensure that schedules are recoverable.

Recoverable schedule Cascading rollback – a single transaction failure leads to a series of transaction rollbacks. Consider the following schedule where none of the transactions has yet committed (so the schedule is recoverable) If T10 fails, T11 and T12 must also be rolled back. Can lead to the undoing of a significant amount of work

Recoverable schedule Cascadeless schedules — cascading rollbacks cannot occur; for each pair of transactions Ti and Tj such that Tj reads a data item previously written by Ti, the commit operation of Ti appears before the read operation of Tj. Every cascadeless schedule is also recoverable

Concurrency Control To ensure that it is, the system must control the interaction among the concurrent transactions. This control is achieved through one of a variety of mechanisms called concurrency control schemes. the most frequently used schemes are two-phase locking snapshot isolation.

Lock-Based Protocols One way to ensure isolation is to require that data items be accessed in a mutually exclusive manner; that is, while one transaction is accessing a data item, no other transaction can modify that data item. The most common method used to implement this requirement is to allow a transaction to access a data item only if it is currently holding a lock on that item.

Lock-Based Protocols A lock is a mechanism to control concurrent access to a data item Data items can be locked in two modes : 1. exclusive (X) mode. Data item can be both read as well as written. X-lock is requested using lock-X instruction. 2. shared (S) mode. Data item can only be read. S-lock is requested using lock-S instruction. Lock requests are made to concurrency-control manager. Transaction can proceed only after request is granted.

Lock-Based Protocols (Cont.)
Lock-compatibility matrix A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions Any number of transactions can hold shared locks on an item, but if any transaction holds an exclusive on the item no other transaction may hold any lock on the item. If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. The lock is then granted.

Lock-Based Protocols (Cont.)
Example of a transaction performing locking: T2: lock-S(A); read (A); unlock(A); lock-S(B); read (B); unlock(B); display(A+B) Locking as above is not sufficient to guarantee serializability — if A and B get updated in-between the read of A and B, the displayed sum would be wrong. A locking protocol is a set of rules followed by all transactions while requesting and releasing locks, indicating when a transaction may lock and unlock each of the data items. The set of all such schedules is a proper subset of all possible serializable schedules

Pitfalls of Lock-Based Protocols
If we do not use locking, or if we unlock data items too soon after reading or writing them, we may get inconsistent states. On the other hand, if we do not unlock a data item before requesting a lock on another data item, deadlocks may occur. Consider the partial schedule Neither T3 nor T4 can make progress — executing lock-S(B) causes T4 to wait for T3 to release its lock on B, while executing lock-X(A) causes T3 to wait for T4 to release its lock on A. Such a situation is called a deadlock. To handle a deadlock one of T3 or T4 must be rolled back and its locks released.

Pitfalls of Lock-Based Protocols (Cont.)
Starvation is also possible if concurrency control manager is badly designed. For example: A transaction may be waiting for an X-lock on an item, while a sequence of other transactions request and are granted an S-lock on the same item. Concurrency control manager can be designed to prevent starvation.

The Two-Phase Locking Protocol
One protocol that ensures serializability is the two-phase locking protocol. This protocol requires that each transaction issue lock and unlock requests in two phases: Phase 1: Growing Phase transaction may obtain locks transaction may not release locks Phase 2: Shrinking Phase transaction may release locks transaction may not obtain locks The protocol assures serializability. It can be proved that the transactions can be serialized in the order of their lock points (i.e. the point where a transaction acquired its final lock).

The Two-Phase Locking Protocol (Cont.)
Initially, a transaction is in the growing phase. The transaction acquires locks as needed. Once the transaction releases a lock, it enters the shrinking phase, and it can issue no more lock requests. Two-phase locking does not ensure freedom from deadlocks transactions T3 and T4 are two phase, but, in schedule 2 (Figure 15.7), they are deadlocked.

Cascading roll-back is possible under two-phase locking. Each transaction observes the two-phase locking protocol, but the failure of T5 after the read(A) step of T7 leads to cascading rollback of T6 and T7.

To avoid Cascading roll-back, follow a modified protocol called strict two-phase locking. Here a transaction must hold all its exclusive locks till it commits/aborts. Rigorous two-phase locking is even stricter: here all locks are held till commit/abort. In this protocol transactions can be serialized in the order in which they commit.

Review Problem with example and solved with Two Phase Locking

Three classic problems
Although transactions execute correctly, results may interleave in diff ways => 3 classic problems. Lost Update Uncommitted Dependency Inconsistent Analysis Lost Update Uncommitted Dependency Inconsistent Analysis It is the logical possibility of each of these that demands the need for concurrency control. The problems arise from two or more transactions reading or writing on the same part of the same database. Although each transaction executes correctly, the results may interleave in different ways to produce 3 classic problems.

Lost Update problem Time User 1 (Trans A) User2 (Trans B) 1 Retrieve t
3 Update t 4 5 6 7 t is a tuple in a table retrieved by both users in the course of both transactions. Transaction A loses an update at time 4. The update at t3 by transaction A is lost (overwritten) at t4 by B. t is a tuple in a table retrieved by both users in the course of both transactions. Transaction A loses an update at time 4. The update at t3 by transaction A is lost (overwritten) at t4 by B.

Uncommitted Dependency
Time User 1 (Trans A) User 2 (Trans B) 1 Update t 2 Retrieve t 3 Rollback 4 5 6 7 8 Here are two versions of this problem. In each one (Times 1-3 and 6-8). Transaction A is dependent on an uncommitted change at time t2 made by Transaction B which is lost on Rollback. If one transaction is allowed to retrieve (and/or update) a tuple which has been updated by another, but not yet committed by that transaction – perhaps it will never be committed (e.g., it is rollbacked, then transaction A sees the data no longer exists). One trans is allowed to retrieve/update) a tuple updated by another, but not yet committed. Trans A is dependent at time t2 on an uncommitted change made by Trans B, which is lost on Rollback.

Inconsistent Analysis
Initially: Acc 1 = 40; Acc2 = 50; Acc3 = 30; Time User 1 (Trans A) User 2 (Trans B) 1 Retrieve Acc 1 : Sum = 40 2 Retrieve Acc2 : Sum = 90 3 Retrieve Acc3 : 4 Update Acc3: 30 → 20 5 Retrieve Acc1: 6 Update Acc1: 40 → 50 7 commit 8 Retrieve Acc3: Sum = 110 (not 120) Trans A sees inconsistent DB state after B updated Accumulator => performs inconsistent analysis. Transaction A sees an inconsistent state of the DB (before and after transaction B has updated the Accumulator) and therefore performs inconsistent analysis.

Why these problems? Retrieve : ‘read’ (R) Update : ‘write’ (W).
interleaving two transactions => 3 PBS: RR – no problem WW – lost update WR – uncommitted dependency RW – inconsistent analysis Regarding Retrieve as a ‘read’ (R) and Update as a ‘write’ (W). These problems do in fact correspond to the three logical possibilities of inconsistency arising from interleaving two transactions. Consider successive actions: RR – no problem WR – uncommitted dependency WW – lost update RW – inconsistent analysis

How to prevent such problems?
IF risk of interference = low => concurrency control schemes ~ two-phase locking ~ common approach although it requires deadlock avoidance!! Lock applies to a tuple : exclusive (write; X) or shared(read; S). One way to manage concurrency to avoid such problems is by a locking protocol. Other approaches include serializability, time-stamping, and shadow-paging. See books. Where there is low risk of interference, two-phase locking is a common approach although this usually requires deadlock avoidance. A lock applies to a particular tuple and may be exclusive (X) or shared(S). More details in lecture and in books. A transaction can lock a database item from being updated (by another transaction) while it is being used. It acquires a lock on the item of interest and releases it when it has finished. Exclusion/ Exclusive Locks (X-Locks) Shared Locks (S-Locks) X-locks are also called write-locks and used when data will be changed by transaction. S-locks are also called read-locks and used for data which will be read by a transaction but not written.

Lost Update ‘solved’ No update lost but => deadlock Time
User 1 (Trans A) User2 (Trans B) 1 Retrieve t (get S-lock on t) 2 3 Update t (request X-lock on t) 4 wait 5 6 7 No update is lost but the result is deadlock – see later how to deal with this. Both transactions cannot win, one must be told to rollback (the victim). No update lost but => deadlock

Uncommitted Dependency solved
A is prevented from seeing change at time t2. Because transaction B acquires an X-lock on p at time t1, transaction A has to wait until transaction B has either committed or rollbacked. Transaction A is prevented from seeing an uncommitted change at time t2. A resolution of the first version of the problem. Second version is similar (check this!). Time User 1 (Trans A) User 2 (Trans B) 1 Update t (get X-lock on t) 2 Retrieve t (request S-lock on t) - 3 wait 4 5 Commit / Rollback (releases X-lock on t) 6 Resume: Retrieve t (get S-lock on t) 7 8 A is prevented from seeing change at time t2. Because transaction B acquires an X-lock on p at time t1, transaction A has to wait until transaction B has either committed or rollbacked. Transaction A is prevented from seeing an uncommitted change at time t2. A resolution of the first version of the problem. Second version is similar (check this!).

Inconsistent Analysis ‘solved’
Time User 1 (Trans A) User 2 (Trans B) 1 Retrieve Acc1 : (get S-lock) Sum = 40 2 Retrieve Acc2 : (get S-lock) Sum = 90 3 Retrieve Acc3: (get S-lock) 4 Update Acc3: (get X-lock) 30 → 20 5 Retrieve Acc1: (get S-lock) 6 Update Acc1: (request X-lock) wait 7 Retrieve Acc3: (request S-lock) Transaction B’s UPDATE at t6 is not accepted as transaction A holds an S-lock on it. Inconsistent analysis is prevented but deadlock occurs at time t7.

Deadlock Deadlock occurs when 2 or more transaction are in a simultaneous wait state. It is desirable to conceal deadlocks from the user. As we have seen, Deadlock occurs when 2 or more transaction are in a simultaneous wait state. The system must detect and break deadlocks by: Choosing one of the transactions as a victim and rolling it back. Timing out the transaction and returning an error (or automatically restarting the transaction hoping not to get deadlock again). Return an error code back to the victim (e.g., DEADLOCK VICTIM) and leaving it up to the program to handle the situation. It is desirable to conceal deadlocks from the user.

Deadlock Resolution The system must detect and break deadlocks by:
Choosing one trans as a victim and rolling it back. Timing out the trans and returning an error. automatically restarting the transaction hoping not to get deadlock again. Return an error code back to the victim and leaving it up to program to handle situation. As we have seen, Deadlock occurs when 2 or more transaction are in a simultaneous wait state. The system must detect and break deadlocks by: Choosing one of the transactions as a victim and rolling it back. Timing out the transaction and returning an error (or automatically restarting the transaction hoping not to get deadlock again). Return an error code back to the victim (e.g., DEADLOCK VICTIM) and leaving it up to the program to handle the situation. It is desirable to conceal deadlocks from the user.

Implementation of Locking
A Lock manager can be implemented as a separate process to which transactions send lock and unlock requests The lock manager replies to a lock request by sending a lock grant messages (or a message asking the transaction to roll back, in case of a deadlock) The requesting transaction waits until its request is answered The lock manager maintains a datastructure called a lock table to record granted locks and pending requests The lock table is usually implemented as an in-memory hash table indexed on the name of the data item being locked

Thank you

PMIT-6102 Advanced Database Systems

Similar presentations

Presentation on theme: "PMIT-6102 Advanced Database Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PMIT-6102 Advanced Database Systems

Similar presentations

Presentation on theme: "PMIT-6102 Advanced Database Systems"— Presentation transcript:

Similar presentations

About project

Feedback