1 Lecture 4: Transaction Serialization and Concurrency Control Advanced Databases CG096 Nick Rossiter [Emma-Jane Phillips-Tait]
2 Content 1 Concurrent Transactions and Parallel Execution of Operations 2 Problems with Concurrency 3 Scheduling of Transaction Execution 4 Locking Techniques for Concurrency Control 5 Optimistic Strategy for Transaction Management
3 1. Concurrent Transactions Transaction resources Transactions consist of operations, which are executed in a fixed sequence (serially) Operations have starting point and end point (duration) Operations manipulate data (parameters) Sources of Concurrency during transaction execution Operations from different transactions can overlap (parallel execution) Data can be visible by more then one transaction (shared data)
4 Example: Flight Reservation seats seat seats
5 1.1 Typical situations requiring concurrency control Exclusive access to an external device or shared service (e.g., managing printer queues) Coordination of applications which process parallel data (e.g. parallel DB servers) Disabling or enabling execution of the client programs in a specific moment (typically for database administration - e.g. database backups, enforcing resource occupation, etc.) Detection of transaction ends when managing multiple sessions for connection to the database (client/server architectures, Web access)
6 1.2 Transaction Properties and Transaction Management ACID properties as implemented by DBMS guarantee correct behaviour for transactions only to certain extent operations are independent the effect of the operation execution does not change if operations from other transactions mix with them In other cases the application should incorporate an explicit control mechanism for preserving the original logics of transaction operations using DBMS utilities for programming the application (e.g. Oracle DBMS_TRANSACTION package) using specialized transaction servers between the application and DB (e.g. Microsoft MTS, Java JTS)
7 2. Problems with Concurrency (in absence of locking) Lost Update problem - losing values due to intervention of write operation from other overlapping transactions Temporary Update problem - discarding previous changes made by overlapping transaction after rollback Incorrect Summary problem - overwriting of certain values used for calculation by write operations from other transactions
8 2.1 Lost Update Problem Time T0T0 Transaction A Transaction BValue Start A6 T1T1 Read Value (6) 6 T2T2 Add 2 (6+2=8)Read Value (6) 6 T3T3 Write Value (8)Add 3 (6+3=9)8 T4T4 End AWrite Value (9)9 Start B What should the final Order Value be? Which Update has been lost? T5T5 End B9
9 2.2 Temporary Update Problem Time T0T0 Transaction ATransaction BValue Start A6 T1T1 Read Value (6) 6 T2T2 Add 2 (8)6 T3T3 Write Value (8)8 T4T4 Failure: Rollback!8Read Value (8) Start B T5T5 Write Value (6)Add 3 (8+3=11)6 Write Value (11)T6T6 End A 11 What should the final Order Value be? Where is the temporary update? T5T5 End B11
Incorrect Summary Problem Time T0T0 Transaction A Transaction BValues T1T1 Read 1 st Value (6) 6363 T2T2 Add 2 (6+2=8) 6363 T3T3 Write 1 st Value (8) 8383 T4T T5T5 Add 2 (3+2 = 5) 8383 Write 2 nd Value (5) 8585 Read 2 nd Value (3) Read 1 st Value (8) Read 2 nd Value (3) Total Sum = 11 What should the total Order Value be? Which order was accumulated before update, and which after?
11 3. Scheduling of Transaction Execution A schedule S of n transactions is a sequential ordering of the operations of the n transactions. The transactions are interleaved A schedule maintains the order of operations within the individual transaction. For each transaction T if operation a is performed in T before operation b, then operation a will be performed before operation b also in S. The operations are in the same order as they were before the transactions were interleaved Two operations conflict if they belong to different transactions AND access the same data item AND one of them is a write. read x write x read x write x read x write x T1 T2 S
Serial and Non-serial Schedules A schedule S is serial if, for every transaction T participating in the schedule, all of T's operations are executed consecutively in the schedule; otherwise it is called non-serial. Non-serial schedules mean that transactions are interleaved. There are many possible orders of operations in alternative schedules. A schedule S consisting of n transactions is serialisable if it is equivalent to some serial schedule of the same n transactions. The results from serial schedules always leave the database in a consistent state never suffer from interference by one transaction with another vary according to the order in which the transactions are performed
13 Schedule B Example of Serial Schedules Schedule A
14 Example of Non-serial Schedules Schedule C Schedule D We have to figure out whether a schedule is equivalent to a serial schedule, i.e. the reads and writes are in the right order in the schedule. Do a precedence graph.
15 Precedence Graphs Schedule E Schedule F Schedule G Schedule H Not conflict serialisable Conflict serialisable Conflict serialisable Conflict serialisable
Transaction Serialisability The effect on a database of any number of transactions executing in parallel must be the same as if they were executed one after another (I-property guaranteed)
17 Syntactic (View) Serialisability Equivalence: As long as each read operation of a transaction reads the result of the same write operation in both schedules, the write operations of each transaction must produce the same results The read operations are said to see the same view of data in both schedules The final write operation on each data item is the same in both schedules, so the database state should be the same at the end of both schedules View serialisation A schedule S is view serialisable if it is equivalent to a serial schedule Testing for view serialisability is NP-complete: it is is highly improbable that an efficient algorithm can be found
Methods for Transaction Serialisation Timestamps unique identifiers for each transaction generated by the system order transactions by their timestamps to ensure a particular serialisability used extensively in databases including mirroring and distributed application
19 4. Locking Techniques The concept of locking data items is one of the main techniques for controlling the concurrent execution of transactions. A lock is a variable associated with a data item in the database. Generally there is a lock for each data item in the database. A lock describes the status of the data item with respect to possible operations that can be applied to that item used for synchronising the access by concurrent transactions to the database items. A transaction locks an object before using it When an object is locked by another transaction, the requesting transaction must wait
Types of Locks Binary locks have two possible states: 1.locked (lock_item (X) operation) and 2.unlocked (unlock (X) operation Multiple-mode locks allow concurrent access to the same item by several transactions. Three possible states: 1.read locked or shared locked (other transactions are allowed to read the item) 2.write locked or exclusive locked (a single transaction exclusively holds the lock on the item) and 3.unlocked. Locks are held in a lock table. upgrade lock: read lock to write lock downgrade lock: write lock to read lock
Locking Granularity A database item which can be locked could be a database record a field value of a database record a disk block the whole database Trade-offs coarse granularity the larger the data item size, the lower the degree of concurrency fine granularity the smaller the data item size, the more locks to be managed and stored, and the more lock/unlock operations needed.
22 Record Locking Every record has a lock. The lock may have 3 states: Unlocked = U Read Locked = R, n Write Locked = W Note: n is the number of transactions which have put a read lock on the record.
23 The lock must be checked, then set before the record is accessed. Decision Table for Lock Management: Record Locking Protocol
24 Example: Prevention of Lost Update Time T0T0 Transaction ATransaction BValue Start A6 T3T3 Add 2 (6+2=8)Request Write6(W) T4T4 Write ValueWait6(W) T5T5 Set Write Lock 8(W) T1T1 6 End B T2T2 Set Write Lock 6(W) Read Value (6) Start A Read Value T6T6 8 T7T7 8(W) Release Lock (8) Wait etc.
25 Example: Locking with Lost Update
Ensuring Serialisability: Two-Phase Locking All locking operations (read_lock, write_lock) should precede the first unlock operation in the transactions. Two phases: expanding phase: new locks on items can be acquired but none can be released shrinking phase: existing locks can be released but no new ones can be acquired The two phases are completely disjoint, no overlapping Record access occurs during or after the expanding phase but must be complete before the shrinking phase starts.
27 Example: Prevention of Incorrect Summary Time T0T0 Transaction ATransaction BValues T1T1 6(R,1) 3 T2T2 Wait 6(R,1) 3(R,1) T3T3 T4T4 T5T5 Read Lock record 1 Total Value (6+3= 9) Read 1 st Value (6) Read 2 nd Value (3) Read Lock record 2 Release Locks 6(R,1) 3(R,1) 6(R,1) 3(R,1) 6(R,1) 3(R,1) 6(R,1) 3(R,1) Wait Write Lock record T6T6 ExpandIngExpandIng ShrinkingShrinking Request Write record 1
Locking Problems: Deadlock Time T0T0 Transaction ATransaction BValues T1T1 Write Lock record T2T2 6(W) 3 T3T3 6(W) 3(R,1) T4T4 6(W) 3(R,1) Wait 6(W) 3(R,1) Read Lock record 2 Read 1 st Value (6) Read 2 nd Value (3) Request Write record 2 Request Read record 1 Wait Each Process is waiting for the other to release a lock!
29 Deadlock Prevention Killing processes: Victim selection Explicit Timestamping of the operations Enforcing Timeouts of the transactions Detection of “Waiting for” loops Diagrammatically Process 1 Process 2 Object 1 Object 2 Locked by Waiting for
30 5. Optimistic Strategy for Concurrent Transaction Management No checking while the transaction is executing. Check for conflicts after the transaction. Checks are all made at once, so low transaction execution overhead Relies on little interference between transactions Updates are not applied until the end of transaction Updates are applied to local copies in transaction space
31 Phases in optimistic strategy 1. read phase: read from the database, but updates are applied only to local copies 2.validation phase: check to ensure serialisability will not be validated if the transaction updates are actually applied to the database 3.write phase: if validation is successful, transaction updates applied to database; otherwise updates are discarded and transaction is aborted and restarted.