The Concept of Transaction Processing A Transaction: logical unit of database processing that includes one or more access operations (read - retrieval, write - insert or update, delete). A transaction (set of operations) may be stand- alone specified in a high level language like SQL, or may be embedded within a program. Transaction boundaries: Begin and End transaction. An application program may contain several transactions separated by the Begin and End transaction boundaries. Dr. Mohamed Osman Hegazi
Properties of Transactions ACID properties: Atomicity: A transaction is an atomic unit of processing; it is either performed in its entirety or not performed at all. Consistency preservation: A correct execution of the transaction must take the database from one consistent state to another. Isolation: A transaction should not make its updates visible to other transactions until it is committed. Durability or permanency: Once a transaction changes the database and the changes are committed, these changes must never be lost because of subsequent failure. Dr. Mohamed Osman Hegazi
read_item(X): Reads a database item named X into a program variable x. write_item(X): Writes the value of program variable X into the database item named X. read_item(X) command includes the following steps: Find the address of the disk block that contains item X. Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer). Copy item X from the buffer to the program variable named X. write_item(X) command includes the following steps: Find the address of the disk block that contains item X. Copy that disk block into a buffer in main memory (if that disk block is not already in some main memory buffer). Copy item X from the program variable named X into its correct location in the buffer. Store the updated block from the buffer back to disk (either immediately or at some later point in time). Basic operations are read and write Dr. Mohamed Osman Hegazi
Single-User System: At most one user at a time can use the system. Multiuser System: Many users can access the system concurrently. Concurrency – Interleaved processing: concurrent execution of processes is interleaved in a single CPU – Parallel processing: processes are concurrently executed in multiple CPUs. Concurrency control Dr. Mohamed Osman Hegazi
Why Concurrency Control is needed: a)The Lost Update Problem. This occurs when two transactions that access the same database items have their operations interleaved in a way that makes the value of some database item incorrect. b)The Temporary Update (or Dirty Read) Problem. This occurs when one transaction updates a database item and then the transaction fails for some reason. The updated item is accessed by another transaction before it is changed back to its original value. c)The Incorrect Summary Problem. If one transaction is calculating an aggregate summary function on a number of records while other transactions are updating some of these records, the aggregate function may calculate some values before they are updated and others after they are updated. Dr. Mohamed Osman Hegazi
(a) The lost update problem. Dr. Mohamed Osman Hegazi
(b) The temporary update problem (Dirty Read). Dr. Mohamed Osman Hegazi
(c) The incorrect summary problem. Dr. Mohamed Osman Hegazi
Why recovery is needed: (What causes a Transaction to fail) 1. A computer failure (system crash): A hardware or software error occurs in the computer system during transaction execution. A transaction or system error : Some operation in the transaction may cause it to fail, such as integer overflow or division by zero. Transaction failure may also occur because of erroneous parameter values or because of a logical programming error. In addition, the user may interrupt the transaction during its execution. Local errors or exception conditions detected by the transaction: - certain conditions necessitate cancellation of the transaction. For example, data for the transaction may not be found. A condition, such as insufficient account balance in a banking database. - a programmed abort in the transaction causes it to fail. Concurrency control enforcement: The concurrency control method may decide to abort the transaction, to be restarted later, because it violates serializability or because several transactions are in a state of deadlock. Disk failure: Some disk blocks may lose their data because of a read or write malfunction or because of a disk read/write head crash. Physical problems : This refers to an endless list of problems that includes power or air-conditioning failure, fire, theft, sabotage, and overwriting disks by mistake. Recovery Dr. Mohamed Osman Hegazi
Transaction states. Dr. Mohamed Osman Hegazi
Transaction and System Concepts Recovery manager keeps track of the following operations: begin_transaction:This marks the beginning of transaction execution. read or write: These specify read or write operations on the database items that are executed as part of a transaction. end_transaction: This specifies that read and write transaction operations have ended and marks the end limit of transaction execution. At this point it may be necessary to check whether the changes introduced by the transaction can be permanently applied to the database or whether the transaction has to be aborted because it violates concurrency control or for some other reason. commit_transaction: This signals a successful end of the transaction so that any changes (updates) executed by the transaction can be safely committed to the database and will not be undone. rollback (or abort): This signals that the transaction has ended unsuccessfully, so that any changes or effects that the transaction may have applied to the database must be undone. Dr. Mohamed Osman Hegazi
Recovery techniques use the following operators : undo: Similar to rollback except that it applies to a single operation rather than to a whole transaction. redo: This specifies that certain transaction operations must be redone to ensure that all the operations of a committed transaction have been applied successfully to the database. Recovery techniques Dr. Mohamed Osman Hegazi
The System Log: The log is kept on disk, so it is not affected by any type of failure except for disk or catastrophic failure. In addition, the log is periodically backed up to archival storage (tape) to guard against such catastrophic failures Transaction Log For recovery from any type of failure data values prior to modification (BFIM - BeFore Image) and the new value after modification (AFIM – AFter Image) are required. These values and other information is stored in a sequential file called Transaction log.
Commit Point of a Transaction: Definition: A transaction T reaches its commit point when all its operations that access the database have been executed successfully and the effect of all the transaction operations on the database has been recorded in the log. Beyond the commit point, the transaction is said to be committed, and its effect is assumed to be permanently recorded in the database. The transaction then writes an entry [commit,T] into the log. Roll Back of transactions: Needed for transactions that have a [start_transaction,T] entry into the log but no commit entry [commit,T] into the log. Redoing transactions: Transactions that have written their commit entry in the log must also have recorded all their write operations in the log; otherwise they would not be committed, so their effect on the database can be redone from the log entries. Force writing a log: before a transaction reaches its commit point, any portion of the log that has not been written to the disk yet must now be written to the disk. This process is called force-writing the log file before committing a transaction. Dr. Mohamed Osman Hegazi
Database Concurrency Control 1 Purpose of Concurrency Control To enforce Isolation (through mutual exclusion) among conflicting transactions. To preserve database consistency through consistency preserving execution of transactions. To resolve read-write and write-write conflicts. Example: In concurrent execution environment if T1 conflicts with T2 over a data item A, then the existing concurrency control decides if T1 or T2 should get the A and if the other transaction is rolled-back or waits. Dr. Mohamed Osman Hegazi
Database Concurrency Control Two-Phase Locking Techniques Locking is an operation which secures (a) permission to Read or (b) permission to Write a data item for a transaction. Example: Lock (X). Data item X is locked in behalf of the requesting transaction. Unlocking is an operation which removes these permissions from the data item. Example: Unlock (X). Data item X is made available to all other transactions. Lock and Unlock are Atomic operations. Dr. Mohamed Osman Hegazi
Database Concurrency Control Two-Phase Locking Techniques: The algorithm T1 T2 Result read_lock (Y);read_lock (X); Initial values: X=20; Y=30 read_item (Y);read_item (X); Result of serial execution unlock (Y);unlock (X); T1 followed by T2 write_lock (X);Write_lock (Y); X=50, Y=80. read_item (X);read_item (Y); Result of serial execution X:=X+Y;Y:=X+Y; T2 followed by T1 write_item (X);write_item (Y); X=70, Y=50 unlock (X);unlock (Y);
Database Concurrency Control Two-Phase Locking Techniques: The algorithm T1T2 Result read_lock (Y); X=50; Y=50 read_item (Y); Nonserializable because it. unlock (Y); violated two-phase policy. read_lock (X); read_item (X); unlock (X); write_lock (Y); read_item (Y); Y:=X+Y; write_item (Y); unlock (Y); write_lock (X); read_item (X); X:=X+Y; write_item (X); unlock (X); Time
Database Concurrency Control Dealing with Deadlock Deadlock T’1T’2 read_lock (Y);T1 and T2 did follow two-phase read_item (Y);policy but they are deadlock read_lock (X); read_item (Y); write_lock (X); (waits for X)write_lock (Y); (waits for Y) Deadlock (T’1 and T’2)
Database Concurrency Control ( Dealing with Deadlock) Deadlock prevention A transaction locks all data items it refers to before it begins execution. This way of locking prevents deadlock since a transaction never waits for a data item. The conservative two-phase locking uses this approach. Deadlock detection and resolution The scheduler maintains a wait-for-graph for detecting cycle. If a cycle exists, then one transaction involved in the cycle is selected (victim) and rolled-back. A wait-for-graph is created using the lock table. As soon as a transaction is blocked, it is added to the graph. Deadlock avoidance There are many variations of two-phase locking algorithm. Some avoid deadlock by not letting the cycle to complete. That is as soon as the algorithm discovers that blocking a transaction is likely to create a cycle, it rolls back the transaction. Wound-Wait and Wait-Die algorithms use timestamps to avoid deadlocks by rolling-back victim.
Database Concurrency Control Timestamp based concurrency control algorithm Timestamp A monotonically increasing variable (integer) indicating the age of an operation or a transaction. A larger timestamp value indicates a more recent event or operation. Timestamp based algorithm uses timestamp to serialize the execution of concurrent transactions.
Assignment no. 5 Give a brief comparison between Two- Phase Locking and Timestamp Dr. Mohamed Osman Hegazi