Chapter 9 Transaction Management and Concurrency Control

Chapter 9 Transaction Management and Concurrency Control
Database Systems: Design, Implementation and Management 6th Edition Peter Rob & Carlos Coronel 1

What Is a Transaction? A transaction is a logical unit of work that must be either entirely completed or aborted; no intermediate states are acceptable. Most real-world database transactions are formed by two or more database requests. A database request is the equivalent of a single SQL statement in an application program or transaction. A database request involving update actually involves at least one Read and at least one Write operation. A transaction that changes the contents of the database must alter the database from one consistent database state to another. To ensure consistency of the database, every transaction must begin with the database in a known consistent state. 4

Transaction Examples A transaction includes read and write operations to access the database T2 (“Deposit”) T1 (“Transfer”) read_item(X); read_item(X); X = X + M; X = X - N; write_item(X); write_item(X); read_item(Y); Y = Y + N; write_item(Y); A bit of terminology - Read Set = set of all items a transaction reads {X} for T2; {X,Y} for T1 Write Set = set of all items a transaction writes {X} for T2; {X,Y} for T1 - Read brings item from disk to memory; write updates disk from memory items being read and written can be at various levels of granularity - field, record, disk block - most of what we will talk about will be the same regardless of granularity. Here the granularity is one field in one record Each of these transactions must be all or nothing (more important for T1) and they must not be interfered with by other transactions! BTW - write doesn’t necessarily mean immediate disk update for transaction or O.S. reasons - here in transaction 1 we will not want the disk update for item X to happen until ready for disk update for item Y too -

What Is a Transaction? Evaluating Transaction Results
An accountant wishes to register the credit sale of 100 units of product X to customer Y in the amount of $500.00: Reducing product X’s Quantity on hand by 100. Adding $ to customer Y’s accounts receivable. UPDATE PRODUCT SET PROD_QOH = PROD_QOH WHERE PROD_CODE = ‘X’; UPDATE ACCREC SET AR_BALANCE = AR_BALANCE WHERE AR_NUM = ‘Y’; If the above two transactions are not completely executed, the transaction yields an inconsistent database. Garbage is worse than no data …< DO BANK ACCOUNT balance transfer UPDATE account SET Balance= Balance + 50 WHERE acctID = ‘ABC123UME’; SET Balance = Balance – 50 WHERE acctID = ‘ABC123UMESAV’; 7

What Is a Transaction? Evaluating Transaction Results
The DBMS does not guarantee that the semantic meaning of the transaction truly represents the real-world event. If we define the transaction to be just this one statement, then the DBMS doesn’t know that we have messed up UPDATE PRODUCT SET PROD_QOH = PROD_QOH WHERE PROD_CODE = ‘X’; 8

What Is a Transaction? Transaction Properties (ACID plus)
Atomicity requires that all operations of a transaction be completed; if not, the transaction is aborted. Consistency Preserving – complete execution of transaction takes DB from one consistent state to another Isolation – transaction should appear as though it is running in isolation – transactions shouldn’t interfere with each other Durability – (or permanency) – changes made by committed transactions must persist – cannot be lost. Serializability – concurrent transactions are treated as if they were executed in serial order (one after another) … transaction is “atomic unit” – cannot be broken up – all or nothing. DBMS transaction and recovery facilities used to ensure atomicity … developers must ensure that their transactions are consistency preserving – aided by DBMS facilities for enforcing referential integrity, entity integrity, domain constraints etc … DBMS concurrency facilities are responsible for preserving isolation … generally means that the data used during the execution of a transaction cannot be used by a second transaction until the first one is completed (assuming at least one of the transactions writes the data out) … indicates the permanence of the database’s consistent state. DBMS recovery facilities responsible for ensuring durability … describes the result of the concurrent execution of several transactions. This property is important in multi-user and distributed databases. Personally, I think the author is mistaken in putting serializability here – other authors just have ACID, and serializability is a means of ensuring Isolation 9

Transaction Management with SQL
ANSI has defined standards that govern SQL database transactions Transaction support is provided by two SQL statements: COMMIT and ROLLBACK. When a transaction sequence is initiated, it must continue through all succeeding SQL statements until one of the following four events occurs: A COMMIT statement is reached. A ROLLBACK statement is reached. The end of a program is successfully reached (equivalent to COMMIT). The program is abnormally terminated (equivalent to ROLLBACK). 10

The Transaction Log A transaction log keeps track of all transactions that update the database. The information stored in the log is used by the DBMS for a recovery triggered by a ROLLBACK statement, program crash, or a system failure. The transaction log stores before-and-after data about the database and any of the tables, rows, and attribute values that participated in the transaction. – start of transaction, reads, writes, commits, aborts all noted. The transaction log is itself a database, and it is managed by the DBMS like any other database. <SKIP> 12

The Transaction Log Stores: A record for the beginning of transaction
For each transaction component (SQL statement) Type of operation being performed (update, delete, insert) Names of objects affected by the transaction (the name of the table) “Before” and “after” values for updated fields Pointers to previous and next transaction log entries for the same transaction The ending (COMMIT) of the transaction

A Snippet of a Transaction Log
<UREALISTICALLY SIMPLE BECAUSE ONLY HAVE INFO FROM ONE TRANSACTION – but note – has table, row, attribute and before and after values <Handout > 13

Concurrency Control Concurrency control coordinates simultaneous execution of transactions in a multiprocessing database. The objective of concurrency control is to preserve the Isolation of transactions – generally by ensuring the serializability of transactions in a multi-user database environment. Important  Simultaneous execution of transactions over a shared database can create several data integrity and consistency problems: Lost Updates. Uncommitted Data (Dirty Read) Inconsistent retrievals. (Incorrect Summary) 14

Normal Sequential Execution of Two Transactions
T2 (“Deposit”) T1 (“Transfer”) read_item(X); X = X + M; write_item(X); X = X - N; read_item(Y); Y = Y + N; write_item(Y); Lets suppose that two transactions happen like pictured here … suppose account X has a balance of $150 and T2 is going to Deposit $50 and T1 is going to transfer $100 T2 reads amount ($150) and adds $50 (giving $200). T2 writes its value of X ($200) to disk T1 reads NEW amount ($200) and subtracts $100 (giving $100) T1 writes its value of X ($100) to disk, Then starts working on account Y. Account X has a proper balance. Account Y has no potential conflict, so it will be correct - $100 more than it was before

The Lost Update Problem
T2 (“Deposit”) T1 (“Transfer”) read_item(X); X = X - N; X = X + M; write_item(X); read_item(Y); Y = Y + N; write_item(Y); Lets suppose that there was no transaction handling - everything left up to the O.S. to interleave processes as it sees fit, and DBMS does nothing to preserve transaction’s atomicity - ok, let’s suppose then that it happens like pictured here … suppose account X has a balance of $150 and T2 is going to Deposit $50 and T1 is going to transfer $100 T1 reads amount ($150) and subtracts $100 (leaving $50). T2 reads OLD amount ($150) and adds $50 (giving $200) T1 writes its value of X ($50) to disk, Then starts working on account Y T2 writes ITS value of X ($200) to disk The result is Account X having $100 too much money - the transfer out has been lost (and with the transfer in working successfully $100 will have been created out of thin air!) - This may seem flukey, hard to believe, … but it could happen, something that a bank never wants to happen. IT MUST NOT HAPPEN! - concurrency must be controlled.

The Dirty Read (Uncommitted Data) Problem
T2 (“Deposit”) T1 (“Transfer”) read_item(X); X = X - N; write_item(X); X = X + M; read_item(Y); <CRASH> Recovery sets X back to original value AKA, The “Temporary Update” problem. Again, lets suppose that there was no handling of concurrency (but some handling of recovery) - everything left up to the O.S. to interleave processes as it sees fit, and DBMS only makes sure an entire transaction is executed successfully - ok, let’s suppose then that it happens like pictured here … suppose account X has a balance of $150 and T2 is going to Deposit $50 and T1 is going to transfer $100 T1 reads amount ($150) and subtracts $100 (leaving $50). T1 writes its value of X ($50) to disk, T2 reads amount ($50) and adds $50 (giving $100) T1 Then starts working on account Y but the program crashes recovery - when seeing that whole transaction wasn’t completed may back out part that was done - add back $100 to ITS value of X($50) (giving$150) T2 writes ITS value of X ($100) to disk The result is Account X having $100 too little money - the transfer out has taken effect despite its failure (and with the transfer in not occurring $100 will have been lost into thin air!) This is because T2 used “uncommitted data” or “dirty data” - data that isn’t really real. The update done by T1 was only temporary - when it was backed out T2 was in the dark and didn’t reflect it While the bank may like this scenario, it can’t figure to get away with it - the customer will notice the debit in say checking, without a corresponding credit in say savings. This problem too, must be avoided

Concurrency Control Inconsistent Retrievals (or Incorrect Summary)
Inconsistent retrievals occur when a transaction calculates some summary (aggregate) functions over a set of data while other transactions are updating the data. Example: T1 calculates the total quantity on hand of the products stored in the PRODUCT table. At the same time, T2 updates the quantity on hand (PROD_QOH) for two of the PRODUCT table’s products. The problem is that the summary transaction might read some data before they are changed and other data after they are changed – giving inconsistent answer 19

Retrieval During Update
T2 represents the correction of a typing error T1 is calculating the total on hand inventory for all products combined 20

Transaction Results: Data Entry Correction
This is an example – how things should work – total of 92 21

Inconsistent Retrievals
This is really BAD luck! But we don’t want to have our managers making decisions based on data that is incorrect, whether it is just because of bad luck or not. We want them making decisions based on correct data. So this needs to be stopped 22

Concurrency Control The Scheduler
The scheduler (part of DBMS) establishes the order in which the operations within concurrent transactions are executed. The scheduler interleaves the execution of database operations to make sure that the computer’s CPU is used efficiently – while also ensuring serializability and isolation. To determine the appropriate order, the scheduler bases its actions on concurrency control algorithms, such as locking or time stamping or optimistic methods. Just doing transactions serially – one transaction all of the way through and then another … does not efficiently use the CPU because whenever the transaction does a disk I/O, the transaction has to wait a long time (in comparison to times for doing calculations). Some transactions can be interleaved easily – if they access different data, or all do reads only, others need more care to avoid problems such as we’ve discussed. 23

Read/Write Conflict Scenarios: Conflicting Database Operations Matrix
If we are dealing with the same data, two transactions that only READ the data do not conflict, But if EITHER one (or both) does a WRITE, then there is a conflict 24

Concurrency Control with Locking Methods
Concurrency can be controlled using locks. A lock guarantees exclusive use of a data item to a current transaction. A transaction acquires a lock prior to data access; the lock is released (unlocked) when the transaction is completed. All locking of information is managed by a lock manager. … the lock ensures that another transaction does not access the data while it is in an inconsistent state … locks are taken care of by the DBMS. The user does not have to ask for a lock 25

Lock Granularity Lock granularity indicates the level of lock use. Database level (See Figure 9.3) Table level (See Figure 9.4) Page level (See Figure 9.5) Row level (See Figure 9.6) Field (attribute) level 26

A Database-Level Locking Sequence
… The whole database is locked by a transaction, other transactions must wait to use the DB. Too tight of a lock – results in serial schedule – one whole transaction after another – no concurrency <Handout – if we get this far > 27

An Example Of A Table-Level Lock
Entire table is locked by a transaction – still too tight The transaction accessing Table B can go on – but a transaction accessing the same table must wait – even if accessing different records <Handout – if we get this far > 28

An Example Of A Page-Level Lock
A page is a section of a disk ( Fixed size 4K, or 8K, or …). A page may contain records from more than one table, but probably not all of the records from one table. This is convenient, because data is usually brought into memory a page at a time, and written to disk a page at a time. So locking a page means that there is a block of data that you are working with, probably you will write out that block to disk before unlocking. This is a commonly used approach in DBMSs today NOTE here that T2 is able to go on until it looks to use the same page that T1 is using. <Handout – if we get this far > 29

An Example Of A Row-Level Lock
If transactions are accessing different rows then they do not conflict with each other – they can execute concurrently. Row-level lock only blocks access to the same row. Allows a lot of concurrency – but with a lot of overhead – locks must be managed for each record in each table in the DB Field level locks would require even more overhead <Handout – if we get this far > 30

Binary Locks A binary lock has only two states: locked (1) or unlocked (0). If an object is locked by a transaction, no other transaction can use that object. If an object is unlocked, any transaction can lock the object for its use. A transaction must unlock the object after its termination. Every transaction requires a lock and unlock operation for each data item that is accessed (which could be at any level of granularity – table, page, row, …). Regardless of level of locking, a DBMS may use different lock types – binary (to discuss now) or shared/exclusive 31

An Example Of A Binary Lock
Here, if T2 was going to start at time 2.5, it would try to lock PRODUCT and be rejected, and have to wait. It can’t go on until T1 unlocks PRODUCT This eliminates the Lost Update problem – but with a loss in the amount of concurrency. Here we may have to live with that, as it is necessary to ensure consistency – but this approach will also keep two transactions that only READ data to have to wait on each other 32

Exclusive Locks An exclusive lock exists when access is specially reserved for the transaction that locked the object. The exclusive lock must be used when the potential for conflict exists. An exclusive lock is issued when a transaction wants to write (update) a data item and no locks are currently held on that data item. Works like a binary lock in that it keeps all other accesses out 33

Shared Locks A shared lock exists when concurrent transactions are granted READ access on the basis of a common lock. A shared lock produces no conflict as long as the concurrent transactions are read only. A shared lock is issued when a transaction wants to read data from the database and no exclusive lock is held on that data item. With shared/exclusive locks, a lock could have one of three states: unlocked, shared(READ), or exclusive (WRITE) In some schemes, can upgrade a lock from shared to exclusive or downgrade a lock from exclusive to shared 33

Potential Problems with Locks The resulting transaction schedule may not be serializable. The schedule may create deadlocks. Solutions Two-phase locking for the serializability problem. Deadlock detection and prevention techniques for the deadlock problem. one transaction may be waiting on a lock put on by another transaction, and then that transaction stalls because it is waiting on a lock put on by the first transaction 35

Two-Phase Locking The two-phase locking protocol defines how transactions acquire and relinquish locks. It guarantees serializability, but it does not prevent deadlocks. In a growing phase, a transaction acquires all the required locks without unlocking any data. Once all locks have been acquired, the transaction is in its locked point. In a shrinking phase, a transaction releases all locks and cannot obtain any new locks. all locks that are going to be acquired must be acquired before any are released. No data can be affected until locking point (to avoid dirty read) This reduces concurrency – because locks are held when they are not needed – transactions must lock earlier and/or release later than under less conservative approaches - even if a transaction is finished with item Z, the transaction must keep it locked if it needs to lock another item later (or lock the later one now so that Z can be released now). Either way, items are locked when not needed – keeping other transactions from accessing them. 36

How A Deadlock Condition Is Created
Deadlocks (Deadly Embrace) Two transactions are waiting for each other to unlock data Deadlocks exist when two transactions T1 and T2 exist in the following mode: T1 = access data items X and Y T2 = access data items Y and X If T1 has not unlocked data item Y, T2 cannot begin; and, if T2 has not unlocked data item X, T1 cannot continue. (See Table 9.11) 40

Three Techniques to Control Deadlocks: Deadlock Prevention A transaction requesting a new lock is aborted if there is a possibility that a deadlock can occur. Restarted later Deadlock Detection The DBMS periodically tests the database for deadlocks. If a deadlock is found, one of the transactions (“victim”) is aborted, and the other transaction continues. Deadlock Avoidance The transaction must obtain all the locks it needs before it can be executed. … essentially means that transactions working with the same data must be executed serially – decreasing concurrency Which to use depends on the environment and system requirements (how fast response time is needed?) 41

Database Recovery Management
Recovery restores a database from a given state, usually inconsistent, to a previously consistent state. Recovery techniques are based on the atomic transaction property: All portions of the transaction must be treated as a single logical unit of work, All operations must be applied and completed to produce a consistent database. If, for some reason, any transaction operation cannot be completed, the transaction must be aborted, and any changes to the database must be rolled back. Recovery reverses changes up to the time a transaction was aborted 44

Levels of Backup Full backup of the database It backs up or dumps the whole database. Differential backup of the database Only the last modifications done to the database are copied. Backup of the transaction log only It backs up all the transaction log operations that are not reflected in a previous backup copy of the database. … since last backup Backup is in a secure place – hopefully in a different building 45

Database Failures Software Operating system, DBMS, application programs, viruses Hardware Memory chip errors, disk crashes, bad disk sectors, disk full errors Programming Exemption Application programs, end users Transaction Deadlocks External Fire, earthquake, flood 46

Transaction Recovery Makes use of deferred-write and write-through
Deferred write (or deferred update) Transaction operations do not immediately update the physical database Only the transaction log is updated Database is physically updated only after the transaction reaches its commit point using the transaction log information

Transaction Recovery (continued)
Write-through Database is immediately updated by transaction operations during the transaction’s execution, even before the transaction reaches its commit point Transaction log is also updated If a transaction fails, the database uses the log information to roll back the database

A Transaction Log for Transaction Recovery Examples
< IF it looks like we’re going to get this far, need to fill in explanation from p460-2 >

End Chapter 9

Chapter 9 Transaction Management and Concurrency Control

Similar presentations

Presentation on theme: "Chapter 9 Transaction Management and Concurrency Control"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 9 Transaction Management and Concurrency Control

Similar presentations

Presentation on theme: "Chapter 9 Transaction Management and Concurrency Control"— Presentation transcript:

Similar presentations

About project

Feedback