Lecture 11 Distributed Transaction Management. Concurrency Control The problem of synchronizing concurrent transactions such that the consistency of the.

Slides:

Advertisements

Similar presentations

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.

Advertisements

Outline Introduction Background Distributed Database Design

Database Systems (資料庫系統)

Chapter 16 Concurrency. Topics in this Chapter Three Concurrency Problems Locking Deadlock Serializability Isolation Levels Intent Locking Dropping ACID.

1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.

Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.

Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.

Concurrency Control II

CSC271 Database Systems Lecture # 32.

Concurrency Control.

Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.

Sekolah Tinggi Ilmu Statistik (STIS) 1 Dr. Said Mirza Pahlevi, M.Eng.

Quick Review of Apr 29 material

©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.

Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.

Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.

CS 582 / CMPE 481 Distributed Systems Concurrency Control.

Transaction Management and Concurrency Control

Transaction Management and Concurrency Control

Distributed Systems1 Chapter 7: Distributed Transactions Transaction concepts Centralized/Distributed Transaction Architecture Schedule concepts Locking.

Transaction Processing: Concurrency and Serializability 10/4/05.

Transaction Management

1 Minggu 8, Pertemuan 15 Transaction Management Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.

©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.

Concurrency Control John Ortiz.

Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.

Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.

© 1997 UW CSE 11/13/97N-1 Concurrency Control Chapter 18.1, 18.2, 18.5, 18.7.

Transactions and concurrency control

Concurrency Control.

Distributed DBMSSlide 1Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor,

BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.

Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.

Databases Illuminated

ICS (072)Concurrency Control Techniques1 Concurrency Control Techniques Chapter 18 Dr. Muhammad Shafique.

Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.

Chapter 11 Concurrency Control. Lock-Based Protocols  A lock is a mechanism to control concurrent access to a data item  Data items can be locked in.

Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Concurrency Control Techniques Chapter 18

Concurrency control. Lock-based protocols One way to ensure serializability is to require the data items be accessed in a mutually exclusive manner One.

Chapter 15 Concurrency Control Yonsei University 1 st Semester, 2015 Sanghyun Park.

Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.

TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..

II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.

Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.

1 7. Distributed Concurrency Control Chapter 11 Distributed Concurrency Control.

Concurrency Control and Reliable Commit Protocol in Distributed Database Systems Jian Jia Chen 2002/05/09 Real-time and Embedded System Lab., CSIE, National.

Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.

Concurrency Control Introduction Lock-Based Protocols

1 Concurrency control lock-base protocols timestamp-based protocols validation-based protocols Ioan Despi.

Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.

9 1 Chapter 9_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.

Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.

Lecture 9- Concurrency Control (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch

Switch off your Mobiles Phones or Change Profile to Silent Mode.

10 1 Chapter 10_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.

3 Database Systems: Design, Implementation, and Management CHAPTER 9 Transaction Management and Concurrency Control.

Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.

Distributed Databases – Advanced Concepts Chapter 25 in Textbook.

Transaction Management

Concurrency Control.

Part- A Transaction Management

Outline Introduction Background Distributed DBMS Architecture

Chapter 10 Transaction Management and Concurrency Control

Outline Introduction Background Distributed DBMS Architecture

Chapter 15 : Concurrency Control

Distributed Database Management Systems

Chapter 7: Distributed Transactions

Outline Introduction Background Distributed DBMS Architecture

Transactions, Properties of Transactions

Presentation transcript:

Lecture 11 Distributed Transaction Management

Concurrency Control The problem of synchronizing concurrent transactions such that the consistency of the database is maintained while, at the same time, maximum degree of concurrency is achieved. Anomalies: –Lost updates The effects of some transactions are not reflected on the database. –Inconsistent retrievals A transaction, if it reads the same data item more than once, should always read the same value.

Execution History (or Schedule) An order in which the operations of a set of transactions are executed. A history (schedule) can be defined as a partial order over the operations of a set of transactions. H 1 ={ W 2 ( x ), R 1 ( x ), R 3 ( x ), W 1 ( x ), C 1, W 2 ( y ), R 3 ( y ), R 2 ( z ), C 2, R 3 ( z ), C 3 } T 1 :Read( x ) T 2 :Write( x ) T 3 :Read( x ) Write( x )Write( y )Read( y ) CommitRead( z )Read( z )Commit

Formalization of History A complete history over a set of transactions T={T 1, …, T n } is a partial order H c (T) = {∑ T, ≺ H } where  ∑ T =  i ∑ i, for i = 1, 2, …, n  ≺ H   i ≺ T i, for i = 1, 2, …, n  For any two conflicting operations O ij, O kl  ∑ T, either O ij ≺ H O kl or O kl ≺ H O ij

Example Consider the two transactions:

Example Cont. which can be specified as a DAG

History as a List It is quite common to specify a history as a listing of the operations in ΣT, where their execution order is relative to their order in this list.

History (or Schedule) A history is defined as a prefix of a complete history. A prefix of a partial order can be defined as follows. Given a partial order, is a prefix of P if:

History (or Schedule) Cont. The first two conditions define P’ as a restriction of P on domain Σ’, whereby the ordering relations in P are maintained in P’ The last condition indicates that for any element of Σ’ all its predecessors in Σ have to be included in Σ as well.

History Vs. Complete History We can now deal with incomplete histories. This is useful because: From the perspective of the serializability theory, we deal only with conflicting operations of transactions rather than with all operations. Furthermore, and perhaps more important, when we introduce failures, we need to be able to deal with incomplete histories, which is what a prefix enables us to do.

Example Consider the following three transactions:

Serial History All the actions of a transaction occur consecutively. No interleaving of transaction operations. If each transaction is consistent (obeys integrity rules), then the database is guaranteed to be consistent at the end of executing a serial history. T 1 :Read( x ) T 2 :Write( x ) T 3 :Read( x ) Write( x )Write( y )Read( y ) CommitRead( z )Read( z ) CommitCommit

Serializable History Transactions execute concurrently, but the net effect of the resulting history upon the database is equivalent to some serial history. Equivalent with respect to what? –Conflict equivalence: the relative order of execution of the conflicting operations belonging to unaborted transactions in two histories are the same. –Conflicting operations: two incompatible operations (e.g., Read and Write) conflict if they both access the same data item. Incompatible operations of each transaction is assumed to conflict; do not change their execution orders. If two operations from two different transactions conflict, the corresponding transactions are also said to conflict.

Serializable History The following are not conflict equivalent H s ={ W 2 ( x ), W 2 ( y ), R 2 ( z ), R 1 ( x ), W 1 ( x ), R 3 ( x ), R 3 ( y ), R 3 ( z )} H 1 ={ W 2 ( x ), R 1 ( x ), R 3 ( x ), W 1 ( x ), W 2 ( y ), R 3 ( y ), R 2 ( z ), R 3 ( z )} The following are conflict equivalent; therefore H 2 is serializable. H s ={ W 2 ( x ), W 2 ( y ), R 2 ( z ), R 1 ( x ), W 1 ( x ), R 3 ( x ), R 3 ( y ), R 3 ( z )} H 2 ={ W 2 ( x ), R 1 ( x ), W 1 ( x ), R 3 ( x ), W 2 ( y ), R 3 ( y ), R 2 ( z ), R 3 ( z )} T 1 :Read( x ) T 2 :Write( x ) T 3 :Read( x ) Write( x )Write( y )Read( y ) CommitRead( z )Read( z )Commit

Serializability in Distributed DBMS Serializability theory extends in a straightforward manner to the non-replicated (or partitioned) distributed databases. The history of transaction execution at each site is called a local history. If the database is not replicated and each local history is serializable, their union (called the global history) is also serializable as long as local serialization orders are identical.

Example Consider two bank accounts, x (stored at Site 1) and y (stored at Site 2), and the following two transactions where T1 transfers $100 from x to y, while T2 simply reads the balances of x and y:

Example Cont. Obviously, both of these transactions need to run at both sites. Consider the following two histories that may be generated locally at the two sites (Hi is the history at Site i):

Example Cont. Both of these histories are serializable; indeed, they are serial. Therefore, each represents a correct execution order. Furthermore, the serialization order for both are the same T1 → T2. Therefore, the global history that is obtained is also serializable with the serialization order T1 → T2.

Example Cont. However, if the histories generated at the two sites are as follows, there is a problem: Although each local history is still serializable, the serialization orders are different: H’1 serializes T1 before T2 while H’2 serializes T2 before T1. Therefore, there can be no global history that is serializable.

Concurrency Control Algorithms Pessimistic –Two-Phase Locking-based (2PL) Centralized (primary site) 2PL Primary copy 2PL Distributed 2PL –Timestamp Ordering (TO) Basic TO Multiversion TO Conservative TO –Hybrid Optimistic –Locking-based –Timestamp ordering-based

Locking-Based Algorithms Transactions indicate their intentions by requesting locks from the scheduler (called lock manager). Locks are either read lock (rl) [also called shared lock] or write lock (wl) [also called exclusive lock] Read locks and write locks conflict (because Read and Write operations are incompatible rl wl rl yesno wl nono Locking works nicely to allow concurrent processing of transactions.

Locking Problem The locking algorithm as described above will not, unfortunately, properly synchronize transaction executions. This is because to generate serializable histories, the locking and releasing operations of transactions also need to be coordinated. We demonstrate this by an example.

Example Consider the following two transactions: The following is a valid history that a lock manager employing the locking algorithm may generate:

Example Cont. where lri (z) indicates the release of the lock on z that transaction Ti holds. Note that H is not a serializable history. For example, if prior to the execution of these transactions, the values of x and y are 50 and 20, respectively, one would expect their values following execution to be, respectively, either 102 and 38 if T1 executes before T2, or 101 and 39 if T2 executes before T1. However, the result of executing H would give x and y the values 102 and 39. Obviously, H is not serializable.

Discussion The problem with history H in Example is the following. The locking algorithm releases the locks that are held by a transaction (say, Ti ) as soon as the associated database command (read or write) is executed, and that lock unit (say x) no longer needs to be accessed. However, the transaction itself is locking other items (say, y), after it releases its lock on x. Even though this may seem to be advantageous from the viewpoint of increased concurrency, it permits transactions to interfere with one another, resulting in the loss of isolation and atomicity. Hence the argument for two-phase locking (2PL).

Two-Phase Locking (2PL) The two-phase locking rule simply states that no transaction should request a lock after it releases one of its locks. Alternatively, a transaction should not release a lock until it is certain that it will not request another lock. 2PL algorithms execute transactions in two phases.

Two-Phase Locking (2PL) Each transaction has a growing phase, where it obtains locks and accesses data items, and a shrinking phase, during which it releases locks The lock point is the moment when the transaction has achieved all its locks but has not yet started to release any of them. Thus the lock point determines the end of the growing phase and the beginning of the shrinking phase of a transaction. It has been proven that any history generated by a concurrency control algorithm that obeys the 2PL rule is serializable

Two-Phase Locking (2PL)  A Transaction locks an object before using it.  When an object is locked by another transaction, the requesting transaction must wait.  When a transaction releases a lock, it may not request another lock. Obtain lock Release lock Lock point Phase 1Phase 2 BEGINEND No. of locks

2PL Problems 2PL indicates that the lock manager releases locks as soon as access to that data item has been completed. This permits other transactions awaiting access to go ahead and lock it, thereby increasing the degree of concurrency. However, this is difficult to implement since the lock manager has to know that the transaction has obtained all its locks and will not need to lock another data item. The lock manager also needs to know that the transaction no longer needs to access the data item in question, so that the lock can be released.

2PL Problems Cont. Finally, if the transaction aborts after it releases a lock, it may cause other transactions that may have accessed the unlocked data item to abort as well. This is known as cascading aborts.

Strict 2PL These problems may be overcome by strict two-phase locking, which releases all the locks together when the transaction terminates (commits or aborts).

Strict 2PL Hold locks until the end. Obtain lock Release lock BEGINEND Transaction duration period of data item use No. of locks

2PL Issue We should note that even though a 2PL algorithm enforces conﬂict serializability, it does not allow all histories that are conﬂict serializable. Consider the following history H is not allowed by 2PL algorithm since T1 would need to obtain a write lock on y after it releases its write lock on x. However, this history is serializable in the order T3 → T1 → T2. The order of locking can be exploited to design locking algorithms that allow histories such as these

2PL Issue Cont. The main idea is to observe that in serializability theory, the order of serialization of conflicting operations is as important as detecting the conflict in the first place and this can be exploited in defining locking modes. Consequently, in addition to read (shared) and write (exclusive) locks, a third lock mode is defined: ordered shared.

Centralized 2PL There is only one 2PL scheduler in the distributed system. Lock requests are issued to the central scheduler. Data Processors at participating sites Coordinating TMCentral Site LM Lock Request Lock Granted Operation End of Operation Release Locks

Distributed 2PL 2PL schedulers are placed at each site. Each scheduler handles lock requests for data at that site. A transaction may read any of the replicated copies of item x, by obtaining a read lock on one of the copies of x. Writing into x requires obtaining write locks for all copies of x.

Distributed 2PL Execution Coordinating TMParticipating LMsParticipating DPs Lock Request Operation End of Operation Release Locks

Timestamp Ordering  Transaction (T i ) is assigned a globally unique timestamp ts(T i ).  Transaction manager attaches the timestamp to all operations issued by the transaction.  Each data item is assigned a write timestamp (wts) and a read timestamp (rts): –rts(x) = largest timestamp of any read on x –wts(x) = largest timestamp of any read on x  Conflicting operations are resolved by timestamp order. Basic T/O : for R i (x)for W i (x) if ts(T i ) < wts(x)if ts(T i ) < rts(x) and ts(T i ) < wts(x) then reject R i (x)then reject W i (x) else accept R i (x)else accept W i (x) rts(x)  ts(T i ) wts(x)  ts(T i )

Conservative Timestamp Ordering Basic timestamp ordering tries to execute an operation as soon as it receives it –progressive –too many restarts since there is no delaying Conservative timestamping delays each operation until there is an assurance that it will not be restarted Assurance? –No other operation with a smaller timestamp can arrive at the scheduler –Note that the delay may result in the formation of deadlocks

Multiversion Timestamp Ordering Do not modify the values in the database, create new values. A R i (x) is translated into a read on one version of x. –Find a version of x (say x v ) such that ts(x v ) is the largest timestamp less than ts(T i ). A W i (x) is translated into W i (x w ) and accepted if the scheduler has not yet processed any R j (x r ) such that ts(T i ) < ts(x r ) < ts(T j )

Optimistic Concurrency Control Algorithms Pessimistic execution Optimistic execution ValidateReadComputeWrite ValidateReadComputeWrite

Optimistic Concurrency Control Algorithms Transaction execution model: divide into subtransactions each of which execute at a site –T ij : transaction T i that executes at site j Transactions run independently at each site until they reach the end of their read phases All subtransactions are assigned a timestamp at the end of their read phase Validation test performed during validation phase. If one fails, all rejected.

Optimistic CC Validation Test  If all transactions T k where ts(T k ) < ts(T ij ) have completed their write phase before T ij has started its read phase, then validation succeeds –Transaction executions in serial order TkTk RVW RVW T ij

Optimistic CC Validation Test  If there is any transaction T k such that ts(T k )<ts(T ij ) and which completes its write phase while T ij is in its read phase, then validation succeeds if WS(T k )  RS(T ij ) = Ø –Read and write phases overlap, but T ij does not read data items written by T k RVW TkTk RVW T ij

Optimistic CC Validation Test  If there is any transaction T k such that ts(T k )< ts(T ij ) and which completes its read phase before T ij completes its read phase, then validation succeeds if WS(T k )  RS(T ij ) = Ø and WS(T k )  WS(T ij ) = Ø –They overlap, but don't access any common data items. RVW TkTk RVW T ij

Deadlock A transaction is deadlocked if it is blocked and will remain blocked until there is intervention. Locking-based CC algorithms may cause deadlocks. TO-based algorithms that involve waiting may cause deadlocks. Wait-for graph –If transaction T i waits for another transaction T j to release a lock on an entity, then T i → T j in WFG. TiTi TjTj

Local versus Global WFG Assume T 1 and T 2 run at site 1, T 3 and T 4 run at site 2. Also assume T 3 waits for a lock held by T 4 which waits for a lock held by T 1 which waits for a lock held by T 2 which, in turn, waits for a lock held by T 3. Local WFG Global WFG T1T1 Site 1 Site 2 T2T2 T4T4 T3T3 T1T1 T2T2 T4T4 T3T3

Deadlock Management Ignore –Let the application programmer deal with it, or restart the system Prevention –Guaranteeing that deadlocks can never occur in the first place. Check transaction when it is initiated. Requires no run time support. Avoidance –Detecting potential deadlocks in advance and taking action to insure that deadlock will not occur. Requires run time support. Detection and Recovery –Allowing deadlocks to form and then finding and breaking them. As in the avoidance scheme, this requires run time support.

Deadlock Prevention All resources which may be needed by a transaction must be predeclared. –The system must guarantee that none of the resources will be needed by an ongoing transaction. –Resources must only be reserved, but not necessarily allocated a priori –Unsuitability of the scheme in database environment –Suitable for systems that have no provisions for undoing processes. Evaluation: –Reduced concurrency due to preallocation –Evaluating whether an allocation is safe leads to added overhead. – Difficult to determine (partial order) + No transaction rollback or restart is involved.

Deadlock Avoidance Transactions are not required to request resources a priori. Transactions are allowed to proceed unless a requested resource is unavailable. In case of conflict, transactions may be allowed to wait for a fixed time interval. Order either the data items or the sites and always request locks in that order. More attractive than prevention in a database environment.

Deadlock Avoidance – Wait-Die Algorithm If T i requests a lock on a data item which is already locked by T j, then T i is permitted to wait iff ts(T i ) ts(T j ), then T i is aborted and restarted with the same timestamp. –if ts(T i )<ts(T j ) then T i waits else T i dies –non-preemptive: T i never preempts T j –prefers younger transactions

Deadlock Avoidance – Wound-Wait Algorithm If T i requests a lock on a data item which is already locked by T j, then T i is permitted to wait iff ts(T i )>ts(T j ). If ts(T i )<ts(T j ), then T j is aborted and the lock is granted to T i. –if ts(T i )<ts(T j ) then T j is wounded else T i waits –preemptive: T i preempts T j if it is younger –prefers older transactions

Deadlock Detection Transactions are allowed to wait freely. Wait-for graphs and cycles. Topologies for deadlock detection algorithms –Centralized –Distributed –Hierarchical

Centralized Deadlock Detection One site is designated as the deadlock detector for the system. Each scheduler periodically sends its local WFG to the central site which merges them to a global WFG to determine cycles. How often to transmit? –Too often ⇒ higher communication cost but lower delays due to undetected deadlocks –Too late ⇒ higher delays due to deadlocks, but lower communication cost Would be a reasonable choice if the concurrency control algorithm is also centralized. Proposed for Distributed INGRES

Build a hierarchy of detectors Hierarchical Deadlock Detection Site 1Site 2Site 3Site 4 DD 21 DD 22 DD 23 DD 24 DD 11 DD 14 DD 0 x

Distributed Deadlock Detection Sites cooperate in detection of deadlocks. One example: –The local WFGs are formed at each site and passed on to other sites. Each local WFG is modified as follows:  Since each site receives the potential deadlock cycles from other sites, these edges are added to the local WFGs  The edges in the local WFG which show that local transactions are waiting for transactions at other sites are joined with edges in the local WFGs which show that remote transactions are waiting for local ones. –Each local deadlock detector: looks for a cycle that does not involve the external edge. If it exists, there is a local deadlock which can be handled locally. looks for a cycle involving the external edge. If it exists, it indicates a potential global deadlock. Pass on the information to the next site.

“Relaxed” Concurrency Control Non-serializable histories –E.g., ordered shared locks –Semantics of transactions can be used Look at semantic compatibility of operations rather than simply looking at reads and writes Nested distributed transactions –Closed nested transactions –Open nested transactions –Multilevel transactions

Multilevel Transactions Consider two transactions T 1 :Withdraw(o,x ) T 2 :Withdraw(o,y ) Deposit( p, x )Deposit( p, y )

Questions These slides are from Principles of Distributed Database Systems M. Tamer Özsu Patrick Valduriez Springer (2011).