Concurrency Control II

Slides:



Advertisements
Similar presentations
Concurrency Control III. General Overview Relational model - SQL Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
Database Systems (資料庫系統)
Transaction Management and Concurrency Control
1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Concurrency Control II
Concurrency Control Amol Deshpande CMSC424. Approach, Assumptions etc.. Approach  Guarantee conflict-serializability by allowing certain types of concurrency.
Concurrency Control II. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Concurrency Control Part 2 R&G - Chapter 17 The sequel was far better than the original! -- Nobody.
1 Supplemental Notes: Practical Aspects of Transactions THIS MATERIAL IS OPTIONAL.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 19.
Quick Review of Apr 29 material
1 Concurrency Control Yanlei Diao UMass Amherst March 27 and 29, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Concurrency Control R &G - Chapter 19 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book.
©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.
Concurrency Control. Example Schedules Constraint: The sum of A+B must be the same Before: After: T1 read(A) A = A -50 write(A) read(B)
1 Concurrency Control. 2 Locking: A Technique for C. C. Concurrency control usually done via locking. Lock info maintained by a “lock manager”: –Stores.
Concurrency Control II R &G - Chapter 17 Lecture 20 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17.
Final Exam Review Last Lecture R&G - All Chapters Covered The end crowns all, And that old common arbitrator, Time, Will one day end it. William Shakespeare.
©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.
Concurrency Control II R &G - Chapter 19 Lecture 22 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue.
ICS (072)Concurrency Control1 Transaction Processing and Concurrency Control Dr. Muhammad Shafique Chapter March 2008.
Concurrency Control III R &G - Chapter 17 Lecture 24 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue.
©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.
Alternative Concurrency Control Methods R&G - Chapter 17.
Concurrency Control. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Transaction Processing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17 Modified by Donghui Zhang.
Concurrency Control.
MULTIUSER DATABASES : Concurrency and Transaction Management.
1 Transaction Management Overview Chapter Transactions  Concurrent execution of user programs is essential for good DBMS performance.  Because.
1 Transaction Management Overview Chapter Transactions  A transaction is the DBMS’s abstract view of a user program: a sequence of reads and writes.
CPSC 461. Outline I. Transactions II. Concurrency control Conflict Serializable Schedules Two – Phase protocol Lock Management Deadlocks detection and.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 18.
Database Systems/COMP4910/Spring05/Melikyan1 Transaction Management Overview Unit 2 Chapter 16.
1 Transaction Management Overview Chapter Transactions  Concurrent execution of user programs is essential for good DBMS performance.  Because.
V. Megalooikonomou Concurrency control (based on slides by C. Faloutsos at CMU and on notes by Silberchatz,Korth, and Sudarshan) Temple University – CIS.
Chapter 11 Concurrency Control. Lock-Based Protocols  A lock is a mechanism to control concurrent access to a data item  Data items can be locked in.
Chapter 15 Concurrency Control Yonsei University 1 st Semester, 2015 Sanghyun Park.
Concurrency Control Concurrency Control By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Concurrency Control in Database Operating Systems.
Introduction to Database Management Systems, Jarek Gryz1 Transaction Management Overview Chapter 16.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two schedules are conflict equivalent if:  Involve the same actions of the same.
1 Concurrency Control Lecture 22 Ramakrishnan - Chapter 19.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
Concurrency Control Introduction Lock-Based Protocols
1 Concurrency control lock-base protocols timestamp-based protocols validation-based protocols Ioan Despi.
1 Database Systems ( 資料庫系統 ) December 27, 2004 Chapter 17 By Hao-hua Chu ( 朱浩華 )
1 CSE232A: Database System Principles More Concurrency Control and Transaction Processing.
Timestamp-based Concurrency Control
Lecture 9- Concurrency Control (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Concurrency Control 2004, Spring Pusan National University Ki-Joune Li.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two schedules are conflict equivalent if:  Involve the same actions of the same.
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
Concurrency Control Techniques
Concurrency Control Chapter 17 Modified by Donghui Zhang
Chapter 16: Concurrency Control
Transaction Management
MULTIUSER DATABASES : Concurrency and Transaction Management
Transaction Management
Concurrency Control Chapter 17
Concurrency Control Chapter 17
Concurrency Control Chapter 17
Transaction Management
Temple University – CIS Dept. CIS661 – Principles of Data Management
Concurrency Control Chapter 17
CONCURRENCY Concurrency is the tendency for different tasks to happen at the same time in a system ( mostly interacting with each other ) .   Parallel.
Presentation transcript:

Concurrency Control II

General Overview Relational model - SQL Functional Dependencies Formal & commercial query languages Functional Dependencies Normalization Physical Design Indexing Query Processing and Optimization Transaction Processing and CC

Review: AC[I]D Isolation How? Concurrent xctions unaware of each other How? Serial execution of transactions Poor Throughput and response time Ensure concurrency Prevent “bad” concurrency and allow only “good” concurrency through analysis of “schedules” Allow only “conflict serializable” schedules: schedules that are equivalent to (some) serial schedules. Precedence graph: If PS is acyclic  confl. serializable schedule

How to enforce serializable schedules? prevent P(S) cycles from occurring using a concurrency control manager: ensures interleaving of operations amongst concurrent xctions only result in serializable schedules. T1 T2 ….. Tn CC Scheduler DB

Anomalies with Interleaved Execution Reading Uncommitted Data (WR Conflicts, “dirty reads”): Unrepeatable Reads (RW Conflicts): T1: R(A), W(A), R(B), W(B), Abort T2: R(A), W(A), C T1: R(A), R(A), W(A), C T2: R(A), W(A), C

Anomalies (Continued) Overwriting Uncommitted Data (WW Conflicts): T1: W(A), W(B), C T2: W(A), W(B), C Solution: Use appropriate CC Protocols to achieve serializable schedules

Agenda 2PL and variants Timestamp-based Optimistic CC: Validation-based protocols Multiple granularity Multi-version Weaker Consistency (other than serializability) Dealing with Deadlocks

The Two-Phase Locking Protocol This is a protocol which ensures conflict-serializable schedules. Phase 1: Growing Phase transaction may obtain locks transaction may not release locks Phase 2: Shrinking Phase transaction may release locks transaction may not obtain locks The protocol assures serializability. It can be proved that the transactions can be serialized in the order of their lock points (i.e. the point where a transaction acquired its final lock). Locks can be either X, or S/X.

Lock-Based Concurrency Control Strict Two-phase Locking (Strict 2PL) Protocol: Each Xact must obtain a S (shared) lock on object before reading, and an X (exclusive) lock on object before writing. All locks held by a transaction are released when the transaction completes If an Xact holds an X lock on an object, no other Xact can get a lock (S or X) on that object. Strict 2PL allows only serializable schedules. Has no cascading rollbacks (as locks are released only when txn completes)

Agenda 2PL and variants Timestamp-based Optimistic CC: Validation-based protocols Multiple granularity Multi-version Weaker Consistency (other than serializability) Dealing with Deadlocks

Timestamp-Based Protocols Idea: Decide in advance ordering of xctions Ensure concurrent schedule serializes to serial order decided Timestamps TS(Ti) is time Ti entered the system Data item timestamps: W-TS(Q): Largest timestamp of any xction that wrote Q R-TS(Q): Largest timestamp of any xction that read Q Timestamps -> serializability order

Timestamp CC Idea: If action pi of Xact Ti conflicts with action qj of Xact Tj, and TS(Ti) < TS(Tj), then pi must occur before qj. Otherwise, restart violating Xact.

When Xact T wants to read Object O If TS(T) < W-TS(O), this violates timestamp order of T w.r.t. writer of O. So, abort T and restart it with a new, larger TS. (If restarted with same TS, T will fail again!) If TS(T) > W-TS(O): Allow T to read O. Reset R-TS(O) to max(R-TS(O), TS(T)) Change to R-TS(O) on reads must be written to disk! This and restarts represent overheads. U writes O T reads O T start U start

When Xact T wants to Write Object O If TS(T) < R-TS(Q), then the value of Q that T is producing was needed previously, and the system assumed that that value would never be produced. write rejected, T is rolled back. If TS(T) < W-TS(Q), then T is attempting to write an obsolete value of Q. Hence, this write operation is rejected, and T is rolled back. Otherwise, the write operation is executed, and W-TS(Q) is set to TS(T). U reads Q T writes Q T start U start

When Xact T wants to Write Object O If TS(T) < R-TS(Q), this violates timestamp order of T w.r.t. writer of Q; abort and restart T. If TS(T) < WTS(Q), violates timestamp order of T w.r.t. writer of Q. Thomas Write Rule: We can safely ignore such outdated writes; need not restart T! (T’s write is effectively followed by another write, with no intervening reads.) Allows some serializable but non conflict serializable schedules: Else, allow T to write O. T1 T2 R(A) W(A) Commit Allows non-Conflict-serializable schedules

How Locking works in practice Read(A),Write(B) l(A),Read(A),l(B),Write(B)… Scheduler, part I lock table Scheduler, part II DB

Agenda 2PL and variants Timestamp-based Optimistic CC: Validation-based protocols Multiple granularity Multi-version Weaker Consistency (other than serializability) Dealing with Deadlocks

Optimistic CC (Kung-Robinson) Locking is a conservative approach in which conflicts are prevented. Disadvantages: Lock management overhead. Deadlock detection/resolution. Lock contention for heavily used objects. If conflicts are rare, we might be able to gain concurrency by not locking, and instead checking for conflicts before Xacts commit.

Optimistic CC: Kung-Robinson Model Xacts have three phases: READ: Xacts read from the database, but make changes to private copies of objects. VALIDATE: Check for conflicts. WRITE: Make local copies of changes public. old modified objects ROOT new

Validation Test conditions that are sufficient to ensure that no conflict occurred. Each Xact is assigned a numeric id. Just use a timestamp. Xact ids assigned at end of READ phase, just before validation begins. (Why then?) ReadSet(Ti): Set of objects read by Xact Ti. WriteSet(Ti): Set of objects modified by Ti.

Test 1 For all i and j such that Ti < Tj, check that Ti completes before Tj begins. Ti Tj R V W R V W

Test 1 For all i and j such that Ti < Tj, check that Ti completes before Tj begins. Ti Tj R V W R V W

Test 2 For all i and j such that Ti < Tj, check that: Ti completes before Tj begins its Write phase + WriteSet(Ti) ReadSet(Tj) is empty. Ti R V W Tj R V W Does Tj read dirty data? Does Ti overwrite Tj’s writes?

Test 3 For all i and j such that Ti < Tj, check that: Ti completes Read phase before Tj does + WriteSet(Ti) ReadSet(Tj) is empty + WriteSet(Ti) WriteSet(Tj) is empty. Ti R V W Tj R V W Does Tj read dirty data? Does Ti overwrite Tj’s writes?

Example of what validation must prevent: RS(T2)={B} RS(T3)={A,B} WS(T2)={B,D} WS(T3)={C} =   T2 start T2 validated T3 validated T3 start time

Example of what validation must allow: =  RS(T2)={B} RS(T3)={A,B} WS(T2)={B,D} WS(T3)={C}  T2 start T2 validated T3 validated T3 start T2 finish phase 3 T3 start time

Another thing validation must prevent: RS(T2)={A} RS(T3)={A,B} WS(T2)={D,E} WS(T3)={C,D} T2 validated T3 validated BAD: w3(D) w2(D) finish T2 time

Another thing validation must allow: RS(T2)={A} RS(T3)={A,B} WS(T2)={D,E} WS(T3)={C,D} T2 validated T3 validated finish T2 finish T2 time

Comments on Serial Validation Assignment of Xact id, validation, and the Write phase are inside a critical section! I.e., Nothing else goes on concurrently. If Write phase is long, major drawback. Optimization for Read-only Xacts: Don’t need critical section (because there is no Write phase).

Overheads in Optimistic CC Must record read/write activity in ReadSet and WriteSet per Xact. Must create and destroy these sets as needed. Must check for conflicts during validation, and must make validated writes ``global’’. Critical section can reduce concurrency. Scheme for making writes global can reduce clustering of objects. Optimistic CC restarts Xacts that fail validation. Work done so far is wasted; requires clean-up.

``Optimistic’’ 2PL If desired, we can do the following: Set S locks as usual. Make changes to private copies of objects. Obtain all X locks at end of Xact, make writes global, then release all locks. In contrast to Optimistic CC as in Kung-Robinson, this scheme results in Xacts being blocked, waiting for locks. However, no validation phase, no restarts (modulo deadlocks).

Agenda 2PL and variants Timestamp-based Optimistic CC: Validation-based protocols Multiple granularity Multi-version Weaker Consistency (other than serializability) Dealing with Deadlocks

Multiple Granularity Database Tables contains Pages Tuples Allow data items to be of various sizes and define a hierarchy of data granularities, where the small granularities are nested within larger ones When a transaction locks a node in the hierarchy explicitly, it implicitly locks all the node's descendents in the same mode. Tuples Tables Pages Database contains

Multiple Granularity If we lock large objects (e.g., Relations) Need few locks Low concurrency If we lock small objects (e.g., tuples,fields) Need more locks More concurrency

Example of Granularity Hierarchy The highest level in the example hierarchy is the entire database. The levels below are of type area, file or relation and record in that order.

Multiple-Granularity Locks Hard to decide what granularity to lock (tuples vs. pages vs. tables). Shouldn’t have to decide! Data “containers” are nested: Database Tables contains Pages Tuples

Solution: New Lock Modes, Protocol Allow Xacts to lock at each level, but with a special protocol using new “intention” locks: Before locking an item, Xact must set “intention locks” on all its ancestors. For unlock, go from specific to general (i.e., bottom-up). SIX mode: Like S & IX at the same time. Scanning the table but updating few rows -- IS IX Ö S X

Multiple Granularity Lock Protocol Each Xact starts from the root of the hierarchy. To get S or IS lock on a node, must hold IS or IX on parent node. What if Xact holds SIX on parent? S on parent? To get X or IX or SIX on a node, must hold IX or SIX on parent node. Must release locks in bottom-up order. Protocol is correct in that it is equivalent to directly setting locks at the leaf levels of the hierarchy.

Compatibility Matrix with Intention Lock Modes The compatibility matrix for all lock modes is: requestor IS IX S S IX X   holder

P C IS IX S IS, S SIX IS, S, IX, X, SIX X [S, IS] not necessary Parent Child can be locked in locked in IS IX S SIX X P IS, S IS, S, IX, X, SIX [S, IS] not necessary X, IX, [SIX] none C

Example T1(IS) T1(S) , T2(IX) T2(X) R1 t1 t4 t2 t3

Multiple Granularity Locking Scheme Transaction Ti can lock a node Q, using the following rules: (1) Follow multiple granularity comp function Lock root of tree first, any mode Node Q can be locked by Ti in S or IS only if parent(Q) can be locked by Ti in IX or IS Node Q can be locked by Ti in X,SIX,IX only if parent(Q) locked by Ti in IX,SIX (2) Ti is two-phase (2PL) (3) Ti can unlock node Q only if none of Q’s children are locked by Ti  Observe that locks are acquired in root-to-leaf order, whereas they are released in leaf-to-root order.

Examples T1(IX) T1(IS) R R T1(IX) T1(S) t2 t3 t1 t4 t1 t2 t3 t4 T1(X) f2.2 f4.2 f4.2 f2.1 f4.2 f4.2 f2.1 f2.2 Can T2 access object f2.2 in X mode? What locks will T2 get? T1(SIX) R Parent Child IS IS,S IX IS,S, IX, X, SIX S [S, IS] not necessary SIX X, IX, [SIX] X none T1(IX) t1 t2 t3 t4 T1(X) f2.1 f2.2 f4.2 f4.2

Agenda 2PL and variants Timestamp-based Optimistic CC: Validation-based protocols Multiple granularity Multi-version Weaker Consistency (other than serializability) Dealing with Deadlocks

Multiversion Schemes Multiversion schemes keep old versions of data item to increase concurrency. Multiversion Timestamp Ordering Multiversion Two-Phase Locking Each successful write results in the creation of a new version of the data item written. Use timestamps to label versions. When a read(Q) operation is issued, select an appropriate version of Q based on the timestamp of the transaction, and return the value of the selected version. reads never have to wait as an appropriate version is returned immediately.

More on Consistency We have seen thus far: Serializability -- 2PL and timestamp Weaker levels of consistency Degree-two consistency: differs from two-phase locking in that S-locks may be released at any time, and locks may be acquired at any time X-locks must be held till end of transaction Serializability is not guaranteed, programmer must ensure that no erroneous database state will occur Cursor stability: For reads, each tuple is locked, read, and lock is immediately released X-locks are held till end of transaction Special case of degree-two consistency

Weak Levels of Consistency in SQL SQL allows non-serializable executions Serializable: is the default Repeatable read: allows only committed records to be read, and repeating a read should return the same value (so read locks should be retained) However, the phantom phenomenon need not be prevented T1 may see some records inserted by T2, but may not see others inserted by T2 Read committed: same as degree two consistency, but most systems implement it as cursor-stability Read uncommitted: allows even uncommitted data to be read In many database systems (Oracle), Read Committed is the default consistency level has to be explicitly changed to serializable when required set isolation level serializable

Agenda 2PL and variants Timestamp-based Optimistic CC: Validation-based protocols Multiple granularity Multi-version Weaker Consistency (other than serializability) Dealing with Deadlocks

Dealing with Deadlocks Deadlock Prevention (read from the book) Deadlock detection; How do you detect a deadlock? Wait-for graph Directed edge from Ti to Tj Ti waiting for Tj T2 T4 T1 T1 T2 T3 T4 S(V) X(V) S(W) X(Z) X(W) T3 Suppose T4 requests lock-S(Z).... S(Z)

Detecting Deadlocks Wait-for graph has a cycle  deadlock T2, T3, T4 are deadlocked T2 T4 T1 T3 Build wait-for graph, check for cycle How often? - Tunable Expect many deadlocks or many xctions involved Run often to avoid aborts Else run less often to reduce overhead

Recovering from Deadlocks Rollback one or more xction Which one? Rollback the cheapest ones Cheapest ill-defined Was it almost done? How much will it have to redo? Will it cause other rollbacks? How far? May only need a partial rollback Avoid starvation Ensure same xction not always chosen to break deadlock Simplest mechanism: Timeout : abort after a lengthy wait time

Concurrency Control : Summary 2PL, Strict 2PL,.. Ensure Conflict-Serializable schedules Timestamp-based CC Thomas-Write Rule: can give non Conflict-serializable schedules Optimistic CC No locking overheads but critical-section overheads Multiple Granularity Additional intention locks to lock ancestors before we lock leaves in S or X modes. Release is always from leaf-to-root Multi-version: fast for reads Weaker versions Oracle default: Read Committed + Multi-version => high throughput (patented technology)