Temple University – CIS Dept. CIS661 – Principles of Data Management V. Megalooikonomou Concurrency control (based on slides by C. Faloutsos at CMU and on notes by Silberchatz,Korth, and Sudarshan)
General Overview Relational model - SQL Functional Dependencies & Normalization Physical Design &Indexing Query optimization Transaction processing concurrency control recovery
Transactions - dfn = unit of work, eg. Atomicity (all or none) move $10 from savings to checking Atomicity (all or none) Consistency Isolation (as if alone) Durability recovery concurrency control
Concurrency – overview why we want it? what does it mean ‘correct’ interleaving? precedence graph how to achieve correct interleavings automatically? concurrency control
Problem concurrent access to data (consider ‘lost update’ problem) how to solve it?
Lost update problem – no locks Read(N) time
Solution – part 1 locks! (most popular solution) lock manager: grants/denies lock requests
Lost update problem – with locks lock manager lock(N) Read(N) N=N-1 Write(N) Unlock(N) grants lock denies lock lock(N) time T2: waits grants lock to T2 Read(N) ...
Locks but, what if we all just want to read ‘N’?
Solution – part 1 Locks and their flavors compatibility matrix X-locks: exclusive (or write-) locks S-locks: shared (or read-) locks <and more ... > compatibility matrix T2 wants T1 has S X T F
Solution – part 1 A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions Any number of transactions can hold shared locks on an item, but if any transaction holds an exclusive on the item no other transaction may hold any lock on the item. If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. The lock is then granted.
Solution – part 1 transactions request locks (or upgrades) lock manager grants or blocks requests transactions release locks lock manager updates lock-table
Solution – part 2 locks are not enough – eg., ‘inconsistent analysis’
‘Inconsistent analysis’ time Precedence graph?
‘Inconsistent analysis’ – w/ locks time T1 L(A) Read(A) ... U(A) T2 L(A) .... L(B) the problem remains! Solution??
General solution: Protocol(s) Most popular protocol: A locking protocol is a set of rules followed by all transactions while requesting and releasing locks. Locking protocols restrict the set of possible schedules. Most popular protocol: 2 Phase Locking (2PL)
2PL Phase 1: Growing Phase Phase 2: Shrinking Phase This is a protocol which ensures conflict-serializable schedules. Phase 1: Growing Phase transaction may obtain locks transaction may not release locks Phase 2: Shrinking Phase transaction may release locks transaction may not obtain locks The protocol assures serializability. It can be proved that the transactions can be serialized in the order of their lock points (i.e. the point where a transaction acquired its final lock).
2PL X-lock version: transactions issue no lock requests, after the first ‘unlock’ THEOREM: if all transactions obey 2PL -> all schedules are serializable
2PL – example ‘inconsistent analysis’ – why not 2PL? how would it be under 2PL?
2PL – X/S lock version transactions issue no lock/upgrade request, after the first unlock/downgrade In general: ‘growing’ and ‘shrinking’ phase
2PL – observations limits concurrency may lead to deadlocks 2PLC (keep locks until ‘commit’) strict two-phase locking. Here a transaction must hold all its exclusive locks till it commits/aborts. Rigorous two-phase locking is even stricter: here all locks are held till commit/abort.
Concurrency – overview what does it mean ‘correct’ interleaving? precedence graph how to achieve correct interleavings automatically? concurrency control locks + protocols 2PL, 2PLC graph protocols multiple granularity locks <cc without locks: optimistic cc>
Other protocols than 2-PL – graph-based Assumption: we have prior knowledge about the order in which data items will be accessed (hierarchical) ordering on the data items, like, eg., pages of a B-tree A C B
Other protocols than 2-PL – graph-based Graph-based protocols are an alternative to two-phase locking Impose a partial ordering on the set D = {d1, d2 ,..., dh} of all data items. If di dj then any transaction accessing both di and dj must access di before accessing dj. Implies that the set D may now be viewed as a directed acyclic graph, called a database graph. The tree-protocol is a simple kind of graph protocol.
Eg., tree protocol (X-lock version) an xact can request any item, on its first lock request from then on, it can only request items for which it holds the parent lock it can release locks at any time it can NOT request an item twice
Tree protocol - example follows tree protocol? ‘correct’? T1 T2 L(B) L(D) L(H) U(D) L(E) U(E) L(D) U(B) U(H) L(G) U(G) B C D E F G H I
Tree protocol equivalent to 2PL? deadlocks? Pros and cons
Tree protocol The tree protocol ensures conflict serializability as well as freedom from deadlock. Unlocking may occur earlier in the tree-locking protocol than in the two-phase locking protocol. shorter waiting times, increase in concurrency protocol is deadlock-free, no rollbacks are required the abort of a transaction can still lead to cascading rollbacks. However, in the tree-locking protocol, a transaction may have to lock data items that it does not access. increased locking overhead, and additional waiting time potential decrease in concurrency Schedules not possible under two-phase locking are possible under tree protocol, and vice versa.
More protocols lock granularity – field? record? page? table? Pros and cons? (Ideally, each transaction should obtain a few locks)
Multiple granularity Eg: DB Table1 Table2 record1 record2 record-n attr2 attr1 attr1
Multiple granularity Allow data items to be of various sizes and define a hierarchy of data granularities Can be represented graphically as a tree (but don't confuse with tree-locking protocol) When a transaction locks a node in the tree explicitly, it implicitly locks all the node's descendents in the same mode. Locking granularity (level in tree where locking is done): fine granularity (lower in tree): high concurrency, high locking overhead coarse granularity (higher in tree): low locking overhead, low concurrency
what types of locks? X/S locks for leaf level + ‘intent’ locks, for higher levels IS: intent to obtain S-lock underneath IX: intent to obtain X-lock underneath S: shared lock for this level X: ex- lock for this level SIX: shared lock here; + IX
Protocol each xact obtains appropriate lock at highest level proceeds to desirable lower levels intention locks allow a higher level node to be locked in S or X mode without having to check all descendent nodes.
Protocol Transaction Ti can lock a node Q, using the following rules: 1. The lock compatibility matrix must be observed. 2. The root of the tree must be locked first, and may be locked in any mode. 3. A node Q can be locked by Ti in S or IS mode only if the parent of Q is currently locked by Ti in either IX or IS mode. 4. A node Q can be locked by Ti in X, SIX, or IX mode only if the parent of Q is currently locked by Ti in either IX or SIX mode. 5. Ti can lock a node only if it has not previously unlocked any node (that is, Ti is two-phase). 6. Ti can unlock a node Q only if none of the children of Q are currently locked by Ti. Observe that locks are acquired in root-to-leaf order, whereas they are released in leaf-to-root order.
Compatibility matrix T2 wants T1 has IS IX S SIX X t f
Conclusions ‘ACID’ for transactions concurrency: serializability (precedence graph) one (popular) solution: locks + 2PL(C) protocol graph protocols; multiple granularity