Concurrency Control Nate Nystrom CS 632 February 6, 2001.

Concurrency Control Nate Nystrom CS 632 February 6, 2001

Papers  Berenson, Bernstein, Gray, et al., "A Critique of ANSI SQL Isolation Levels", SIGMOD'95  Kung and Robinson, "On Optimistic Methods for Concurrency Control", TODS June 1981  Agrawal, Carey, and Livny, "Models for Studying Concurrency Control Performance: Alternatives and Implications", SIGMOD'85

Concurrency control methods  Locking  By far, the most popular method  Deadlock, starvation  Optimistic  High abort rates  Immediate restart

Isolation Levels  Serializability is expensive to enforce  Trade correctness for performance  Transactions can run at lower isolation levels  Repeatable read  Read committed  Read uncommitted

Basics  History: sequence of operations  Ex: r 1 (x) r 2 (y) w 1 (y) c 1 w 2 (x) a 2  Dependencies: wr (true), rw (anti), ww (output)  H and H' equivalent if H' is reordering of H and H' has same dependencies as H  H serializable if  serial H' s.t. H  H'  Concurrent T and T' conflict if both access same item and one writes

ANSI SQL Isolation Levels  Defined in terms of proscribed anomalies  Read Uncommitted - everything allowed  Read Committed - dirty reads  Repeatable Read - dirty reads, fuzzy reads  Serializable - dirty reads, fuzzy reads, phantoms

Problems  Anomalies are ambiguous  w 1 (x)... r 2 (x)... (a 1 & c 2 in any order)  w 1 (x)... r 2 (x)... ((c 1 | a 1 ) & (c 2 | a 2 ) in any order)  First case is strict interpretation (an anomaly), second is loose interpretation (a phenomenon)  Anomalies don't prevent some undesirable behavior  Ex: Phantom defined to include inserts and updates, but not deletes

Locking  T has well-formed writes (reads) if it requests a write lock before writing  T has two-phase locking if it does not request any lock after releasing a lock  Locks are long duration if held until abort, else short duration  Theorem: well-formed two-phase locking guarantees serializability

Locking Isolation Levels  0 has well-formed (i.e., short) writes  1  (read committed) - long duration write locks   (read uncommitted) - short read locks, long write locks  repeatable read - short predicate read locks, long item read locks, long write locks   (serializable) - long read locks, long write locks

Dirty Writes  ANSI definitions lack prohibition of dirty writes  w 1 (x)... w 2 (x)... ((c 1 | a 1 ) & (c 2 | a 2 ) in any order)  With dirty writes allowed, rollback is difficult to implement (with locking CC)  Prohibiting dirty writes serializes txns in write order (all ww dependencys go forward)

New Definitions  Use loose interpretation  Fix definition of phantom to prevent deletes  Prohibit dirty writes  Read Uncommitted - dirty writes  Read Committed - dirty writes, dirty reads  Repeatable Read - dirty writes, dirty reads, fuzzy reads  Serializable - dirty writes, dirty reads, fuzzy reads, phantoms

More Problems  New definitions are too strong  Prohibits some serializable histories  r 1 (x) w 1 (x) r 1 (y) w 1 (y) r 2 (x) r 2 (y) c 1 c 2  T 2 has dirty reads according to the proposed new definitions  Prohibiting dirty writes useful for recovery with locking CC, but not helpful for optimistic CC

Other Isolation Levels  Cursor stability  Prevent lost updates by adding cursor reads  Stronger than read committed  Weaker than repeatable read  Snapshot isolation  Read from/write to a snapshot of the committed data as of the time the transaction started  Stronger than read committed  Incomparable to repeatable read

Optimistic Concurrency Control  Divide transaction into read, validate, and write phases  Validation checks if transaction can be inserted into a serializable history  Why: lower message cost, little blocking in low contention environments, no deadlock  Why not: abort rates can be high, not suitable for interactive, non-restartable, transactions

Validation  Assign transaction i a unique number t(i).  Validation condition:  For all i and for all j with t(i) < t(j), one of the following must hold: 1 i completes write phase before j starts read phase 2 i completes write phase before j starts write phase and WS(i)  RS(j) =  3 i completes read phase before j completes read phase and WS(i)  (RS(j)  WS(j)) = 

Validation readwritevalidate readwritevalidate readwritevalidate readwritevalidate readwritevalidate readwritevalidate 1. 2. 3. WS(i)  RS(j) =  i j WS(i)  (RS(j)  WS(j)) =  i i j j

Transaction numbers  What should t(i) be?  Unique timestamp assigned at beginning of validation phase  Guarantees that i completes read phase before j completes read phase if t(i) < t(j)

Serial Implementation  Ensure one of conditions (1) or (2) holds  At transaction begin, record start tn  At transaction end, record finish tn  Validate against all t in [start tn+1, finish tn] by checking if RS intersects WS(t)  (2) requires concurrent transactions write phases are serial: put validation, assignment of tn, and write phase in a critical section  Various optimizations to reduce size of critical section

Parallel Implementation  Ensure one of (1), (2), and (3) hold  At transaction end, take snapshot of active set, then add tid to active set  Validate outside CS against:  All t in [start tn+1, finish tn] by checking if RS intersects WS(t)  All t in our snapshot of active by checking if RS or WS intersects WS(t)  If valid, perform writes outside CS, assign tn, and remove from active set

Performance  Agrawal: previous studies flawed  Different performance models  contradictions  Flawed assumptions  Infinite resources  Transactions progress at a rate independent of number of concurrent transactions  Need a more complete, more realistic model

Logical Queuing Model terminals blocked Q ready Q update Q update delay object Q think?more? think object UPDATE COMMIT RESTART ACCESS BLOCK CC

Experiments  Compare locking, optimistic, and immediate-restart CC  Low contention (large database)  Infinite resources  Limited resources (small database)  Multiple resources  Interactive workloads

Throughput

Limited Resources  Correspondence between disk utilization and throughput when low contention  When high contention, correspondence between useful disk utilization and throughput  High contention  aborts and restarts

Response Time

Multiple Resources

 As resources increase, non-blocking CC scales better than blocking  Blocking CC thrashes waiting for locks  Optimistic CC thrashes on restarts  Immediate-restart CC reaches a plateau due to adaptive restart delay

Conclusions  Locking has better throughput for medium to high contention environments  If resource utilization low enough that waste can be tolerated, immediate- restart and optimistic CC have better throughput  Limit multiprogramming level to avoid thrashing due to blocking and restarts

Concurrency Control Nate Nystrom CS 632 February 6, 2001.

Similar presentations

Presentation on theme: "Concurrency Control Nate Nystrom CS 632 February 6, 2001."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Concurrency Control Nate Nystrom CS 632 February 6, 2001.

Similar presentations

Presentation on theme: "Concurrency Control Nate Nystrom CS 632 February 6, 2001."— Presentation transcript:

Similar presentations

About project

Feedback