Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden.

Slides:



Advertisements
Similar presentations
1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
Advertisements

Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Transaction Management 2004, Spring Pusan National University Ki-Joune.
1 Chapter 3. Synchronization. STEMPusan National University STEM-PNU 2 Synchronization in Distributed Systems Synchronization in a single machine Same.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions.
ICS 421 Spring 2010 Transactions & Concurrency Control (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa.
Concurrency Control and Recovery In real life: users access the database concurrently, and systems crash. Concurrent access to the database also improves.
Transaction Management and Concurrency Control
Transaction Management Overview R & G Chapter 16 There are three side effects of acid. Enhanced long term memory, decreased short term memory, and I forget.
1 Transaction Management Overview Yanlei Diao UMass Amherst March 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.
Transaction Management
1 Transaction Management Database recovery Concurrency control.
Database Management Systems I Alex Coman, Winter 2006
Transactions Amol Deshpande CMSC424. Today Project stuff… Summer Internships 
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Concurrency Control. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Transaction Processing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
CS162 Section Lecture 10 Slides based from Lecture and
1 Transaction Management Overview Chapter Transactions  Concurrent execution of user programs is essential for good DBMS performance.  Because.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
1 Transaction Management Overview Chapter Transactions  A transaction is the DBMS’s abstract view of a user program: a sequence of reads and writes.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 18.
08_Transactions_LECTURE2 DBMSs should guarantee ACID properties (Atomicity, Consistency, Isolation, Durability). This is typically done by guaranteeing.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Database Systems/COMP4910/Spring05/Melikyan1 Transaction Management Overview Unit 2 Chapter 16.
1 Transaction Management Overview Chapter Transactions  Concurrent execution of user programs is essential for good DBMS performance.  Because.
Isolation Spring 2010 Lecture 18 Sam Madden Key concepts: Serial equivalence Two-phase locking Deadlock detection slides online at
Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
1 Transactions Chapter Transactions A transaction is: a logical unit of work a sequence of steps to accomplish a single task Can have multiple.
Concurrency Control in Database Operating Systems.
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
15.1 Transaction Concept A transaction is a unit of program execution that accesses and possibly updates various data items. E.g. transaction to transfer.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Software System Lab. Transactions Transaction Concept A transaction is a unit of program execution that accesses and possibly updates various.
CSE544: Transactions Concurrency Control Wednesday, 4/26/2006.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions.
Transaction Management Exercises I/O and CPU activities can be and are overlapped to minimize (disk and processor) idle time and to maximize throughput.
Jinze Liu. ACID Atomicity: TX’s are either completely done or not done at all Consistency: TX’s should leave the database in a consistent state Isolation:
Advanced Database CS-426 Week 6 – Transaction. Transactions and Recovery Transactions A transaction is an action, or a series of actions, carried out.
CS 440 Database Management Systems Concurrency Control 1.
MULTIUSER DATABASES : Concurrency and Transaction Management.
Concurrency Control, Locking, and Recovery
Transactions and Concurrency Control
Concurrency control in transactional systems
Part- A Transaction Management
Two phase commit.
Transaction Management
COS 418: Advanced Computer Systems Lecture 5 Michael Freedman
Transaction Management Overview
IS 651: Distributed Systems Consensus
Transaction Management
File Processing : Transaction Management
Chapter 10 Transaction Management and Concurrency Control
CS162 Operating Systems and Systems Programming Review (II)
Atomic Commit and Concurrency Control
Transaction Management Overview
Transaction management
Transaction Management
Transaction Management Overview
Concurrency control (OCC and MVCC)
Lecture 18: Concurrency Control
Presentation transcript:

Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden

What we’ve learnt so far… All-or-nothing atomicity in transactions –So that failures do not result in (undesirable) intermediate state How to realize all-or-nothing atomicity? –Logging –REDO/UNDO logging vs. REDO-only logging

Concurrency control Many application clients are concurrently accessing a storage system. Concurrent transactions might interfere –Similar to data races in parallel programs

A_bal = READ(A) If (A_bal >50) { A_bal -= 50 WRITE(A, A_bal) dispense $50 to user } Withdraw $50 from A A_bal = READ(A) B_bal = READ(B) Print(A_bal+B_bal) Report sum of money An example Storage system A_bal = READ(A) If (A_bal >100) { B_bal = READ(B) B_bal += 100 A_bal -= 100 WRITE(A, A_bal) WRITE(B, B_bal) } Transfer $100 from A to B

Yfs lab example extent server d = GET(dir) newd = append “x” to d PUT(newd) yfs client 1 (CREATE) d = GET(dir) newd = append “y” to d PUT(newd) yfs client 2 (CREATE)

Solutions? 1.Leave it to application programmers –Application programmers are responsible for locking data items 2.Storage system performs automatic concurrency control –Concurrent transactions execute as if serially

ACID transactions A (Atomicity) –All-or-nothing w.r.t. failures C (Consistency) –Transactions maintain any internal storage state invariants I (Isolation) –Concurrently executing transactions do not interfere D (Durability) –Effect of transactions survive failures

What’s ACID? an example T1: Transfer $100 from A to B T2: Transfer $50 from B to A A T1 completes or nothing (ditto for T2) D once T1/T2 commits, stays done, no updates lost I no races, as if T1 happens either before or after T2 C preserves invarants, e.g. account balance > 0

Ideal isolation semantics: serializability Definition: execution of a set of transactions is equivalent to some serial order –Two executions are equivalent if they have the same effect on database and produce same output.

Conflict serializability An execution schedule is the ordering of read/write/commit/abort operations A_bal = READ(A) B_bal = READ(B) B_bal += 100 A_bal -= 100 WRITE(A, A_bal) WRITE(B, B_bal) Commit A_bal = READ(A) B_bal = READ(B) Print(A_bal+B_bal) Commit A (serial) schedule: R(A),R(B),W(A),W(B),C,R(A),R(B),C

Conflict serializability Two schedules are equivalent if they: –contain same operations –order conflicting operations the same way A schedule is serializable if it’s identical to some serial schedule Two ops conflict if they access the same data item and one is a write.

Examples A_bal = READ(A) B_bal = READ(B) Print(A_bal+B_bal) A_bal = READ(A) B_bal = READ(B) B_bal += 100 A_bal -= 100 WRITE(A, A_bal) WRITE(B, B_bal) Serializable? R(A),R(B),R(A),R(B),C W(A),W(B),C Serializable? R(A),R(B), W(A), R(A),R(B),C, W(B),C Equivalent serial schedule: R(A),R(B),C,R(A),R(B),W(A),W(B),C

Realize a serializable schedule Locking-based approach Strawman solution 1: –Grab global lock before transaction starts –Release global lock after transaction commits Strawman solution 2: –Grab lock on item X before reading/writing X –Release lock on X after reading/writing X

Strawman 2’s problem A_bal = READ(A) B_bal = READ(B) Print(A_bal+B_bal) A_bal = READ(A) B_bal = READ(B) B_bal += 100 A_bal -= 100 WRITE(A, A_bal) WRITE(B, B_bal) Possible with strawman 2? (short-duration locks on writes) R(A),R(B), W(A), R(A),R(B),C, W(B),C Locks on writes should be held till end of transaction Read an uncommitted value

More Strawmans Strawman 3 –Grab lock on item X before writing X –Release locks at the end of transaction –Grab lock on item X before reading X –Release locks immediately after reading Possible with strawman 3? (short-duration locks on reads) R(A),R(B), W(A), R(A),R(B),C, W(B),C R(A), R(A),R(B), W(A), W(B),C, R(B), C Non-repeatable reads Read locks must be held till commit time

Strawman 3’s problem Possible with strawman 3? (short-duration locks on reads) R(A),R(B), W(A), R(A),R(B),C, W(B),C R(A), R(A),R(B), W(A), W(B),C, R(B), C Non-repeatable reads Read locks must be held till commit time A_bal = READ(A) B_bal = READ(B) Print(A_bal+B_bal) A_bal = READ(A) B_bal = READ(B) B_bal += 100 A_bal -= 100 WRITE(A, A_bal) WRITE(B, B_bal)

2 phase locking (2PL) –A growing phase in which the transaction is acquiring locks –A shrinking phase in which locks are released In practice, –The growing phase is the entire transaction –The shrinking phase is at the commit time Optimization: –Use read/write locks instead of exclusive locks Realize a serializable schedule

2PL: an example R_lock(A) A_bal = READ(A) R_lock(B) B_bal = READ(B) B_bal += 100 A_bal -= 100 W_lock(A) WRITE(A, A_bal) W_lock(B) WRITE(B, B_bal) W_unlock(A) W_unlock(B) R_lock(A) A_bal = READ(A) R_lock(B) B_bal = READ(B) Print(A_bal+B_bal) R_unlock(A) R_unlock(B) Possible? R(A),R(B), W(A), R(A),R(B),C W(B),C

More on 2PL What if a lock is unavailable? Deadlocks possible? How to cope with deadlock? –Grab locks in order? No always possible –Transaction manager detects deadlock cycles and aborts affected transactions –Alternative: timeout and abort yourself wait detect & abort Use crash recovery mechanism (e.g. UNDO records) to clean up

The Phantom problem T1: begin_tx update emp (set salary = 1.1 * salary) where dept = ‘CS’ end_tx T2: begin_tx Update emp (set dept = ‘CS’) where emp.name = ‘jinyang’ end_tx T1: update Bob T1: update Herbert T2: move Jinyang T2: move Lakshmi T2 commits and releases all locks T1: update Lakshmi(!!!!!) T1: update Michael T1: commits

The Phantom problem Issue is lock things you touch – but must guarantee non-existence of any new ones! Solutions: –Predicate locking –Range locks in a B-tree index (assumes an index on dept). Otherwise, lock entire table –Often ignored in practice

Why less than serializability? Performance of locking? What to do: –run analytics on a companion data warehouse (common practice) –take the stall (rarely done) – run less than serializability T1: begin_tx Select avg (sal) from emp end_tx T2: begin_tx Update emp … end_tx

Degrees of isolation Degree 0 (read uncommitted) –no locks (guarantees nothing) Degree 1 (read committed) –Short read locks, long write locks Degree 2 –Long read/write locks. (serializable unless you squint) Degree 3 –Long read/write locks plus solving the phantom problem

Disadvantages of locking- based approach Need to detect deadlocks –Distributed implemention needs distributed deadlock detection (bad) Read-only transactions can block update transactions –Big performance hit if there are long running read-only transactions

Multi-version concurrency control No locking during transaction execution! Each data item is associated with multiple versions Multi-version transactions: –Buffer writes during execution –Reads choose the appropriate version –At commit time, system validates if okay to make writes visible (by generating new versions)

Snapshot isolation A popular multi-version concurrency control scheme A transaction: –reads a “snapshot” of database image –Can commit only if there are no write-write conflict

Implementing snapshot isolation T is assigned a start timestamp,T.sts R(A)W(B) T buffers writes to B and adds B to its writeset, T.wset += {B} T is assigned a commit timestamp System checks forall T’, s.t. T’.cts > T.sts && T’.cts < T.cts T’.wset and T.wset do not overlap T reads the biggest version of A, A(i), such that i <= T.sts time

An example R(A) R(B) W(B)W(A) R(A) R(B) No read-uncommitted No unrepeatable-read

Snapshot isolation < serializability The write-skew problem Possible under SI? R(A),R(B),W(B), W(A),C,C Serializable?