Atomic Commit and Concurrency Control

Slides:



Advertisements
Similar presentations
Database Systems (資料庫系統)
Advertisements

1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Concurrency Control Enforcing Serializability by Locks
1 CS 194: Elections, Exclusion and Transactions Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
ICS 421 Spring 2010 Transactions & Concurrency Control (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa.
Transaction Processing Lecture ACID 2 phase commit.
Transaction Management and Concurrency Control
1 Transaction Management Overview Yanlei Diao UMass Amherst March 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Transaction Processing: Concurrency and Serializability 10/4/05.
Transaction Management
Transactions or Concurrency Control. Introduction A program which operates on a DB performs 2 kinds of operations: –Access to the Database (Read/Write)
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Concurrency Control. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Transaction Processing.
CS162 Section Lecture 10 Slides based from Lecture and
Concurrency control in transactional systems Jinyang Li Some slides adapted from Stonebraker and Madden.
08_Transactions_LECTURE2 DBMSs should guarantee ACID properties (Atomicity, Consistency, Isolation, Durability). This is typically done by guaranteeing.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
1 Transactions Chapter Transactions A transaction is: a logical unit of work a sequence of steps to accomplish a single task Can have multiple.
Concurrency Control in Database Operating Systems.
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Page 1 Concurrency Control Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
1 Concurrency Control Lecture 22 Ramakrishnan - Chapter 19.
IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.
1 Database Systems ( 資料庫系統 ) December 27, 2004 Chapter 17 By Hao-hua Chu ( 朱浩華 )
Transaction Management Exercises I/O and CPU activities can be and are overlapped to minimize (disk and processor) idle time and to maximize throughput.
Jinze Liu. ACID Atomicity: TX’s are either completely done or not done at all Consistency: TX’s should leave the database in a consistent state Isolation:
MULTIUSER DATABASES : Concurrency and Transaction Management.
CS 440 Database Management Systems
Concurrency Control, Locking, and Recovery
Transaction Management
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
Concurrency control in transactional systems
Part- A Transaction Management
Two phase commit.
Transaction Management
COS 418: Advanced Computer Systems Lecture 5 Michael Freedman
Transaction Management Overview
Anthony D. Joseph and John Canny
IS 651: Distributed Systems Consensus
March 21st – Transactions
April 4, 2011 Ion Stoica CS162 Operating Systems and Systems Programming Lecture 18 Transactions April 4, 2011 Ion.
Anthony D. Joseph and Ion Stoica
Transaction Management
Concurrency Control II (OCC, MVCC)
March 9th – Transactions
Chapter 10 Transaction Management and Concurrency Control
Lecture 21: Concurrency & Locking
CS162 Operating Systems and Systems Programming Review (II)
Basic Two Phase Locking Protocol
ENFORCING SERIALIZABILITY BY LOCKS
Distributed Transactions
Lecture 21: Replication Control
Lecture 21: Intro to Transactions & Logging III
Concurrency Control II and Distributed Transactions
Lecture 22: Intro to Transactions & Logging IV
Transaction Management
Temple University – CIS Dept. CIS661 – Principles of Data Management
Transaction Management Overview
Lecture 21: Replication Control
Concurrency control (OCC and MVCC)
Database Systems (資料庫系統)
Lecture 18: Concurrency Control
Database Systems (資料庫系統)
Transactions, Properties of Transactions
Presentation transcript:

Atomic Commit and Concurrency Control COS 418: Distributed Systems Lecture 15 Wyatt Lloyd

Lets Scale Strong Consistency! Atomic Commit Two-phase commit (2PC) Serializability Strict serializability Concurrency Control: Two-phase locking (2PL) Optimistic concurrency control (OCC)

Atomic Commit Atomic: All or nothing Either all participants do something (commit) or no participant does anything (abort) Common use: commit a transaction that updates data on different shards

Transaction Examples Bank account transfer Turing -= $100 Lovelace += $100 Maintaining symmetric relationships Lovelace FriendOf Turing Turing FriendOf Lovelace Order product Charge customer card Decrement stock Ship stock

Relationship with Replication Replication (e.g., RAFT) is about doing the same thing multiple places to provide fault tolerance Sharding is about doing different things multiple places for scalability Atomic commit is about doing different things in different places together

Relationship with Replication Replication Dimension A-F G-L M-R S-Z A-F G-L M-R S-Z A-F G-L M-R S-Z Sharding Dimension

Focus on Sharding for Today Replication Dimension A-F G-L M-R S-Z A-F G-L M-R S-Z A-F G-L M-R S-Z Sharding Dimension

Atomic Commit Atomic: All or nothing Either all participants do something (commit) or no participant does anything (abort) Atomic commit is accomplished with the Two-phase commit protocol (2PC)

Two-Phase Commit Phase 1 Coordinator inspects all votes Phase 2 Coordinator sends Prepare requests to all participants Each participant votes yes or no Sends yes vote or no vote back to coordinator Typically acquires locks if they vote yes Coordinator inspects all votes If all yes, then commit If any no, then abort Phase 2 Coordinator sends Commit or Abort to all participants If commit, each participant does something Each participant releases locks Each participant sends an Ack back to the coordinator

Unilateral Abort Any participant can cause an abort With 100 participants, if 99 vote yes and 1 votes no => abort! Common reasons to abort: Cannot acquire required lock No memory or disk space available to do write Transaction constraint fails (e.g., Alan does not have $100) Q: Why do we want unilateral abort for atomic commit?

Atomic Commit All-or-nothing Unilateral abort Two-phase commit Prepare -> Commit/abort

Lets Scale Strong Consistency! Atomic Commit Two-phase commit (2PC) Serializability Strict serializability Concurrency Control: Two-phase locking (2PL) Optimistic concurrency control (OCC)

Two Concurrent Transactions transaction transfer(A, B): begin_tx a  read(A) if a < 10 then abort_tx else write(A, a−10) b  read(B) write(B, b+10) commit_tx transaction sum(A, B): begin_tx a  read(A) b  read(B) print a + b commit_tx

Isolation Between Transactions Isolation: sum appears to happen either completely before or completely after transfer i.e., it appears that all operations of a transaction happened together Schedule for transactions is an ordering of the operations performed by those transactions

Problem from Concurrent Execution Serial execution of transactions—transfer then sum: transfer: rA wA rB wB © sum: rA rB © Concurrent execution can result in state that differs from any serial execution: transfer: rA wA rB wB © Time  © = commit debit credit debit credit

Isolation Between Transactions Isolation: sum appears to happen either completely before or completely after transfer i.e., it appears that all operations of a transaction happened together Given a schedule of operations: Is that schedule in some way “equivalent” to a serial execution of transactions?

Equivalence of Schedules Two operations from different transactions are conflicting if: They read and write to the same data item The write and write to the same data item Two schedules are equivalent if: They contain the same transactions and operations They order all conflicting operations of non-aborting transactions in the same way Read-write conflict because changing their order changes the result Write-write conflict because changing their order changes the result Read-read do not conflict because changing their order does not change the result

Serializability A schedule is serializable if it is equivalent to some serial schedule i.e., non-conflicting operations can be reordered to get a serial schedule

A Serializable Schedule A schedule is serializable if it is equivalent to some serial schedule i.e., non-conflicting operations can be reordered to get a serial schedule transfer: rA wA rB wB © sum: rA rB © Time  © = commit rA Serial schedule Conflict-free!

A Non-Serializable Schedule A schedule is serializable if it is equivalent to some serial schedule i.e., non-conflicting operations can be reordered to get a serial schedule transfer: rA wA rB wB © sum: rA rB © Time  © = commit But in a serial schedule, sum’s reads either both before wA or both after wB Conflicting ops Conflicting ops

Linearizability vs. Serializability Linearizability: a guarantee about single operations on single objects Once write completes, all reads that begin later should reflect that write Serializability is a guarantee about transactions over one or more objects Doesn’t impose real- time constraints Strict Serializability = Serializability + real-time ordering Intuitively Serializability + Linearizability We’ll stick with only Strict Serializability for this class

Consistency Hierarchy Strict Serializability e.g., Spanner Linearizability e.g., RAFT Sequential Consistency CAP PRAM 1988 (Princeton) Causal+ Consistency e.g., Bayou Eventual Consistency e.g., Dynamo

Lets Scale Strong Consistency! Atomic Commit Two-phase commit (2PC) Serializability Strict serializability Concurrency Control: Two-phase locking (2PL) Optimistic concurrency control (OCC)

Concurrency Control Concurrent execution can violate serializability We need to control that concurrent execution so we do things a single machine executing transactions one at a time would Concurrency control

Concurrency Control Strawman #1 Big global lock Acquire the lock when transaction starts Release the lock when transaction ends Provides strict serializability Just like executing transaction one by one because we are doing exactly that No concurrency at all Terrible for performance: one transaction at a time

Locking Locks maintained on each shard Lock types Transaction requests lock for a data item Shard grants or denies lock Lock types Shared: Need to have before read object Exclusive: Need to have before write object Shared (S) Exclusive (X) Yes No

Concurrency Control Strawman #2 Grab locks independently, for each data item (e.g., bank accounts A and B) transfer: ◢A rA wA ◣A ◢B rB wB ◣B © sum: ◿A rA ◺A ◿B rB ◺B © Time  © = commit ◢ /◿ = eXclusive- / Shared-lock; ◣ / ◺ = X- / S-unlock Permits this non-serializable interleaving

Two-Phase Locking (2PL) 2PL rule: Once a transaction has released a lock it is not allowed to obtain any other locks Growing phase: transaction acquires locks Shrinking phase: transaction releases locks In practice: Growing phase is the entire transaction Shrinking phase is during commit

2PL Provide Strict Serializability 2PL rule: Once a transaction has released a lock it is not allowed to obtain any other locks transfer: ◢A rA wA ◣A ◢B rB wB ◣B © sum: ◿A rA ◺A ◿B rB ◺B © Time  © = commit ◢ /◿ = X- / S-lock; ◣ / ◺ = X- / S-unlock 2PL precludes this non-serializable interleaving Q: What is not allowed by 2pl here? Sum can’t grab a new lock on b once it releases its lock on A Transfer can’t grab a new lock on B once it released its lock on A

2PL and Transaction Concurrency 2PL rule: Once a transaction has released a lock it is not allowed to obtain any other locks transfer: ◿A rA ◢A wA ◿B rB ◢B wB✻© sum: ◿A rA ◿B rB✻© Time  © = commit ◢ /◿ = X- / S-lock; ◣ / ◺ = X- / S-unlock; ✻ = release all locks 2PL permits this serializable, interleaved schedule

2PL Doesn’t Exploit All Opportunities For Concurrency 2PL rule: Once a transaction has released a lock it is not allowed to obtain any other locks transfer: rA wA rB wB © sum: rA rB © Time  © = commit (locking not shown) 2PL precludes this serializable, interleaved schedule

Issues with 2PL What do we do if a lock is unavailable? Give up immediately? Wait forever? Waiting for a lock can result in deadlock Transfer has A locked, waiting on B Sum has B locked, waiting on A Many different ways to detect and deal with deadlocks

Lets Scale Strong Consistency! Atomic Commit Two-phase commit (2PC) Serializability Strict serializability Concurrency Control: Two-phase locking (2PL) Optimistic concurrency control (OCC)

2PL is Pessimistic Acquires locks to prevent all possible violations of serializability But leaves a lot of concurrency on the table that is okay

Be Optimistic! Goal: Low overhead for non-conflicting txns Assume success! Process transaction as if would succeed Check for serializability only at commit time If fails, abort transaction Optimistic Concurrency Control (OCC) Higher performance when few conflicts vs. locking Lower performance when many conflicts vs. locking

2PL vs OCC Conflict Rate From Rococo paper in OSDI 2014. Focus on 2PL vs. OCC. Observe OCC better when write rate lower (fewer conflicts), worse than 2PL with write rate higher (more conflicts)

Optimistic Concurrency Control Optimistic Execution: Execute reads against shards Buffer writes locally Validation and Commit: Validate that data is still the same as previously observed (i.e., reading now would give the same result) Commit the transaction by applying all buffered writes Need this to all happen together, how? Time to check t Time to use

OCC Uses 2PC Validation and Commit use Two-Phase Commit Client sends each shard a prepare Prepare includes read values and buffered writes for each shard Each shard acquires shared locks on read locations and exclusive locks on write locks Each shard checks if read values validate Each shard sends vote to client If all locks acquired and reads validate => Vote Yes Otherwise => Vote No Client collects all votes, if all yes then commit Client sends commit/abort to all shards If commit: shards apply buffered writes Shards release all locks Time to check t Time to use

Lets Scale Strong Consistency! Atomic Commit Two-phase commit (2PC) Serializability Strict serializability Concurrency Control: Two-phase locking (2PL) Optimistic concurrency control (OCC) Uses 2PC 2PL also uses 2PC, but we’ll leave discussing that to the Spanner lecture