Lecture 09 – Distributed Concurrency Management

Slides:



Advertisements
Similar presentations
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Advertisements

TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Lock-Based Concurrency Control
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
OCT Distributed Transaction1 Lecture 13: Distributed Transactions Notes adapted from Tanenbaum’s “Distributed Systems Principles and Paradigms”
Distributed Systems Fall 2010 Transactions and concurrency control.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Persistent State Service 1 Distributed Object Transactions  Transaction principles  Concurrency control  The two-phase commit protocol  Services for.
Transaction Management
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Distributed Transactions Transaction may access data at several sites. Each site has a local.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Distributed Systems Fall 2009 Distributed transactions.
TRANSACTION PROCESSING TECHNIQUES BY SON NGUYEN VIJAY RAO.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Distributed Commit Dr. Yingwu Zhu. Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed?
CS162 Section Lecture 10 Slides based from Lecture and
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.
Distributed Transactions Chapter 13
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Transactions and Concurrency Control Distribuerade Informationssystem, 1DT060, HT 2013 Adapted from, Copyright, Frederik Hermans.
Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!
Lecture 11 – Distributed Concurrency Management Tuesday Oct 5 th, Distributed Systems.
Distributed Transactions Chapter – Vidya Satyanarayanan.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Atomic Transactions in Distributed Systems
Transaction Management
Outline Introduction Background Distributed DBMS Architecture
CSC 8320 Advanced Operating Systems Xueting Liao
Two phase commit.
COS 418: Distributed Systems Lecture 6 Daniel Suo
Transaction Management Overview
IS 651: Distributed Systems Consensus
Commit Protocols CS60002: Distributed Systems
Outline Announcements Fault Tolerance.
Lecture 21: Concurrency & Locking
CSE 486/586 Distributed Systems Concurrency Control --- 3
Distributed Transactions
Lecture 21: Replication Control
Atomic Commit and Concurrency Control
Causal Consistency and Two-Phase Commit
Distributed Systems Lecture 09 – Distributed Concurrency Management Tuesday Sept 29th, 2016.
Distributed Transactions
Exercises for Chapter 14: Distributed Transactions
Distributed Databases Recovery
Distributed Transactions
Transaction Management
UNIVERSITAS GUNADARMA
Transactions in Distributed Systems
Distributed Transactions
Transaction Management Overview
Distributed Transactions
Lecture 21: Replication Control
CSE 486/586 Distributed Systems Concurrency Control --- 3
Last Class: Fault Tolerance
Transaction Communication
Transactions, Properties of Transactions
Presentation transcript:

Lecture 09 – Distributed Concurrency Management 15-440 Distributed Systems Lecture 09 – Distributed Concurrency Management The Two Phase Commit Protocol Thursday, September 27th, 2018

Logistics Updates P1 Part A checkpoint ( was due 9/25) Part A due: Saturday 10/6 (6-week drop deadline 10/8) Part B due: Tuesday 10/16 HW2 will be released 10/01 HW 2 due: Friday, 10/10 (tentative, maybe 10/12) (*No Late Days*) => time to prepare for Mid term We’re currently grading HW1 HW1 solutions should come online soon

Today's Lecture Outline Transactions and Consistency Database terminology Part I: Single Server Case (not covered well in book) Two Phase Locking Part II: Distributed Transactions Two Phase Commit (Tanenbaum 8.5)

Assumptions for Today Ignore failures in first half Concurrency is our main concern To deal with failures Assume a form of logging, where every machine writes information down *before* operating on it, to recover from simple failures. Recover after failure. Next lecture: Logging and Crash Recovery

Database transactions Background: Database Researchers Defined: “Transactions” Collections of Reads + Writes to Global State Appear as a single, “indivisible” operation Standard Models for Reliable Storage (visit later) Desirable Characteristics of Transactions Atomicity, Consistency, Isolation, Durability Also referred to as the “ACID” Acronym! Database researchers laid the background for reasoning about how to process a number of concurrent events that update some shared global state. They invented the concept of a "transaction" in which a collection of reads and writes of a global state are bundled such that they appear to be single, indivisible operation. (They also defined standard models for reliable storage and for robust operation. We'll visit those later.)

Transactions: ACID Properties Atomicity: Each transaction completes in its entirely, or is aborted. If aborted, should not have have effect on the shared global state. Example: Update account balance on multiple servers Consistency: Each transaction preserves a set of invariants about global state. (Nature of invariants is system dependent). Example: in a bank system, law of conservation of $$ Atomicity: Each transaction is either completed in its entirety, or aborted. In the latter case, it should have no effect on the global state. Example: Update account balance on multiple servers  So either the transaction happens as an single transaction on all servers, or not at all. Consistency: Each transaction preserves a set of invariants about the global state. The exact nature of these is system dependent. *Warning*: The "C" here makes a great acronym, but the term "consistency" is used differently in distributed systems when referring to operations across multiple processors. Beware this gotcha -- this "C" is an application or database-specific idea of maintaining the self-consistency of the data. Example: For the bank case, the invariant that must be true before and after the transaction is that the total amount of money must remain the same.

Transactions: ACID Properties Isolation: Also means serializability. Each transaction executes as if it were the only one with the ability to RD/WR shared global state. Durability: Once a transaction has been completed, or “committed” there is no going back. In other words there is no “undo”. Transactions can also be nested “Atomic Operations” => Atomicity + Isolation Isolation: Also means that if there are two of more transactions running at the same time, to each of them and to other processes, the final result looks like the transactions ran in some sequential order (the order itself may be system dependent). Durability: Also implies that no failure after the commit can undo the results or cause the results to be lost. Interestingly, transactions be be deeply nested (subtansactions within transactions) but the semantics are clear. When a transactions or subtransaction starts its given a private copy of the all data in the system for it to manipulate. However, if it aborts its private data and changes vanishes. If it commits, then its private universe replaces the parents universe and the changes are made durable.

A Transaction Example: Bank Array Bal[i] stores balance of Account “i” Implement: xfer, withdraw, deposit withdraw(i, v): b = Bal[i] // Read if b >= v // Test Bal[i] = b-v // Write return true else return false xfer(i, j, v): if withdraw(i, v): deposit(j, v) else abort deposit(j, v): Bal[j] += v

A Transaction Example: Bank xfer(i, j, v): if withdraw(i, v): deposit(j, v) else abort withdraw(i, v): b = Bal[i] // Read if b >= v // Test Bal[i] = b-v // Write return true else return false deposit(j, v): Bal[j] += v Question: So, what will happen if they follow ACID properties and happen in some serial order? Question: What if we didn’t take care? Is there a race condition? Answer: Race condition when updating value of Bal[x]. If the transactions interleave between Read and Writes of each action? Bal[x] could be either 30 or 40 but both transactions could happen! ACID Violation: Not Isolated, and not Consistent Imagine: Bal[x] = 100, Bal[y]=Bal[z]=0 Two transactions => T1:xfer(x,y,60), T2: xfer(x,z,70) ACID Properties: T1 or T2 in some serial order T1; T2: T1 succeeds; T2 Fails. Bal[x]=40, Bal[y]=60 T2; T1: T2 succeeds; T1 Fails. Bal[x]=30, Bal[z]=70 What if we didn’t take care? Is there a race condition? Updating Bal[x] with Read/Write interleaving of T1,T2

A Transaction Example: Bank xfer(i, j, v): if withdraw(i, v): deposit(j, v) else abort withdraw(i, v): b = Bal[i] // Read if b >= v // Test Bal[i] = b-v // Write return true else return false sumbalance(i, j, k): return Bal[i]+Bal[j]+ Bal[k] deposit(j, v): Bal[j] += v Question: So, what will happen if they follow ACID properties and happen in some serial order? Question: What if we didn’t take care? Is there a race condition? Answer: Race condition when updating value of Bal[x]. If the transactions interleave between Read and Writes of each action? Bal[x] could be either 30 or 40 but both transactions could happen! ACID Violation: Not Isolated, and not Consistent (create more money than we had!) Imagine: Bal[x] = 100, Bal[y]=Bal[z]=0 Two transactions => T1:xfer(x,y,60), T2: xfer(x,z,70) ACID violation: Not Isolated, Not Consistent Updating Bal[x] with Read/Write interleaving of T1,T2 Bal[x] = 30 or 40; Bal[y] = 60; Bal [z] = 70 For Consistency, implemented sumbalance() State invariant sumbalance=100 violated! We created $$

Implement transactions with locks Use locks to wrap xfer xfer(i, j, v): lock() if withdraw(i, v): deposit(j, v) else abort unlock() xfer(i, j, v): lock(i) if withdraw(i, v): unlock(i) lock(j) deposit(j, v) unlock(j) else unlock(i) abort Is this the right approach? No, sequential bottleneck since global lock across all acounts! No, consistency violation. sumbalance() after unlock(i) WHY? Releasing lock i early can give consistency violation. Some other transaction (e.g., sumbalance) could see decremented value of account i, but unincremented value of count j. However, is this the correct approach? (Hint: efficiency) Is this fixed then? Sequential bottleneck due to global lock. Solution? No, consistency violation. sumbalance() after unlock(i)

Implement transactions with locks Fix: Release locks when update of all state variables complete. xfer(i, j, v): lock(i) if withdraw(i, v): lock(j) deposit(j, v) unlock(i); unlock(j) else unlock(i) abort Are we done then? Nope, deadlock. Bal[x]=Bal[y]=100 xfer(x,y,40) and xfer (y, x, 30)

Implement transactions with locks Insight: Need unique global order for acquiring locks. xfer(i, j, v): lock(min(i,j); lock(max (i,j)) if withdraw(i, v): deposit(j, v) unlock(i); unlock(j) else unlock(i); unlock(j) abort This works. :) Motivation for 2-Phase Locking

Acquiring Locks in a Unique Order Consider “Wait-for” graph for state of locks Vertices represent transactions Edge from vertex i to vertex j if transaction i is waiting for lock held by transaction j. What does a cycle mean? Can a cycle occur if we acquire locks in unique order? No. Label edges with its lock ID. For any cycle, there must be some pair of edges (i, j), (j, k) labeled with values x & y. As k holds y, but waits for x: y<x. Transaction j is holding lock x and it wants lock y, so y > x. Implies that j is not acquiring its lock in proper order. General scheme: 2-phase locking More precisely: strong strict two phase locking Recall, we wanted the locks with the lower ID to be acquired first.

2-Phase Locking Variant General 2-phase locking Phase 1: Acquire or Escalate Locks (e.g. read => write) Phase 2: Release or de-escalate lock Strict 2-phase locking Phase 1: (same as before) Phase 2: Release WRITE lock at end of transaction only Strong Strict 2-phase locking Phase 2: Release ALL locks at end of transaction only. Most common version, required for ACID properties General 2-phase locking Phase 1. Acquire or escalate locks (e.g., read lock to write lock) Phase 2. Release or deescalate locks Strict 2-phase locking During Phase 2. Release WRITE locks only at end of transactions Strong strict 2-phase locking During Phase 2. Release ALL locks only at end of transactions. This is the most common version. Required to provide ACID properties.

2-Phase Locking Why not always use strong-strict 2-phase locking? A transaction may not know the locks it needs in advance if Bal(yuvraj) < 100: x = find_richest_prof() transfer_from(x, yuvraj) Other ways to handle deadlocks Lock manager builds a “waits-for” graph. On finding a cycle, choose offending transaction and force abort Use timeouts: Transactions should be short. If hit time limit, find transaction waiting for a lock and force abort.

Transactions – split into 2 phases Phase 1: Preparation: Determine what has to be done, how it will change state, without actually altering it. Generate Lock set “L” Generate List of Updates “U” Phase 2: Commit or Abort Everything OK, then update global state Transaction cannot be completed, leave global state as is In either case, RELEASE ALL LOCKS Preparation. Figure out what to do and how it will change state, without altering state. Generate L: Set of locks, and U: List of updates 2. Commit or abort. If everything OK, then update global state. If transaction cannot be completed, leave global state unchanged. In either case, release all locks

Example Question: So, what would “commit” and ”abort” look like? xfer(i, j, v): L={i,j} // Locks U=[] //List of Updates begin(L) //Begin transaction, Acquire locks bi = Bal[i] bj = Bal[j] if bi >= v: Append(U,Bal[i] <- bi – v) Append(U, Bal[j] <- bj + v) commit(U,L) else abort(L) Question: what would the two functions commit() and abort() look like? Question: So, what would “commit” and ”abort” look like? commit(U,L): Perform all updates in U Release all locks in L abort(L): Release all locks in L

Today's Lecture Outline Consistency for multiple-objects, multiple-servers Part I: Single Server Case (not covered well in book) Two Phase Locking Part II: Distributed Transactions Two Phase Commit (Tanenbaum 8.6)

Distributed Transactions? Partition databases across multiple machines for scalability (E.g., machine 1 responsible for account i, machine 2 responsible for account j) Transaction often touch more than one partition How do we guarantee that all of the partitions commit the transactions or none commit the transactions? Transferring money from i to j. Requirement: both banks/machines do it, or neither

Enabling Distributed Transactions Similar idea as before, but: State spread across servers (maybe even WAN) Failures Overall Idea: Client initiates transaction. Makes use of “coordinator” All other relevant servers operate as “participants” Coordinator assigns unique transaction ID (TID) Strawman solution 2-phase commit protocol Note, co-ordinator could even be the client itself.

Strawman solution Even without failures, a lot can go wrong Account j on Srv 2 has only $90 Account j doesn’t exist! Violates which part of ACID

2-Phase Commit Phase 1: Prepare & Vote Phase 2: Commit Participants figure out all state changes Each determines if it can complete the transaction Communicate with coordinator Phase 2: Commit Coordinator broadcasts to participants: COMMIT / ABORT If COMMIT, participants make respective state changes

Implementing 2-Phase Commit Implemented as a set of messages Note: This requires that nodes be able to abort or carry forward their changes. This is accomplished with _logging_. (next lecture)

Implementing 2-Phase Commit Implemented as a set of messages Messages in first phase A: Coordinator sends “CanCommit?” to participants Note: This requires that nodes be able to abort or carry forward their changes. This is accomplished with _logging_. (next lecture)

Implementing 2-Phase Commit Implemented as a set of messages Messages in first phase A: Coordinator sends “CanCommit?” to participants B: Participants respond: “VoteCommit” or “VoteAbort” Note: This requires that nodes be able to abort or carry forward their changes. This is accomplished with _logging_. (next lecture)

Implementing 2-Phase Commit Implemented as a set of messages Messages in first phase A: Coordinator sends “CanCommit?” to participants B: Participants respond: “VoteCommit” or “VoteAbort” Note: This requires that nodes be able to abort or carry forward their changes. This is accomplished with _logging_. (next lecture) Messages in the second phase A: All “VoteCommit”: , Coord sends “DoCommit” If any “VoteAbort”: abort transaction. Coordinator sends “DoAbort” to everyone => release locks

Implementing 2-Phase Commit Implemented as a set of messages Messages in first phase A: Coordinator sends “CanCommit?” to participants B: Participants respond: “VoteCommit” or “VoteAbort” Note: This requires that nodes be able to abort or carry forward their changes. This is accomplished with _logging_. (next lecture) Messages in the second phase A: All “VotedCommit”: , Coord sends “DoCommit” If any “VoteAbort”: abort transaction. Coordinator sends “DoAbort” to everyone => release locks

Example for 2PC Bank Account “i” at Server 1, “j” at Server 2. L={i} Begin(L) // Acq. Locks U=[] //List of Updates b=Bal[i] if b >= v: Append(U, Bal[i] <- b – v) vote commit else vote abort L={j} Begin(L) // Acq. Locks U=[] //List of Updates b=Bal[j] Append(U, Bal[j] <- b + v) vote commit Server 2 implements transaction Server 1 implements transaction Server 2 can assume that the account of “i” has enough money, otherwise whole transaction will abort. What about locking? Locks held by individual participants - Acquire lock at start of prep process, release at Commit/Abort

Properties of 2-Phase Commit Correctness Neither can commit unless both agreed to commit Performance 3N messages per transaction How to handle failure? Timeouts  performance bad in case of failure!

Deadlocks and Livelocks Distributed deadlock Cyclic dependency of locks by transactions across servers In 2PC this can happen if participants unable to respond to voting request (e.g. still waiting on a lock on its local resource) Handled with a timeout. Participants times out, then votes to abort. Retry transaction again. Addresses the deadlock concern However, danger of LIVELOCK – keep trying!

Timeout and Failure Cases 1 Coordinator times out after “CanCommit?” Hasn’t sent any commit messages, safely abort Conservative. Why? Preserve correctness, sacrifice performance Participant times out after “VoteAbort” Can safely abort unilaterally. Why?

Timeout and Failure Cases 2 Participant times out after “VoteCommit” Are unilateral decisions possible? Commit, Abort? Participant could wait forever Solution: ask another participant (gossip protocol) Learn coordinator’s decision: do the same Assumption: non-Byzantine failure model Other participant hasn’t voted: abort is safe. Why? Coordinator has not made decision No reply or other participant also “VoteCommit”: wait 2PC is “blocking protocol”  3PC in book.

2 Phase Commit in Practice 2PC widely used in practice Logging and Crash Recovery Crucial to handle crashes / reboots (next lecture) Very powerful and resilient when paired with RAID (3 lectures from now) NDB Cluster

Summary Distributed consistency management ACID Properties desirable Single Server case: use locks + 2-phase locking (strict 2PL, strong strict 2PL), transactional support for locks Multiple server distributed case: use 2-phase commit for distributed transactions. Need a coordinator to manage messages from participants 2PC can become a performance bottleneck

Additional Material Overview: 2PC notation from the Book Terminology used by messages different, but essentially the protocol is the same Pointers to 3PC (fully described in the book)

Two-Phase Commit (1) (a) The finite state machine for the coordinator in 2PC. (b) The finite state machine for a participant. Coordinator/Participant can be blocked in 3 states: Participant: Waiting in INIT state for VOTE_REQUEST Coordinator: Blocked in WAIT state, listening for votes Participant: blocked in READY state, waiting for global vote

Two-Phase Commit (2) What if a “READY” participant does not receive the global commit? Can’t just abort => figure out what message a co-ordinator may have sent. Approach: ask other partcipants Take actions on response on any of the participants E.g. P is in READY state, asks other “Q” participants If everyone is in ”READY” state then protocol blocks for co-ordinator (e.g. if the co-ordinator is unresponsive or has crashed!). What happens if everyone is in ”READY” state?

Two-Phase Commit (3) For recovery, must save state to persistent storage (e.g. log), to restart/recover after failure. Participant (INIT): Safe to local abort, inform Coordinator Participant (READY): Contact others Coordinator (WAIT): Retransmit VOTE_REQ Coordinator (WAIT/Decision): Retransmit VOTE_COMMIT Coordinator has to be careful about tracking two states: If it recorded that it entered wait state and not all responses came back, it can just resent the VOTE_REQ If it reconded that it had come to a decision in the 2nd phase already, then that decision must be recorded and just be retransmitted.

2PC: Actions by Coordinator Why do we have the ”write to LOG” statements? Ans: Needed for recovery as mentioned earlier Why do we have the ”write to LOG” statements?

2PC: Actions by Participant Wait for REQUESTS If Decision to COMMIT LOG=>Send=> Wait No response? Ask others If global decision, COMMIT OR ABORT Else, Local Decision LOG => send

2PC: Handling Decision Request What if everyone has received VOTE_REQ, and Co-ordinator crashes? DEADLOCK: Use 3PC Note, participant can only help others if it has reached a global decision and committed it to its log. What if everyone has received VOTE_REQ, and Co-ordinator crashes?