CS162 Section Lecture 10 Slides based from Lecture and www.news.cs.nyu.edu/~jinyang/fa08/

Slides:



Advertisements
Similar presentations
Two phase commit. Failures in a distributed system Consistency requires agreement among multiple servers –Is transaction X committed? –Have all servers.
Advertisements

CS542: Topics in Distributed Systems Distributed Transactions and Two Phase Commit Protocol.
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Transaction Management Overview R & G Chapter 16 There are three side effects of acid. Enhanced long term memory, decreased short term memory, and I forget.
ICS 421 Spring 2010 Transactions & Concurrency Control (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa.
ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.
Two phase commit. What we’ve learnt so far Sequential consistency –All nodes agree on a total order of ops on a single object Crash recovery –An operation.
Systems of Distributed Systems Module 2 -Distributed algorithms Teaching unit 3 – Advanced algorithms Ernesto Damiani University of Bozen Lesson 6 – Two.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
1 Transaction Management Overview Yanlei Diao UMass Amherst March 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Distributed Transactions Transaction may access data at several sites. Each site has a local.
1 More on Distributed Coordination. 2 Who’s in charge? Let’s have an Election. Many algorithms require a coordinator. What happens when the coordinator.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Distributed Commit Dr. Yingwu Zhu. Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed?
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
1 Transaction Management Overview Chapter Transactions  Concurrent execution of user programs is essential for good DBMS performance.  Because.
1 Transaction Management Overview Chapter Transactions  A transaction is the DBMS’s abstract view of a user program: a sequence of reads and writes.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 18.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Lecture 21 Ramakrishnan - Chapter 18.
CS 162 Discussion Section Week 9 11/11 – 11/15. Today’s Section ●Project discussion (5 min) ●Quiz (10 min) ●Lecture Review (20 min) ●Worksheet and Discussion.
Distributed Transactions Chapter 13
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
Database Systems/COMP4910/Spring05/Melikyan1 Transaction Management Overview Unit 2 Chapter 16.
1 Transaction Management Overview Chapter Transactions  Concurrent execution of user programs is essential for good DBMS performance.  Because.
Distributed Transaction Management, Fall 2002Lecture Distributed Commit Protocols Jyrki Nummenmaa
Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!
University of Tampere, CS Department Distributed Commit.
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
1 Concurrency Control Lecture 22 Ramakrishnan - Chapter 19.
Consistency David E. Culler CS162 – Operating Systems and Systems Programming Lecture 35 Nov 19, 2014 Read:
CS 162 Section 10 Two-phase commit Fault-tolerant computing.
Transaction Management and Recovery, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 18.
Fault Tolerance Chapter 7. Goal An important goal in distributed systems design is to construct the system in such a way that it can automatically recover.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these.
MULTIUSER DATABASES : Concurrency and Transaction Management.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Transaction Management Overview
Two phase commit.
COS 418: Distributed Systems Lecture 6 Daniel Suo
Transaction Management Overview
Anthony D. Joseph and John Canny
IS 651: Distributed Systems Consensus
April 4, 2011 Ion Stoica CS162 Operating Systems and Systems Programming Lecture 18 Transactions April 4, 2011 Ion.
Anthony D. Joseph and Ion Stoica
Commit Protocols CS60002: Distributed Systems
Transaction Management Overview
Lecture 21: Concurrency & Locking
CS162 Operating Systems and Systems Programming Review (II)
CSE 486/586 Distributed Systems Concurrency Control --- 3
Distributed Transactions
Atomic Commit and Concurrency Control
Causal Consistency and Two-Phase Commit
Distributed Databases Recovery
Transaction Management
Transaction Management Overview
Lecture 21: Replication Control
CSE 486/586 Distributed Systems Concurrency Control --- 3
Last Class: Fault Tolerance
Transaction Management Overview
Presentation transcript:

CS162 Section Lecture 10 Slides based from Lecture and

Problems with Interleaved Execution T1:R(A), R(A),W(A) T2: R(A),W(A) What type of conflict is this?

Problems with Interleaved Execution What type of conflict is this? T1:R(A),W(A), W(A) T2: R(A), …

Problems with Interleaved Execution What type of conflict is this? T1:W(A), W(B) T2: W(A),W(B)

What is conflict equivalence?

Conflict Serializability T1:R(A),W(A), R(B),W(B) T2: R(A),W(A), R(B),W(B) T1:R(A),W(A), R(B), W(B) T2: R(A), W(A), R(B),W(B) T1:R(A),W(A),R(B), W(B) T2: R(A),W(A), R(B),W(B)

T1T2 A Dependency graph B T1:R(A),W(A), R(B),W(B) T2: R(A),W(A), R(B),W(B)

T1:R(A),W(A), R(B),W(B) T2: R(A),W(A),R(B),W(B) T1T2 A B Dependency graph

2 Phase Locking 1) Each transaction must obtain: – S (shared) or X (exclusive) lock on data before reading, – X (exclusive) lock on data before writing 2) A transaction can not request additional locks once it releases any locks Thus, each transaction has a “growing phase” followed by a “shrinking phase” Growing Phase Shrinking Phase

Assume each instruction (R, W, etc) takes one time unit, and lock ops takes zero time units. What is the minimum possible execution time schedule? Transaction 1: R(A); A = A + 100; W(A); R(B); B = B – 100; W(B); Transaction 2: R(A); A = A – 50; W(A); 2 PL Example

What if you used strict 2 PL?

Why 2PC? Most of the real time transactions are distributed transactions. A distributed transaction involves altering data on multiple databases A database must coordinate the committing or rolling back of the changes in a transaction

Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed? – Have all servers applied update X to a replica? Achieving agreement w/ failures is hard – Impossible to distinguish host vs. network failures

Two Phase commit (2PC) It is a standard protocol for making commit and abort atomic Coordinator - the component that coordinates commitment at home(T) Participant - a resource manager accessed by T A participant P is ready to commit T if all of T’s after-images at P are in stable storage

2PC contd… 16 directoractors RM director Commit Ready Commit TM: Transaction Manager RM: Resource Manager client TM RM Ready?

Do you remember the State Machine of Coordinator from the lecture? INIT WAIT ABORTCOMMIT Recv: VOTE-ABORT Send: GLOBAL-ABORT Recv: VOTE-COMMIT Send: GLOBAL-COMMIT Recv: START Send: VOTE-REQ

State Machine of workers INIT READY ABORTCOMMIT Recv: VOTE-REQ Send: VOTE-ABORT Recv: VOTE-REQ Send: VOTE-COMMIT Recv: GLOBAL-ABORTRecv: GLOBAL-COMMIT

Example Bank ABank B Transfer $1000 From A:$3000 To B:$2000 Clients want all-or-nothing transactions – Transfer either happens or not at all client

Strawman solution Bank ABank B Transfer $1000 From A:$3000 To B:$2000 client Transaction coordinator

Strawman solution What can go wrong? – A does not have enough money – B’s account no longer exists – B has crashed – Coordinator crashes client transaction coordinator bank Abank B start done A=A-1000 B=B+1000

Reasoning about correctness TC, A, B each has a notion of committing Correctness: – If one commits, no one aborts – If one aborts, no one commits Performance: – If no failures, A and B can commit, then commit – If failures happen, find out outcome soon

Correctness first client transaction coordinator bank Abank B start result prepare rBrB rArA outcome If r A ==yes && r B ==yes outcome = “commit” else outcome = “abort” B commits upon receiving “commit”

Performance Overhead TM LOGMessageRM Log PREPARE Prepare / abort VOTE Y/N Commit / abort C/A Commit / abort ACK End*

Performance Issues What about timeouts? – TC times out waiting for A’s response – A times out waiting for TC’s outcome message What about reboots? – How does a participant clean up?

Handling timeout on A/B TC times out waiting for A (or B)’s “yes/no” response Can TC unilaterally decide to commit? Can TC unilaterally decide to abort?

Handling timeout on TC If B responded with “no” … – Can it unilaterally abort? If B responded with “yes” … – Can it unilaterally abort? – Can it unilaterally commit?

What would happen if a single GLOBAL-COMMIT message was lost?

When can the TM respond with a SUCCESS? client transaction coordinator bank Abank B start result prepare rBrB rArA outcome

Possible termination protocol Execute termination protocol if B times out on TC and has voted “yes” B sends “status” message to A – If A has received “commit”/”abort” from TC … – If A has not responded to TC, … – If A has responded with “no”, … – If A has responded with “yes”, … Resolves most failure cases except sometimes when TC fails

Handling crash and reboot Nodes cannot back out if commit is decided TC crashes just after deciding “commit” – Cannot forget about its decision after reboot A/B crashes after sending “yes” – Cannot forget about their response after reboot

Handling crash and reboot All nodes must log protocol progress What and when does TC log to disk? What and when does A/B log to disk?

Recovery upon reboot If TC finds no “commit” on disk, abort If TC finds “commit”, commit If A/B finds no “yes” on disk, abort If A/B finds “yes”, run termination protocol to decide

Summary: two-phase commit 1.All nodes that decide reach the same decision 2.No commit unless everyone says "yes". 3.No failures and all "yes", then commit. 4.If failures, then repair, wait long enough for recovery, then some decision.