Two Phase Commit CSE593 Transaction Processing Philip A. Bernstein

Slides:



Advertisements
Similar presentations
Two phase commit. Failures in a distributed system Consistency requires agreement among multiple servers –Is transaction X committed? –Have all servers.
Advertisements

6.852: Distributed Algorithms Spring, 2008 Class 7.
CS542: Topics in Distributed Systems Distributed Transactions and Two Phase Commit Protocol.
4/26/ Two Phase Commit CSEP 545 Transaction Processing for E-Commerce Philip A. Bernstein Copyright ©2003 Philip A. Bernstein.
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
CS 603 Handling Failure in Commit February 20, 2002.
COS 461 Fall 1997 Transaction Processing u normal systems lose their state when they crash u many applications need better behavior u today’s topic: how.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 12: Three-Phase Commits (3PC) Professor Chen Li.
Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.
Distributed Transaction Processing Some of the slides have been borrowed from courses taught at Stanford, Berkeley, Washington, and earlier version of.
CS 582 / CMPE 481 Distributed Systems
1 ICS 214B: Transaction Processing and Distributed Data Management Replication Techniques.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
CS 603 Three-Phase Commit February 22, Centralized vs. Decentralized Protocols What if we don’t want a coordinator? Decentralized: –Each site broadcasts.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Distributed Transactions Transaction may access data at several sites. Each site has a local.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
Distributed Commit. Example Consider a chain of stores and suppose a manager – wants to query all the stores, – find the inventory of toothbrushes at.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Distributed Deadlocks and Transaction Recovery.
Distributed Commit Dr. Yingwu Zhu. Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed?
CS162 Section Lecture 10 Slides based from Lecture and
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
DISTRIBUTED SYSTEMS II AGREEMENT (2-3 PHASE COM.) Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.
Distributed Transactions Chapter 13
Distributed Transaction Recovery
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
PAVANI REDDY KATHURI TRANSACTION COMMUNICATION. OUTLINE 0 P ART I : I NTRODUCTION 0 P ART II : C URRENT R ESEARCH 0 P ART III : F UTURE P OTENTIAL 0 R.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Distributed Transaction Management, Fall 2002Lecture Distributed Commit Protocols Jyrki Nummenmaa
Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!
University of Tampere, CS Department Distributed Commit.
Distributed Transactions Chapter – Vidya Satyanarayanan.
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.
SHUJAZ IBRAHIM CHAYLASY GNOPHANXAY FIT, KMUTNB JANUARY 05, 2010 Distributed Database Systems | Dr.Nawaporn Wisitpongphan | KMUTNB Based on article by :
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Recovery in Distributed Systems:
Chapter 19: Distributed Databases
Outline Introduction Background Distributed DBMS Architecture
Database System Implementation CSE 507
Two phase commit.
Commit Protocols CS60002: Distributed Systems
RELIABILITY.
Outline Introduction Background Distributed DBMS Architecture
Outline Announcements Fault Tolerance.
Distributed Commit Phases
Distributed Systems CS
CSE 486/586 Distributed Systems Concurrency Control --- 3
Assignment 5 - Solution Problem 1
Distributed Transactions
DISTRIBUTED DATABASES
Brahim Ayari, Abdelmajid Khelil, Neeraj Suri and Eugen Bleim
Exercises for Chapter 14: Distributed Transactions
Distributed Databases Recovery
Chapter 19: Distributed Transaction Recovery
UNIVERSITAS GUNADARMA
Transactions in Distributed Systems
CIS 720 Concurrency Control.
CSE 486/586 Distributed Systems Concurrency Control --- 3
Last Class: Fault Tolerance
Transactions, Properties of Transactions
Presentation transcript:

Two Phase Commit CSE593 Transaction Processing Philip A. Bernstein Copyright ©2001 Philip A. Bernstein 2/14/01

Outline 1. Introduction 2. The Two-Phase Commit (2PC) Protocol 3. 2PC Failure Handling 4. 2PC Optimizations 5. Process Structuring 6. Three Phase Commit 2/14/01

1. Introduction Goal - ensure the atomicity of a transaction that accesses multiple resource managers (Recall, resource abstracts data, messages, and other items that are shared by transactions.) Why is this hard? What if resource manager RMi fails after a transaction commits at RMk? What if other resource managers are down when RMi recovers? What if a transaction thinks a resource manager failed and therefore aborted, when it actually is still running? 2/14/01

Assumptions Each resource manager independently commits or aborts a transaction atomically on its resources. Home(T) decides when to start committing T Home(T) doesn’t start committing T until T terminates at all nodes (hard) Resource managers fail by stopping no Byzantine failures 2/14/01

Problem Statement Transaction T accessed data at resource managers R1, … Rn. The goal is to either commit T at all of R1, … Rn, or abort T at all of R1, … Rn even if resource managers, nodes and communications links fail during the commit or abort activity That is, never commit at Ri but abort at Rk. 2/14/01

2. Two-Phase Commit Two phase commit (2PC) is the standard protocol for making commit and abort atomic Coordinator - the component that coordinates commitment at home(T) Participant - a resource manager accessed by T A participant P is ready to commit T if all of T’s after-images at P are in stable storage The coordinator must not commit T until all participants are ready If P isn’t ready, T commits, and P fails, then P can’t commit when it recovers. 2/14/01

The Protocol (Begin Phase 1) The coordinator sends a Request-to-Prepare message to each participant The coordinator waits for all participants to vote Each participant votes Prepared if it’s ready to commit may vote No for any reason may delay voting indefinitely (Begin Phase 2) If coordinator receives Prepared from all participants, it decides to commit. (The transaction is now committed.) Otherwise, it decides to abort. 2/14/01

The Protocol (cont’d) The coordinator sends its decision to all participants (i.e. Commit or Abort) Participants acknowledge receipt of Commit or Abort by replying Done. 2/14/01

Case 1: Commit Coordinator Participant Request-to-Prepare Prepared Done 2/14/01

Case 2: Abort Coordinator Participant Request-to-Prepare No Abort Done 2/14/01

Performance In the absence of failures, 2PC requires 3 rounds of messages before the decision is made Request-to-prepare Votes Decision Done messages are just for bookkeeping they don’t affect response time they can be batched 2/14/01

Uncertainty Before it votes, a participant can abort unilaterally After a participant votes Prepared and before it receives the coordinator’s decision, it is uncertain. It can’t unilaterally commit or abort during its uncertainty period. Coordinator Participant Request-to-Prepare Prepared Uncertainty Period Commit Done 2/14/01

Uncertainty (cont’d) The coordinator is never uncertain If a participant fails or is disconnected from the coordinator while it’s uncertain, at recovery it must find out the decision 2/14/01

The Bad News Theorems Uncertainty periods are unavoidable Blocking - a participant must await a repair before continuing. Blocking is bad. Theorem 1 - For every possible commit protocol (not just 2PC), a communications failure can cause a participant to become blocked. Independent recovery - a recovered participant can decide to commit or abort without communicating with other nodes Theorem 2 - No commit protocol can guarantee independent recovery of failed participants 2/14/01

3. 2PC Failure Handling Failure handling - what to do if the coordinator or a participant times out waiting for a message. Remember, all failures are detected by timeout A participant times out waiting for coordinator’s Request-to-prepare. It decides to abort. The coordinator times out waiting for a participant’s vote It decides to abort 2/14/01

2PC Failure Handling (cont’d) A participant that voted Prepared times out waiting for the coordinator’s decision It’s blocked. Use a termination protocol to decide what to do. Naïve termination protocol - wait till the coordinator recovers The coordinator times out waiting for Done it must resolicit them, so it can forget the decision 2/14/01

Forgetting Transactions After a participant receives the decision, it may forget the transaction After the coordinator receives Done from all participants, it may forget the transaction A participant must not reply Done until its commit or abort log record is stable Else, if it fails, then recovers, then asks the coordinator for a decision, the coordinator may not know 2/14/01

Logging 2PC State Changes Logging may be eager meaning it’s flushed to disk before the next Send Message Or it may be lazy = not eager Coordinator Log Start2PC (eager) Participant Request-to-Prepare Log prepared (eager) Prepared Log commit (eager) Commit Done Log commit (eager) Log commit (lazy) 2/14/01

Coordinator Recovery If the coordinator fails and later recovers, it must know the decision. It must therefore log the fact that it began T’s 2PC protocol, including the list of participants, and Commit or Abort, before sending Commit or Abort to any participant (so it knows whether to commit or abort after it recovers). If the coordinator fails and recovers, it resends the decision to participants from whom it doesn’t remember getting Done If the participant forgot the transaction, it replies Done The coordinator should therefore log Done after it has received them all. 2/14/01

Participant Recovery If a participant P fails and later recovers, it first performs centralized recovery (Restart) For each distributed transaction T that was active at the time of failure If P is not uncertain about T, then it unilaterally aborts T If P is uncertain, it runs the termination protocol (which may leave P blocked) To ensure it can tell whether it’s uncertain, P must log its vote before sending it to the coordinator To avoid becoming totally blocked due to one blocked transaction, P should reacquire T’s locks during Restart and allow Restart to finish before T is resolved. 2/14/01

Heuristic Commit Suppose a participant recovers, but the termination protocol leaves T blocked. Operator can guess whether to commit or abort Must detect wrong guesses when coordinator recovers Must run compensations for wrong guesses Heuristic commit If T is blocked, the local resource manager (actually, transaction manager) guesses At coordinator recovery, the transaction managers jointly detect wrong guesses. 2/14/01

4. 2PC Optimizations Read-only transaction Presumed Abort Transfer of coordination Re-infection Cooperative termination protocol 2/14/01

Read-only Transaction A read-only participant need only respond to phase one. It doesn’t care what the decision is. It responds Prepared-Read-Only to Request-to-Prepare, to tell the coordinator not to send the decision Limitation - All other participants must be fully terminated, since the read-only participant will release locks after voting. No more testing of SQL integrity constraints No more evaluation of SQL triggers 2/14/01

Presumed Abort After a coordinator decides Abort and sends Abort to participants, it forgets about T immediately. Participants don’t acknowledge Abort (with Done) Coordinator Participant Log Start2PC Request-to-Prepare Log prepared Prepared Log abort (forget T) Commit Log abort (forget T) If a participant times out waiting for the decision, it asks the coordinator to retry. If the coordinator has no info for T, it replies Abort. 2/14/01

Transfer of Coordination If there is one participant, you can save a round of messages 1. Coordinator asks participant to prepare and become the coordinator. 2. The participant (now coordinator) prepares, commits, and tells the former coordinator to commit. 3. The coordinator commits and replies Done. Coordinator Participant Log prepared Request-to-Prepare-and -transfer-coordination Log committed Log commit Commit Done Supported by some app servers, but not in any standards. 2/14/01

Reinfection Suppose A is coordinator and B and C are participants A asks B and C to prepare B votes prepared Now calls B to do some work. (B is reinfected.) B does the work and tells C it had prepared, but now it expects C to be its coordinator. When A asks C to prepare, C propagates the request to B and votes prepared only if both B and C are prepared. (See Tree of Processes discussion later.) Can be used to implement integrity constraint checking, triggers, and other commit-time processing, without requiring an extra phase (between phases 1 and 2 of 2PC). 2/14/01

Cooperative Termination Protocol (CTP) Assume coordinator includes a list of participants in Request-to-Prepare. If a participant times-out waiting for the decision, it runs the following protocol. 1. Participant P sends Decision-Req to other participants 2. If participant Q voted No or hasn’t voted or received Abort from the coordinator, it responds Abort 3. If participant Q received Commit from the coordinator, it responds Commit. 4. If participant Q is uncertain, it responds Uncertain (or doesn’t respond at all). If all participants are uncertain, then P remains blocked. 2/14/01

Cooperative Termination Issues Participants don’t know when to forget T, since other participants may require CTP Solution 1 - After receiving Done from all participants, coordinator sends End to all participants Solution 2 - After receiving a decision, a participant may forget T any time. To ensure it can run CTP, a participant should include the list of participants in the vote log record. 2/14/01

5. Process Structuring To support multiple RMs on multiple nodes, and minimize communication, use one transaction manager (TM) per node TM may be in the OS (VAX/VMS, Win2K), the app server (IBM CICS), DBMS, or a separate product (early Tandem). TM performs coordinator and participant roles for all transactions at its node. TM communicates with local RMs and remote TMs. Application RM ops StartTransaction, Commit, Rollback TX Resource Manager XA Other TMs 2PC ops Enlist and 2PC ops Transaction Manager 2/14/01

Enlisting in a Transaction When an Application in a transaction T first calls an RM, the RM must tell the TM it is part of T. Called enlisting or joining the transaction Application 2. Write(X, T) 1. StartTransaction (returns Tranction ID) Resource Manager 3. Enlist(T) Transaction Manager 2/14/01

Enlisting in a Transaction (cont’d) When an application A in a transaction T first calls an application B at another node, B must tell its local TM that the transaction has arrived. Application B Application A 5. Call(AP-B, T) 1. Call(AP-B, T) Communications Manager Communications Manager 3. Send Call(AP-B, T) 2. AddBranch(N, T) 4. StartBranch(N, T) Transaction Manager Transaction Manager Node N Node M 2/14/01

Tree of Processes Application calls to RMs and other applications induces a tree of processes Each internal node is the coordinator for its descendants, and a participant to its parents. This adds delay to two-phase commit TM1 Different Nodes RM1 TM2 TM3 TM4 RM2 RM3 RM4 RM5 2/14/01

Handling Multiple Protocols Communication managers solve the problem of handling multiple 2PC protocols by providing a model for communication between address spaces a wire protocol for two-phase commit But, expect restrictions on multi-protocol interoperation. The RM only talks to the TM-RM interface. The multi-protocol problem is solved by the TM vendor. Application Send/receive msg RM ops TX Communication Manager 2PC ops Resource Manager Other TMs XA XA+ Enlist and 2PC ops Transaction Manager 2/14/01

Complete Walkthrough Application: Start-trans Call DBMS Call remote app Commit 5. Call Application 6. Start-branch 2. Call DBMS 1. Start Tran 4. Add-branch 7. Commit Database System Comm Manager Comm Mgr 3. Enlist DBMS 8. Req-prepare 9. Prepared 10. Commit 11. Done Transaction Manager Txn Manager 2/14/01

Customer Checklist Does your DBMS support 2PC? Does your execution environment support it? If so, with what DBMSs? Using what protocol(s)? Do these protocols meet your interoperation needs? Is the TM-DBMS interface open (for home-grown DBMSs)? Can an operator commit/abort a blocked txn? If so, is there automated support for reconciling mistakes? Is there automated heuristic commit? 2/14/01

6.Three Phase Commit- The Idea 3PC prevents blocking in the absence of communications failures (unrealistic, but …). It can be made resilient to communications failures, but then it may block 3PC is much more complex than 2PC, but only marginally improves reliability — prevents some blocking situations. 3PC therefore is not used much in practice Main idea: becoming certain and deciding to commit are separate steps. 3PC ensures that if any operational process is uncertain, then no (failed or operational) process has committed. So, in the termination protocol, if the operational processes are all uncertain, they can decide to abort (avoids blocking). 2/14/01

Three Phase Commit- The Protocol 1. (Begin phase 1) Coordinator C sends Request-to-prepare to all participants 2. Participants vote Prepared or No, just like 2PC. 3. If C receives Prepared from all participants, then (begin phase 2) it sends Pre-Commit to all participants. 4. Participants wait for Abort or Pre-Commit. Participant acknowledges Pre-commit. 5. After C receives acks from all participants, or times out on some of them, it (begin third phase) sends Commit to all participants (that are up) 2/14/01

3PC Failure Handling If coordinator times out before receiving Prepared from all participants, it decides to abort. Coordinator ignores participants that don’t ack its Pre-Commit. Participants that voted Prepared and timed out waiting for Pre-Commit or Commit use the termination protocol. The termination protocol is where the complexity lies. (E.g. see [Bernstein, Hadzilacos, Goodman 87], Section 7.4) 2/14/01