Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!

Similar presentations


Presentation on theme: "Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!"— Presentation transcript:

1 Fault Tolerance CSCI 4780/6780

2 Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!! Two phase commit & three phase commit Two phase commit –Coordinator sends a VOTE_REQUEST –Participant sends a VOTE_COMMIT or VOTE_ABORT –Coordinator collects all votes and sends GLOBAL_COMMIT or GLOBAL_ABORT to all –Processes commit or abort the transaction

3 Two-Phase Commit (1) a)The finite state machine for the coordinator in 2PC. b)The finite state machine for a participant.

4 2 Phase Commit with Failures Process failures can lead to indefinite blocking Timeout mechanisms Wait states –INIT of a participant: Abort and send VOTE_ABORT –WAIT of coordinator: Send VOTE_ABORT –READY of participant When participant P is ready it can ask other participant Q –If Q is in INIT, Abort the transaction –If Q has received commit or Abort act accordingly –If Q has in WAIT, BLOCK

5 Two-Phase Commit (2) Actions taken by a participant P when residing in state READY and having contacted another participant Q. State of QAction by P COMMITMake transition to COMMIT ABORTMake transition to ABORT INITMake transition to ABORT READYContact another participant

6 Coordinator Actions Record WAIT and then multicast VOTE_REQUEST to everyone After all decisions have been received, record the decision and then multicast

7 Participant Actions Waits for a vote request Upon receiving a request, the participant decides the vote Records the vote and replies Logs the global decision and then executes DECISION_REQUEST if timeout

8 Two-Phase Commit (3) Outline of the steps taken by the coordinator in a two phase commit protocol actions by coordinator: while START _2PC to local log; multicast VOTE_REQUEST to all participants; while not all votes have been collected { wait for any incoming vote; if timeout { while GLOBAL_ABORT to local log; multicast GLOBAL_ABORT to all participants; exit; } record vote; } if all participants sent VOTE_COMMIT and coordinator votes COMMIT{ write GLOBAL_COMMIT to local log; multicast GLOBAL_COMMIT to all participants; } else { write GLOBAL_ABORT to local log; multicast GLOBAL_ABORT to all participants; }

9 Two-Phase Commit (4) Steps taken by participant process in 2PC. actions by participant: write INIT to local log; wait for VOTE_REQUEST from coordinator; if timeout { write VOTE_ABORT to local log; exit; } if participant votes COMMIT { write VOTE_COMMIT to local log; send VOTE_COMMIT to coordinator; wait for DECISION from coordinator; if timeout { multicast DECISION_REQUEST to other participants; wait until DECISION is received; /* remain blocked */ write DECISION to local log; } if DECISION == GLOBAL_COMMIT write GLOBAL_COMMIT to local log; else if DECISION == GLOBAL_ABORT write GLOBAL_ABORT to local log; } else { write VOTE_ABORT to local log; send VOTE ABORT to coordinator; }

10 Two-Phase Commit (5) Steps taken for handling incoming decision requests. actions for handling decision requests: /* executed by separate thread */ while true { wait until any incoming DECISION_REQUEST is received; /* remain blocked */ read most recently recorded STATE from the local log; if STATE == GLOBAL_COMMIT send GLOBAL_COMMIT to requesting participant; else if STATE == INIT or STATE == GLOBAL_ABORT send GLOBAL_ABORT to requesting participant; else skip; /* participant remains blocked */

11 Three-Phase Commit The states of the coordinator and each participant satisfy the following two conditions: 1.There is no single state from which it is possible to make a transition directly to either a COMMIT or an ABORT state. 2.There is no state in which it is not possible to make a final decision, and from which a transition to a COMMIT state can be made.

12 FSM of Three-Phase Commit

13 Checkpointing Saving the state of the system to stable storage The recorded state should be globally consistent If the delivery of a message is recorded, its sending must be recorded too Each process records its state locally System should recover to most recent distributed snapshot – Recovery line

14 Recovery Line

15 Independent Checkpointing Each process records its state independently On failure and recovery, each process recovers to its most recently saved state If the states do not form global snapshot, process have to roll back further – Domino effect Computing dependencies is hard Garbage collection is hard too

16 Domino Effect

17 Coordinated Checkpointing Processes synchronize when checkpointing Coordinator sends CHECKPOINT_REQUEST Processes take local checkpoint, queues subsequent messages and send out CHECKPOINT_DONE message Coordinator collects all CHECKPOINT_DONE messages and multicasts a CHECKPOINT_DONE message No messages sent until processes receive CHECKPOINT_DONE message

18 Message-Logging Schemes


Download ppt "Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!"

Similar presentations


Ads by Google