Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.

Similar presentations


Presentation on theme: "1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems."— Presentation transcript:

1 1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems

2 ICS214BNotes 112. Distributed commit problem Action: a 1,a 2 Action: a 3 Action: a 4,a 5 Transaction T Commit must be atomic

3 ICS214BNotes 113 Distributed commit problem Commit must be atomic –site failures –communication failures –network partitions –timeout failures Solution: Atomic commit protocol –must ensure that despite failures, if all failures repaired, then transactions commits or aborts at all sites. Most common ACP: Two-phase commit (2PC) –Centralized 2PC –Distributed 2PC –Linear 2PC –Many other variants…

4 ICS214BNotes 114 Terminology Resource Managers (RMs) –Usually databases Participants –RMs that did work on behalf of transaction Coordinator –Component that runs two-phase commit on behalf of transaction

5 ICS214BNotes 115 Coordinator Participant REQUEST-TO-PREPARE PREPARED* COMMIT* DONE

6 ICS214BNotes 116 Coordinator Participant REQUEST-TO-PREPARE NO ABORT DONE

7 ICS214BNotes 117 States of the Transaction At Coordinator: –Initiated (I) -- transaction known to system –Preparing (P) -- prepare message sent to participants –committed (C) -- has committed –Aborted (A) -- has aborted At participant: –Initiated (I) –Prepared (P) -- prepared to commit, if the coordinator so desires –committed (C) –Aborted (A)

8 ICS214BNotes 118 Protocol Database Coordinator maintains a protocol database (in main memory) for each transaction Protocol database –enables coordinator to execute 2PC –answers inquiries by participants about status of transaction  cohorts may make such inquiries if they fail during recovery –entry for transaction deleted when coordinator is sure that no one will ever inquire about transaction again (when it has been acked by all the participants)

9 ICS214BNotes 119 two-phase commit (messages) CoordinatorParticipant I P C A I P C A commit-request request-prepare* no abort* prepared* Commit* commit ack request-prepare prepared request-prepare no abort ack F ack*

10 ICS214BNotes 1110 Notation: Incoming message Outgoing message ( * = everyone) When participant enters “P” state: –it must have acquired all resources –it can only abort or commit if so instructed by a coordinator Coordinator only enters “C” state if all participants are in “P”, i.e., it is certain that all will eventually commit

11 ICS214BNotes 1111 Two phase commit -- normal actions (coordinator) –make entry into protocol database for transaction marking its status as initiated when coordinator first learns about transaction –Add participant to the cohort list in protocol database when coordinator learns about the cohorts –Change status of transaction to preparing before sending prepare message. (it is assumed that coordinator will know about all the participants before this step) –On receipt of PREPARE message from cohort, mark cohort as PREPARED. If all cohorts PREPARED, then change status to COMMITTED and send COMMIT message.  must force a commit log record to disk before sending commit message. –on receipt of ACK message from cohort, mark cohort as ACKED. When all cohorts have acked, then delete entry of transaction from protocol database.  Must write a completed log record to disk before deletion from protocol database. No need to force the write though.

12 ICS214BNotes 1112 Two Phase Commit - normal actions (participant) On receipt of PREPARE message, write PREPARED log record before sending PREPARED message –needs to be forced to disk since coordinator may now commit. On receipt of COMMIT message, write COMMIT log record before sending ACK to coordinator –cohort must ensure log forced to disk before sending ack -- but no great urgency for doing so.

13 ICS214BNotes 1113 Timeout actions At various stages of protocol, transaction waits from messages at both coordinator and participants. If message not received, on timeout, timeout action is executed: Coordinator Timeout Actions –waiting for votes of participants: ABORT transaction, send aborts to all. –waiting for ack from some participant: forward the transaction to recovery process that periodically will send COMMIT to participant. When participant will recover, and all participants send an ACK, coordinator writes a completion log record and deletes entry from protocol database. Cohort timeout actions: –waiting for prepare: abort the transaction, send abort message to coordinator. Alternatively, it could wait for the coordinator to ask for prepare. –Waiting for decision: forward transaction to recovery process. Recovery process executes status-transaction call to the coordinator. Such a transaction is blocked for recovery of failure. The participant could have used a different termination protocol -- e.g., polling other participants. (cooperative Termination)

14 ICS214BNotes 1114 2PC is blocking Sample scenario: CoordP2 W P1P3 W P4 W

15 ICS214BNotes 1115 Case I: P 1  “W”; coordinator sent commits P 1  “C” Case II: P 1  NO; P 1  A  P 2, P 3, P 4 (surviving participants) cannot safely abort or commit transaction coord P1P1 P2P2 P3P3 P4P4 w w w

16 ICS214BNotes 1116 Recovery Actions (cohort) All sites execute REDO-UNDO pass Detection: A site knows it is a cohort if it finds a prepared log record for a transaction If the log does not contain a commit log record: –reacquire all locks for the transaction –ask coordinator for the status of transaction If log contains a commit log record –do nothing

17 ICS214BNotes 1117 Recovery Action (coordinator) If protocol database was made fault-tolerant by logging every change, simply reconstruct the protocol database and restart 2PC from the point of failure. However, since we have only logged the commit and completion transitions and nothing else: –if the log does not contain a commit. Simply abort the transaction. If a cohort asks for status in the future, its status is not in the protocol database and it will be considered as aborted. –If commit log record, but no completion log record,  recreate transactions entry committed in the protocol database and the recovery process will ask all the participants if they are still waiting for a commit message. If no one is waiting, the completion entry will be written. – If commit log record + completion log record  do nothing.

18 ICS214BNotes 1118 2PC analysis Count number of messages, and log writes and number of forced log writes Normal Processing overhead –Coordinator: 2 log writes (commit/Abort, complete) 1 forced + 2 messages per cohort –Cohort  2 log writes both forced (prepared, committed/aborted)  2 messages to coordinator Presumed Abort Optimization: –if no entry in the protocol database, the transaction is presumed to have aborted. –If transaction aborts, delete entry from protocol database. No log record written and no ACKs required from cohorts since absence of transaction from protocol database is same as abort.

19 ICS214BNotes 1119 Variants of 2PC Linear Coord Hierarchical ok commit

20 ICS214BNotes 1120 Distributed –Nodes broadcast all messages –Every node knows when to commit Variants of 2PC

21 ICS214BNotes 1121 Cooperative Termination Protocol Bad case –Participant P recovers from failure –Has prepared record for transaction T –No commit or abort record for T –Coordinator is down Participant P is blocked until coordinator recovers

22 ICS214BNotes 1122 Cooperative termination protocol But perhaps some other participant can help? Requires participants “know” each other!

23 ICS214BNotes 1123 Cooperative Termination Protocol Participant P sends a DECISION- REQUEST message to other participants Alive participants respond with COMMIT, ABORT, or UNCERTAIN If any participant replies with a decision (COMMIT or ABORT), P acts on decision –And sends decision to UNCERTAIN participants

24 ICS214BNotes 1124 Cooperative Termination Protocol When P receives a DECISION-REQUEST –If it knows decision, responds with COMMIT or ABORT –If it has not prepared transaction, responds ABORT –If it is prepared but does not know decision, responds UNCERTAIN

25 ICS214BNotes 1125 Cooperative Termination Sample scenario: CoordP1 C P2 W P3 W

26 ICS214BNotes 1126 Cooperative Termination Sample scenario: CoordP1 W P2 W P3 A

27 ICS214BNotes 1127 Cooperative Termination Sample scenario: CoordP1 W P2 W P3 W

28 ICS214BNotes 1128 Is there a non-blocking protocol? Theorem: If communications failure or total site failures (i.e., all sites are down simultaneously) are possible, then every atomic protocol may cause processes to become blocked. Two exceptions: if we ignore communication failures, it is possible to design such a protocol (Skeen et. al. 83) If we impose some restrictions on transactions (I.e., what data they can read/write) such a protocol can also be designed (Mehrotra et. al. 92)

29 ICS214BNotes 1129 Next… Three-phase commit (3PC) –Nonblocking if reliable network (no communications failure) and no total site failures –Handling communications failures

30 ICS214BNotes 1130 Why 2PC blocks? Since operational site on timeout in prepare state does not know if the failed site(s) had committed or aborted the transaction. Polling all operational sites does not work since all the operational sites might be in doubt.

31 ICS214BNotes 1131 Approach to Making ACP Non-blocking For a given state S of a transaction T in the ACP, let the concurrency set of S be the set of states that other sites could be in. For example, in 2PC, the concurrency set of PREPARE state is {PREPARE, ABORT, COMMIT} We develop non-blocking protocol, we will –ensures that concurrency set of a transaction does not contain both a commit and an abort –There exists no non-committable state whose concurrency set contains a commit. A state is committable if occupancy of the state by any site implies everyone has voted to commit the transaction. Necessity of these conditions illustrated by considering a situation with only 1 site operational. If either of the above violated, there will be blocking. Sufficiency illustrated by designing a termination protocol that will terminate the protocol correctly if the above assumptions hold.

32 ICS214BNotes 1132 Three-Phase Commit Sample scenario: CoordP1 W P2 W P3 W

33 ICS214BNotes 1133 Coordinator Participant REQUEST-TO-PREPARE PREPARED COMMIT/ABORT DONE Uncertainty period

34 ICS214BNotes 1134 3PC Principle If ANY operational site is in the “uncertain” state, NO site (operational or failed) could have decided to commit Reminder: Assume reliable network

35 ICS214BNotes 1135 Coordinator Participant REQUEST-TO-PREPARE PREPARED COMMIT DONE PRECOMMIT ACK

36 ICS214BNotes 1136 Coordinator Participant REQUEST-TO-PREPARE NO ABORT DONE

37 ICS214BNotes 1137 Coordinator Participant Log start-3PC record (participant list) Log commit record (state C) Log prepared record (state W) Log committed record (state C) REQUEST-PREPARE PREPARED COMMIT PRECOMMIT ACK

38 ICS214BNotes 1138 Coordinator Participant REQUEST-PREPARE PREPARED COMMIT PRECOMMIT ACK 1. Timeout: Abort 2. Timeout: ignore 1. Timeout: abort 2. Timeout Termination Protocol 3. Timeout Termination Protocol

39 ICS214BNotes 1139 Process categories Three categories –Operational  Process has been up since start of 3PC –Failed  Process has halted since start of 3PC, or is recovering –Recovered  Process that failed and has completed recovery

40 ICS214BNotes 1140 Three Phase Commit - Termination Protocol Choose a backup coordinator from the remaining operational sites. Backup coordinator sends messages to other operational sites to make transition to its local state (or to find out that such a transition is not feasible) and waits for response. Based on response as well as its local state, it continues to commit or abort the transaction. It commits, if its concurrency set includes a commit state. Else, it aborts.

41 ICS214BNotes 1141 Termination Protocol Start 3PC Coordinator fails Decision reached All sites learn decision Only operational processes participate in termination protocol. Recovered processes wait until decision is reached and then learn decision

42 ICS214BNotes 1142 Coordinator Participant REQUEST-PREPARE PREPARED COMMIT PRECOMMIT ACK Abortable (A) Uncertain (U) Precommitted (PC) Committed (C)

43 ICS214BNotes 1143 Termination Protocol Elect new coordinator –Use Election Protocol (coming soon…) New coordinator sends STATE- REQUEST to participants Makes decision using termination rules Communicates to participants

44 ICS214BNotes 1144 Coordinator Participant STATE-REQUEST* ABORTABLE ABORT*

45 ICS214BNotes 1145 Coordinator Participant STATE-REQUEST* COMMITTED COMMIT*

46 ICS214BNotes 1146 Coordinator Participant STATE-REQUEST* UNCERTAIN* ABORT*

47 ICS214BNotes 1147 Coordinator Participant STATE-REQUEST* PRECOMMITTED, NO COMMITTED COMMIT* PRECOMMIT* ACK*

48 ICS214BNotes 1148 Termination Protocol Sample scenario: CoordP1 W P2 W P3 W

49 ICS214BNotes 1149 Termination Protocol Sample scenario: CoordP1 W P2 W P3 PC

50 ICS214BNotes 1150 Note: 3PC unsafe with communication failures! W W W P P abort commit

51 ICS214BNotes 1151 After coordinator receives DONE message, it can forget about the transaction –E.g., cleanup control structures


Download ppt "1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems."

Similar presentations


Ads by Google