Transactions ECEN 5053 Software Engineering of Distributed Systems University of Colorado Initially prepared by: David Leberknight.

Transactions ECEN 5053 Software Engineering of Distributed Systems University of Colorado Initially prepared by: David Leberknight

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Key Points so Far zGoal: objects managed by a server must remain in a consistent state zObjects must be recoverable zMust deal with crash failures of processes & omission failures of communication zAsynchronous systems zDesign objects for safe concurrent access zKeep atomic operations free from interference from concurrent operations in other threads, some of which are cooperating threads zCoordinator owns the responsibility for ensuring ACIDity zEnsure serially equivalent interleaving ( wrt conflicting ops)

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.11 A dirty read when transaction T aborts TransactionT: a.getBalance() a.setBalance(balance + 10) TransactionU: a.getBalance() a.setBalance(balance + 20) balance = a.getBalance()$100 a.setBalance(balance + 10)$110 balance = a.getBalance()$110 a.setBalance(balance + 20) $130 commit transaction abort transaction

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Recoverability of transactions zIf a transaction (like U) commits after seeing the effects of a transaction that subsequently aborted, it is not recoverable  e.g. U waits until T commits or aborts  if T aborts then U must also abort For recoverability: A commit is delayed until after the commitment of any other transaction whose state has been observed

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Cascading aborts zSuppose that U delays committing until after T aborts. ythen, U must abort as well. yif any other transactions have seen the effects due to U, they too must be aborted. ythe aborting of these latter transactions may cause still further transactions to be aborted. zSuch situations are called cascading aborts. To avoid cascading aborts transactions are only allowed to read objects written by committed transactions. to ensure this, any read operation must be delayed until other transactions that applied a write operation to the same object have committed or aborted. Avoidance of cascading aborts is a stronger condition than recoverability

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Strict executions of transactions zStrict executions of transactions yexecutions of transactions are called strict if both read and write operations on an object are delayed until all transactions that previously wrote that object have either committed or aborted. ythe strict execution of transactions enforces the desired property of isolation zTentative versions are used during progress of a transaction yobjects in tentative versions are stored in volatile memory

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.14 Transactions T and U with exclusive locks

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Key Points continued zPrevent dirty reads zA commit is delayed until after the commitment of any other transaction whose state has been observed zAvoid cascading aborts zStrict execution of transactions – enforces I = isolation zLocking provides the mutual exclusion support needed to get serial equivalence and avoid dirty reads

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 12.13 Nested transactions T : top-level transaction T 1 = openSubTransaction T 2 openSubTransaction T 1 : T 2 : T 11 : T 12 : T 211 : T 21 : prov.commit abort prov. commit commit

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Nested transactions (12.3) zTo a parent, a subtransaction is atomic with respect to failures and concurrent access zTransactions at the same level (e.g. T1 and T2) can run concurrently but access to common objects is serialized zA subtransaction can fail independently of its parent and other subtransactions yWhen it aborts, its parent decides what to do, e.g. start another subtransaction or give up

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Advantages of nested transactions zSubtransactions may run concurrently with other subtransactions at the same level. ythis allows additional concurrency in a transaction. y when subtransactions run in different servers, they can work in parallel. xe.g. consider the branchTotal operation xit can be implemented by invoking getBalance at every account in the branch. zSubtransactions can commit or abort independently. yThis is potentially more robust yA parent can decide on different actions according to whether a subtransaction has aborted or not

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Commitment of nested transactions zA transaction may commit or abort only after its child transactions have completed. zA subtransaction decides independently to commit provisionally or to abort. Its decision to abort is final. zWhen a parent aborts, all of its subtransactions are aborted. zWhen a subtransaction aborts, the parent can decide whether to abort or not. zIf the top-level transaction commits, then all of the subtransactions that have provisionally committed can commit too, provided that none of their ancestors has aborted.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Summary on transactions zWe consider only transactions at a single server, they are: zatomic in the presence of concurrent transactions ywhich can be achieved by serially equivalent executions zatomic in the presence of server crashes ythey save committed state in permanent storage (recovery Ch.14, 4 th edition) ythey use strict executions to allow for aborts ythey use tentative versions to allow for commit/abort znested transactions are structured from sub-transactions ythey allow concurrent execution of sub-transactions ythey allow independent recovery of sub-transactions

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Introduction to concurrency control zTransactions must be scheduled so that their effect on shared objects is serially equivalent for serial equivalence, (a) all access by a transaction to a particular object must be serialized with respect to another transaction’s access. (b) all pairs of conflicting operations of two transactions should be executed in the same order.  A server can achieve serial equivalence by serializing access to objects, e.g. by the use of locks  Two-phase locking - has a ‘growing’ and a ‘shrinking’ phase to ensure (b), a transaction is not allowed any new locks after it has released a lock

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Lock compatibility For one objectLock requested readwrite Lock already set noneOK readOKwait writewait Figure 13.15

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.16 Use of locks in strict two-phase locking 1. When an operation accesses an object within a transaction: (a)If the object is not already locked, it is locked and the operation proceeds. (b)If the object has a conflicting lock set by another transaction, the transaction must wait until it is unlocked. (c)If the object has a non-conflicting lock set by another transaction, the lock is shared and the operation proceeds. (d)If the object has already been locked in the same transaction, the lock will be promoted if necessary and the operation proceeds. (Where promotion is prevented by a conflicting lock, rule (b) is used.) 2. When a transaction is committed or aborted, the server unlocks all objects it locked for the transaction.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.17 Lock class public class Lock { private Object object;// the object being protected by the lock private Vector holders; // the TIDs of current holders private LockType lockType; // the current type public synchronized void acquire(TransID trans, LockType aLockType ){ while(/*another transaction holds the lock in conflicing mode*/) { try { wait(); }catch ( InterruptedException e){/*...*/ } } if(holders.isEmpty()) { // no TIDs hold lock holders.addElement(trans); lockType = aLockType; } else if(/*another transaction holds the lock, share it*/ ) ){ if(/* this transaction not a holder*/) holders.addElement(trans); } else if (/* this transaction is a holder but needs a more exclusive lock*/) lockType.promote(); } Continues on next slide

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.17 continued public synchronized void release(TransID trans ){ holders.removeElement(trans); // remove this holder // set locktype to none notifyAll(); }

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 LockManager class public class LockManager { private Hashtable theLocks; public void setLock(Object object, TransID trans, LockType lockType){ Lock foundLock; synchronized(this){ // find the lock associated with object // if there isn’t one, create it and add to the hashtable } foundLock.acquire(trans, lockType); } // synchronize this one because we want to remove all entries public synchronized void unLock(TransID trans) { Enumeration e = theLocks.elements(); while(e.hasMoreElements()){ Lock aLock = (Lock)(e.nextElement()); if(/* trans is a holder of this lock*/ ) aLock.release(trans); }

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.19 Deadlock with write locks TransactionT U OperationsLocksOperationsLocks a.deposit(100); write lockA b.deposit(200) write lockB b.withdraw(100 ) waits forU’’sa.withdraw(200);waits forT’s lock onB A

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.20 The wait-for graph for Figure 13.19 B A Waits for Held by T U U T Waits for

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.21 A cycle in a wait-for graph U V T

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Another wait-for graph zT, U and V share a read lock on C and zW holds write lock on B (which V is waiting for) zT and W then request write locks on C and deadlock occurs e.g. V is in two cycles - look on the left C T U V Held by T U V W W B Waits for Figure 12.22

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.23 Resolution of the deadlock in example 13.19 Transaction TTransaction U OperationsLocksOperationsLocks a.deposit(100); write lock A b.deposit(200) write lock B b.withdraw(100) waits for U ’s a.withdraw(200); waits for T’s lock onB A (timeout elapses) T’s lock onA becomes vulnerable, unlockA, abort T a.withdraw(200); write locksA unlockA, B

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Deadlock prevention is unrealistic ze.g. lock all of the objects used by a transaction when it starts yunnecessarily restricts access to shared resources. yit is sometimes impossible to predict at the start of a transaction which objects will be used. zDeadlock can also be prevented by requesting locks on objects in a predefined order y but this can result in premature locking and a reduction in concurrency

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Timeouts on locks zLock timeouts can be used to resolve deadlocks yeach lock is given a limited period in which it is invulnerable. yafter this time, a lock becomes vulnerable. yprovided that no other transaction is competing for the locked object, the vulnerable lock is allowed to remain. ybut if any other transaction is waiting to access the object protected by a vulnerable lock, the lock is broken yThe transaction whose lock has been broken is aborted  problems with lock timeouts  locks may be broken when there is no deadlock  if the system is overloaded, lock timeouts will happen more often and long transactions will be penalized  it is hard to select a suitable length for a timeout

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Optimistic concurrency control zthe scheme is called optimistic because the likelihood of two transactions conflicting is low za transaction proceeds without restriction until the closeTransaction (no waiting, therefore no deadlock) zit is then checked to see whether it has come into conflict with other transactions (validation) zwhen a conflict arises, a transaction is aborted

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Optimistic concurrency control Working phase –the transaction uses a tentative version of the objects it accesses (dirty reads can’t occur as we read from a committed version or a copy of it) –the coordinator records the readset and writeset of each transaction Validation phase –at closeTransaction the coordinator validates the transaction (looks for conflicts) –if the validation is successful the transaction can commit. –if it fails, either the current transaction, or one it conflicts with is aborted Update phase –If validated, the changes in its tentative versions are made permanent. –read-only transactions can commit immediately after passing validation.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Validation of transactions TvTv TiTi Rule writeread1.TiTi must not read objects written byTvTv readwrite2.TvTv must not read objects written byTiTi write 3.TiTi must not write objects written byTvTv and TvTv mustnot write objects written byTiTi page 498  Validation can be simplified by omitting rule 3 (if no overlapping of validate and update phases)

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Figure 13.28 Validation of transactions Earlier committed transactions WorkingValidationUpdate T 1 T v Transaction being validated T 2 T 3 Later active transactions active 1 2

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Backward Validation of Transactions zstartTn is the biggest transaction number assigned to some other committed transaction when T v started its working phase zfinishTn is biggest transaction number assigned to some other committed transaction when Tv started its validation phase zIn figure, StartTn + 1 = T 2 and finishTn = T 3. In backward validation, the read set of T v must be compared with the write sets of T 2 and T 3. zthe only way to resolve a conflict is to abort T v Backward validation of transaction T v boolean valid = true; for (int T i = startTn+1; T i <= finishTn; T i ++){ if (read set of T v intersects write set of T i ) valid = false; } to carry out this algorithm, we must keep write sets of recently committed transactions

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Forward validation zRule 1. the write set of T v is compared with the read sets of all overlapping active transactions yIn Figure 12.28, the write set of T v must be compared with the read sets of active1 and active2. zRule 2. (read T v vs write T i ) is automatically fulfilled because the active transactions do not write until after T v has completed. Forward validation of transaction Tv boolean valid = true; for (int Tid = active1; Tid <= activeN; Tid++){ if (write set of Tv intersects read set of Tid) valid = false; } read only transactions always pass validation as the other transactions are still active, we have a choice of aborting them or T v if we abort T v, it may be unnecessary as an active one may anyway abort Go back to conflict rules and Fig. 12.28 the scheme must allow for the fact that read sets of active transactions may change during validation

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Comparison of methods for concurrency control zpessimistic approach (detect conflicts as they arise) ytimestamp ordering: serialization order decided statically ylocking: serialisation order decided dynamically zoptimistic methods yall transactions proceed, but may need to abort at the end yefficient operations when there are few conflicts, but aborts lead to repeating work zthe above methods are not always adequate e.g. yin cooperative work there is a need for user notification yapplications such as cooperative CAD need user involvement in conflict resolution

Slides for Chapter 14: Distributed transactions From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Addison-Wesley 2005

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Commitment of dist. trans. - intro zA distributed transaction refers to a flat or nested transaction that accesses objects managed by multiple servers zAtomicity must still be preserved yA process on one of the servers is coordinator, it must ensure the same outcome at all of the servers. yThe ‘two-phase commit protocol’ is the most commonly used protocol for achieving this

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig 14.1 Distributed transactions Client X Y Z X Y M N T 1 T 2 T 11 Client P T T 12 T 21 T 22 (a) Flat transaction(b) Nested transactions T T

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig. 14.2 Nested banking transaction a.withdraw(10) c. deposit(10) b.withdraw(20) d.deposit(20) Client A B C T 1 T 2 T 3 T 4 T D X Y Z T =openTransaction openSubTransaction a.withdraw(10); closeTransaction openSubTransaction b.withdraw(20); openSubTransaction c.deposit(10); openSubTransaction d.deposit(20);

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig. 14.3 A dist. banking transaction.. BranchZ BranchX participant C D Client BranchY B A participant join T a.withdraw(4); c.deposit(4); b.withdraw(3); d.deposit(3); openTransaction b.withdraw(T, 3); closeTransaction T =openTransaction a.withdraw(4); c.deposit(4); b.withdraw(3); d.deposit(3); closeTransaction Note: the coordinator is in one of the servers, e.g. BranchX

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig. 14.4 Ops for 2PC canCommit?(trans)-> Yes / No Call from coordinator to participant to ask whether it can commit a transaction. Participant replies with its vote. doCommit(trans) Call from coordinator to participant to tell participant to commit its part of a transaction. doAbort(trans) Call from coordinator to participant to tell participant to abort its part of a transaction. haveCommitted(trans, participant) Call from participant to coordinator to confirm that it has committed the transaction. getDecision(trans) -> Yes / No Call from participant to coordinator to ask for the decision on a transaction after it has voted Yes but has still had no reply after some delay. Used to recover from server crash or delayed messages.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig. 14.5 Two-phase commit protocol Phase 1 (voting phase): 1. The coordinator sends a canCommit? request to each of the participants in the transaction. 2. When a participant receives a canCommit? request it replies with its vote (Yes or No) to the coordinator. Before voting Yes, it prepares to commit by saving objects in permanent storage. If the vote is No the participant aborts immediately.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig. 14.5 Two-phase commit protocol Phase 2 (completion according to outcome of vote): 3. The coordinator collects the votes (including its own). (a) If there are no failures and all the votes are Yes the coordinator decides to commit the transaction and sends a doCommit request to each of the participants. (b) Otherwise the coordinator decides to abort the transaction and sends doAbort requests to all participants that voted Yes. 4. Participants that voted Yes are waiting for a doCommit or doAbort request from the coordinator. When a participant receives one of these messages it acts accordingly and in the case of commit, makes a haveCommitted call as confirmation to the coordinator.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig. 14.6 Comm in 2PC protocol canCommit? Yes doCommit haveCommitted Coordinator 1 3 (waiting for votes) committed done prepared to commit step Participant 2 4 (uncertain) prepared to commit committed statusstepstatus

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Summary of 2PC za distributed transaction involves several different servers. yA nested transaction structure allows x additional concurrency and xindependent committing by the servers in a distributed transaction. zatomicity requires that the servers participating in a distributed transaction either all commit it or all abort it. zcontinued...

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Summary of 2PC zatomic commit protocols are designed to achieve this effect, even if servers crash during their execution. zthe 2PC protocol allows a server to abort unilaterally. yit includes timeout actions to deal with delays due to servers crashing. y2PC protocol can take an unbounded amount of time to complete but is guaranteed to complete eventually.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 14.5 Distributed deadlocks zSingle server transactions can experience deadlocks yprevent or detect and resolve yuse of timeouts is clumsy, detection is preferable. xit uses wait-for graphs. zDistributed transactions lead to distributed deadlocks yin theory can construct global wait-for graph from local ones ya cycle in a global wait-for graph that is not in local ones is a distributed deadlock

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 14.6 Transaction recovery zAtomicity property of transactions ydurability and failure atomicity ydurability requires that objects are saved in permanent storage and will be available indefinitely yfailure atomicity requires that effects of transactions are atomic even when the server crashes

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 14.6 Transaction recovery zRecovery is concerned with yensuring that a server’s objects are durable and ythat the service provides failure atomicity. yfor simplicity we assume that when a server is running, all of its objects are in volatile memory yand all of its committed objects are in a recovery file in permanent storage yrecovery consists of restoring the server with the latest committed versions of all of its objects from its recovery file

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Recovery manager zThe task of the Recovery Manager (RM) is: yto save objects in permanent storage (in a recovery file) for committed transactions; yto restore the server’s objects after a crash; yto reorganize the recovery file to improve performance; yto reclaim storage space (in the recovery file). zmedia failures yi.e. disk failures affecting the recovery file yneed another copy of the recovery file on an independent disk.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig 14.18 Types of entry in a recovery file Type of entryDescription of contents of entry Object A value of an object. Transaction statusTransaction identifier, transaction status (prepared, committed aborted ) and other status values used for the two-phase commit protocol. Intentions listTransaction identifier and a sequence of intentions, each of which consists of, <position in recovery file of value of object>.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Fig 14.19 Log for banking service

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Logging - reorganizing the recovery file zRM is responsible for reorganizing its recovery file zCheckpointing ythe process of writing to a new recovery file xthe current committed values of a server’s objects, x transaction status entries and intentions lists of transactions not yet fully resolved xincluding information related to 2PC (see later) ycheckpointing makes recovery faster and saves disk space xdone after recovery and from time to time

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 Recovery of the 2PC RoleStatusAction of recovery manager CoordinatorpreparedNo decision had been reached before the server failed. It sends abortTransaction to all the servers in the participant list and adds the transaction statusaborted in its recovery file. Same action for state aborted. If there is no participant list, the participants will eventually timeout and abort the transaction. CoordinatorcommittedA decision to commit had been reached before the server failed. It sends adoCommit to all the participants in its participant list (in case it had not done so before) and resumes the two-phase protocol at step 4 (Fig 13.5). ParticipantcommittedThe participant sends ahaveCommitted message to the coordinator (in case this was not done before it failed). This will allow the coordinator to discard information about this transaction at the next checkpoint. ParticipantuncertainThe participant failed before it knew the outcome of the transaction. It cannot determine the status of the transaction until the coordinator informs it of the decision. It will send agetDecision to the coordinator to determine the status of the transaction. When it receives the reply it will commit or abort accordingly. ParticipantpreparedThe participant has not yet voted and can abort the transaction. Coordinatordone No action is required.

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000 RoleStatusAction of RM Coordi nator prepared or aborted No decision reached before server failed. Sends abortTransaction to all participants on list and sends aborted to recovery file. Coord committed A decision to commit had been reached before the server failed. Sends a doCommit to all part.’s in list and resumes 2PC at step 4. Particip committed Part sends haveCommitted msg to Coord. Allows coord. to discard info about trans at next ckpoint. Particip ant uncertain The part. failed before knew outcome of trans. Cannot determine status of transaction until coordinator informs of decision. Sends a getDecision to coord to determine status of trans. When reply received, commits or aborts Particip preparedParticipant has not voted; can abort CoorddoneNo action is required Recovery of the 2PC

Transactions ECEN 5053 Software Engineering of Distributed Systems University of Colorado Initially prepared by: David Leberknight.

Similar presentations

Presentation on theme: "Transactions ECEN 5053 Software Engineering of Distributed Systems University of Colorado Initially prepared by: David Leberknight."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Transactions ECEN 5053 Software Engineering of Distributed Systems University of Colorado Initially prepared by: David Leberknight.

Similar presentations

Presentation on theme: "Transactions ECEN 5053 Software Engineering of Distributed Systems University of Colorado Initially prepared by: David Leberknight."— Presentation transcript:

Similar presentations

About project

Feedback