Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these (Atomicity, Consistency, Isolation, Durability) What is a distributed transaction? - Involves objects managed by multiple servers communicating with one another.
Transactions Permanent Record Server operation Server operation Server operation Server operation Shared variables Commit / Abort
Concurrency control The goal of concurrency control is to guarantee that when multiple transactions are concurrently executed, the net effect should be equivalent to executing them in some serial order. This is the essence of the serializability property.
Example 1 T1 starts (20)W(x:=1) [OK] R(x) [OK]T1 commits T2 starts(30)W(x:=2) [OK] T2 commits T3 starts (40)W(x:=3) [OK] R(x) T3 commits This is serializable. Think of other examples too.
Example 2 T1 starts (20)W(x:=1) [OK] R(x) [NO]T1 aborts T2 starts(30)W(x:=2) [OK] R(x) T2 commits T3 starts (40)W(x:=3) [OK] T3 commits This is not serializable.
Question Transaction 1 Raise the Q score of all GRE candidates from Iowa City by 10 points Transaction 2 Raise the Q score of all students whose id ends with 035 by 5 points Can we run these concurrently? Explain.
Pitfalls in concurrency control Dirty read Lost update Premature write
Lost update Amy’s transactionBob’s transaction 1Load B into local 4Load B into local 2Add $250 to local 5Add $250 to local 3Store local to B 6Store local to B What if the interleaving is ? The final value of B is $1250, although it should have been $1500 Initially, B= $1000
Dirty read Initially B= $1000 Amy’s transactionBob’s transaction 1Load B into local 4Load B into local 2Add $250 to local 5Add $250 to local 3Store local to B 6Store local to B ABORTCOMMIT Execute the actions in the sequence The final result is still $1500, although it should have been $1250
Premature write {Initially B = 0} Amy’s transactionBob’s transaction 1B:= $500 2B := $1000 3COMMIT 4 ABORT B changes to 0. This could have been avoided if the second transaction postponed its commit UNTIL the first transaction commits or aborts.
Locks Locks are commonly used to implement serrializability of concurrent transactions. Operations on shared objects are in conflict when one of them is a write operation. Each transaction must acquire the corresponding exclusive lock before executing an action. Locks can be fine grained. Note that there is no conflict between two reads.
Serializability The serialization graph is a directed graph (V, E) where V is the set of transactions, and E is the set of directed edges between transactions - a directed edge from a transaction T j to a transaction T k implies that T k applied a lock only after T j released the corresponding lock. TjTj TkTk
Serializability theorem For a set of concurrent transaction, the serializability property holds if and only if the corresponding serialization graph is acyclic
Two-phase locking (2PL) Phase 1. Acquire all locks needed to execute the transaction. The locks will be acquired one after another, and this phase is called the growing phas e or acquisition phase Phase 2. Release all locks acquired so far. This is called the shrinking phase or the release phase.
Two-phase locking (2PL) Growing phaseShrinking phase acquirerelease
2PL Theorem. 2PL guarantees serializability. Proof. Suppose that the theorem is not correct. Then the serialization graph must contain a cycle …T j T k … T m T j …This implies that T j must have released a lock (that was later acquired by T k ) and then acquired a lock (that was released by T m ). However this violates the condition of two-phase locking that rules out any locking once a lock has been released.
Atomic Commit Protocols Network of servers The initiator of a transaction is called the coordinator, and the remianing servers are participants S1 S3 S2 Servers may crash
Requirements of Atomic Commit Protocols Network of servers Termination. All non-faulty servers must eventually reach an irrevocable decision. Agreement. If any server decides to commit, then every server must have voted to commit. Validity. If all servers vote commit and there is no failure, then all servers must commi t. S1 S3 S2 Servers may crash
One-phase Commit server coordinator client server participant Commit If a participant deadlocks or faces a local problem then the coordinator may never be able to find it. Too simplistic.