Distributed Database Management Systems Lecture 29
In the previous lecture Serializability Theory Serializability Theory in DDBS.
In this Lecture Locking based CC Timestamp ordering based CC.
TM sends ops and other information to LM LM checks the status of data item Already Locked or not.
x = x+1 x = x*2 Write(x) Write(x) Commit Commit y = y-1 y = y*2 T1: Read(x) T2: Read(x) x = x+1 x = x*2 Write(x) Write(x) Commit Commit Read(y) Read(y) y = y-1 y = y*2 Write(y) Write(y) S={wl1(x), R1(x), W1(x), lr1(x), wl2(x), R2(x), W2(x), lr2(x), wl2(y), R2(y), W2(y), lr2(y), C2,wl1(y), R1(y), W1(y), lr1(y), C1)
Problem: Locks released immediately- S={wl1(x), R1(x), W1(x), lr1(x), wl2(x), R2(x), W2(x), lr2(x), wl2(y), R2(y), W2(y), lr2(y), C2,wl1(y), R1(y), W1(y), lr1(y), C1) Not a serial schedule Problem: Locks released immediately-
Two-Phase Locking
A transaction must not attain a lock once it releases a lock Or, it should not release any lock until it is sure it won’t need any lock
It creates growing phase, shrinking phase and a lock point Lock point determines end of growing phase and start of shrinking phase
Any transaction that follows 2-PL is serializable.
This 2-PL is difficult to implement LM has to know that a Tr has attained all locks Its not going to need a released item again So we have strict 2-PL
Centralized 2PL The locking job is designated to a single site So only one site has the Lock Manager
Central site may become a bottleneck in case of too many accesses Primary Copy 2-PL is one solution
Coordinating TM Primary Site LM Data Processor Partic Site
Distributed 2-PL
That was it about locking approach Next we see timestamp-based concurrency control.
Serializabiltiy implemented by assigning a serialization order to transactions Unique timestamps are issued by TM at the initiation of a transaction
Timestamps should be unique and monotonically increasing Maintaining system wide monotonically increasing counter is difficult in DDBS environment
Timestamp then could be a two tuple consisting a counter value and the site id. System clock can also be used in stead.
Timestamp Ordering Rule Given two conflicting operations Oij, Okl of Ti and Tk, Oij is executed first iff ts(Ti) < ts(Tk) Younger and older transactions.
TO scheduler checks each operation against conflicting operations Younger ones are allowed, older refused-
A TO scheduler is guaranteed to generate serial order However, needs to know all operations in advance
Practically, operations come one at a time To detect the sequence of operations, a R/W timestamp is allocated to data items as well
Data items assigned rts(x) or wts(x), largest timestamp that has read or written a data item.
For a write request, if an older transaction has read or written the data item, then operation is rejected.
For a read request, if an older transaction has written the data item, then operation is rejected TO does not generate deadlocks
Conservative TO Basic TO generates too many restarts Like, if a site is relatively calm, then its transactions will be restarted again and again
Synchronizing timestamps may be very costly System clocks can used if they are at comparable speeds
In con-TO, operations are not executed immediately, but they are buffered Scheduler maintains queue for each TM.
Operations from a TM are placed in relevant queue, ordered and executed later Reduces but does not eliminate restarts
Multiversion TO Another attempt to reduce the restarts Multiple versions of data items with largest r/w stamps are maintained.
Read operation is performed from appropriate version Write is rejected if any older has read or written a data item
That was all about Pessimistic CC algorithms, now we move to Optimistic approaches.
Pessimistic assume that conflicts are likely to happen so they are careful and follow…. Validate Read Compute Write
Optimistic assumes less chances of conflict, so validation is done at the last stage… Read Compute Validate Write
Each transaction is divided into sub-transactions that execute independently Tij = Tr i that executes at site j Transactions run independently at each site until they reach end of read phases.
All sub-transactions are assigned timestamps at the end of read phases After that Validation test is performed
VT1: If all transactions Tk, where ts(Tk)<ts(Tij) have completed their write phase before Tij started its read, validation succeeds
VT2: If there is any Tk, where ts(Tk)<ts(Tij) which completes its write while Tij is in its read phase then validation succeeds if WS(Tk) ∩ RS(Tij) = Ø
VT3: If Tk completes its read phase before Tij completes its read phase, then validation succeeds if WS(Tk) ∩ RS(Tij) = Ø and WS(Tk) ∩ WS(Tij) = Ø Read Validate Write Tk Tij
Optimistic techniques allow more concurrency. Need more storage for the validation tests Repeated failure for longer transactions.
Here we have finalized our discussion on CC techniques. Next topic is Deadlock Management
Locking based CC generates deadlock T1 waits for data item being held by T2, and other way round Wait-for Graph.
A WFG represents the relationship between transactions waiting for each other to release data items
We can have local and global WFG T1 T2 T3 T4 Site 1 Site 2
Deadlock Management Ignore Prevention Avoidance Detect and Recovery