Distributed Systems and Concurrency: Concurrency Control in Local and Distributed Systems Majeed Kassis.

Distributed Systems and Concurrency: Concurrency Control in Local and Distributed Systems
Majeed Kassis

Transactions Definition: A sequence of operations that perform a single logical function Examples: Withdraw money from bank account Making an airline reservation Making a credit-card purchase Usually used in context of databases

Definition: Concurrency Control
Definition: Concurrency control is the activity of coordinating concurrent accesses to a database in a multi-user database management system (DBMS)

Local concurrency control
Two-phase locking Timestamps Validation Distributed concurrency control Two-phase commit concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.

Concurrency in a DBMS Users submit transactions, and can think of each transaction as executing by itself. Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of various transactions. Each transaction must leave the database in a consistent state if the DB is consistent when the transaction begins. Issues: Effect of interleaving transactions, and crashes.

Transaction Fundamental Principles
Atomicity A transaction must happen completely or not at all A crash, power failure, error, or anything else won't allow you to be in a state in which only some of the transaction was applied. Consistency Data will be stay consistent after handling the transaction None of the constraints on the data will ever be violated. This does not guarantee correctness of the transaction Consistent: (definition) Consistent state: any given database transaction must change affected data only in allowed ways. Inconsistent state example: if the transaction was partially applied on the database.

Transaction Fundamental Principles
Isolation Each transaction runs as if alone If two transactions are executing concurrently, each one will see the world as if they were executing sequentially Durability יציבות Once a transaction is complete, it is guaranteed that all of the changes have been recorded to a durable medium (such as a hard disk) It cannot be undone, once completed!

Problems to avoid Lost updates Inconsistent retrievals
Another transaction overwrites your change based on a previous value of some data Inconsistent retrievals You read data that can never occur in a consistent state partial writes by other transactions writes by a transaction that later abort

Lost update example

Temporary update problem
One transaction updates a DB item and then the transaction fails for some reason. The updated item is accessed by another transaction before it is changed back to its original value.

The Incorrect Summary Problem
One transaction is calculating an aggregate summary function on a number of records while other transactions are updating some of these records, the aggregate function may calculate some values before they are updated and others after they are updated.

Transaction States Active, the initial state; the transaction stays in this state while it is executing Partially committed, after the final statement has been executed. Failed, after the discovery that normal execution can no longer proceed. Aborted, after the transaction has been rolled back and the database restored to its state prior to the start of the transaction. Two options after it has been aborted: restart the transaction – only if no internal logical error kill the transaction Committed, after successful completion.

Transaction API begin() commit() rollback()/abort()
To begin a transaction commit() Attempts to complete the transaction by changing the database Note: The system might refuse the commit attempt, and initiate abort procedure. This means the transaction has failed. rollback()/abort() Aborts the transaction

Example Problems due to concurrency:
bool transfer(Account src, Account dest, long x) { Transaction t = begin(); if (src.getBalance() >= x) { src.setBalance(src.getBalance() – x); dest.setBalance(dest.getBalance() + x); return t.commit(); } t.abort(); return FALSE; Problems due to concurrency: Lost updates – your result was overwritten Inconsistent retrievals – another transaction is modifying fields you are requiring

Solutions? Poor: Global lock! Better: Lock objects as you need
Only let one transaction run at a time isolated from all other transactions make changes permanent on commit or undo changes on abort, if necessary Not efficient We do wish to allow concurrent access. Better: Lock objects as you need Other transactions can execute concurrently Easy to implement

Lock-Based Protocols A lock is a mechanism to control concurrent access to a data item Data items can be locked in two modes : exclusive (X) mode. Data item can be both read as well as written. X- lock is requested using lock-X instruction. shared (S) mode. Data item can only be read. S-lock is requested using lock-S instruction.

Locks alone are insufficient
bool transfer(Account src, Account dest, long x) { lock(src); if (src.getBalance() >= x) { src.setBalance(src.getBalance() – x); unlock(src); lock(dest); dest.setBalance(dest.getBalance() + x); unlock(dest); return TRUE; } unlock(src) return FALSE; After 5 is done, it is open for read before we manage to write into dest This might cause some other queries to return wrong result: What is the value difference between src account and dest account? After 5 is done, it is open for read before we manage to write into dest

Two-Phase Locking Preserves serializability : Transactions can be executed concurrently, but can logically be ordered as if they were executed one after the other Two phases in a transaction life: Growing phase: acquire locks Shrinking phase: release locks Rule: You cannot lock anymore after your first release Locks are implicitly released upon commit/abort. Issues: Deadlocks might happen! Resource ordering is important

Example using two phase locks
bool transfer(Account src, Account dest, long x) { Transaction t = begin(); t.lock(src) if (src.getBalance() >= x) { src.setBalance(src.getBalance() – x); t.lock(dest); dest.setBalance(dest.getBalance() + x); return t.commit() //src and dest are unlocked } t.abort() //unlocks src return FALSE;

2PL might suffer deadlocks
t2.lock(bar); t2.lock(foo); t1.lock(foo); t1.lock(bar); • t1 might get the lock for foo, then t2 gets the lock for bar, then both transactions wait while trying to get the other lock

Problem with regular 2PL – Cascading Rollbacks
T3 begin S_lock(Z); R(Z); rest of T3 unlock(X); unlock(Y); unlock(Z); commit T2 begin X_lock(Z); S_lock(X); R(X); Rest of T2 unlock(Z); unlock(X); unlock(Y); commit T1 begin X_lock(X); W(X); Rest of T unlock(X); unlock(Y); Abort

Strict vs Regular Two-Phase Locking

Preventing deadlock Each transaction can get all its locks at once
Each transaction can get all its locks in a predefined order Both of these strategies are impractical: Transactions often do not know which locks they will need in the future

Detecting deadlock Construct a “waits-for” graph
Each vertex in the graph represents a transaction T1 → T2 if T1 is waiting for a lock T2 holds There is a deadlock iff the waits-for graph contains a cycle

“Ignoring” deadlock Automatically abort all long-running transactions
Not a bad strategy, if you expect transactions to be short A long-running “short” transaction is probably deadlocked

Timestamp Concurrency Control Algorithms Appraoch
Majeed Kassis

Timestamp Concurrency Control Algorithms
A transaction’s timestamp to coordinate concurrent access to data. A timestamp is a unique identifier given by DBMS to a transaction that represents relative start time of a transaction: Uniqueness: no equal timestamp values can exist. Monotonicity: timestamp values always increase. For each data Q, the protocol maintains two values: W-timestamp(Q): largest timestamp of any transaction successfully wrote Q. R-timestamp(Q): largest timestamp of any transaction successfully read Q. These solutions are lock-free!

Timestamp-based Transaction Execution Rules
Access Rule If two transaction wish to access same data at the same time, older transaction gets priority. Late Transaction Rule If a younger transaction has written a data item, then an older transaction is not allowed to read or write that data item. Younger Transaction Rule A younger transaction can read or write a data item that has already been written by an older transaction. Access Rule − When two transactions try to access the same data item simultaneously, for conflicting operations, priority is given to the older transaction. This causes the younger transaction to wait for the older transaction to commit first. Late Transaction Rule − If a younger transaction has written a data item, then an older transaction is not allowed to read or write that data item. This rule prevents the older transaction from committing after the younger transaction has already committed. Younger Transaction Rule − A younger transaction can read or write a data item that has already been written by an older transaction. Deadlocks?

When Xact T wants to read Object O
If TS(T) < WTS(O), this violates timestamp order of T w.r.t. writer of O. So, abort T and restart it with a new, larger TS. (If restarted with same TS, T will fail again! If TS(T) > WTS(O): Allow T to read O. Reset RTS(O) to max(RTS(O), TS(T))

When Xact T wants to Write Object O
If TS(T) < RTS(O), this violates timestamp order of T w.r.t. writer of O; abort and restart T. If TS(T) < WTS(O), violates timestamp order of T w.r.t. writer of O. Else, allow T to write O and update WTS(O) to TS(T).

Optimistic Algorithms for Concurrency Control
Based on the assumption that conflicts of database operations are rare. Better to let transactions run to complete Check for conflicts before they commit! Not while executing. Does not require locking. A transaction is executed without restrictions until it is committed following these phases: Read phase. Validation or certification phase. Write phase.

Read Phase: Optimistic Algorithms
Updates are prepared using private (or local) copies (or versions) of the data members The transaction reads values of committed data from the database. Executes the needed computations Makes the updates to a private copy of the database values. It is conventional to allocate a timestamp to each transaction at the end of its Read to determine the set of transactions that must be examined by the validation procedure.

Validation Phase: Optimistic Algorithms
The transaction is validated to assure that the changes made will not affect the integrity and consistency of the database. The transaction goes to the write phase, on validation success. The transaction is restarted, on validation failure. The list of data members is checked for conflicts. If conflicts are detected in this phase, the transaction is aborted and restarted. The validation algorithm must check that the transaction has: seen all modifications of transactions committed after it starts.

Write Phase: Optimistic Algorithms
The changes are permanently applied to the database. The updated data members are made public.

Test 1 For all i and j such that Ti < Tj, check that Ti completes before Tj begins. Ti Tj R V W R V W

Test 2 For all i and j such that Ti < Tj, check that: Ti completes before Tj begins its Write phase + WriteSet(Ti) ) ∩ ReadSet(Tj) is empty. Ti R V W Tj R V W Does Tj read dirty data? Does Ti overwrite Tj’s writes?

Test 3 For all i and j such that Ti < Tj, check that: Ti completes Read phase before Tj does + WriteSet(Ti) ∩ ReadSet(Tj) is empty + WriteSet(Ti) ∩ WriteSet(Tj) is empty. Ti R V W Tj R V W Does Tj read dirty data? Does Ti overwrite Tj’s writes?

wait-die scheme: When transaction Ti requests a data item currently held by Tj, Ti is allowed to wait only if it has a timestamp smaller than that of Tj (That is Ti is older than Tj), otherwise Ti is rolled back (dies). if a transaction requests to lock a resource (data item), which is already held with a conflicting lock by another transaction, then one of the two possibilities may occur − (1) If TS(Ti) < TS(Tj) − that is Ti, which is requesting a conflicting lock, is older than Tj − then Ti is allowed to wait until the data-item is available. (2) If TS(Ti) > TS(tj) − that is Ti is younger than Tj − then Ti dies. Ti is restarted later with a random delay but with the same timestamp.

wound-wait scheme: It is a counterpart to the wait-die scheme
wound-wait scheme: It is a counterpart to the wait-die scheme. When Transaction Ti requests a data item currently held by Tj, Ti is allowed to wait only if it has a timestamp larger than that of Tj, otherwise Tj is rolled back (Tj is wounded by Ti). if a transaction requests to lock a resource (data item), which is already held with conflicting lock by some another transaction, one of the two possibilities may occur − 1) If TS(Ti) < TS(Tj), then Ti forces Tj to be rolled back − that is Ti wounds Tj. Tj is restarted later with a random delay but with the same timestamp. (2) If TS(Ti) > TS(Tj), then Ti is forced to wait until the resource is available.

Final Notes: Optimistic Algorithms
Advantages: Very efficient when conflicts are rare. The rollback involves only the local copy of data. Disadvantages: Conflicts are expensive to deal with, since the conflicting transaction must be rolled back. Longer transactions are more likely to have conflicts and may be repeatedly rolled back because of conflicts with short transactions. Applications: Only suitable for environments where there are few conflicts Suitable for systems with the majority of their transactions are short. Acceptable for mostly Read or Query database systems that require very few update transactions.

Distributed transactions
Majeed Kassis

Distributed Database Management System
A collection of multiple, logically interrelated databases distributed over a computer network. A distributed Database Management system is as the software system that permits the management of the distributed database and make the distribution transparent to the users. Data may be replicated and found at different sites!

Distributed transactions
Data stored at distributed locations Failure model: messages might be delayed or lost servers might crash, but can recover saved persistent storage

The coordinator Begins transaction Responsible for commit/abort
Assigns unique transaction ID Responsible for commit/abort Many systems allow any client to be the coordinator for its own transactions The participants: The servers with the data used in the distributed transaction.

Problems with simple commit
“One-phase commit” Coordinator broadcasts “commit!” to participants until all reply What happens if one participant fails? Can the other participants then undo what they have already committed?

Two-phase commit (2PC) The commit-step itself is two phases
Phase 1: Voting Each participant prepares to commit by writing log records, and votes on whether or not it can commit Phase 2: Committing Each participant actually commits or aborts

2PC operations canCommit?(T) -> yes/no doCommit(T) doAbort(T)
Coordinator asks a participant if it can commit doCommit(T) Coordinator tells a participant to actually commit doAbort(T) Coordinator tells a participant to abort haveCommitted(participant,T) Participant tells coordinator it actually committed getDecision(T) -> yes/no Participant can ask coordinator if T should be committed or aborted

The voting phase Coordinator asks each participant: canCommit?(T)
Participants must prepare to commit using permanent storage before answering yes Objects are still locked Once a participant votes “yes”, it is not allowed to cause an abort Outcome of T is uncertain until doCommit or doAbort Other participants might still cause an abort

The commit phase The coordinator collects all votes
If unanimous “yes”, causes commit If any participant voted “no”, causes abort The fate of the transaction is decided atomically at the coordinator, once all participants vote Coordinator records fate using permanent storage Then broadcasts doCommit or doAbort to participants

2PC sequence of events

2PC continued Problem: If the coordinator fails, then everyone is stuck For instance, if everyone voted for commit but did not receive an answer, it is unknown whether the coordinator committed or aborted before failing. Problem is solved in 3PC: Send your vote to everyone Wait for everyone’s vote, or until someone is suspected If at least one node voted ‘no’ or at least one node is suspected to have failed, invoke Consensus(Abort) Else, start Consensus(Commit)

Distributed Systems and Concurrency: Concurrency Control in Local and Distributed Systems Majeed Kassis.

Similar presentations

Presentation on theme: "Distributed Systems and Concurrency: Concurrency Control in Local and Distributed Systems Majeed Kassis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Systems and Concurrency: Concurrency Control in Local and Distributed Systems Majeed Kassis.

Similar presentations

Presentation on theme: "Distributed Systems and Concurrency: Concurrency Control in Local and Distributed Systems Majeed Kassis."— Presentation transcript:

Similar presentations

About project

Feedback