1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Concurrency Control for Distributed Databases These slides are licensed under a Creative Commons.

Slides:

Advertisements

Similar presentations

More About Transaction Management Chapter 10. Contents Transactions that Read Uncommitted Data View Serializability Resolving Deadlocks Distributed Databases.

Advertisements

TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.

Concurrency Control Amol Deshpande CMSC424. Approach, Assumptions etc.. Approach  Guarantee conflict-serializability by allowing certain types of concurrency.

Lock-Based Concurrency Control

6.852: Distributed Algorithms Spring, 2008 Class 7.

(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)

CS 603 Handling Failure in Commit February 20, 2002.

Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.

COS 461 Fall 1997 Transaction Processing u normal systems lose their state when they crash u many applications need better behavior u today’s topic: how.

Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.

Exercises for Chapter 17: Distributed Transactions

CIS 720 Concurrency Control. Timestamp-based concurrency control Assign a timestamp ts(T) to each transaction T. Each data item x has two timestamps:

ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.

Transaction Processing Lecture ACID 2 phase commit.

Distributed Systems 2006 Styles of Client/Server Computing.

Distributed Systems Fall 2010 Transactions and concurrency control.

Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.

Transaction Management and Concurrency Control

Systems of Distributed Systems Module 2 -Distributed algorithms Teaching unit 3 – Advanced algorithms Ernesto Damiani University of Bozen Lesson 6 – Two.

CS 582 / CMPE 481 Distributed Systems

Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.

Transaction Management

1 Distributed Databases CS347 Lecture 16 June 6, 2001.

Reliability and Partition Types of Failures 1.Node failure 2.Communication line of failure 3.Loss of a message (or transaction) 4.Network partition 5.Any.

Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)

©Silberschatz, Korth and Sudarshan19.1Database System Concepts Distributed Transactions Transaction may access data at several sites. Each site has a local.

1 More on Distributed Coordination. 2 Who’s in charge? Let’s have an Election. Many algorithms require a coordinator. What happens when the coordinator.

1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.

Distributed Commit. Example Consider a chain of stores and suppose a manager – wants to query all the stores, – find the inventory of toothbrushes at.

CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.

CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.

Distributed Systems Fall 2009 Distributed transactions.

CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.

Distributed Deadlocks and Transaction Recovery.

AN OPTIMISTIC CONCURRENCY CONTROL ALGORITHM FOR MOBILE AD-HOC NETWORK DATABASES Brendan Walker.

CS162 Section Lecture 10 Slides based from Lecture and

III. Current Trends: 2 - Distributed DBMSsSlide 1/47 III. Current Trends Distributed DBMSs: Advanced Concepts 3C13/D63C13/D6.

Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.

Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.

Distributed Transactions Chapter 13

Distributed Transactions

Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.

Concurrency Control in Database Operating Systems.

Distributed Transaction Management, Fall 2002Lecture Distributed Commit Protocols Jyrki Nummenmaa

University of Tampere, CS Department Distributed Commit.

Databases Illuminated

1 Advanced Database Topics Copyright © Ellis Cohen Synchronous Data Replication These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

XA Transactions.

Distributed Transactions Chapter – Vidya Satyanarayanan.

Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.

Distributed Databases

IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.

1 CSE 480: Database Systems Lecture 24: Concurrency Control.

Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.

Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.

Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.

Distributed Databases – Advanced Concepts Chapter 25 in Textbook.

Chapter 19: Distributed Databases

Outline Introduction Background Distributed DBMS Architecture

Multiple Granularity Granularity is the size of data item allowed to lock. Multiple Granularity is the hierarchically breaking up the database into portions.

Database System Implementation CSE 507

Two phase commit.

Commit Protocols CS60002: Distributed Systems

Outline Announcements Fault Tolerance.

Distributed Transactions

Exercises for Chapter 14: Distributed Transactions

Distributed Databases Recovery

UNIVERSITAS GUNADARMA

CIS 720 Concurrency Control.

Transactions, Properties of Transactions

Presentation transcript:

1 Advanced Database Topics Copyright © Ellis Cohen Concurrency Control for Distributed Databases These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see

2 Copyright © Ellis Cohen, Topics Distributed Lock-Based Concurrency Control Distributed Abort Protocol Distributed Atomic Commit Protocols Distributed Optimistic Concurrency Control

3 Copyright © Ellis Cohen, Distributed Lock-Based Concurrency Control

4 Copyright © Ellis Cohen, Sub-query Distribution Suppose a coordinator wants to execute the query that lists the project managed by the highest paid employee SELECT * FROM Projs WHERE pmgr = (SELECT empno FROM Emps WHERE sal = (SELECT max(sal) FROM Emps)) If subordinate S1 holds the Projs table, and subordinate S2 holds the Emps tables, then the coordinator will request S2 to execute the sub-query SELECT empno FROM Emps WHERE sal = (SELECT max(sal) FROM Emps) Will get the result back (let's call it result), and request S1 to execute (and return the results of) the sub-query SELECT * FROM Projs WHERE pmgr = result

5 Copyright © Ellis Cohen, Sub-transactions Imagine a coordinator C has started a transaction TC, and is executing a query as part of TC. –The coordinator divides the query up into sub-queries, which it sends to various subordinates. –It labels each subquery with TC, the identity of the main transaction. When a subordinate S is passed a sub-query –If it has not yet seen the label TC, it creates a local transaction TS (called a sub-transaction), and associates TS with TC. –If it has seen TC before, it looks up the corresponding TS. In either case, S runs the sub-query as part of the local sub-transaction TS

6 Copyright © Ellis Cohen, Centralized Locking Each query & commit funneled through central Lock Manager site which maintains all locks Evaluation: Only supports table-level granularity (but predicate locks could achieve the effect of row-level granularity) Cost Issue: Requires extra communication for each query Reliability Issue: single point of failure; crash of Lock Manager requires abort of all transactions + election of new Lock Manager Scalability Issue: Lock Manager is bottleneck Note: Depending upon pattern of communication, address both reliability & scalability via hierarchy of lock managers

7 Copyright © Ellis Cohen, Distributed Deadlock Prevention Each subordinate –Locks its own DB objects –Can make WAIT/WOUND/DIE decisions locally (requires transaction properties - e.g. timestamp, priority - passed with each sub-query) WOUND or DIE –Aborts local sub-transaction –Notifies coordinator who aborts main transaction (if not already aborted) & informs other subordinates (and, if hierarchical, notifies its parent coordinator) Consider two transactions, T1 and T2, managed by different coordinators C1 and C2, that both try to lock the same resource. If T1's clock is set a year in the past, will it ever be wounded or die?

8 Copyright © Ellis Cohen, Local & Global WFG's Consider T1 locks A at site S1, requests B at site S2 T2 locks B at site S2, requests A at site S1 T2 A T1 S1 knows: S2 knows: Local WFGs (Wait For Graphs) S1 knows: T2  T1 S2 knows: T1  T2 Need to build Global WFG to discover cycle T1  T2  T1 T2 T1 B T2 T1

9 Copyright © Ellis Cohen, DDBMS Deadlock Detection Timeout-based Deadlock Detection (Oracle) Subordinate detects local deadlocks via local WFG Use timeouts to detect global deadlocks Centralized Deadlock Detection Each subordinate sends local WFG to central site regularly which informs coordinator of deadlock Can also do this hierarchically Phantom Deadlock Problem Suppose central site detects deadlock between T1 and T2, and chooses to tell T1's coordinator to abort In the meantime, T2 is aborted for some other reason (e.g. T2's coordinator crashes) How could phantom deadlocks be avoided?

10 Copyright © Ellis Cohen, Distributed Deadlock Detection Path Pushing Algorithm When coordinator makes a subquery for transaction T, pass along sites at which T has already acquired locks If subquery causes wait, and deadlock can't be detected locally, send (own & propagated) knowledge about path to sites at which T has acquired locks, as well other [higher numbered] waiting sites you know about

11 Copyright © Ellis Cohen, Distributed Abort Protocol

12 Copyright © Ellis Cohen, Distributed System Failures Site failures Site crashes or is unable to respond to messages Link failures Messages may be undeliverable, lost, or garbled, so understandable response is not received Link failures can cause network partition; some sites become unreachable from other sites Failure detection Usually via timeouts (time it takes for remote site to respond to message exceeds threshold) If failure is suspected (a message timed out), a ping message can be sent to site; if ping response is received, timeout period can be extended (but not indefinitely)

13 Copyright © Ellis Cohen, Distributed Algorithms Because of failures, distributed algorithms are complicated. In designing distributed algorithms, we need to work out The messages that need to go back and forth between nodes, and how a node responds to each message, to accomplish the algorithm How to handle timeouts: what to do when a node expects a message, but doesn’t received it in a reasonable time How to handle recovery: what a node does on recovering, if it crashed while it was in participating in the distributed algorithm

14 Copyright © Ellis Cohen, Aborting Distributed Transactions To explore distributed algorithms, we'll consider distributed abort: How a coordinator gets all the subordinates to abort a transaction. Coordinator Subordinate ABORT ABORT- ACK ABORT ABORT- ACK First, what could make a coordinator start an ABORT

15 Copyright © Ellis Cohen, Causes of Distributed Abort Subordinate Raise error in executing a sub-query Crashes (or appears to) Coordinator Raise error in executing local sub-query Crashes (or appears to) Told to ROLLBACK (by application) Told to ABORT (e.g. deadlock detection)

16 Copyright © Ellis Cohen, Standard Abort Protocol COORDINATOR (Abort) (when it decides / is told to abort) Force Abort to log (with list of subordinates) Send ABORT to each Subordinate Aborts main transaction SUBORDINATE (Abort) (when it receives an ABORT message) Force Abort to Log (unless already aborted) Send ABORT-ACK to coordinator Abort own subtransaction (unless already aborted) COORDINATOR (AbortComplete) (when it receives all ABORT-ACK back) Write AllAbortsDone to log Suppose it doesn't receive all ACKS back? Is ABORT-ACK even necessary?

17 Copyright © Ellis Cohen, Timeouts SUBORDINATE (Waiting) Subordinates at any time can send an INQUIRE message to the coordinator. If response is –ACTIVE  wait some more –ABORT  Do standard Abort action –none  decide whether to abort or to wait some more COORDINATOR (waiting for ABORT-ACK) Regularly keep sending ABORT & wait for ABORT-ACK

18 Copyright © Ellis Cohen, Recovery COORDINATOR (on discovering Abort T in log, without corresponding AllAbortsDone) Send ABORTs to all subordinates (in Abort entry) (on discovering Start T in log, without corresponding Commit or Abort) Subordinates are unknown: Answer INQUIREs. SUBORDINATE (on discovering Abort T in Log) Send ABORT-ACK to coordinator (on discovering Start T in Log, but no corresponding Commit or Abort) Send ABORT to coordinator (directs coordinator to abort transaction) Force ABORT to log Abort own subtransaction Are ABORT-ACK & AllAbortsDone necessary?

19 Copyright © Ellis Cohen, ABORT-ACK & AllAbortsDone The ABORT-ACK message and the AllAbortsDone log entry are not completely necessary. That's because subordinates can abort on their own (for any reason, but especially) if they don't hear from the coordinator. ACKs and completion log entries are much more crucial when we talk abort commit

20 Copyright © Ellis Cohen, Distributed Atomic Commit Protocols (ACP)

21 Copyright © Ellis Cohen, Atomic Commit Protocols Distributed Atomic Commit Protocols ensure atomicity & durability in distributed environments –A transaction which executes at multiple sites must either be committed at all sites or aborted at all sites –Not acceptable to have a transaction committed at one site and aborted at another 2 Phase Commit (2PC) Industry Standard Protocol 3 Phase Commit (3PC) Extension of 2PC which reduces blocking when coordinator fails occur during protocol

22 Copyright © Ellis Cohen, PC Motivation Suppose Transaction coordinator, with subordinates S1 and S2 is ready to commit (in particular, all subqueries have finished successfully) Coordinator sends COMMIT messages for the transaction to S1 and S2. S1 commits its local subtransactions. S2 crashes just before receiving the COMMIT message (and before writing any local subtransaction state to stable storage) -- i.e. S2 aborts. Problem Need a way to ensure that once the coordinator has decided to commit & has started to send COMMIT messages, a subordinate crash does not cause that subtransaction to abort

23 Copyright © Ellis Cohen, Simplified 2 Phase Commit Coordinator Subordinate PREPARE COMMIT- ACK 1a YES 1b 2b COMMIT 2a

24 Copyright © Ellis Cohen, PC Approach PREPARE Phase: Coordinator sends PREPARE message to each subordinate Each subordinate prepares to commit by ensuring that the sub-transaction can be made locally durable (e.g. by forcing out log entries, including the Prepare log entry) Once the subordinate has prepared it can commit even after it crashes, and it is not allowed to abort unless it knows the coordinator aborted the transaction COMMIT Phase: Coordinator sends COMMIT only after all subordinates are prepared. The transaction is unalterably committed when the Commit entry is forced to the coordinator's log (because if it crashes, it can complete the commit on recovery)

25 Copyright © Ellis Cohen, Prepare Phase COORDINATOR (Prepare) (when it decides / is told to commit) Force out log (with Prepare entry containing list of subordinates) Send PREPARE to each Subordinate (with list of subordinates) SUBORDINATE (Prepare) (when it receives a PREPARE message) Decides whether it can commit (NO only if it is already aborting or it uses optimistic concurrency and local validation fails) NO  Force Abort to Log (unless already aborted) Send NO to coordinator Abort own subtransaction (unless already aborted) YES  Force out Log with Prepare entry Send YES to coordinator

26 Copyright © Ellis Cohen, Period of Uncertainty Once a subordinate answers YES to PREPARE The subordinate cannot unilaterally decide whether to commit or abort The subtransaction enters a period of uncertainty, not knowing whether the main transaction will ultimately commit or abort The subordinate must wait until the coordinator tells it which to do

27 Copyright © Ellis Cohen, Coordinator Commit Phase The coordinator waits for all subordinates to respond If any subordinate responds NO, or does not respond within the timeout period (possibly after sending PREPARE again), the coordinator –Forces Abort to the log –Sends ABORT to each subordinate that did not respond with a NO –Aborts the main transaction If all subordinates respond YES within the timeout period, the coordinator –Forces Commit to the log This is the moment at which the transaction is durably committed –Sends COMMIT to each subordinate –Commits the main transaction

28 Copyright © Ellis Cohen, Subordinate Commit Phase SUBORDINATE (receiving ABORT) –Force Abort to log –Abort own subtransaction SUBORDINATE (receiving COMMIT) –Force Commit to log –Send COMMIT-ACK back to Coordinator COORDINATOR (receiving all COMMIT-ACKs) –Writes CommitComplete to Log –If it times out waiting for a COMMIT-ACK from a subordinate, it will keep sending COMMITs

29 Copyright © Ellis Cohen, Subordinate Timeouts SUBORDINATE (waiting for Prepare/Abort) Send an INQUIRE message to the coordinator. If response is –ACTIVE  wait some more –ABORT  Do standard Abort action –PREPARING  Do standard Prepare action –none  decide whether to abort or to wait some more SUBORDINATE (after Prepare) Send an INQUIRE message to the coordinator. If response is –PREPARING  continue to wait –ABORT  Do standard Abort action –COMMIT  Do standard Commit action –none  Cannot make a unilateral decision! Must either wait or find out the transaction disposition in some other way (e.g. by using a Termination Protocol)

30 Copyright © Ellis Cohen, Recovery COORDINATOR (on discovering Commit T in log, without corresponding CommitComplete) Send COMMIT to all subordinates. (on discovering Prepare T in log, without corresponding Commit or Abort) Send ABORTs to all subordinates SUBORDINATE (on discovering Commit T in Log) Send COMMIT-ACK to coordinator (on discovering Prepare T in Log) Send YES to coordinator (on discovering Start T in Log, but no corresponding Commit or Abort) Send ABORT to coordinator (directs coordinator to abort transaction) Force ABORT to log Abort own subtransaction

31 Copyright © Ellis Cohen, Termination Protocol Motivation A subordinate can get stuck in a period of uncertainty if –The subordinate has already prepared –Either (a) the coordinator crashed or (b) the coordinator & subordinate became disconnected before the coordinator could send ABORT or COMMIT to the subordinate. However, –Maybe the coordinator did get an ABORT or COMMIT message off to another subordinate. –The subordinate might be able to proceed if it could check with the other subordinates!

32 Copyright © Ellis Cohen, Termination Protocol Along with PREPARE message, each subordinate gets a list of other subordinates If coordinator does not respond to INQUIRE, it sends INQUIRE to (some or all of) the other subordinates. Other subordinates respond –COMMIT - if it received COMMIT from coordinator –ABORT - if it aborted -- e.g. it received ABORT from coordinator, or it responded NO to PREPARE, or didn't receive PREPARE, and chooses to abort –UNCERTAIN - otherwise Subordinate commits or aborts if COMMIT or ABORT is received from any other subordinate, else it remains uncertain (occasionally keep trying INQUIREs to coordinator & other subordinates) Blocking problem: If all responses are UNCERTAIN or time out, a subordinate may have to wait for coordinator recovery or network repair

33 Copyright © Ellis Cohen, PC Motivation If a subordinate is uncertain, and every subordinate it can communicate with is uncertain, they ALL MUST WAIT. With 3PC, if the group of communicating subordinates are a [weighted] majority of the participants, they can always proceed!

34 Copyright © Ellis Cohen, PC Extends 2PC to 3 phases: PREPARE, PRECOMMIT, COMMIT A subordinate is uncertain after sending YES and before getting back PRECOMMIT A [weighted] minority partition of subordinates must wait for network repair. A [weighted] majority partition of the subordinates Aborts if all are uncertain Else if at least one has received PRECOMMIT, uses an election protocol to elect a new coordinator if necessary (e.g. the one with the highest IP address), who then continues with the protocol A coordinator (original or elected) sends COMMIT when it gets PRECOMMIT-ACKS from a [weighted] majority of the subordinates

35 Copyright © Ellis Cohen, Distributed Optimistic Concurrency Control

36 Copyright © Ellis Cohen, Optimistic Concurrency Control Assumes (optimistically) that a transaction will not have conflicts with other transactions, avoiding the overhead of locks. Cache-Based: Reads all possible data from and writes all data to its client cache. Validation-Based: When the transaction commits, writes all changes back the DB server, but only after validating that the data it used during the transaction is still up-to-date.

37 Copyright © Ellis Cohen, Distributed Validation S TblB TblA AB B's cache for S A's cache for S When S commits, A & B will both receive PREPARE messages. They will each locally do validation for their respective subtransactions, and only respond YES if validation succeeds. Consider a distributed DB which uses server-managed client caches. Note: With a client-side cache, S would need, as part of PREPARE, to pass back to A & B the timestamps of the data items read from A and B respectively. How can this be supported if cross-DB query processing (e.g. joins) are done at other nodes, and only the final results are passed back to S?

38 Copyright © Ellis Cohen, Distributed Ordering Problem What if S and T want to commit at the same time, A receives S's PREPARE message first, and B receives T's PREPARE message first  result can be non-serializable ST TblB AB B's cache for S B's cache for T TblA A's cache for S A's cache for T

39 Copyright © Ellis Cohen, Non-Serializable Result S 1) UPDATE AT SET a2 = a ) UPDATE BT SET b2 = b1 3) COMMIT T 1) UPDATE BT SET b2 = b ) UPDATE AT SET a2 = a1 3) COMMIT Assume a1=1 a2=2 b1=3 b2=4 There are two possible serial schedules S T  a2=1 b2=103 T S  a2=101 b2=3 But suppose S & T execute in parallel, and send PREPAREs to A and B in parallel If A get PREPAREs & validates T after S, no R/W conflicts and both validations succeed  a2=1 If B get PREPAREs & validates S after T, no R/W conflicts and both validations succeed  b2=3 When using Distributed Optimistic Concurrency Control subordinates cannot independently order commits!

40 Copyright © Ellis Cohen, Timestamped Cache Checking Suppose all sites have access to the same global clock, and when S and T want to commit, they pass the current global time as part of their PREPARE messages (the PrepareTime) ST A TblA A's cache for S A's cache for T Suppose T sends PREPARE to A after S does, and suppose A receives them in the same order. When A receives T's PREPARE, it's PrepareTime is larger than every PrepareTime it already received, including S's. A can do Timestamped Cache Checking: For every local data item A read that is in A's cache for T, check whether A's version is the latest one (compare its read timestamp in the cache to the local DB's timestamp for it)

41 Copyright © Ellis Cohen, Out of Order Prepares ST A TblA A's cache for S A's cache for T Suppose T sends PREPARE to A after S does, but A receives them in the opposite order. A receives PREPARE for T first, validates it, and responds YES, and then receives PREPARE for S, with an earlier PrepareTime. Problem: If S wrote something that T read, T read the wrong version of it; T should have read the version that S wrote. Too late to fail validation for T, but we can fail validation for S. Problem: If T wrote (and committed) something that S read, S read the wrong version of it. S should have read the data before T persisted it! Also fail validation for S in this case. T already committed, but S should have committed first These checks must be done in addition to Timestamped Cache Checking

42 Copyright © Ellis Cohen, Loosely Synchronized Clocks In fact, distributed systems generally do not all have access to a global time. Instead, they use a Distributed Time Service, which sends time messages between sites, and ensures that all clocks stay reasonably close to one another. Increasing clock skew Will, at worst, cause the algorithm described to fail more validations unnecessarily (since more PREPARE's will appear to be received out of order), but Will not cause validation to incorrectly succeed. Are out of order PREPAREs a problem for Timestamp-Based or Read-Consistent concurrency control?

43 Copyright © Ellis Cohen, Timestamp-Based Concurrency Ordering does not affect the Timestamped-Based Concurrency Control Algorithm Ordering already taken into account Data items are marked with read times as well as their write times. Timestamp-based checks effectively already do the appropriate validation based on order. Increasing clock skew Simply causes more timestamp-based checks to fail