CS6223: Distributed Systems

Slides:



Advertisements
Similar presentations
CM20145 Concurrency Control
Advertisements

Concurrency Control III. General Overview Relational model - SQL Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Lecture plan Transaction processing Concurrency control
Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
1 Concurrency Control III Dead Lock Time Stamp Ordering Validation Scheme.
Concurrency Control WXES 2103 Database. Content Concurrency Problems Concurrency Control Concurrency Control Approaches.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
CS 542: Topics in Distributed Systems Transactions and Concurrency Control.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Concurrency Control II
Concurrency Control Amol Deshpande CMSC424. Approach, Assumptions etc.. Approach  Guarantee conflict-serializability by allowing certain types of concurrency.
Topic 6.3: Transactions and Concurrency Control Hari Uday.
CSC271 Database Systems Lecture # 32.
Lock-Based Concurrency Control
Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.
1 Chapter 3. Synchronization. STEMPusan National University STEM-PNU 2 Synchronization in Distributed Systems Synchronization in a single machine Same.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
Database Systems, 8 th Edition Concurrency Control with Time Stamping Methods Assigns global unique time stamp to each transaction Produces explicit.
1 TRANSACTION & CONCURRENCY CONTROL Huỳnh Văn Quốc Phương Thái Thị Thu Thủy
Transactions and concurrency control
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
Concurrency Control and Recovery In real life: users access the database concurrently, and systems crash. Concurrent access to the database also improves.
Distributed Systems 2006 Styles of Client/Server Computing.
ACID A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction, may change the DB from.
Distributed Systems Fall 2010 Transactions and concurrency control.
CS 582 / CMPE 481 Distributed Systems Concurrency Control.
Transaction Management and Concurrency Control
Transaction Management and Concurrency Control
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Transaction Management
1 Transaction Management Database recovery Concurrency control.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Transactions and concurrency control
Transaction Management Chapter 9. What is a Transaction? A logical unit of work on a database A logical unit of work on a database An entire program An.
Distributed Deadlocks and Transaction Recovery.
08_Transactions_LECTURE2 DBMSs should guarantee ACID properties (Atomicity, Consistency, Isolation, Durability). This is typically done by guaranteeing.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.
Chapter 11 Concurrency Control. Lock-Based Protocols  A lock is a mechanism to control concurrent access to a data item  Data items can be locked in.
Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
Distributed Processing Systems ( Concurrency Control ) 오 상 규 서강대학교 정보통신 대학원
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
9 1 Chapter 9_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
NOEA/IT - FEN: Databases/Transactions1 Transactions ACID Concurrency Control.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
10 1 Chapter 10_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
Concurrency Control Techniques
Transaction Management and Concurrency Control
Concurrency Control.
Transaction Management Transparencies
Transaction Properties
Chapter 10 Transaction Management and Concurrency Control
Section 18.8 : Concurrency Control – Timestamps -CS 257, Rahul Dalal - SJSUID: Edited by: Sri Alluri (313)
Concurrency Control WXES 2103 Database.
Replication and Recovery in Distributed Systems
Chapter 15 : Concurrency Control
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Transaction management
Transaction Management
UNIVERSITAS GUNADARMA
Presentation transcript:

CS6223: Distributed Systems Transaction Processing Systems

File operations and Transaction operations client File server ……. read(fid, write(fid, …… stateless RPC files database server client tid=Topen; Tread(tid, fid, Twrite(tid,fid, ….. Tclose(tid); Tread(tid, fid1, Twrite(tid,fid1, Twrite(tid, fid2, keep all states of a trans RPC Draw Transaction Fig on board (for later use): distributed environment, 1) clients are concurrent; 2) each operation is an RPC Transaction is a sequence of operations that requires atomicity. Bank transaction (multiple operations)

Definitions of Atomic Transactions Atomic Transaction: a sequence of data access operations that are atomic in the sense of: All-or-Nothing Serializability (indivisibility and invisibility) The ACID properties of a transaction in RM-ODP: Atomicity Consistency Isolation Durability A transaction guarantees “the execution of a sequence of operations has the same effect as one single atomic operation”

What makes Database inconsistent? Concurrent accesses of transactions Failures of servers Two well-known problems caused by concurrency: Lost update problem Balance := B.Read() $200 B.Write( balance + 3) $203 Banlance := A.Read() $100 A.Write (balance-4) $96 Balance := B.Read() $200 B.Write ( balance +4 ) $204 Transaction U: deposit (B, 3); Transaction T: transfer(A, B, 4); Using the Fig on board! Basic operation provided by file system / dbms is read / write

Inconsistent Retrieval Problem Banlance := A.Read() $200 A.Write (balance-100) $100 Balance := B.Read() $200 B.Write ( balance +100) $300 Transaction U: TotalBalance() Transaction T: transfer(A, B, 100); Balance := A.Read() $100 Balance:= balance+B.Read() $300 Balance:= balance+C.Read() $300+ . This shows the order that trans operations executed at the server. See the white board how two transactions interleaving with each other. Note: each operation is atomic

Problem Caused by Server Failures Transaction: transfer(A, B, 100) bal = A.Read(); A.Write(bal - 100);  server crash point bal = B.Read(); B.Write(bal + 100);

Transactional File Operations Transactions provide a programming environment: Concurrency transparency Failure transparency Transactional file service operations: TID = OpenTrans () (Commit, Abort) = CloseTrans (TID) AbortTrans (TID) TWrite(TID, FID, i, Data) TRead(TID, FID, i, buf) FID = TCreate(TID, filename, type) TDelete(TID, FID) ...... Relating them to file operations…(they’re file operations. In FS, each file operation is independent; but now they’re all dependent).

An Example of Using Transactions Transaction: transfer (A, B, 100) tid = OpenTrans(); Tread(tid, A, bal) Twrite(tid, accountA, bal-100) Tread(tid, accountB, bal) Twrite(tid, accountB, bal+100) CloseTrans(tid); Application won’t go wrong. Explain concurrency / failure could make transaction go wrong ->

How to achieve atomicity of transactions? Failure Recovery (guarantee nothing-or-all) Intention list approach Shadow version approach Concurrency Control (guarantee serializability) 2-Phase Locking Timestamp Ordering Optimistic Method

Failure Recovery of Transactions An execution of a transaction has Two Phases: Tentative phase, the execution transaction body Commit phase, making tentative values permanent Making transactions failure recoverable: Keep the tentative values of data on disk that can survive failures Restore data items at the restart from a failure The commit phase repeatable (itempotent) See Fig. on board. During tentative phase, any failures would cause “abort” of the transaction, which doesn’t affect permanent database

Intention List Approach (failure recovery) Example of intention list: tid = OpenTrans; TWrite(tid, fd1, len1, data1); …… TWrite(tid, fd2, len2, data2); Close(tid) Intention List: tid, status, {"Twrite, fd, pos1, len1, data1", "Twrite, fd, pos2, len2, data2"} …..when failure occurs during the execution of this transaction, … how to do recovery (point is nothing-all for database) ->

Implementation of Intention List Transaction operations by using intention list: Twrite writes data to the intention list. Tread reads data from the intention list if it is present. CloseTrans performs the write operations in the intention list onto database files. Note: this operation takes much shorter time than the execution time of a transaction. Recovery manager: a program (part of the server) which is called when the server restarts from a failure. Recovery file: a file used by the recovery manager to restore the database to a consistency state. Each entry of the file, for one transaction, contains the information: Tid transaction status intention list At recovery time, it reads in the recovery file, simply abort all transactions that are in tentative phase, and commit transactions whose status are “committed”.

Example of recovery Transactions T and U: T: transfer (A, B, $20) U: transfer (C, B, $22) Recovery: remove tentative data created by U (p5 & p6) and commit the data updated by T (p1 & p2) to database. P1&p2 created by T, p5&p6 by U. Recovery: remove U and commit T. recovery should be idempotent (failure could occur during recovery)!

Shadow Version Approach (failure recovery) When a transaction modifies a file or data item, it creates a shadow (tentative) version of the file. The subsequent Twrite/Tread are performed on the shadow version. At closeTrans, it detects version conflict with other concurrent transactions (done by concurrency control). If no conflict, the shadow version is merged with other concurrent versions already committed; otherwise it is aborted. Vc Vc current version. When Vy commit (suppose Vx already committed), two options: if it’s mergeable with Vx (i.e., they read/write different parts of a file), it can commit; otherwise it aborts.

Implementation of Shadow Version Approach A copy of the file index-block is created at the first Twrite of a transaction. Each Twrite operation creates shadow pages. The index entries of modified pages are made pointing to the shadow pages. The original file index-block is replaced by the new one at the commit of the transaction (this operation is idempotent).

Serializability (Serially Equivalent) Two transactions may conflict with each other when they access the same data item(s). Only write operations may cause inconsistency. Def. Serializability: T  U (T completes before U starts): read operations in U must read the data written by T write operations in U must overwrite the data done by T T cannot see any effect of U See fig on board.

Absolute Sequential Execution T  U : T U read(a) write(a) read(b) write(b) read(c) U  T: read( c) Use Fig on board again. It’s easy for the server to enforce this execution order, it simply only allows on trans to open. Concurrency control is talking about how the server schedule trans operations! But, this sequential execution has poor performance: NO concurrency (the server is idle most of the time) -> transaction interleaving

Serial Equivalence T U: T U read(c) read(a) write(a) read(b) write(b) ?goal: let transactions as concurrent as possible, so long as they keep database consistent! Two trans can be concurrent, but still preserve serializability

Non-serial Equivalence T U read(a) write(a) read(b) read( c) write(b) T  U write(b) U  T Interleaving causes problem -> generalize the problem ->

Non-serial Equivalence of Lost Update and Inconsistent Retrievals Balance := B.Read() $200 B.Write( balance + 3) $203 Banlance := A.Read() $100 A.Write (balance-4) $96 B.Write ( balance +4 ) $204 Transaction T: transfer(A, B, 4) Transaction U: deposit(B, 3) TU UT Transaction T: transfer(A, B, 100); Balance := A.Read() $100 Balance:= balance+B.Read() $300 Balance:= balance+C.Read() $300+ . Banlance := A.Read() $200 A.Write (balance-100) $100 Balance := B.Read() $200 B.Write ( balance +100) $300 Transaction U: TotalBalance() TU UT

Serializability of Transactions Conflicting operations: Two operations access the same data item and one of them is write. Conflicting transactions: Two transactions that have conflicting operations in them. Serializability of transactionsT1 and T2: T1  T2 iff two conflicting operations op1i  T1 and op2j  T2: op1i  op2j. Serializability of Transactions: The execution of a set of transactions T1 T2, ..., Tm, is equivalent to the execution of them in a serial order, i.e., Ti1  Ti2  , ...,  Tim Note: Serializability is the minimum requirement for database consistency in general. Now, we only check if there’s cycle in execution order, i.e., T -> U and U -> T What’s the method that can achieve “serializability” ->

2-Phase Locking First phase: obtaining locks. Second phase: releasing locks. When a transaction begins to release a lock, it cannot apply for locks any more. Why two phases of locking? Server operations for T Server operations for U write(a); read(b); write(b); This inter-leaving of T and U is not ok. First explain “lock” mechanism

Simple locking won’t work (2-Phase Locking) First phase: obtaining locks. Second phase: releasing locks. When a transaction begins to release a lock, it cannot apply for locks any more. Think about by using simple locking: Server operations for T Server operations for U lock(a);write(a); unlock(a) lock(b); read(b); unlock(b) lock(b);write(b); unlock(b) Simple “lock” mechanism -> not serializable!

2-Phase Locking Use two phases of locking: First phase: obtaining locks. Second phase: releasing locks. When a transaction begins to release a lock, it cannot apply for locks any more. Use two phases of locking: Server operations for T Server operations for U lock(a); write(a); lock(b); read(b); lock(b) – wait!!! read(b); ……. write(b); write(b); unlock(a, b) Note: U cannot obtain lock on b until T completes. Note: there’s only one type of lock, so T no need to lock b again for write (see p25 read/write lock) In most implementations, it simply lets a transaction hold all the locks until commit. This’s “concurrency control” at the server!

Serializability of 2-Phase Locking Lock compatibility: Serializability: All transactions are serialized in the order of the time they obtain locks on data items. OK OK OK Wait Wait Wait Lock already set None Read Write For one data item Lock requested Read Write Distinguish read locks from write locks, see p24 example to see how read / write locks work!

Implementation of Locking Lock manager: a module of a server program. It maintains a table of locks for the data items of the server. Each entry in the table of locks has: transaction ID data-item ID lock type a condition variable (a queue for clients waiting unlock) Lock manager provides two operations: lock(trans, dataItem, lockType) unlock(trans), signal the condition variable The lock and unlock operations must be atomic

Deadlock in 2-Phase Locking Deadlock may occur in 2-Phase locking Deadlock prevention and handling: Lock all data items a transaction accesses at the start of the transaction. Set timeout for waiting a lock.

Discussions on 2-Phase Locking Advantages of locking: No abort and restart. Simple for understanding and implementation. Disadvantages: Pessimistic. Poor efficiency (cost of locking and clients waiting for locks). Deadlock. Pause, let students understand serializability of 2-phase locking

Timestamp Ordering Serializability of Timestamp Ordering: transactions are serialized in the order of their start time (openTrans). Important Timestamps: Each transaction is assigned an unique timestamp T (also used as TID) at the open. Each data item has a write-timestamp (wt) and a read-timestamp (rt). The basic idea of concurrency control is to serialize T and U into T –> U (i.e., U read or over-write results of T). The easy way is to serialize them according to a time order. 3 timestamps: tid, rt, wt -> give basic idea of the method…

Timestamp Ordering (Cont’d) Read/Write Operations: Compare timestamp T with the rt and wt of the data to decide if the operation can proceed. Read Rule: transaction T can read a data item only if the data item was last written by an earlier transaction. if a read of T is accepted, record T as a tentative rt of the data (if T > rt). Write Rule: transaction T can write the data only if the data was last read and written by earlier transactions. when a write of T is accepted, a tentative version of data is the created with timestamp T; Commitment: T’s tentative version of data and tentative rt become permanent; if T aborts, all tentative data and rts created by T are removed. data T rt wt Tid determines the ordering of transaction executions.

Rules for Write W1: T  rt and T > wt, a tentative value is created. W2: T < rt (including tentative rt) or T < wt, abort. T1 < T2 < T3 < T4 After T2 T3 T1 Time T4 before T3 Abort c) T3 Write d) T3 Write a) T3 Write b) 4 cases of write: a) no tentative version, b) with earlier tentative versions, c) with later tentative versions, d) arrives too late Compare the cases with locking method. A trans can be ABORTED by the server!!!

Rules for Read R1: T  wt if there is a tentative value made by itself, read this tentative value; otherwise: if there are tentative values whose timestamps are earlier than T, wait for the tentative values committed; otherwise: read data immediately. R2: T < wt, abort. T4 T2 T1 Time a) T3 Read d) T3 Read c) T3 Read Read Proceeds Wait T3 Abort Selected Again 4 cases:… Still has to wait…, same in locking method

Commitment of Transactions The commit of a tentative value (including a tentative rt of a data object) has to wait for the commit of tentative values of earlier transactions. A transaction waits for the earlier transactions only, which avoids deadlocks. Note: all transactions are serialized in the order of their open time (openTrans time). Note: at commit, need to wait for an early transaction to abort / commit if it left a tentative read-timestamp on an item. This allows a later rt always over-write an earlier rt!

Multiversion Timestamp Ordering The server keeps old committed versions (a history of versions) of the data to avoid unnecessary “abort” (particularly for read operations). A read that arrives too late need not be rejected. It returns the data whose version has the largest wt that is less than the transaction. A read still need to wait for a tentative version to commit/abort. There is no conflict between write operations, bcs each transaction writes on its own version of the data. But a write will be rejected if the data was read by a later transaction. A commit does not need to wait for earlier transactions (bcs multi-versions can co-exist) if it does not read data from earlier versions. T0 T0(T1) T1 T3 read: wait wt rt Example: Initially, both wt and rt of data are T0; After T1 read, it creates tentative rt in T0, T3 read will wait! After T1 commit, T3 create tentative rt in T1 and its own tentative version. ? If a T2-read arrives, which version to read? If T2-write comes?? T0 T1 T1(T3) T3 T1 commit; T3 R/W T0 T1 T3 T3 commit tentative rt

Multiversion Timestamp Ordering: late read / write In multiversion timestamp ordering, a read operation can always proceed (provided the old version is still kept) But a write operation arrived too late can still be rejected. Example: after T3 read & write…. (cont’d).. T3 read T3 write wt Will it cause “conflict of concurrent versions”? No, concurrent tentative versions from the same committed image still wait for sequential commit. There is no concurrent tentative versions created from two committed image!!! bcs a write arriving too late will be aborted bcs the data was read by a later transaction… Note: T3 creates a tentative rt in version T1. T3 write creates a tentative version. T1 T1(3) T3 rt T1 < T2 < T3 < T4 < T5

Multiversion Timestamp Ordering: late read/write (cont’d) After T5 reads the data and committed, if T4-write arrives, it’ll be aborted; otherwise you have conflict of T4 & T5. Rule: a write operation arrived too late will still be rejected. T3 read T3 commit T4 write: abort! T5 read & commit T4-write abort, bcs it invalidate T5’s read! If it goes ahead to make a version T4, T5 should read T4 instead of T3! T1 T3 T3 T5 T1 < T2 < T3 < T4 < T5

Discussions on Timestamp Ordering Advantages: No deadlocks. Disadvantage: Extra storage for timestamps. Possible abort and restart. A long transaction may block others. Timestamp ordering method is pessmistic (over-strong): T U t1=openTrans t2 = openTrans read(t2, b) write(t2 ,b) closeTrans(t2) read(t1, b)  T will abort (unnecessarily) T did nothing after opening…

Optimistic Method (Optimistic Timestamp Ordering) Observation: the possibility of conflicts of two transactions is low. Transactions are allowed to progress as if there were no conflict at all. A transaction has three phases: tentative, validation and commit. Tentative phase: read and write are returned immediately (write is to a tentative version). Read set (RS) and Write set (WS) of the trans are recorded. Validation phase: detect the conflicts with other concurrent transactions and decide commit/abort as the result. Commit phase: commit tentative values made by the transaction. Note: transactions are serialized in the order of their close time. openT closeT tentative validate commit Definition of RS and WS. And the use of RS and WS in validation of T -> U! Note: validation phase and write phase are very ……….short in time, compared with tentative phase!

Validation Condition When validating Ti, system checks each Tj which is concurrent with Ti (i.e. Ti starts in between the start and completion of Tj). 1. Sequential validation (overlap in tentative phase): WS(Tj)  RS(Ti) =  2. Concurrent validation (overlap in validation/commit phase): WS(Tj)  (RS(Ti)  WS(Ti)) = . (if WS(Tj)  WS(Ti)  , Tj may overwrite Ti’s data due to concurrent commit) In either case, Tj and Ti are equivalent to Tj  Ti. Tj Ti Tj Ti Sequential validation: otherwise Ti missed out data updated by Tj It’s easy for the DBMS to do sequential validation. It simply makes validation a critical section, no allow pre-emption. Sequential validation is, in fact, good enough for system performance, bcs the time taken for validation is very short! But for multi-threading DBMS system, concurrent validation is desired & possible: 2. Concurrent validation: why? “write of Ti must over-write data by Tj” Tj Ti

Implementation Details Three important time points of a transaction: start_t: time of “openTrans” finish_t: time of “closeTrans” commit_t: time of final completion start_t finish_t commit_t validation_set: a set of transactions being in validation.

Validation, Write, and Commit of Transaction Ti When Ti enters validation, record time finish_t, make a copy of validation_set and add Ti into validation_set. transactions whose commit_t is in between start_t and finish_t of Ti are concurrent in tentative phase but sequential in validation phase with Ti. transactions in validation_set are concurrent in validation phase with Ti. If Ti passes validation, it enters write-phase. Ti's commit_t is recorded and Ti is removed from validation_set. commit_t start_t finish_t Ti Ti Steps 1, 2, 3.

How long we need to keep information of Ti Information of committed transactions should be kept for other transactions validation: the information of a transaction can be removed when its commit_t < start_t of any uncommitted transaction in the system. start_ti commit_ti start_t Ti

Discussions on Optimistic Method Serializability Transactions are serialized in the order of their close time. Starvation problem a (long) transaction may be aborted again and again. Advantage optimistic (high concurrency). no deadlock. a transaction will not be blocked by others. Disadvantage: complicate. starvation. abort and restart.

Conclusion of Concurrency Control Methods Two phase locking: Lock data items before access. Serialized in the order of obtaining locks. No abort and restart. Timestamp ordering: Check timestamps of data items before access. Serialized in the order of start time. Abort during execution. Optimistic method: Validate at the close. Serialized in the order of close time. Abort at validation.