Presentation is loading. Please wait.

Presentation is loading. Please wait.

3. Transaction processing concepts

Similar presentations


Presentation on theme: "3. Transaction processing concepts"— Presentation transcript:

1 3. Transaction processing concepts
What is a transaction? Logical unit of work Execution of a program that accesses or changes the database Fundamental property of a database: The data are shared among several users and applications  concurrent processing, possibly conflicting interests. AdvDB J. Teuhola 2015

2 Transaction viewpoints
A transaction should lead the database from one consistent state to another. Partial execution not allowed (principle: all or nothing) Concurrent access to data by multiple transactions should be supported. Transactions may end prematurely, due to system (hardware/software) failure; recovery mechanisms are needed. AdvDB J. Teuhola 2015

3 Categories of database processing
OLTP = On-Line Transaction Processing Needed for the everyday functioning of the organization’s main activities, accessing the ’operative’ databases Short transactions and response times OLAP = On-Line Analytic Processing Decision support, data mining Long read-only transactions ’Data warehouse’ The data need not be exactly up-to-date; refreshed periodically AdvDB J. Teuhola 2015

4 Interleaved execution
Concurrent transactions may be executed in an interleaved fashion. Transactions consist of steps, between which the data may be inconsistent. Arbitrary interleaving of steps will cause problems (interference): Lost update problem Temporary update problem (‘dirty read’) Incorrect summary problem AdvDB J. Teuhola 2015

5 Lost update problem Example: Bank account, two parallel withdrawals
The first update is lost. Note that the transactions are not aware of each other’s internal program variables (bal). Time Transaction 1: Transaction 2: Withdraw 200 from account 123 Withdraw 100 from account bal = read_balance(123) = 1000 € bal = bal – 200 € bal = read_balance(123) = 1000 € bal = bal € write_balance(123, bal) = 800 € write_balance(123, bal) = 900 € AdvDB J. Teuhola 2015

6 Temporary update (‘dirty read’) problem
Example transactions on bank accounts: Time Transaction 1 Transaction 2 Transfer 100 € from account Withdraw 200 € from account to account 789 bal = read_balance(123)=1000€ bal = bal - 100€ write_balance(123, bal)=900€ bal = read_balance(123) = 900€ bal = read_balance(789)=2000€ abort; recover_balance(123)=1000€ bal = bal € write_balance(123, bal)=700 € // Should be 800 € AdvDB J. Teuhola 2015

7 Incorrect summary problem
Time Transaction 1 Transaction 2 Sum of balances Transfer 100 € from 789 to sum = 0 bal = read_balance(123) = 1000 € sum = sum + bal = 1000 € bal = read_balance(456) = 2000 € sum = sum + bal = 3000€ bal = read_balance(789) = 3000 € bal = bal – 100 € write_balance(789, bal) = 2900 € bal = read_balance(123) = 1000 € bal = bal + 100€ write_balance(123, bal) = 1100 € bal = read_balance(789) = 2900 € sum = sum € = 5900 € // Should be 6000 € AdvDB J. Teuhola 2015

8 Reasons for recovery DBMS must ensure that each transaction is either (1) executed in totality (all operations completed successfully; updates made permanent), or (2) has no effect on the database contents nor on other transactions. In case of failure, the DBMS must roll back the (partially executed) transaction so that (2) holds. AdvDB J. Teuhola 2015

9 Types of failures Hardware/software failure: contents of main memory may be lost. Transaction error: e.g. illegal operation or user interrupt. Logical error: Violation of database consistency constraints. Concurrency control error: Interference of transactions, e.g. deadlock. Disk failure, accidents, etc.: Some data are usually lost. AdvDB J. Teuhola 2015

10 Transaction events begin_transaction read/write
commit: successful end; fix the changes rollback (abort): unsuccessful ending undo effects of a failed transaction redo effects of a successful transaction end_transaction AdvDB J. Teuhola 2015

11 System log On disk (permanent storage) Enables recovery from failures
Stores data about transaction states and performed updates (before-/after-images) Transactions identified by a unique id. Log entries: start, [read,] write, commit, abort Undo needs before-image (old value) Redo needs after-image (new value) AdvDB J. Teuhola 2015

12 Commit point Database operations performed successfully
Effects recorded in the log. At commit the log page in buffer must be force-written to disk. Writing other buffer pages may be postponed. At failure: Uncommitted transactions are undone. Committed transactions may have to be redone. AdvDB J. Teuhola 2015

13 Checkpoint Effects of write operations of committed transactions are forced to disk. A transaction need not be redone at failure, if its commit-entry in the log is before the last checkpoint. Checkpoint is taken periodically. AdvDB J. Teuhola 2015

14 Checkpoint actions Suspend execution. Force-write committed updates.
Write the checkpoint info to the log: active transactions pointers to the first and last log entries of transactions (to speed-up recovery) Resume execution. AdvDB J. Teuhola 2015

15 Transaction processing scenario
Program 1 Program 2 DBMS manages ‘Page frames’ Log buffer DB Main memory Disk pages DB buffer Log pages Force-write at checkpoint Force-write at commit Before-/ after-images of changed tuples AdvDB J. Teuhola 2015

16 Desirable (ACID) properties of transactions
Atomicity: All or nothing. Consistency: From one consistent state of the database to another. Isolation: Updates are invisible before commit; no need for cascading rollbacks. Durability: Committed changes are not lost (responsibility of the recovery method), except in catastrophic failures. AdvDB J. Teuhola 2015

17 Levels of isolation 0: No overwrite of dirty reads.
1: No lost updates. 2. No dirty reads & no lost updates. 3. ‘True isolation’: No dirty reads & no lost updates & repeatable read. AdvDB J. Teuhola 2015

18 Schedule Assume a set of transactions {T1, ..., Tn} with ordered operations: Ti = <op1>i ; <op2>i ; ... A schedule S is an ordered set of operations, where each Ti-sequence is a subsequence of S. Notice: Operations of different transactions may be interleaved. AdvDB J. Teuhola 2015

19 Example schedule Operation sequences of three transactions:
T1: r1(x); r1(y); w1(x); c1; T2: r2(y); w2(y); c2; T3: r3(x); r3(y); w3(x); c3; One possible interleaved schedule: r1(x); r2(y); r1(y); w1(x); r3(x); r3(y); w2(y); w3(x); c2; c1; c3; AdvDB J. Teuhola 2015

20 Conflict of operations
Notation for operations: <op><trans-id>(<data item>) where <op> is one of {r, w, c, a} = {read, write, commit, abort} Conflict (= danger): Two operations <op1>i(X) and <op2>j(X) may cause trouble if i  j and (<op1> = w or <op2> = w) AdvDB J. Teuhola 2015

21 Example of conflicts Schedule: r1(x); r1(y); r2(x); w1(x); r2(y); w2(x); c1; c2; Conflicting pairs of operations: r1(x) and w2(x) r2(x) and w1(x) w1(x) and w2(x) Not in conflict: r1(y) and r2(y) AdvDB J. Teuhola 2015

22 Complete schedule S is a complete schedule of T1, T2, ..., Tn, if
(1) S contains operations of T1, T2, ..., Tn (but no others). (2) Operation order of each Ti is preserved. (3) Order of operations is fixed for each conflicting pair; otherwise ordering may be partial. (4) Commit/abort is the last operation of each Ti. AdvDB J. Teuhola 2015

23 Committed projection Committed projection of S = All operations of S belonging to committed transactions. Reason for this concept: New transactions appear all the time; difficult to find a moment when no active transactions exist. AdvDB J. Teuhola 2015

24 Recoverable schedule Transaction T2 that reads X does not commit until T1 that wrote X (before the read) has committed. Explanation: T2 can be rolled back as long as it has not committed. If T1 aborts, T2 should be rolled back to avoid consequences of dirty read. AdvDB J. Teuhola 2015

25 Cascading rollback Cascade phenomenon:
T2 has to be aborted because T1 aborted; T3 has to be aborted because T2 aborted; ... Prevention: Read an item only after the writing transaction has committed (cascadeless schedule). Strict schedule: Read/write X only after the last writer transaction of X has committed. Advantage: Simpler undo. Disadvantage: Lower level of concurrency AdvDB J. Teuhola 2015

26 Serializability Aim: Correct (but not necessarily serial) schedules for interfering transactions Means: Sufficient isolation Serial schedule: All operations of each transaction Ti are executed consecutively, without interleaving. AdvDB J. Teuhola 2015

27 Serializability (cont.)
Axiom: For independent transactions every serial schedule is correct (n! alternatives for n transactions). Problem: Limited concurrency; if one transaction waits for I/O, another cannot take over; the processor is idle. Serializable schedule: Equivalent to some serial schedule (same result, though interleaved execution); means logical correctness. AdvDB J. Teuhola 2015

28 Forms of schedule equivalence
Result equivalence: May be accidental; it should hold for all possible database states. Conflict equivalence: The order of any two conflicting operations is the same in both schedules. View equivalence: A read operation reads the result of the same write operation (or none) in both schedules. AdvDB J. Teuhola 2015

29 Forms of serializability
Corresponding to the forms of equivalence: Conflict serializability: Simple testing algorithm View serializability: Less restrictive than conflict serializability. Testing is an NP-complete problem. Note. Every conflict-serializable schedule is also view-serializable, but not vice versa. AdvDB J. Teuhola 2015

30 Testing conflict serializability
1. For each transaction, create a node in a precedence graph. 2. For each pair of conflicting operations <op>Ti(X) and <op>Tj(X), occurring in this order, create an edge from Ti to Tj. 3. The schedule is serializable if the precedence graph has no cycles. AdvDB J. Teuhola 2015

31 Serializable schedule  equivalent serial schedule
1. Derive the precedence graph of transactions as above. 2. Perform a topological sort of the graph nodes. 3. The resulting sequence is the solution (which may not be unique, in general). AdvDB J. Teuhola 2015

32 Example of a precedence graph
Schedule: r1(x), r2(y), w1(x), r3(x), r3(y), w3(x), w3(y), w2(y), c1, c2, c3 1 2 y x y The schedule is not serializable, due to the cycle. 3 AdvDB J. Teuhola 2015

33 Problems of serializability
Difficult to determine the order of operations beforehand. Scheduling must be done on-the-fly, based on waiting, priorities, system load, etc. Testing serializability afterwards may be too optimistic (cf. optimistic concurrency control). Serializability can be applied only to the committed projection. AdvDB J. Teuhola 2015

34 Solution to maintaining serializability
Control the interleaving of operations by obeying rules (a protocol) that guarantee serializability without checking it. (E.g. two-phase locking protocol, see later) Conclusion: Serializability theory helps to gain better understanding of protocols, to obtain correct interleaving and improved concurrency. AdvDB J. Teuhola 2015

35 Debit-credit transactions
Less restrictive form of equivalence Addition to and subtraction from an item can be done in any order (commutativity). A schedule is correct if <read, add/subtract, write> triples are not interrupted. AdvDB J. Teuhola 2015

36 Granularity To be decided: Unit of data considered in scheduling
A smaller unit enables a higher level of concurrency (but involves also more effort to manage) A (disk) page is often the simplest unit of operation. AdvDB J. Teuhola 2015

37 Transaction support in SQL
Implicit BEGIN_TRANSACTION Explicit COMMIT or ROLLBACK SET TRANSACTION characteristics: Access mode: READ ONLY or READ WRITE Diagnostic area size: Number of feedback conditions held simultaneously. Isolation level (from weakest to strongest): READ UNCOMMITTED (allows dirty reads) READ COMMITTED (allows non-repeatable reads) REPEATABLE READ (allows ‘phantoms’) SERIALIZABLE (actually stronger than defined earlier) AdvDB J. Teuhola 2015


Download ppt "3. Transaction processing concepts"

Similar presentations


Ads by Google