Download presentation
Published bySophia MacLean Modified over 10 years ago
1
CSL 771: Database Implementation Transaction Processing
Maya Ramanath All material (including figures) from: Concurrency Control and Recovery in Database Systems Phil Bernstein, Vassos Hadzilacos and Nathan Goodman (
2
A transaction is a unit of interaction
Transactions Interaction with the DBMS through SQL update Airlines set price = price - price*0.1, status = “cheap” where price < 5000 A transaction is a unit of interaction
3
Database system must ensure ACID properties
Atomicity Consistency Isolation Durability Database system must ensure ACID properties
4
Atomicity and Consistency
Single transaction Execution of a transaction: “all-or-nothing” Either a transaction completes in its entirety Or it “does not even start” As if the transaction never existed No partial effect must be visible 2 outcomes: A transaction COMMITs or ABORTs
5
Consistency and Isolation
Multiple transactions Concurrent execution can cause an inconsistent database state Each transaction executed as if isolated from the others
6
Durability If a transaction commits the effects are permanent
But, durability has a bigger scope Catastrophic failures (floods, fires, earthquakes)
7
What we will study… Concurrency Control Recovery
Ensuring atomicity, consistency and isolation when multiple transactions are executed concurrently Recovery Ensuring durability and consistency in case of software/hardware failures
8
Terminology Data item Read (x) Write (x, 5) Start (T) Commit (T)
A tuple, table, block Read (x) Write (x, 5) Start (T) Commit (T) Abort (T) Active Transaction A transaction which has neither committed nor aborted Commit: guarantees that the transaction will not be aborted! (Bank transaction, for example) Abort: can occur because the transaction itself encounters an error condition, or by system failures beyond its control. The DB itself may abort for some reason.
9
High level model Transaction Manager Scheduler Recovery Manager
Transaction n Transaction Manager Scheduler Disk Recovery Manager Cache Manager
10
Recoverability (1/2) Transaction T Aborts DBMS has to…
T wrote some data items T’ read items that T wrote DBMS has to… Undo the effects of T Undo effects of T’ But, T’ has already committed T T’ Read (x) Write (x, k) Read (y) Write (y, k’) Commit Abort
11
Ti commits after all Tk commit
Recoverability (2/2) Let T1,…,Tn be a set of transactions Ti reads a value written by Tk, k < i An execution of transactions is recoverable if Ti commits after all Tk commit T1 T2 Write (x,2) Read (x) Write (y,2) Commit T1 T2 Write (x,2) Read (x) Write (y,2) Commit
12
Cascading Aborts (1/2) Because T was aborted, T1,…, Tk also have to be aborted T T’ T’’ Read (x) Write (x, k) Read (y) Write (y, k’) Abort
13
Cascading Aborts (2/2) Recoverable executions do not prevent cascading aborts How can we prevent them then ? T1 T2 Write (x,2) Read (x) Write (y,2) Commit T1 T2 Write (x,2) Commit Read (x) Write (y,2)
14
What we learnt so far… Reading a value, committing a transaction
Recoverable with cascading aborts Recoverable without cascading aborts Not recoverable T1 T2 Write (x,2) Read (x) Write (y,2) Commit T1 T2 Write (x,2) Read (x) Write (y,2) Commit T1 T2 Write (x,2) Commit Read (x) Write (y,2)
15
Strict Schedule (1/2) “Undo”-ing the effects of a transaction
Restore the before image of the data item T1 T2 Write (x,1) Write (y,3) Write (y,1) Commit Read (x) Abort T1 T2 Write (x,1) Write (y,3) Commit Equivalent to Final value of y: 3
16
Strict Schedule (2/2) Initial value of x: 1 T1 T2 Write (x,2) Write (x,3) Abort T1 T2 Write (x,2) Write (x,3) Abort T1 T2 Write (x,2) Abort Write (x,3) Should x be restored to 1 or 3? T1 restores x to 3? T2 restores x to 2? Do not read or write a value which has been written by an active transaction until that transaction has committed or aborted
17
The Lost Update Problem
Read (x) Write (x, 200,000) Commit Write (x, 200) Assume x is your account balance
18
Serializable Schedules
Serial schedule Simply execute transactions one after the other A serializable schedule is one which equivalent to some serial schedule
19
Serializability Theory
20
Serializable Schedules
T1: op11, op12, op13 T2: op21, op22, op23, op24 Serial schedule Simply execute transactions one after the other op11, op12, op13 op21, op22, op23, op24 op21, op22, op23, op24 op11, op12, op13 Serializable schedule Interleave operations Ensure end result is equivalent to some serial schedule
21
Notation r1[x] = Transaction 1, Read (x) w1[x] = Transaction 1, Write (x) c1 = Transaction 1, Commit a1= Transaction 1, Abort r1[x], r1[y], w2[x], r2[y], c1, c2
22
Histories (1/3) Operations of transaction T can be represented by a partial order. r1[x] r1[y] w1[z] c1
23
Histories (2/3) Conflicting operations
Of two ops operating on the same data item, if one of them is a write, then the ops conflict An order has to be specified for conflicting operations
24
Histories (3/3) Complete History
25
Serializable Histories
The goal: Ensure that the interleaving operations guarantee a serializable history. The method When are two histories equivalent? When is a history serial?
26
Equivalence of Histories (1/2)
H ≅ H’ if they are defined over the same set of transactions and they have the same operations they order conflicting operations the same way
27
Equivalence of Histories (2/2)
y Source: Concurrency Control and Recovery in Database Systems: Bernstein, Hadzilacos and Goodman
28
Serial History A complete history is serial if for every pair of transactions Ti and Tk, all operations of Ti occur before Tk OR all operations of Tk occur before Ti A history is serializable if its committed projection is equivalent to a serial history.
29
Serialization Graph T1 T3 T2
30
Serializability Theorem
A history H is serializable if its serialization graph SG(H) is acyclic On your own How do recoverability, strict schedules, cascading aborts fit into the big picture?
31
Locking
32
High level model Transaction Manager Scheduler Recovery Manager
Transaction n Transaction Manager Scheduler Disk Recovery Manager Cache Manager
33
Transaction Management
. Transaction n Transaction Manager Receives Transactions Sends operations to scheduler Read1(x) Write2(y,k) Read2(x) Commit1 Scheduler Execute op Reject op Delay op Disk
34
Locking is used by the scheduler to ensure serializability
Each data item x has a lock associated with it If T wants to access x Scheduler first acquires a lock on x Only one transaction can hold a lock on x T releases the lock after processing Locking is used by the scheduler to ensure serializability
35
Notation Read lock and write lock Obtaining read and write locks
rl[x], wl[x] Obtaining read and write locks rli[x], wli[x] Lock table Entries of the form [x, r, Ti] Conflicting locks pli[x], qlk[y], x = y and p,q conflict Unlock rui[x], wui[x]
36
Basic 2-Phase Locking (2PL)
RULE 1 RULE 2 pli[x] cannot be released until pi[x] is completed Receive pi[x] is qlk[x] set such that p and q conflict? NO Acquire pli[x] RULE 3 (2 Phase Rule) Once a lock is released no other locks may be obtained. YES pi[x] scheduled pi[x] delayed
37
The 2-phase rule Once a lock is released no other locks may be obtained. T1: r1[x] w1[y] c1 T2: w2[x] w2[y] c2 H = rl1[x] r1[x] ru1[x] wl2[x] w2[x] wl2[y] w2[y] wu2[x] wu2[y] c2 wl1[y] w1[y] wu1[y] c1 T1 T2
38
Correctness of 2PL 2PL always produces serializable histories Proof outline STEP 1: Characterize properties of the scheduler STEP 2: Prove that any history with these properties is serializable (That is, SG(H) is acyclic)
39
Deadlocks (1/2) T1: r1[x] w1[y] c1 T2: w2[y] w2[x] c2 Scheduler rl1[x] wl2[y] r1[x] w2[y] <cannot proceed>
40
Deadlocks (2/2) Strategies to deal with deadlocks Timeouts
Leads to inefficiency Detecting deadlocks Maintain a wait-for graph, cycle indicates deadlock Once a deadlock is detected, break the cycle by aborting a transaction New problem: Starvation
41
Conservative 2PL Avoids deadlocks altogether
T declares its readset and writeset Scheduler tries to acquire all required locks If not all locks can be acquired, T waits in a queue T never “starts” until all locks are acquired Therefore, it can never be involved in a deadlock On your own Strict 2PL (2PL which ensures only strict schedules)
42
Extra Information Assumption: Data items are organized in a tree
Can we come up with a better (more efficient) protocol?
43
Tree Locking Protocol (1/3)
RULE 2 if x is an intermediate node, and y is a parent of x, the ali[x] is possible only if ali[y] RULE 1 Receive ai[x] is alk[x] ? NO RULE 2 RULE 3 ali[x] cannot be released until ai[x] is completed YES pi[x] scheduled ai[x] delayed RULE 4 Once a lock is released the same lock may not be re-obtained.
44
Tree Locking Protocol (2/3)
Proposition: If Ti locks x before Tk, then for every v which is a descendant of x, if both Ti and Tk lock v, then Ti locks v before Tk. Theorem: Tree Locking Protocol always produces Serializable Schedules
45
Tree Locking Protocol (3/3)
Tree Locking Protocol avoids deadlock Releases locks earlier than 2PL BUT Needs to know the access pattern to be effective Transactions should access nodes from root-to-leaf
46
Multi-granularity Locking (1/3)
Refers to the relative size of the data item Attribute, tuple, table, page, file, etc. Efficiency depends on granularity of locking Allow transactions to lock at different granularities
47
Multi-granularity Locking (2/3)
Lock Instance Graph Explicit and Implicit Locks Intention read and intention write locks Intention locks conflict with explicit read and write locks but not with other intention locks Source: Concurrency Control and Recovery in Database Systems: Bernstein, Hadzilacos and Goodman
48
Multi-granularity Locking (3/3)
To set rli[x] or irli[x], first hold irli[y] or iwli[y], such that y is the parent of x. To set wli[x] or iwli[x], first hold iwli[y], such that y is the parent of x. To schedule ri[x] (or wi[x]), Ti must hold rli[y] (or wli[y]) where y = x, or y is an ancestor of x. To release irli[x] (or iwli[x]) no child of x can be locked by Ti
49
The Phantom Problem How to lock a tuple, which (currently) does not exist? T1: r1[x1], r1[x2], r1[X], c1 T2: w[x3], w[X], c2 rl1[x1], r1[x1], rl1[x2], r1[x2], wl2[x3], wl[X], w2[x3], wu2[x3,X], c2, rl1[X], ru1[x1,x2,X], c1
50
Non-lock-based schedulers
51
Timestamp Ordering (1/3)
Each transaction is associated with a timestamp Ti indicates Transaction T with timestamp i. Each operation in the transaction has the same timestamp
52
Timestamp Ordering (2/3)
TO Rule If pi[x] and qk[x] are conflicting operations, then pi[x] is processed before qk[x] iff i < k Theorem: If H is a history representing an execution produced by a TO scheduler, then H is serializable.
53
Timestamp Ordering (3/3)
For each data item x, maintain: max-rt(x), max-wt(x), c(x) Request ri[x] Grant request if TS (i) >= max-wt (x) and c(x), update max-rt (x) Delay if TS(i) > max-wt(x) and !c(x) Else abort and restart Ti Request wi[x] Grant request if TS (i) >= max-wt (x) and TS (i) >= max-rt (x), update max-wt (x), set c(x) = false ON YOUR OWN: Thomas write rule, actions taken when a transaction has to commit or abort
54
Validation Aggressively schedule all operations
Do not commit until the transaction is “validated” ON YOUR OWN
55
Summary Lock-based Schedulers Non-lock-based Schedulers
2-Phase Locking Tree Locking Protocol Multi-granularity Locking Locking in the presence of updates Non-lock-based Schedulers Timestamp Ordering Validation-based Concurrency Control (on your own)
56
SOURCE: Database System: The complete book
SOURCE: Database System: The complete book. Garcia-Molina, Ullman and Widom Recovery
57
Logging Log the operations in the transaction(s) Believe the log
Does the log say transaction T has committed? Or does it say aborted? Or has only a partial trace (implicit abort)? In case of failures, reconstruct the DB from its log
58
The basic setup Buffer Space for each transaction
Buffer Space for data and log Transactions LOG T1 The Disk T2 T3 Tk
59
Terminology Data item: an element which can be read or written
tuple, relation, B+-tree index, etc Input x: fetch x from the disk to buffer Read x,t: read x into variable local variable t Write x,t: write value of t into x Output x: write x to disk
60
Example update Airlines set price = price - price*0.1, status = “cheap” where price < 5000 Read P, x x -= x* 0.1 Write x,P Read S, y y = “CHEAP” Write y, S Output P Output S System fails here System fails here System fails here
61
Logs Sequence of log records Need to keep track of
Start of transaction Update operations (Write operations) End of transaction (COMMIT or ABORT) “Believe” the log, use the log to reconstruct a consistent DB state
62
All 3 logging styles ensure atomicity and durability
Types of logs Undo logs Ensure that uncommitted transactions are rolled back (or undone) Redo logs Ensure that committed transactions are redone Undo/Redo logs Both of the above All 3 logging styles ensure atomicity and durability
63
Undo Logging (1/3) <START T>: Start of transaction T
<COMMIT T> <ABORT T> <T, A, x>: Transaction T modified A whose before-image is x.
64
Undo Logging (2/3) Read P, x x -= x* 0.1 Write x,P Read S, y
y = “CHEAP” Write y, S FLUSH LOG Output P Output S <START T> U1: <T, X, v> should be flushed before Output X U2: <COMMIT T> should be flushed after all OUTPUTs <T, P, x> <T, S, y> <COMMIT T>
65
Undo Logging (3/3) Recovery with Undo log
If T has a <COMMIT T> entry, do nothing If T has a <START T> entry, but no <COMMIT T> T is incomplete and needs to be undone Restore old values from <T,X,v> records There may be multiple transactions Start scanning from the end of the log
66
Redo Logging (1/3) All incomplete transactions can be ignored
Redo all completed transactions <T, A, x>: Transaction T modified A whose after-image is x.
67
Redo Logging (2/3) Read P, x x -= x* 0.1 Write x,P Read S, y
y = “CHEAP” Write y, S FLUSH LOG Output P Output S <START T> R1: <T, X, v> and <COMMIT T> should be flushed before Output X <T, P, x> <T, S, y> <COMMIT T> Write-ahead Logging
68
Redo Logging (3/3) Recovery with Redo Logging
If T has a <COMMIT T> entry, redo T If T is incomplete, do nothing (add <ABORT T>) For multiple transactions Scan from the beginning of the log
69
Undo/Redo Logging (1/3) Undo logging: Cannot COMMIT T unless all updates are written to disk Redo logging: Cannot release memory unless transaction commits Undo/Redo logs attempt to strike a balance
70
Undo/Redo Logging (2/3) Read P, x x -= x* 0.1 Write x,P Read S, y
y = “CHEAP” Write y, S FLUSH LOG Output P Output S <START T> UR1: <T, X, a, b> should be flushed before Output X U1: <T, X, v> should be flushed before Output X U2: <COMMIT T> should be flushed after all OUTPUTs <T, P, x, a> <T, S, y, b> R1: <T, X, v> and <COMMIT T> should be flushed before Output X <COMMIT T>
71
Undo/Redo Logging (3/3) Recovery with Undo/Redo Logging
Redo all committed transactions (earliest-first) Undo all uncommitted transactions (latest-first) What happens if there is a crash when you are writing a log? What happens if there is a crash during recovery?
72
Checkpointing Logs can be huge…can we throw away portions of it?
Can we avoid processing all of it when there is a crash? ON YOUR OWN
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.