Download presentation
Presentation is loading. Please wait.
Published byDylan Howard Modified over 8 years ago
1
Copyright © 2003 – 2013 by Curt Hill Transaction Management An Overview
2
Copyright © 2003 – 2013 by Curt Hill Transaction Any one execution of a user program –A sequence of SQL statements –A program that accesses the DBMS to accomplish a similar action Mostly interested in transactions that change the database –An interaction with queries is also interesting A transaction is the unit of interest for concurrent execution and recovery
3
Examples Buying a product –Finding how many there are –Removing the sold ones from stock –Problem is two requests that appear simultaneously and cannot both be satisfied Complicated queries –Any query where a single table appears more than once in the From –The multiply referenced tables should be the same even with concurrent updates Copyright © 2003 – 2013 by Curt Hill
4
The acronym ACID Atomic –Transaction perceived to be indivisible Consistent –Transforms database from one consistent state to another Isolated –Understandable without regards to any other agents or transactions Durable –Once committed permanence is guaranteed even with system crashes
5
Copyright © 2003 – 2013 by Curt Hill Atomic Either all the actions are applied or none of them are applied No incomplete actions are allowed Since a transaction is made up of many smaller actions: –Some of these may be done before a problem occurs –The DBMS must be able to undo any of the smaller pieces of a transaction if the entire transaction is aborted
6
Copyright © 2003 – 2013 by Curt Hill Errors What causes a transaction to fail? Expected problem –The withdraw amount exceeds the ATM machine or source account –The transaction aborts itself Transient problem –For reasons seen later the DBMS aborts the transaction and restarts it later System error –Disk failure, power failure among others
7
Copyright © 2003 – 2013 by Curt Hill Durability A transaction may either commit or rollback Committed transactions must be durable Rolled back transaction must be completely undone –As if they were never executed at all –Queries are no problem, updates are Once a transaction executes a commit or rollback this should be accomplished even if the system crashes
8
Copyright © 2003 – 2013 by Curt Hill Consistency Two domains: –A database is consistent if all constraints are met –A database is consistent if it correctly models some real-world situation The first is met by the normal checking of the DBMS The second is the responsibility of the transaction
9
Copyright © 2003 – 2013 by Curt Hill Consistency example Transferring money from one account to another –Removing the money from one account leaves the database inconsistent in the second sense –This is corrected when the money is added to the second account Both actions must be in the same transaction
10
Copyright © 2003 – 2013 by Curt Hill Isolation Interleaving of the actions within several different actions may occur The DBMS must guarantee that result will be the same as if they were completely serialized This interleaving is from concurrent execution Without interleaved execution, performance will be poor
11
Copyright © 2003 – 2013 by Curt Hill Transaction Schedules A transaction is a list of actions The actions include: –Reads and writes of tuples –Commit or rollback commands –Others: arithmetic, comparisons, etc Two lists may only interact through the reads and writes A transaction schedule is a ordering all of the actions from a group of transactions
12
Copyright © 2003 – 2013 by Curt Hill Schedules A schedule interleaves the actions of several transactions A complete schedule has all the actions of all the transactions A serial schedule removes interleaving Isolation dictates that the serial schedule ends with the same result as an interleaved schedule
13
Copyright © 2003 – 2013 by Curt Hill Concurrency Can we afford a serial schedule? –One transaction completely finished before the next one is begun No –The impact on performance is too large Instead –We have to handle the transactions concurrently The issues are discussed in the concurrency presentation
14
Copyright © 2003 – 2013 by Curt Hill Serializability A serializable schedule has an equivalent effect to a serial schedule The serializable schedule allows interleaved operations, while the serial schedule does not The idea is that we get concurrent execution and consistent results Consider some examples using two transactions
15
Copyright © 2003 – 2013 by Curt Hill Notes on Examples We have two transactions, T1 and T2 Each does two reads and writes These will be denoted by R(A) and W(A) –Read A (a page) –Write A (a page)
16
Copyright © 2003 – 2013 by Curt Hill Example 1 – No Commonality T1 T2 R(A) R(Y) R(B) R(X) W(A) W(B) W(X) W(Y)
17
Copyright © 2003 – 2013 by Curt Hill No Commonality Easy example Since there are no pages in common Any interleaving works –Provided the order is maintained on each side Painless and free concurrency Somewhat more difficult if the Write is an insert in a B+Tree index which causes splits –Then the pages could be different and still interfere
18
Copyright © 2003 – 2013 by Curt Hill Example 2 – Commonality T1 T2 R(X) R(Y) R(X) W(X) W(Y) W(X) W(Y)
19
Copyright © 2003 – 2013 by Curt Hill Commonality This serializes the same as T1 completed and then T2 The order of this could change and give a different serial Such as the following
20
Copyright © 2003 – 2013 by Curt Hill Example 2 – Again T1 T2 R(X) R(Y) R(X) W(X) W(Y) W(X) W(Y)
21
Copyright © 2003 – 2013 by Curt Hill Again This serializes the same as T2 completed and then T1 Either of these are acceptable All we have to do is guarantee that our interleaved schedule is equivalent to some serial schedule, not a particular serial schedule We do not care which customer gets the product as long as game is fair
22
Copyright © 2003 – 2013 by Curt Hill Non-Serializable Schedule T1 T2 R(X) R(Y) R(X) W(X) W(Y) W(X) W(Y)
23
Copyright © 2003 – 2013 by Curt Hill No Equivalent Serial Schedule T2’s write of X is lost T1’s write of Y is also lost This assumes that the entire page is given to or received from buffer pool There are many other interleavings that also lose something This lost update is one of several concurrent execution anomalies
24
Copyright © 2003 – 2013 by Curt Hill Interleaved Execution Anomalies The above is similar to updating a shared variable in memory Databases allow both commit and rollback operations which cause further problems These are unlike the shared memory problem
25
Copyright © 2003 – 2013 by Curt Hill Four combinations RR –Two separate transactions doing a Read –This is only always harmless case RW –T1 wants to Read and T2 write page WR –T1 wants to Write and T2 read page WW –Both want to write same page –This was first example seen
26
Copyright © 2003 – 2013 by Curt Hill Reading Uncommitted Data Uncommitted data is any data modified by a transaction The modification may be in the past or future Reading uncommitted data is called a dirty read
27
Copyright © 2003 – 2013 by Curt Hill An Aborted Transaction Problem T1 T2 R(X) W(X) Rollback T1 T2 R(X) W(X) Rollback W(X)
28
Copyright © 2003 – 2013 by Curt Hill Rollback Problems Notice in previous that T1 was a query, not an update that had problems This is an example of a WR problem We do not need a rollback to cause the problem Consider the next situation where each record has $10,000 –T1 increases this by 10% and T2 increase by $1000
29
Copyright © 2003 – 2013 by Curt Hill A WR Problem T1 T2 R(X) R(Y) R(X) W(X) W(Y) W(X) W(Y) Commit
30
Copyright © 2003 – 2013 by Curt Hill Results of the WR anomaly The X record gets increased by 2000 –2000 = 10000*10% + 1000 The Y record gets increased by 2100 –2100 = (10000+1000)*10% Not consistent or serializable T2’s read of X and T1’s read of Y are dirty reads
31
Copyright © 2003 – 2013 by Curt Hill RW Problems An RW problem can result in an unrepeatable read Two reads in a row that yield different results
32
Copyright © 2003 – 2013 by Curt Hill An Unrepeatable Read T1 T2 R(X) W(X) Commit
33
Copyright © 2003 – 2013 by Curt Hill What’s the fix? Lock an object in one of several ways to prevent this type of interleaving The most common protocol is Strict Two Phase Locking AKA Strict 2PL There are other protocols as well
34
Copyright © 2003 – 2013 by Curt Hill Types of Locks Shared lock –Used for reading Exclusive lock –Used for writing but also allows reading The lock may be on a: –Tuple –Page –Relation –This is its granularity The larger the granularity the less concurrency and easier to manage
35
Copyright © 2003 – 2013 by Curt Hill Strict 2PL Rules A transaction requests a lock when it desires to access the object –Shared lock for reads and exclusive for writes It holds all locks until complete –Either a commit or rollback These locks are inserted by the DBMS, not necessarily observed in the transaction A transaction that cannot get the lock is suspended until the item is available If both reading and writing is desired get the exclusive lock
36
Copyright © 2003 – 2013 by Curt Hill Some examples revisited Two new commands are added S(A) gets a shared lock on A X(A) gets an exclusive lock on A These locks are held until a commit or rollback
37
Copyright © 2003 – 2013 by Curt Hill WW Problem with Locks T1T2 R(X) R(Y) R(X) W(X) W(Y) W(X) W(Y) X(X) Commit X(Y) Commit Suspended until Commit
38
Copyright © 2003 – 2013 by Curt Hill WR Problem with locks T1 T2 R(X) W(X) Rollback X(X) S(X) Shared lock causes suspension until rollback
39
Copyright © 2003 – 2013 by Curt Hill Another WR T1 T2 R(X) R(Y) R(X) W(X) W(Y) W(X) W(Y) Commit X(X) Exclusive lock suspended until commit occurs
40
Copyright © 2003 – 2013 by Curt Hill DeadLock T1 T2 R(X) R(Y) W(X) W(Y) X(X) X(Y) X(X) Suspended since T2 has Y Suspended since T1 has X Both are now deadlocked
41
Copyright © 2003 – 2013 by Curt Hill What do we do? The usual solution is timers When a transactions suspension for a lock exceeds a threshold value –Abort it –Rollback all actions –Restart the whole transaction
42
Copyright © 2003 – 2013 by Curt Hill Performance What does locking do to the performance of a DBMS? It must slow the DBMS This is better than incorrect results It will slow serializable schedules that otherwise had no problems –This disregards lock bookkeeping Consider a previous example
43
Copyright © 2003 – 2013 by Curt Hill A Serializable Schedule T1 T2 R(X) R(Y) R(X) W(X) W(Y) W(X) W(Y) Commit
44
Copyright © 2003 – 2013 by Curt Hill Commentary This schedule was equivalent to the serial schedule of T1 followed by T2 It also had nice concurrency Introducing locking preserves correctness but destroys concurrency
45
Copyright © 2003 – 2013 by Curt Hill A Serialed Schedule T1 T2 R(X) R(Y) R(X) W(X) W(Y) W(X) W(Y) Commit X(X) X(Y) Lock prevents any action until T1’s commit
46
Copyright © 2003 – 2013 by Curt Hill SQL The transaction statements of SQL are: –Commit –Rollback –Begin Not needed for simple queries Commit is the default for changes if neither is given
47
Copyright © 2003 – 2013 by Curt Hill Begin Begin is used to show boundaries Form is: BEGIN; or BEGIN TRANSACTION The transaction is then terminated by the Commit or Rollback
48
Copyright © 2003 – 2013 by Curt Hill Example BEGIN; Insert Into course VALUES ('PHYS', 141, 4, 'College Physics') Insert Into course VALUES (‘CSCI', 242, 3, ‘Data Structures') COMMIT The more common usage is through a program which determines whether to commit or rollback
49
Copyright © 2003 – 2013 by Curt Hill Save Points A save point is a named point to complete the rollback Example: SAVEPOINT xyz … ROLLBACK to SAVEPOINT xyz This prevents rolling back to BEGIN This was introduced in SQL 1999
50
Copyright © 2003 – 2013 by Curt Hill Granularity The size of the thing to lock –Tuple –Page –Relation Tradeoff –Locking large things impedes concurrency –Locking small things requires more overhead and allows bad things to happen
51
Copyright © 2003 – 2013 by Curt Hill Phantom Read Problem Two reads with different results T1 queries table and finds all rows with certain criteria Does exclusive lock on all of these T2 inserts a new record which matches the criteria T1 now re-queries the criteria and gets a different set Putting a shared lock on whole table solves problem but limits concurrency
52
Copyright © 2003 – 2013 by Curt Hill Recovery The Recovery Manager is responsible for several important issues Removing rollbacks from the file system Guaranteeing that the commit changes are preserved in the file system Defending against crashes and rebuilding the database when one occurs Recovery management is covered in a subsequent presentation
53
Closing Thoughts ACID is the ideal of database reliabilty This is not always possible in distributed databases –CAP theorem precluded –There we settle for BASE Copyright © 2003 – 2013 by Curt Hill
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.