Transactions and Wrap-Up

Slides:



Advertisements
Similar presentations
Database Systems (資料庫系統)
Advertisements

Concurrency Control WXES 2103 Database. Content Concurrency Problems Concurrency Control Concurrency Control Approaches.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Chapter 16 Concurrency. Topics in this Chapter Three Concurrency Problems Locking Deadlock Serializability Isolation Levels Intent Locking Dropping ACID.
1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Concurrency Control Amol Deshpande CMSC424. Approach, Assumptions etc.. Approach  Guarantee conflict-serializability by allowing certain types of concurrency.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17 Sections
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. –Because disk accesses are.
Quick Review of Apr 29 material
Concurrency Control and Recovery In real life: users access the database concurrently, and systems crash. Concurrent access to the database also improves.
Query Optimization, Concluded and Transactions and Concurrency Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December.
Transactions and Wrap-Up Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 8, 2005 Some slide content derived.
Transaction Management and Concurrency Control
1 Transaction Management Overview Yanlei Diao UMass Amherst March 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Transaction Processing: Concurrency and Serializability 10/4/05.
Transaction Management
Concurrency. Correctness Principle A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction,
Concurrency Control John Ortiz.
9 Chapter 9 Transaction Management and Concurrency Control Hachim Haddouti.
Optimization, Auto-Tuning, and Introduction to Transactions Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November.
Transactions and Wrap-Up Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 9, 2004 Some slide content derived.
Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content.
1 Concurrency Control. 2 Transactions A transaction is a list of actions. The actions are reads (written R T (O)) and writes (written W T (O)) of database.
1 Concurrency Control. 2 Transactions A transaction is a list of actions. The actions are reads (written R T (O)) and writes (written W T (O)) of database.
Query Optimization, Concluded and Transactions and Concurrency Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December.
Concurrency Control and Recovery In real life: users access the database concurrently, and systems crash. Concurrent access to the database also improves.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Department of Computer Science and Engineering, HKUST 1 More on Isolation.
Transactions, Concluded, and the Future of Data Management Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December.
SCUJoAnne Holliday11–1 Schedule Today: u Transaction concepts. u Read Sections Next u Authorization and security.
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
Module Coordinator Tan Szu Tak School of Information and Communication Technology, Politeknik Brunei Semester
1 Concurrency Control II: Locking and Isolation Levels.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 136 Database Systems I SQL Modifications and Transactions.
Chapter 16 Concurrency. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.16-2 Topics in this Chapter Three Concurrency Problems Locking Deadlock.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
3 Database Systems: Design, Implementation, and Management CHAPTER 9 Transaction Management and Concurrency Control.
1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens.
Database Isolation Levels. Reading Database Isolation Levels, lecture notes by Dr. A. Fekete, resentation/AustralianComputer.
Jinze Liu. ACID Atomicity: TX’s are either completely done or not done at all Consistency: TX’s should leave the database in a consistent state Isolation:
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
CS 440 Database Management Systems
Transactions.
Transactions and Concurrency Control
Transaction Management and Concurrency Control
CS422 Principles of Database Systems Concurrency Control
Concurrency Control.
Transaction Management
Cse 344 March 25th – Isolation.
March 21st – Transactions
Transaction Management
Query Optimization, Concluded and Transactions and Concurrency
March 9th – Transactions
Chapter 10 Transaction Management and Concurrency Control
Lecture 21: Concurrency & Locking
CS162 Operating Systems and Systems Programming Review (II)
Concurrency Control WXES 2103 Database.
Chapter 15 : Concurrency Control
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Lecture 22: Intro to Transactions & Logging IV
Transaction management
Transaction Management
Temple University – CIS Dept. CIS661 – Principles of Data Management
Database Systems (資料庫系統)
Database Systems (資料庫系統)
Presentation transcript:

Transactions and Wrap-Up Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 6, 2009 Some slide content derived from Ramakrishnan & Gehrke

Reminders Please be sure you’re signed up for a project demo Due with the project demo: the project code (.zipped via email) Due by 12/22: 5-10 page report describing: What your project goals were What you implemented Basic architecture and design Division of labor Also: please email me an assessment of your group; group members’ contributions; your contributions Midterm on Wednesday OR final exam on 12/22

Recall – Lack of Isolation: Dirty Reads Dirty data is data written by an uncommitted transaction; a dirty read is a read of dirty data (WR conflict) Sometimes we can tolerate dirty reads; other times we cannot: e.g., if we wished to ensure balances never went negative in the transfer example, we should test that there is enough money first!

“Bad” Dirty Read If the initial read (italics) were dirty, the balance EXEC SQL select balance into :bal from Accounts where account#=‘1234’; if (bal > 100) { EXEC SQL update Accounts set balance = balance - $100 where account#= ‘1234’; set balance = balance + $100 where account#= ‘5678’; } EXEC SQL COMMIT; If the initial read (italics) were dirty, the balance could become negative!

Acceptable Dirty Read If we are just checking availability of an airline seat, a dirty read might be fine! (Why is that?) Reservation transaction: EXEC SQL select occupied into :occ from Flights where Num= ‘123’ and date=11-03-99 and seat=‘23f’; if (!occ) {EXEC SQL update Flights set occupied=true and seat=‘23f’;} else {notify user that seat is unavailable}

Other Undesirable Phenomena Unrepeatable read: a transaction reads the same data item twice and gets different values (RW conflict) Phantom problem: a transaction retrieves a collection of tuples twice and sees different results

Phantom Problem Example T1: “find the students with best grades who Take either cis550-f09 or cis570-f08” T2: “insert new entries for student #1234 in the Takes relation, with grade A for cis570-f08 and cis550-f09” Suppose that T1 consults all students in the Takes relation and finds the best grades for cis550-f09 Then T2 executes, inserting the new student at the end of the relation, perhaps on a page not seen by T1 T1 then completes, finding the students with best grades for cis570-f08 and now seeing student #1234

Isolation The problems we’ve seen are all related to isolation General rules of thumb w.r.t. isolation: Fully serializable isolation is more expensive than “no isolation” We can’t do as many things concurrently (or we have to undo them frequently) For performance, we generally want to specify the most relaxed isolation level that’s acceptable Note that we’re “slightly” violating a correctness constraint to get performance!

Specifying Acceptable Isolation Levels The default isolation level is SERIALIZABLE (as for the transfer example). To signal to the system that a dirty read is acceptable, In addition, there are SET TRANSACTION READ WRITE ISOLATION LEVEL READ UNCOMMITTED; SET TRANSACTION ISOLATION LEVEL READ COMMITTED; ISOLATION LEVEL REPEATABLE READ;

READ COMMITTED Forbids the reading of dirty (uncommitted) data, but allows a transaction T to issue the same query several times and get different answers No value written by T can be modified until T completes For example, the Reservation example could also be READ COMMITTED; the transaction could repeatably poll to see if the seat was available, hoping for a cancellation

REPEATABLE READ What it is NOT: a guarantee that the same query will get the same answer! However, if a tuple is retrieved once it will be retrieved again if the query is repeated For example, suppose Reservation were modified to retrieve all available seats If a tuple were retrieved once, it would be retrieved again (but additional seats may also become available)

Implementing Isolation Levels One approach – use locking at some level (tuple, page, table, etc.): each data item is either locked (in some mode, e.g. shared or exclusive) or is available (no lock) an action on a data item can be executed if the transaction holds an appropriate lock consider granularity of locks – how big of an item to lock Larger granularity = fewer locking operations but more contention! Appropriate locks: Before a read, a shared lock must be acquired Before a write, an exclusive lock must be acquired

Lock Compatibility Matrix Locks on a data item are granted based on a lock compatibility matrix: When a transaction requests a lock, it must wait (block) until the lock is granted Mode of Data Item None Shared Exclusive Shared Y Y N Exclusive Y N N Request mode {

Locks Prevent “Bad” Execution If the system used locking, the first “bad” execution could have been avoided: Deposit 1 Deposit 2 xlock(X) read(X.bal) {xlock(X) is not granted} X.bal := X.bal + $50 write(X.bal) release(X) X.bal:= X.bal + $10

Lock Types and Read/Write Modes When we specify “read-only”, the system only uses shared-mode locks Any transaction that attempts to update will be illegal When we specify “read-write”, the system may also acquire locks in exclusive mode Obviously, we can still query in this mode

Isolation Levels and Locking READ UNCOMMITTED allows queries in the transaction to read data without acquiring any lock For updates, exclusive locks must be obtained and held to end of transaction READ COMMITTED requires a read-lock to be obtained for all tuples touched by queries, but it releases the locks immediately after the read Exclusive locks must be obtained for updates and held to end of transaction

Isolation levels and locking, cont. REPEATABLE READ places shared locks on tuples retrieved by queries, holds them until the end of the transaction Exclusive locks must be obtained for updates and held to end of transaction SERIALIZABLE places shared locks on tuples retrieved by queries as well as the index, holds them until the end of the transaction Holding locks to the end of a transaction is called “strict” locking

Summary of Isolation Levels Level Dirty Read Unrepeatable Read Phantoms READ UN- Maybe Maybe Maybe COMMITTED READ No Maybe Maybe REPEATABLE No No Maybe READ SERIALIZABLE No No No

Locking and Serializability A transaction must hold all locks until it terminates (a condition called strict locking) It turns out that this is crucial to guarantee serializability Note that the first (bad) example could have been produced if transactions acquired and immediately released locks.

Questions to Address Given a schedule S, is it serializable? How can we "restrict" transactions in progress to guarantee that only serializable schedules are produced?

Conflicting Actions Consider a schedule S in which there are two consecutive actions Ii and Ij of transactions Ti and Tj respectively If Ii and Ij refer to different data items, then swapping Ii and Ij does not matter If Ii and Ij refer to the same data item Q, then swapping Ii and Ij matters if and only if one of the actions is a write Ri(Q) Wj(Q) produces a different final value for Q than Wj(Q) Ri(Q)

Testing for Serializability Given a schedule S, we can construct a di-graph G=(V,E) called a precedence graph V : all transactions in S E : Ti  Tj whenever an action of Ti precedes and conflicts with an action of Tj in S Theorem: A schedule S is conflict serializable if and only if its precedence graph contains no cycles Note that testing for a cycle in a digraph can be done in time O(|V|2)

An Example T1 T2 T3 R(X,Y,Z) R(X) W(X) R(Y) W(Y) W(Z) T1 T2 T3 Cyclic: Not serializable.

Another Example T1 T2 T3 R(X) W(X) T1 T2 T3 R(Y) W(Y) Acyclic: serializable

Producing the Equivalent Serial Schedule If the precedence graph for a schedule is acyclic, then an equivalent serial schedule can be found by a topological sort of the graph For the second example, the equivalent serial schedule is: R1(Y)W1(Y)R2(X)W2(X) R2(Y)W2(Y) R3(X)W3(X)

Locking and Serializability We said that a transaction must hold all locks until it terminates (a condition called strict locking) It turns out that this is crucial to guarantee serializability Note that the first (bad) example could have been produced if transactions acquired and immediately released locks.

Well-Formed, Two-Phased Transactions A transaction is well-formed if it acquires at least a shared lock on Q before reading Q or an exclusive lock on Q before writing Q and doesn’t release the lock until the action is performed Locks are also released by the end of the transaction A transaction is two-phased if it never acquires a lock after unlocking one i.e., there are two phases: a growing phase in which the transaction acquires locks, and a shrinking phase in which locks are released

Two-Phased Locking Theorem If all transactions are well-formed and two-phase, then any schedule in which conflicting locks are never granted ensures serializability i.e., there is a very simple scheduler! However, if some transaction is not well-formed or two-phase, then there is some schedule in which conflicting locks are never granted but which fails to be serializable i.e., one bad apple spoils the bunch

Summary Transactions are all-or-nothing units of work guaranteed despite concurrency or failures in the system Theoretically, the “correct” execution of transactions is serializable (i.e. equivalent to some serial execution) Practically, this may adversely affect throughput  isolation levels With isolation levels, users can specify the level of “incorrectness” they are willing to tolerate

What to Look for Down the Road Well, no one really knows the answer to this… But here are some current directions: Sensors, networks, and streaming data Peer-to-peer meets databases and data integration “The Semantic Web” Security and privacy – especially as integration becomes more commonplace Uncertainty and ranked retrieval … We have lots of research projects at Penn relating to these and other topics

A Plug for Next Year CIS 455/555 (Spring): Internet and Web Systems Focus: building and interconnecting scalable Web servers and services; information retrieval; integration; cloud computing Emphasis on implementation, programming-in-the-large, experimentation – need substantial coding experience Meanwhile… Best of luck on your projects and exams – and have a wonderful break! I hope you learned a lot in this course and that it was enjoyable!