CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Name: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.
Uploaded: 2017-12-16T05:24:52+00:00
Duration: PTM33S58
Channel: Dwayne Robbins
Description: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

CSE Database Concepts Dr. Andy Nagar Spring 2014

A transaction is a sequence of operations that must be executed as a whole.
SAVINGS User $1000 CHECKING User $0 Transfer $500 Debit savings Credit checking Either both (1) and (2) happen or neither! Every DB action takes place inside a transaction.

Transaction support Transaction Logical unit of work on the database.
Chapter Name September 98 Transaction support Transaction Action, or series of actions, carried out by user or application, which reads or updates contents of database. Logical unit of work on the database. Application program is series of transactions with non-database processing in between. Transforms database from one consistent state to another, although consistency may be violated during transaction. ©Pearson Education 2009

We abstract away most of the application code when thinking about transactions.
User’s point of view Transfer $500 Code writer’s point of view Debit savings Credit checking Concurrency control ‘s & recovery’s point of view Read Balance1 Write Balance1 Read Balance2 Write Balance2

Read data item A (typically a tuple)
Schedule: The order of execution of operations of two or more transactions. Schedule S1 Transaction1 Transaction2 R(A) R(C) W(A) R(B) W(C) W(B) Read data item A (typically a tuple) Time

Chapter Name September 98 Example transaction ©Pearson Education 2009

Why do we need transactions?
Transaction 1: Add $100 to account A Transaction 2: Add $200 to account A R(A) W(A) R(A) W(A) Time

What will be the final account balance?
Transaction 1: Add $100 to account A Transaction 2: Add $200 to account A R(A) W(A) R(A) W(A) Time The Lost Update Problem

Chapter Name September 98 Lost update problem Successfully completed update is overridden by another user. T1 withdrawing $10 from an account with balx, initially $100. T2 depositing $100 into same account. Serially, final balance would be $190. ©Pearson Education 2009

Chapter Name September 98 Lost update problem Loss of T2’s update avoided by preventing T1 from reading balx until after update. ©Pearson Education 2009

What will be the final account balance?
Transaction 1: Add $100 to account A Transaction 2: Add $200 to account A R(A) W(A) F A I L R(A) W(A) Time Dirty reads cause problems.

Abort or roll back are the official words for “fail”.
Commit All your writes will definitely absolutely be recorded and will not be undone, and all the values you read are committed too. Abort/rollback Undo all of your writes! We’ll give a more technical definition later on.

The concurrent execution of transactions must be such that each transaction appears to execute in isolation.

The ACID properties: Atomic Consistency Preservation Isolation
A transaction is performed entirely or not at all Consistency Preservation A transaction’s execution takes the database from one correct state to another Isolation The updates of a transaction must not be made visible to other transactions until it is committed Durability If a transaction changes the database and is committed, the changes must never be lost because of subsequent failure

Serial Schedule: the transactions are performed one after the other.
Read(A) Write(A) Read(B) Write(B) Read(C) Write(C) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Read(C) Write(C) Read(B)

A schedule is serializable if it is guaranteed to give the same final result as some serial schedule. Which of these are serializable? Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B)

Why is that last schedule ok?
Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) And this will never cause an interleaving problem Will always give the same result as

Concurrency control protocols
Chapter Name September 98 Concurrency control protocols Two basic concurrency control techniques: Locking, Timestamping. Both are conservative approaches: delay transactions in case they conflict with other transactions. Optimistic methods assume conflict is rare and only check for conflicts at commit. ©Pearson Education 2009

Chapter Name September 98 Locking methods Transaction uses locks to deny access to other transactions and so prevent incorrect updates. Most widely used approach to ensure serializability. Generally, a transaction must claim a shared (read) or exclusive (write) lock on a data item before read or write. Lock prevents another transaction from modifying item or even reading it, in the case of a write lock. ©Pearson Education 2009

Chapter Name September 98 Locking – basic rules If transaction has shared lock on item, can read but not update item. If transaction has exclusive lock on item, can both read and update item. Reads cannot conflict, so more than one transaction can hold shared locks simultaneously on same item. Exclusive lock gives transaction exclusive access to that item. ©Pearson Education 2009

Incorrect locking schedule
Chapter Name September 98 Incorrect locking schedule ©Pearson Education 2009

Incorrect locking schedule
Chapter Name September 98 Incorrect locking schedule If at start, balx = 100, baly = 400, result should be: balx = 220, baly = 330, if T7 executes before T8, or balx = 210, baly = 340, if T8 executes before T7. However, result gives balx = 220 and baly = 340. S is not a serializable schedule. Problem is that transactions release locks too soon, resulting in loss of total isolation and atomicity. To guarantee serializability, need an additional protocol concerning the positioning of lock and unlock operations in every transaction. ©Pearson Education 2009

Two- phase locking Two phases for transaction:
Chapter Name September 98 Two- phase locking Transaction follows 2PL protocol if all locking operations precede first unlock operation in the transaction. Two phases for transaction: Growing phase - acquires all locks but cannot release any locks. Shrinking phase - releases locks but cannot acquire any new locks. ©Pearson Education 2009

Chapter Name September 98 Incorrect Schedule T1 releases its lock too soon. For 2PL, it needs to continue to keep its lock until all operations are over. ©Pearson Education 2009

Conflicting Operations
Chapter Name September 98 Conflicting Operations ©Pearson Education 2009

Certain pairs of operations conflict with each other.
Write(A) Write(A) Write(A) Read(A) Write(A) Read(A) Write(A) Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B)

Let’s clean up by collapsing all the nodes for a transaction into one big node.
Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Blue Black Blue Black Blue Black

Let’s clean up by collapsing all the nodes for a transaction into one big node.
A schedule is conflict serializable iff its dependency graph is acyclic. Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) Blue Black Blue Black Blue Black

Is this schedule conflict serializable?
T T T3 T T T3 R1(A) R2(C) W2(C) W1(A) R1(C) W1(C) R3(C) W3(C) R2(B) W2(B) R2(C) W2(C) R2(B) W2(B) R1(A) W1(A) R1(C) W1(C) R3(C) W3(C) Schedule S2 Schedule S1

What serial schedule is S1 equivalent to, if any?
R(A) W(A) R(C) W(C) R)C( W(B) R(B) W(B) R(A) W(A) R(B) R(C) W(C) R)C(

Nothing What serial schedule is S1 equivalent to, if any? S1 R(A) W(A)
R(C) W(C) R)A( Nothing

Locks are the basis of most protocols to guarantee serializability.
Before you can read (write) item A, you must get a read (write) lock on A from the lock manager, which you eventually give up. Shared lock = read lock: many transactions can hold a shared lock on the same item at the same time. Exclusive lock = write lock: only one transaction can hold an exclusive lock on an item at any given time.

The lock manager makes decisions based on its compatibility matrix
Process T1 X S X N N Process T2 S N Y

Lock management keeps a hash table of current locks
Entry in the lock table # trans who have it locked Lock type (S, X) Queue of requests Data item ID Trans 109 has lock Trans 448 wants X lock Trans 599 wants S lock Trans 47 has lock Queue to lock a particular data item Locking and unlocking must be atomic: ensured by semaphores

2-Phase Locking (2PL): no new locks once you’ve given one up.
Two phases: getting locks (growing), then releasing locks (shrinking). At least one end of each arrow is a Write T1 T2 R/W(A) R/W(B) R/W(C) T3

2PL doesn’t solve every potential problem.
T T2 We should never have let T1 commit. W(B) R(A) W(A) R(B) T1 commits Cascading rollback Now T2 aborts!

How do we deal with this? Commit trans T only after all transactions that wrote data that T read have committed Or only let a transaction read an item after the transaction that last wrote this item has committed Strict 2PL: 2PL + a transaction releases its locks only after it has committed. How does Strict 2PL prevent cascading rollback?

Locks can also lead to deadlocks
Deadlock = circular wait among transactions waiting for each other to release a lock. Prevent: every transaction must lock all items it needs in advance (ha!) Or detect Livelock: a transaction cannot proceed for an indefinite period of time while other transactions in the system continue normally. Fair waiting schemes (i.e., first-come-first-served) for locks T1 T2

Edge from Ti to Tj iff Ti waits for Tj to release a lock
Detect deadlocks by timeouts, or occasionally looking for cycles in a waits-for graph. T1: S(A) S(D)…………..S(B) T2: …………………X(B) ……………………………………X(C) T3: …………………………………. S(D) S(C) ………………... X(A) T4: …………………………………………………… X(B) T1 T2 T4 T3 Edge from Ti to Tj iff Ti waits for Tj to release a lock

Even S2PL does not solve the phantom problem.
Schedule Find the oldest CS undergrad Elaine Undergrad Major Age Alice CS 26 Bob ECE 36 Carla Math 45 David Psych 52 Elaine 67 Add Frank, a 71-year-old CS major; commit. Frank CS B-tree on Major Frank Non-repeatable read! Would never happen in serial execution.

Solution: lock the index used to reach the tuples.
Put a lock here. CS Else must lock the whole table to prevent new records with major = CS being added.

How do you lock an index anyway?
E.g., lock a B-tree node Different locking protocol (not 2PL, S2PL) designed specifically for traversing trees. Why?

Locks aren’t the only way to guarantee serializability
Locking is good when conflicts are likely Timestamp-based approaches are possible Nice if conflicts are rare (e.g., working in parallel as a team on separate data) Otherwise, lots of transactions will abort Details in the book

What exactly are we locking, anyway?
The most common data lock granularities: Record Page Table If you access a small number of records, the best locking granularity is one record. If you access many records of the same table, best to have coarser granularity locks. Because you need to rearrange it when you add/delete/expand/contract tuples

Real DBMS lock matrix: ~ 20 x 20
N N S N Y

Example extra lock mode: increment
Increments and decrements commute with each other. Very useful for banks, but requires atomic read+update. T1: x=R(A), W(A=x+1), z=R(A), W(z=z+1) T2: y=R(A), W(A=y-1) Why bother with this at all???

Intension locks are very common
Scan through the table, intending to modify a few tuples. What kind of lock to put on the table? Has to be X (if we only have S or X). But, blocks all other read requests!

Intention locks IS, IX increase concurrency.
E.g., the whole relation E.g., a tuple Before S locking an item, must IS lock the root. Before X locking an item, must IX lock the root. Should make sure: If T1 S-locks a node, no T2 can X-lock an ancestor. Achieved if S conflicts with IX. If T1 X-locks a node, no T2 can S- or X-lock an ancestor. Achieved if X conflicts with IS and IX.

Allowed Sharings -- IS IX S X -- Ö Ö Ö Ö Ö IS Ö Ö Ö Ö IX Ö Ö Ö S Ö Ö Ö

Preventing lost update problem
Chapter Name September 98 Preventing lost update problem 56 ©Pearson Education 2009 56

Preventing uncommitted dependency problem
Chapter Name September 98 Preventing uncommitted dependency problem 57 ©Pearson Education 2009 57

Preventing inconsistent analysis problem
Chapter Name September 98 Preventing inconsistent analysis problem 58 ©Pearson Education 2009 58

Chapter Name September 98 Deadlock An impasse that may result when two (or more) transactions are each waiting for locks held by the other to be released. ©Pearson Education 2009

Chapter Name September 98 Deadlock Only one way to break deadlock: abort one or more of the transactions. Deadlock should be transparent to user, so DBMS should restart transaction(s). Three general techniques for handling deadlock: Timeouts. Deadlock prevention. Deadlock detection and recovery. ©Pearson Education 2009

Chapter Name September 98 Timestamp methods Transactions ordered globally so that older transactions, transactions with smaller timestamps, get priority in the event of conflict. Conflict is resolved by rolling back and restarting transaction. No locks so no deadlock. ©Pearson Education 2009

Timestamp methods Timestamp
Chapter Name September 98 Timestamp methods Timestamp A unique identifier created by DBMS that indicates relative starting time of a transaction. Can be generated by using system clock at time transaction started, or by incrementing a logical counter every time a new transaction starts. Read/write proceeds only if last update on that data item was carried out by an older transaction. Otherwise, transaction requesting read/write is restarted and given a new timestamp. 62 ©Pearson Education 2009 62

Chapter Name September 98 Optimistic methods Based on assumption that conflict is rare and more efficient to let transactions proceed without delays to ensure serializability. At commit, check is made to determine whether conflict has occurred. If there is a conflict, transaction must be rolled back and restarted. Potentially allows greater concurrency than traditional protocols. ©Pearson Education 2009

Optimistic methods Three phases:
Chapter Name September 98 Optimistic methods Three phases: Read: from start to just before commit; read from database into local variables and update local data. Validation: check to ensure serializability is not violated; if violated transaction aborted and restarted. Write (for update transactions): updates made to local variables then applied to database. ©Pearson Education 2009

Chapter Name September 98 Recovery Process of restoring database to a correct state in the event of a failure. Need for Recovery Control Two types of storage: volatile (main memory) and nonvolatile. Volatile storage does not survive system crashes. Stable storage represents information that has been replicated in several nonvolatile storage media with independent failure modes. ©Pearson Education 2009

Types of Failure System crashes, resulting in loss of main memory.
Chapter Name September 98 Types of Failure System crashes, resulting in loss of main memory. Media failures, resulting in loss of parts of secondary storage. Application software errors. Natural physical disasters. Carelessness or unintentional destruction of data or facilities. Sabotage. ©Pearson Education 2009

Transactions and recovery
Chapter Name September 98 Transactions and recovery Transactions represent basic unit of recovery. Recovery manager responsible for atomicity and durability. If failure occurs between commit and database buffers being flushed to secondary storage then, to ensure durability, recovery manager has to redo (rollforward) transaction’s updates. If transaction had not committed at failure time, recovery manager has to undo (rollback) any effects of that transaction for atomicity. Partial undo - only one transaction has to be undone. Global undo - all transactions have to be undone. ©Pearson Education 2009

Transactions and recovery
Chapter Name September 98 Transactions and recovery DBMS starts at time t0, but fails at time tf. Assume data for transactions T2 and T3 have been written to secondary storage. T1 and T6 have to be undone. In absence of any other information, recovery manager has to redo T2, T3, T4, and T5. ©Pearson Education 2009

Chapter Name September 98 Recovery facilities DBMS should provide following facilities to assist with recovery: Backup mechanism, which makes periodic backup copies of database. Logging facilities, which keep track of current state of transactions and database changes. Checkpoint facility, which enables updates to database in progress to be made permanent. Recovery manager, which allows DBMS to restore database to consistent state following a failure. ©Pearson Education 2009

Log file Contains information about all updates to database:
Chapter Name September 98 Log file Contains information about all updates to database: Transaction records. Checkpoint records. Often used for other purposes (for example, auditing). ©Pearson Education 2009

Log file Transaction records contain: After-image of data item.
Chapter Name September 98 Log file Transaction records contain: Transaction identifier. Type of log record, (transaction start, insert, update, delete, abort, commit). Identifier of data item affected by database action (insert, delete, and update operations). Before-image of data item. After-image of data item. Log management information. 71 ©Pearson Education 2009 71

Log file Log file may be duplexed or triplexed.
Chapter Name September 98 Log file Log file may be duplexed or triplexed. Log file sometimes split into two separate random-access files. Potential bottleneck; critical in determining overall performance. 73 ©Pearson Education 2009 73

Checkpointing Checkpoint
Chapter Name September 98 Checkpointing Checkpoint Point of synchronization between database and log file. All buffers are force-written to secondary storage. Checkpoint record is created containing identifiers of all active transactions. When failure occurs, redo all transactions that committed since the checkpoint and undo all transactions active at time of crash. 74 ©Pearson Education 2009 74

Chapter Name September 98 Checkpointing In previous example, with checkpoint at time tc, changes made by T2 and T3 have been written to secondary storage. Thus: only redo T4 and T5, undo transactions T1 and T6. 75 ©Pearson Education 2009 75

Recovery techniques If database has been damaged:
Chapter Name September 98 Recovery techniques If database has been damaged: Need to restore last backup copy of database and reapply updates of committed transactions using log file. If database is only inconsistent: Need to undo changes that caused inconsistency. May also need to redo some transactions to ensure updates reach secondary storage. Do not need backup, but can restore database using before- and after-images in the log file. 76 ©Pearson Education 2009 76

Immediate update Updates are applied to database as they occur.
Chapter Name September 98 Immediate update Updates are applied to database as they occur. Need to redo updates of committed transactions following a failure. May need to undo effects of transactions that had not committed at time of failure. Essential that log records are written before write to database. Write-ahead log protocol. 77 ©Pearson Education 2009 77

Chapter Name September 98 Immediate update If no “transaction commit” record in log, then that transaction was active at failure and must be undone. Undo operations are performed in reverse order in which they were written to log. 78 ©Pearson Education 2009 78

CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Similar presentations

Presentation on theme: "CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Similar presentations

Presentation on theme: "CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014."— Presentation transcript:

Similar presentations

About project

Feedback