Transactions -Fehily book - chap. 14 - Mannino book - chap 15 (up to 15.2) Prof. Yitz Rosenthal.

Slides:



Advertisements
Similar presentations
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Advertisements

TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
CSC271 Database Systems Lecture # 32.
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
CSCI 3140 Module 8 – Database Recovery Theodore Chiasson Dalhousie University.
Data and Database Administration Chapter 12. Outline What is Concurrency Control? Background Serializability  Locking mechanisms.
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
Recovery 10/18/05. Implementing atomicity Note, when a transaction commits, the portion of the system implementing durability ensures the transaction’s.
Database Administration Chapter Six DAVID M. KROENKE’S DATABASE CONCEPTS, 2 nd Edition.
Transaction Management and Concurrency Control
Quick Review of May 1 material Concurrent Execution and Serializability –inconsistent concurrent schedules –transaction conflicts serializable == conflict.
Chapter 15 Transaction Management. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Transaction basics Concurrency.
Transaction Management and Concurrency Control
Transaction Management and Concurrency Control
1 Transaction Management Overview Yanlei Diao UMass Amherst March 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.
Transaction Management
1 Transaction Management Database recovery Concurrency control.
Chapter 9 Transaction Management and Concurrency Control
9 Chapter 9 Transaction Management and Concurrency Control Hachim Haddouti.
Database Administration Part 1 Chapter Six CSCI260 Database Applications.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
Transaction Management and Concurrency Control
Transactions and Recovery
Transaction Management Chapter 9. What is a Transaction? A logical unit of work on a database A logical unit of work on a database An entire program An.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
INTRODUCTION TO TRANSACTION PROCESSING CHAPTER 21 (6/E) CHAPTER 17 (5/E)
1 CSE 480: Database Systems Lecture 23: Transaction Processing and Database Recovery.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 18.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Transaction Processing Concepts. 1. Introduction To transaction Processing 1.1 Single User VS Multi User Systems One criteria to classify Database is.
Databases Illuminated
ITEC 3220M Using and Designing Database Systems Instructor: Prof. Z. Yang Course Website: 3220m.htm
Chapter 15 Recovery. Topics in this Chapter Transactions Transaction Recovery System Recovery Media Recovery Two-Phase Commit SQL Facilities.
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
Database Systems/COMP4910/Spring05/Melikyan1 Transaction Management Overview Unit 2 Chapter 16.
Ch 10: Transaction Management and Concurrent Control.
11/7/2012ISC329 Isabelle Bichindaritz1 Transaction Management & Concurrency Control.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
Concurrency Control. Objectives Management of Databases Concurrency Control Database Recovery Database Security Database Administration.
Chapter 15 Recovery. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.15-2 Topics in this Chapter Transactions Transaction Recovery System.
Transactions and Locks A Quick Reference and Summary BIT 275.
The Relational Model1 Transaction Processing Units of Work.
Database Systems Recovery & Concurrency Lecture # 20 1 st April, 2011.
Chapter 10 Recovery System. ACID Properties  Atomicity. Either all operations of the transaction are properly reflected in the database or none are.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
CSC 240 (Blum)1 Database Transactions. CSC 240 (Blum)2 Transaction  A transaction is an interaction between a user (or application) and a database. A.
Section 06 (a)RDBMS (a) Supplement RDBMS Issues 2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Transactions.
CSC 411/511: DBMS Design Dr. Nan WangCSC411_L12_JDBC_MySQL 1 Transations.
Transaction Management and Concurrent Control
9 1 Chapter 9_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
NOEA/IT - FEN: Databases/Transactions1 Transactions ACID Concurrency Control.
10 1 Chapter 10_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
3 Database Systems: Design, Implementation, and Management CHAPTER 9 Transaction Management and Concurrency Control.
18 September 2008CIS 340 # 1 Last Covered (almost)(almost) Variety of middleware mechanisms Gain? Enable n-tier architectures while not necessarily using.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
9 1 Chapter 9 Transaction Management and Concurrency Control Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
SYSTEMS IMPLEMENTATION TECHNIQUES TRANSACTION PROCESSING DATABASE RECOVERY DATABASE SECURITY CONCURRENCY CONTROL.
Transaction Management and Concurrency Control
Ch 21: Transaction Processing
Chapter 10 Transaction Management and Concurrency Control
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Lecture 20: Intro to Transactions & Logging II
Transactions and Concurrency
Concurrency Control.
Presentation transcript:

Transactions -Fehily book - chap Mannino book - chap 15 (up to 15.2) Prof. Yitz Rosenthal

ACID There are 4 terms used in conjunction with Transaction –Atomic –Consistent –Isolated –Durable Acronym: ACID

ATOMICITY and DURABLITY

Atomicity Operations in a transaction will be processed as a single unit Either : –ALL of the operations will happen OR –NONE of the operations will happen

Durability Once a transaction is completed, you are GUARANTEED that the data will be stored in the underlying database files correctly EVEN IF THERE IS SOME UNFORSEEN CATISTROPHIC EVENT (e.g. a Power Outage)

EXAMPLES

Examples Travel Agent –Booking a departing and return flight as one purchase. –You don't want to book the departing flight if there is a problem booking the associated return flight at the same time. –EVEN MORE SO: You don't want to book the return flight if you can't book the departing flight at the same time. Banking ATM –Transferring money from a savings account to a checking account. This involves debiting the savings account and crediting the checking account. –You don't want to debit the savings account unless you can also credit the checking account –You also don't want to credit the checking account if there was a problem debiting the savings account.

What can go wrong WITHOUT TRANSACTIONS Imagine... Banking example: –Step 1: Person at ATM requests to do a transfer and presses the "OK" button on ATM. –Step 2: DBMS performs the debit of the savings account and writes the new amount to the database files. –Step 3: ***** POWER OUTAGE ***** (computer goes down) –Step 4: When computer is rebooted, the savings account was debited but the checking account was never credited.

Other types of failures *** Power Outage *** is only one type of failure that can happen to a transaction Other types of failures –program detected failure after debiting savings, program queries checking and notices that balance in checking account is somehow negative - program will voluntarily stop the transfer since something is fishy - program issues ROLLBACK command (see next few slides) to undo the modifications to DB made so far. –abnormal program termination - caused by programming bug (e.g. division by zero in a program might cause an unexpected crash of the program between debiting and crediting) program never COMMITs and transaction times out and DBMS automatically rolls back transaction. –System failure - e.g. power glitch causes reboot of server –Device failure - e.g. hard drive that contains database files crashes. If transaction log is kept on a different hard drive and an earlier copy of the database is backed up somewhere else, the current version of the database can be recreated from the log file.

Buffered Writes

Delayed (or buffered) writes –Writes to database tables are not written to disk immediately. –When an application writes to a DB table (e.g. insert, update, delete) the DBMS stores the information in memory buffers. –The information may actually written to disk only much much later.

Durability Data is DURABLE –Transactions GUARANTEE that if a system failure (e.g. power outage) occurs after a transaction is committed, the database will be able to be restored to reflect the changes made by the transaction even if the underlying table data was not written to the database file. (we'll see how soon).

Disks are SLOW, Memory is FAST Why are writes buffered? –Memory is MUCH, MUCH faster than a disk drive. How –This reason for this has to do with how disk drives work (see next slides).

Tracks, Sectors & Clusters Disk drives are segmented into –tracks (concentric circles) and –sectors (pie slices) Each track and each sector has a identifying number A cluster is a particular area of the disk corresponding to a specific track and specific sector

Cluster Size Every cluster on the disk stores the same amount data. The amount of data stored in a cluster is known as the cluster size. Cluster sizes are usually powers of two: –Example cluster sizes for different disk drives: 512 bytes 1024 bytes 4096 bytes etc.

Reads and Writes Every read and write to a disk drive will read or write an entire cluster at a time. There is NO WAY for a disk drive to read or write only part of a cluster. Therefore - PHYSICALLY, IT TAKES JUST AS MUCH TIME TO READ OR WRITE 512 BYTES ON A DISK DRIVE (if cluster size is 512 bytes) AS IT DOES TO READ OR WRITE JUST ONE BYTE.

Logical Records and Physical Records A physical record corresponds to the data on a single disk sector. A logical record corresponds to the data from a single record in a particular table. Logical records for a specific table are all the same size. (VARCHAR and VARBINARY data are not stored in the logical record)

Storage of logical records in physical records If the logical record smaller is smaller than the physical record size (i.e. cluster size) then multiple logical records are stored in a single physical record. If the logical record size is LARGER than the physical record size then a single logical record will need to be split between 2 or more physical records.

Not enough memory to hold everything The amount of memory available to the DBMS is generally NOT as large as the amount of disk space available. –Memory is much more expensive than disk space. –Hardware limitations limit amount of memory that can be placed on one machine.

Memory Buffers The DBMS creates memory buffers that are the same size as the disk clusters. When the DBMS reads information from a cluster, it copies that information to an in memory buffer which is the same size as the cluster. This is known as a memory "page".

Paging What is "paging"? –There generally are NOT ENOUGH memory pages to store the whole database. –When the DBMS needs to access data that is not currently in memory, the DBMS Picks an in-memory page that is not being used and writes it to disk The DBMS then reads the desired data from the disk into the now available memory buffer.

Checkpoints What is a "checkpoint". –Once in a while, the DBMS ensures that the latest copy of all pages are on the disk. –This is known as a "checkpoint" –Checkpoints are necessary for the log mechanism to work correctly.

SQL Commands

BEGIN TRANSACTION –issued before any SQL statements that are part of the transaction COMMIT –issued after all SQL statements that are part of the transaction –Once the COMMIT statement is executed you are guaranteed that the data is permanently in the database even if unforeseen errors happen.

BEGIN TRANSACTION UPDATE SAVINGS_ACCOUNTS SET BALANCE = BALANCE WHERE ACCOUNT_NUMBER = 12345; UPDATE CHECKING_ACCOUNTS SET BALANCE = BALANCE WHERE ACCOUNT_NUMBER = COMMIT TRANSACTION Example of transaction

Other SQL commands:ROLLBACK Other SQL commands –ROLLBACK A rollback command forces whatever was done in the transaction so far to become "undone". Similar to the "undo" command on your word processor. This is used both with "stored procedures" and application programs that interact with the database. When the program encounters a condition after it started processing the transaction that requires undoing the transaction, the "stored procedure" or the application can issue the ROLLBACK command. Example: if there is a transmission error in a distributed database, the application program can ROLLBACK a transaction once it is started.

SAVE POINTS Additional SQL commands –SAVE TRAN mysavepoint1 allows you to breakup a long transaction in to several parts. You can create several savepoints Each savepoint is given a unique name (e.g. mysavepoint1, mysavepoint2, etc.) at any point the program can issue a ROLLBACK TRAN mysavepoint1 command to rollback the transaction until the specified savepoint and then continue on from there.

LOG File

Transaction ID Many transactions can be executing simultaneously. actions from Trans1 are often interspersed with actions from Trans2 Therefore, each transaction is assigned a unique ID by the DBMS.

LOG File All changes to the database are recorded both –in the underlying DB tables AND –in a TRANSACTION LOG FILE

Buffered Writes Information written to the DB tables can be buffered to enhance performance. Information is not necessarily written to permanent storage (i.e. the disk drive) when the

LOG File 3 types of records in the LOG file –begin record –commit record –detail record 4th type (we'll discuss later) –rollback record

Log BEGIN & COMMIT Records BEGIN record contains –transaction id COMMIT record contains –transaction id

Log DETAIL record There can be many DETAIL records for each transaction Each DETAIL Record contains –transaction id –action (insert, update, delete) –row id (used to uniquely identify the row in the table) –old & new values (AKA before image & after images)

LOG File Implemented as a Table The LOG FILE is often implemented as a special "hidden" database table, not available to users. In this case each row in the table needs a sequence number to indicate the order in which records were written to the LOG

Database Recovery

Transactions to the rescue –Step 1: Person at ATM requests to do a transfer and presses the "OK" button on ATM. –Step 2: DBMS writes a record to the LOG file indicating the changes to be made to the savings_account table –Step 3: DBMS writes new amount to the savings_account record. –Step 3: ***** POWER OUTAGE ***** (computer goes down) –Step 4: When computer is rebooted and DBMS server software is restarted... The recovery subsystem in the DBMS software attempts to "recover" the database (this generally happens automatically - Recovery Transparency) The recovery subsystem looks through the LOG file and backs out any changes to the database made by any Transaction for which there is no COMMIT record To do so, the recovery subsystem must make sure that the value in the savings_account record is equal to the "before image" of the record. –Step 5: Database is restored as though no transfer ever happened. –Step 6: Database comes online for regular processing.

Other scenarios for discussion... Outage happened before record was written to savings_account table file

Database BACKUPs

Backing up a DB DBAs should maintain backups of their entire database in case something catastrophic happens

Two types of backups 2 types of backups –FULL backup –INCREMENTAL backup

FULL backup COLD BACKUP –A FULL backup on a database that is not active requires backing up only the database files (tables, etc.) HOT BACKUP –For 24X7 applications it is often impossible to perform a COLD backup. –A HOT backup requires backup of BOTH database files (i.e. tables, etc) AND LOG files

INCREMENTAL BACKUPS In very large databases, it is often prohibative to backup the entire set to DB files (ie. tables, etc) on a regular basis. Instead a single backup of the DB files can be done at one time. After that backups of the LOG files can be done.

ROLL FORWARD To restore a database that was backed up INCREMENTALLY, the DBA uses a tool to restore the DB. The log files are used to "ROLL FORWARD" the changes that were made to the underlying DB since the backup of the table files.

CONSISTENCY

Transactions always operate on a consistent view of the data and when they end always leave the data in a consistent state. Data may be said to be consistent as long as it conforms to a set of invariants, such as no two rows in the customer table have the same customer id and all orders have an associated customer row. While a transaction executes these invariants may be violated, but no other transaction will be allowed to see these inconsistencies, and all such inconsistencies will have been eliminated by the time the transaction ends.

ISOLATION (and concurrency)

Concurrency and Isolation Concurrency: –In a multi-user database, several programs are working against the database at the same time. Transactions must guarantee that each program "sees" a consistent view of the underlying data without interference from the other programs.

Types of problems if there is no Isolation lost updates 2 transactions trying to update same value uncommitted dependency (AKA dirty read) 1 transaction reads data written by a 2nd transaction before the 2nd transaction commits 2nd transaction does a ROLLBACK Inconsistent Retrievals incorrect summary (includes some changed records and some unchanged records) phantom read –TR1 selects some records –TR2 writes some data that would have been retrieved by TR1s query –TR1 runs the same query again, expecting same results, but gets different results. nonrepeatable read –TR1 reads a value –TR2 changes the value –TR1 reads same value again

What can go wrong: EXAMPLE Examples: –see diagrams on pages in Mannino

Concurrency Transparency Isolation is AUTOMATICALLY enforced by the DBMS. The application programmer and the DBA do not need to do anything other than start and commit the transactions. This is knows as "Concurrency transparency"

How does DBMS enforce Isolation?

Simple method : Sequential Execution Isolation via Sequential Execution –DBMS can wait to perform a transaction until all other transactions in the system have been committed. –This would cause VERY BAD performance for a multi-user application. –The goal of isolation is to make it look to the user like the DBMS is doing sequential execution.

Term: Transaction Throughput Transaction Throughput –The number of transactions a DBMS can perform per unit time –(more is better)

Motivation for studying ISOLATION

Why Do DBAs need to understand concurrency? Concurrency control adds overhead to DBMS processing. Transactions can be structured to minimize or maximize the amount of work the DBMS needs to do.

Locking

Granularity of locks Coarsest to finest –database lock (entire database –table locks –row locks –field lock Other types of locks –page locks (i.e. physical records on disk or pages in memory) –index locks

Locking and Efficiency Coarse locks improve overall performance but can cause individual transactions to wait a long time Fine locks improve perception among users but can decrease overall performance

Lock promotion DBMS "concurrency control manager" may automatically "promote" a lock to a coarser grained lock if it determines that would greatly improve efficiency.

DEADLOCK Deadlock –example: trying to reserve a seat on each leg of a two leg journey –(speak this out)

Deadlock recovery –DBMS chooses one of the deadlocked transactions and automatically does a ROLLBACK on it. –Other transaction(s) can then proceed. Deadlock detection vs. Timeouts –Deadlock detection algorithms are expensive to implement. –DBMS often uses timeouts to determine which transactions are deadlocked. –Timeout values should be chosen appropriately for the application. –In general, transactions should BE SHORT LIVED.

Types of Locks –Shared lock (AKA read lock) –Exclusive lock (AKA write lock)

2 phase locking

–ALL transactions in the database must follow the following rule: A transaction must not acquire any new locks after releasing any lock –This will avoid the "lost updates" problem

Another modification Hold all exclusive (i.e. write) locks to end of transaction –This will avoid the "uncommitted dependency" problem.

One more modification Hold all shared (i.e. read) locks until end of transaction –This eliminates the following problems: incorrect summary nonrepeatable read phantom read

Optimistic concurrency control Check to see if there is a conflict and do a ROLLBACK if there is few conflicts = better performance than locking

Isolation Levels See chart on p. 558 in Mannino –READ UNCOMMITTED –READ COMMITTED –REPEATABLE READ –SERIALIZABLE Example –SET TRANSACTION ISOLATION LEVEL READ COMMITTED

Performance Issues

Store LOG File on different Hard drive

END OF PRESENTATION

CHECKPOINTS

Checkpoints Changes to the underlying tables are not always written out to permanent storage when they happen. Changes can reside in memory (volatile storage) until a "checkpoint" happens. The DBMS will occasionally ensure that all changes to the underlying tables (not the log file) are written out. This is called a checkpoint.

Immediate Update Immediate update –DB writes Table data AFTER log file