CSC 453 Database Systems Lecture

Slides:



Advertisements
Similar presentations
Transactions - Concurrent access & System failures - Properties of Transactions - Isolation Levels 4/13/2015Databases21.
Advertisements

1 Integrity Ioan Despi Transactions: transaction concept, transaction state implementation of atomicity and durability concurrent executions serializability,
Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.
Database Management Systems I Alex Coman, Winter 2006
Dec 15, 2003Murali Mani Transactions and Security B term 2004: lecture 17.
Cs3431 Transactions, Logging and Security. cs3431 Transactions: What and Why? A set of operations on a database must appear as one “unit”. Example: Consider.
Transaction Processing
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
INTRODUCTION TO TRANSACTION PROCESSING CHAPTER 21 (6/E) CHAPTER 17 (5/E)
Transactions Sylvia Huang CS 157B. Transaction A transaction is a unit of program execution that accesses and possibly updates various data items. A transaction.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
DB Transactions CS143 Notes TRANSACTION: A sequence of SQL statements that are executed "together" as one unit:
TRANSACTIONS. Objectives Transaction Concept Transaction State Concurrent Executions Serializability Recoverability Implementation of Isolation Transaction.
Transaction Lectured by, Jesmin Akhter, Assistant professor, IIT, JU.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 136 Database Systems I SQL Modifications and Transactions.
Chapter 15: Transactions Loc Hoang CS 157B. Definition n A transaction is a discrete unit of work that must be completely processed or not processed at.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Jennifer Widom Transactions Properties. Jennifer Widom Transactions Solution for both concurrency and failures A transaction is a sequence of one or more.
15.1 Transaction Concept A transaction is a unit of program execution that accesses and possibly updates various data items. E.g. transaction to transfer.
©Silberschatz, Korth and Sudarshan14.1Database System Concepts - 6 th Edition Chapter 14: Transactions Transaction Concept Transaction State Concurrent.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 14: Transactions.
1 Advanced Database Concepts Transaction Management and Concurrency Control.
10 1 Chapter 10 - A Transaction Management Database Systems: Design, Implementation, and Management, Rob and Coronel.
1 Transaction Processing Case Study. 2 Interaksi Proses There is table Sells(shop,beverage,price), and suppose that Joe’s Shop sells only Juice for $2.50.
CS 440 Database Management Systems
Transactions Introduction.
Databases We are particularly interested in relational databases
Schedule Today: Next After that Normal Forms. Section 3.6.
Schema Refinement and Normal Forms
Chapter 14: Transactions
CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
LAB: Web-scale Data Management on a Cloud
Database Management System
CS 480: Database Systems Lecture 22 March 6, 2013.
Transaction Processing
Transactions.
Transactions Isolation Levels.
Transactions Introduction.
March 21st – Transactions
Transactions Properties.
Transactions.
Chapter 15: Transactions
Module 5: Overview of Normalization
Schema Refinement What and why
Transactions, Locking and Query Optimisation
Transactions Sylvia Huang CS 157B.
Chapter 10 Transaction Management and Concurrency Control
CSC 453 Database Systems Lecture
Chapter 14: Transactions
Outline Introduction Background Distributed DBMS Architecture
Normalization Part II cs3431.
Transactions Isolation Levels.
Lecture 13: Transactions in SQL
Transaction Management Overview
Lecture 20: Intro to Transactions & Logging II
Chapter 14: Transactions
Designing Relational Databases
Chapter 14: Transactions
CSC 453 Database Systems Lecture
UNIT -IV Transaction.
Chapter 7a: Overview of Database Design -- Normalization
CSC 453 Database Systems Lecture
-Transactions in SQL -Constraints and Triggers
Advanced Topics: Indexes & Transactions
Functional Dependencies and Normalization
Lecture 11: Transactions in SQL
CS4222 Principles of Database System
Presentation transcript:

CSC 453 Database Systems Lecture Tanu Malik College of CDM DePaul University

Today Normalization-BCNF Transactions

Normalization Review CourseID | SemesterID | Course Name | ------------------------------------------------| IT101 | 2009-1 | Programming | IT101 | 2009-2 | Programming | IT102 | 2009-1 | Databases | IT102 | 2010-1 | Databases | IT103 | 2009-2 | Web Design |

|-----Primary Key----| CourseID | SemesterID | TeacherID | TeacherName| ---------------------------------------------------------------| IT101 | 2009-1 | 332 | Mr Jones | IT101 | 2009-2 | 332 | Mr Jones | IT102 | 2009-1 | 495 | Mr Bentley | IT102 | 2010-1 | 332 | Mr Jones | IT103 | 2009-2 | 242 | Mrs Smith |

Prime Vs Non-Prime Attributes Prime Attribute: Any attribute that is specified in the candidate key Non-Prime Attribute: Any attribute not part of the candidate key

3NF-Normalization Algorithm (3NF Normalization): Input: Relation R with FDs F c Output: 3NF decomposition D of R D = {} For every XY in F add sub-relation Q =(XY) to D, unless some sub-relation in D already contains all of XY: don’t add Q some sub-relation(S) in D is contained in XY: replace S with Q(XY) If no relation in D contains a key of R, then add new relation Q(X) on some key X of R

3NF Decomposition R={A, B, C, D, E, F} F = { AB  CD, C  EF, D  A } Non-prime determines a prime attribute 

3NF Pay(Employee, Grade, Salary) F= {E  G, E  S, G  S}

The End Result A collection of relations, each in 3NF Each relation has a primary key (We are assuming that there is only one candidate key…) Every non-prime attribute in a relation is determined by its entire primary key No non-prime attribute in a relation is determined by any attributes other than its entire primary key Information can reconstructed using joins, and stored in views if desired

3NF-Checking Order What is the candidate key? Is F in minimal cover? Which FDs violate 3NF? Decompose to 3NF. Is the decomposition lossless? Is it dependency preserving?

Boyce-Codd Normal Form Boyce-Codd Normal Form (BCNF): For every non-trivial functional dependency XA, it must be the case that X is a superkey “Every determinant must contain a candidate key” X must be a superkey even if A is a prime attribute

BCNF example Pizza |Topping |Topping Type ------- |-------------|------------- 1 |mozzarella|cheese 1 |pepperoni |meat 1 |olives |vegetable 2 |mozzarella | meat 2 |sausage |cheese 2 |peppers |vegetable Pizza can have exactly 3 types of topping One type of cheese One type of meat One type of vegetable

Decompose Pizza |Topping | ------- |-------------| 1 |mozzarella| 1 |pepperoni | 1 |olives | 2 |mozzarella | 2 |sausage | 2 |peppers | Topping |Topping Type -------------|------------- mozzarella|cheese pepperoni |meat olives |vegetable mozzarella | meat sausage |meat peppers |vegetable

Example 3NF vs BCNF Client, Office  (Client, Office, Account) Joe 1 B Mary John C 2 Client, Office  (Client, Office, Account) Account  Office

BCNF Decomposition Input: A universal relation R and a set of functional dependencies F on R Output: A decomposition D of R into BCNF schemas with nonadditive join Algorithm on next page Algorithm does not guarantee dependency preservation

BCNF Decomposition Algorithm ALGORITHM BCNF (R: Relation, F: FD set) BEGIN 1. D  {R} 3. While some X → Y holds in some Ri(A1,…,An) in D and (X → Y) is not trivial, X is not a superkey of Ri Ri1  X+ ∩({A1,…,An}) Ri2  X  ({A1,…,An} - X+ ) Result  Result – {Ri}  {Ri1,Ri2} 4. Return result END

BCNF Example: R = (A, B, C) F = {A → B, B → C} Is R in BCNF? A: Consider the nontrivial dependencies in F: 1. A → B, A → R (A is a key) 2. B → C, B → A (B is not a key) Therefore, R not in BCNF

BCNF Example: Q: Is the decomposition lossless? R = R1  R2 R1 = (A, B); R2 = (B, C) F = {A → B, B → C} Are R1, R2 in BCNF? A: 1. Test R1: A → B covered, A → R1 (all other FD’s covered trivial) 2. Test R2: B → C covered, B → R2 (all other FD’s covered trivial)  R1, R2 in BCNF Q: Is the decomposition lossless?

BCNF Decompose R into BCNF: R = (A, B, C, D, E, H) F = {A → BC, E → HA} Decompose R into BCNF:

BCNF Decomposition Decomposition #1: R = R1  R3  R4 Q: Is this DP? R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #1: R = R1  R3  R4 R = (A, B, C, D, E, H) Decompose on A → BC R1 = (A, B, C) R2 = (A, D, E, H) Decompose on E → HA R3 = (A, E, H) R4 = (D, E) Q: Is this DP? A: Yes. All Fc covered by R1, R3, R4. Therefore F+ covered

BCNF Decomposition (cont.) R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #2: R = R1  R3  R5  R6 R = (A, B, C, D, E, H) Decompose on A → B R1 = (A, B) R2 = (A, C, D, E, H) Decompose on E → HA R3 = (A, E, H) R4 = (C, D, E) Decompose on E → C R5 = (C, E) R6 = (E, D) Q: Not DP. Why? A: A → C not covered by R1, R3, R5 , R6.

More BCNF (cont.) Q: Can we decompose on FD’s in Fc to get a DP BCNF decomposition? A: Sometimes, BCNF + DP not possible Example: R = (J, K, L) F = {JK → L, L → K} (Fc = F) Decompose on Or: JK → L L → K L → K JK → L

Not DP: JK → L not covered More BCNF (cont.) Q: Can we decompose on FD’s in Fc to get a DP BCNF decomposition? A: Sometimes, BCNF + DP not possible R = (J, K, L) F = {JK → L, L → K} Decomposition #1: Decomposition #2: R = (J, K, L) Decompose on L → K R = (J, K, L) Decompose on JK → L R1 = (L, K) R2 = (J, L) R2 = (J, K, L) R2 = (L, K) Not DP: JK → L not covered

BCNF Decomposition R(A, B, C, D, E, F) F = { AB  CD, C  EF, D  A } (AC) (AB) (DB) (CB)

BCNF Decomposition R (S, P, Q, X, Y, N, C) F = { S  NC, P  XY, SP  Q , QP } Decompose to BCNF Is it dependency preserving?

Properties of Decompositions When we work with BCNF, we must look at properties involving multiple relations: Nonadditive (Lossless) Join: No tuples that are not in the original relation (spurious tuples) are generated when decomposed relations are joined Dependency Preservation: Every functional dependency in the original relation is represented somewhere in the decomposition

BCNF vs. 3NF Every relation in BCNF is in 3NF Not every relation in 3NF is in BCNF 3NF relations that are not in BCNF fail because some prime attribute is determined by something that is not a superkey – this is allowed by 3NF but not by BCNF Decomposing tables into BCNF can be tricky – functional dependencies can be lost!

Remarks on Algorithms Different runs may yield different results, depending on the order in which attributes and functional dependencies are considered We must know all functional dependencies We can’t always guarantee dependency preservation for BCNF, but we can generate a 3NF decomposition and then consider the individual relations in the result

Transactions Motivated by two independent requirements Concurrent database access Resilience to system failures

Transactions Concurrent Database Access DBMS Data More software Even more software Select… Update… Create Table… Drop Index… Help… Delete… More software DBMS Data

Transactions Concurrent Access: Attribute-level Inconsistency Select enrollment from College Where cName = ‘DePaul’ Update College Set enrollment = enrollment + 1000 Where cName = ‘DePaul’ concurrent with … C2 Select enrollment from College Where cName = ‘DePaul’ Update College Set enrollment = enrollment + 1500 Where cName = ‘DePaul’

Transactions: Flights Example Flights(fltNo,fltDate,seatNo,seatStatus) select seatNo from Flights Where fltNo= 123 and fltDate = ‘2008-12-25’ and seatStatus = ‘available’; Update Flights set seatStatus = ‘occuped’; Where fltNo= 123 and fltDate = ‘2008-12-25’ and seatNo = ’22A’; concurrent with … C2 select * select seatNo from Flights Where fltNo= 123 and fltDate = ‘2008-12-25’ and seatStatus = ‘available’; Update Flights set seatStatus = ‘occuped’; Where fltNo= 123 and fltDate = ‘2008-12-25’ and seatNo = ’22A’; Both modifying the apply record for the student id = 123

Transactions Concurrent Access: Tuple-level Inconsistency select * from Apply Where sID = 123 Update Apply Set major = ‘CS’ Where sID = 123 concurrent with … C2 select * from Apply Where sID = 123 Update Apply Set decision = ‘Y’ Where sID = 123 Both modifying the apply record for the student id = 123

Transactions Concurrent Access: Table-level Inconsistency Update Apply Set decision = ‘Y’ Where sID In (Select sID From Student Where GPA > 3.9) concurrent with … Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500

Transactions Concurrent Access: Multi-statement inconsistency Insert Into Archive Select * From Apply Where decision = ‘N’; Delete From Apply Where decision = ‘N’; concurrent with … Select Count(*) From Apply; Select Count(*) From Archive;

Transactions Concurrency Goal Execute sequence of SQL statements so they appear to be running as a group or “in isolation” Simple solution: execute them in isolation But want to enable concurrency whenever safe to do so Database systems are geared toward performance. They typically operate in concurrent (multi- processor/multi-threaded/asynchronous I/O) environments. Clients may work on different parts of the DBMS

Transactions System Failure DBMS Data More software Select… Update… Create Table… Drop Index… Help… Delete… More software DBMS Data

Transactions Resilience to System Failures Bulk Load DBMS Data

Transactions Resilience to System Failures DBMS Data Insert Into Archive Select * From Apply Where decision = ‘N’; Delete From Apply Where decision = ‘N’; DBMS Data

Transactions Resilience to System Failures DBMS Lots of updates buffered in memory DBMS Data

Transactions System-Failure Goal Guarantee all-or-nothing execution, regardless of failures DBMS Data

Transactions Transactions Solution for both concurrency and failures A transaction is a sequence of one or more SQL operations treated as a unit Transactions appear to run in isolation If the system fails, each transaction’s changes are reflected either entirely or not at all Transactions

Transactions Transactions Solution for both concurrency and failures A transaction is a sequence of one or more SQL operations treated as a unit. SQL standard: Transaction begins with a Begin Transaction statement On “commit” transaction ends and new one begins Current transaction ends on session termination “Autocommit” turns each statement into transaction Transactions

Transactions Transactions Solution for both concurrency and failures A transaction is a sequence of one or more SQL operations treated as a unit Transactions appear to run in isolation If the system fails, each transaction’s changes are reflected either entirely or not at all Transactions

Transactions A Atomicity all-or-nothing 3 Every time a DBMS encounters a transaction, the DBMS software guarantees the following ACID Properties Meaning Order A Atomicity all-or-nothing 3 C Consistency consistent DB state 4 I Isolation appear to act in isolation 1 D Durability commits are persistent 2

Transactions . . . DBMS (ACID Properties) Isolation Serializability . . . Serializability Operations may be interleaved, but execution must be equivalent to some sequential (serial) order of all transactions DBMS Data

Serializability Basic Assumption– Each transaction preserves database consistency. Thus, serial execution of a set of transactions preserves database consistency. A (possibly concurrent) schedule is serializable if it is equivalent to a serial schedule.

Schedules Schedules– a sequences of instructions that specify the chronological order in which instructions of concurrent transactions are executed A schedule for a set of transactions must consist of all instructions of those transactions Must preserve the order in which the instructions appear in each individual transaction. A transaction that successfully completes its execution will have a commit instructions as the last statement By default transaction assumed to execute commit instruction as its last step A transaction that fails to successfully complete its execution will have an abort instruction as the last statement

Schedule 1 Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B. An example of a serial schedule in which T1 is followed by T2 :

Schedule 2 A serial schedule in which T2 is followed by T1 :

Schedule 3 Let T1 and T2 be the transactions defined previously. The following schedule is not a serial schedule, but it is equivalent to Schedule 1. Note -- In schedules 1, 2 and 3, the sum “A + B” is preserved.

Schedule 4 The following concurrent schedule does not preserve the sum of “A + B”

Transactions Concurrent Access: Attribute-level Inconsistency Select enrollment from College Where cName = ‘DePaul’ S1 Update College Set enrollment = enrollment + 1000 Where cName = ‘DePaul’ S2 concurrent with … C2 Select enrollment from College Where cName = ‘DePaul’ S3 S4 Update College Set enrollment = enrollment + 1500 Where cName = ‘DePaul’

Transactions Concurrent Access: Tuple-level Inconsistency select * from Apply Where sID = 123 Update Apply Set major = ‘CS’ Where sID = 123 S2 concurrent with … C2 S3 select * from Apply Where sID = 123 Update Apply Set decision = ‘Y’ Where sID = 123 S4 Both modifying the apply record for the student id = 123

Transactions Concurrent Access: Table-level Inconsistency Update Apply Set decision = ‘Y’ Where sID In (Select sID From Student Where GPA > 3.9) C1 concurrent with … C2 Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500

Transactions Concurrent Access: Multi-statement inconsistency Insert Into Archive Select * From Apply Where decision = ‘N’; Delete From Apply Where decision = ‘N’; concurrent with … Select Count(*) From Apply; Select Count(*) From Archive;

Transactions DBMS (ACID Properties) Durability If system crashes after transaction commits, all effects of transaction remain in database DBMS Data

Transactions DBMS (ACID Properties) Atomicity Each transaction is “all-or-nothing,” never left half done DBMS Data

Transactions Transaction Rollback (= Abort) Undoes partial effects of transaction Can be system- or client-initiated Each transaction is “all-or-nothing,” never left half done Begin Transaction; <get input from user> SQL commands based on input <confirm results with user> If ans=‘ok’ Then Commit; Else Rollback;

Transactions . . . DBMS (ACID Properties) Consistency Each client, each transaction: Can assume all constraints hold when transaction begins Must guarantee all constraints hold when transaction ends . . . Serializability  constraints always hold DBMS Data

Transactions . . . DBMS (ACID Properties) Isolation Serializability . . . Serializability Operations may be interleaved, but execution must be equivalent to some sequential (serial) order of all transactions DBMS  Overhead  Reduction in concurrency Data

Transactions  Overhead  Concurrency  Consistency Guarantees . . . (ACID Properties) Isolation Weaker “Isolation Levels” Read Uncommitted Read Committed Repeatable Read . . . Strongest “Isolation Levels” Serilizable order DBMS  Overhead  Concurrency Data  Consistency Guarantees

Transactions . . . Isolation Levels DBMS Per transaction “In the eye of the beholder” . . . My transaction is Read Uncommitted My transaction is Repeatable Read DBMS Data

Transactions Dirty Reads “Dirty” data item: written by an uncommitted transaction Select enrollment from College Where cName = ‘DePaul’ Update College Set enrollment = enrollment + 1000 Where cName = ‘DePaul’ concurrent with … Select Avg(enrollment) From College

Transactions Dirty Reads “Dirty” data item: written by an uncommitted transaction Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500 concurrent with … Select GPA From Student Where sID = 123 concurrent with … Update Student Set sizeHS = 2600 Where sID = 234

Transactions Serializable Strongest isolation level SQL Default Read Uncommitted A data item is dirty if it is written by an uncommitted transaction. Problem of reading dirty data written by another uncommitted transaction: what if that transaction eventually aborts?

Transactions Isolation Level Read Uncommitted A transaction may perform dirty reads Select Set GPA from Student Where sizeHS > 2500 Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500 concurrent with … Select Avg(GPA) From Student

Transactions Isolation Level Read Uncommitted A transaction may perform dirty reads Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500 concurrent with … Set Transaction Isolation Level Read Uncommitted; Select Avg(GPA) From Student;

Transactions Read Committed Cannot read dirty data written by other uncommitted transactions. But read-committed is still not necessarily serializable Repeatable Read If a tuple is read once, then the same tuple must be retrieved again if query is repeated. Still not serilizable; may see phantom tuples—tuples inserted by other concurrent transactions.

Transactions Isolation Level Read Committed A transaction may not perform dirty reads Still does not guarantee global serializability Select GPA From Student Where sizeHS > 2500 Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500 concurrent with … Set Transaction Isolation Level Read Committed; Select Avg(GPA) From Student; Select Max(GPA) From Student;

Transactions Isolation Level Repeatable Read A transaction may not perform dirty reads An item read multiple times cannot change value Still does not guarantee global serializability Update Student Set GPA = (1.1)  GPA; Update Student Set sizeHS = 1500 Where sID = 123; concurrent with … Set Transaction Isolation Level Repeatable Read; Select Avg(GPA) From Student; Select Avg(sizeHS) From Student;

Transactions Isolation Level Repeatable Read A transaction may not perform dirty reads An item read multiple times cannot change value But a relation can change: “phantom” tuples Insert Into Student [ 100 new tuples ] concurrent with … Set Transaction Isolation Level Repeatable Read; Select Avg(GPA) From Student; Select Max(GPA) From Student;

Transactions Isolation Level Repeatable Read A transaction may not perform dirty reads An item read multiple times cannot change value But a relation can change: “phantom” tuples Delete From Student [ 100 tuples ] concurrent with … Set Transaction Isolation Level Repeatable Read; Select Avg(GPA) From Student; Select Max(GPA) From Student;

Transactions Read Only transactions Helps system optimize performance Independent of isolation level Set Transaction Read Only; Set Transaction Isolation Level Repeatable Read; Select Avg(GPA) From Student; Select Max(GPA) From Student;

Transactions Isolation Levels: Summary dirty reads nonrepeatable reads phantoms Read Uncommitted Read Committed Repeatable Read Serializable

Transactions Isolation Levels: Summary Weaker isolation levels Standard default: Serializable Weaker isolation levels Increased concurrency + decreased overhead = increased performance Weaker consistency guarantees Some systems have default Repeatable Read Isolation level per transaction and “eye of the beholder” Each transaction’s reads must conform to its isolation level