Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSC 453 Database Systems Lecture

Similar presentations


Presentation on theme: "CSC 453 Database Systems Lecture"— Presentation transcript:

1 CSC 453 Database Systems Lecture
Tanu Malik College of CDM DePaul University

2 Today Normalization-BCNF Transactions

3 Normalization Review CourseID | SemesterID | Course Name | | IT101 | | Programming | IT101 | | Programming | IT102 | | Databases | IT102 | | Databases | IT103 | | Web Design |

4 |-----Primary Key----| CourseID | SemesterID | TeacherID | TeacherName| | IT101 | | 332 | Mr Jones | IT101 | | 332 | Mr Jones | IT102 | | 495 | Mr Bentley | IT102 | | 332 | Mr Jones | IT103 | | 242 | Mrs Smith |

5 Prime Vs Non-Prime Attributes
Prime Attribute: Any attribute that is specified in the candidate key Non-Prime Attribute: Any attribute not part of the candidate key

6 3NF-Normalization Algorithm (3NF Normalization):
Input: Relation R with FDs F c Output: 3NF decomposition D of R D = {} For every XY in F add sub-relation Q =(XY) to D, unless some sub-relation in D already contains all of XY: don’t add Q some sub-relation(S) in D is contained in XY: replace S with Q(XY) If no relation in D contains a key of R, then add new relation Q(X) on some key X of R

7 3NF Decomposition R={A, B, C, D, E, F} F = { AB  CD, C  EF, D  A }
Non-prime determines a prime attribute

8 3NF Pay(Employee, Grade, Salary) F= {E  G, E  S, G  S}

9 The End Result A collection of relations, each in 3NF
Each relation has a primary key (We are assuming that there is only one candidate key…) Every non-prime attribute in a relation is determined by its entire primary key No non-prime attribute in a relation is determined by any attributes other than its entire primary key Information can reconstructed using joins, and stored in views if desired

10 3NF-Checking Order What is the candidate key? Is F in minimal cover?
Which FDs violate 3NF? Decompose to 3NF. Is the decomposition lossless? Is it dependency preserving?

11 Boyce-Codd Normal Form
Boyce-Codd Normal Form (BCNF): For every non-trivial functional dependency XA, it must be the case that X is a superkey “Every determinant must contain a candidate key” X must be a superkey even if A is a prime attribute

12 BCNF example Pizza |Topping |Topping Type | | |mozzarella|cheese 1 |pepperoni |meat 1 |olives |vegetable 2 |mozzarella | meat 2 |sausage |cheese 2 |peppers |vegetable Pizza can have exactly 3 types of topping One type of cheese One type of meat One type of vegetable

13 Decompose Pizza |Topping | | | 1 |mozzarella| 1 |pepperoni | 1 |olives | 2 |mozzarella | 2 |sausage | 2 |peppers | Topping |Topping Type | mozzarella|cheese pepperoni |meat olives |vegetable mozzarella | meat sausage |meat peppers |vegetable

14 Example 3NF vs BCNF Client, Office  (Client, Office, Account)
Joe 1 B Mary John C 2 Client, Office  (Client, Office, Account) Account  Office

15 BCNF Decomposition Input: A universal relation R and a set of functional dependencies F on R Output: A decomposition D of R into BCNF schemas with nonadditive join Algorithm on next page Algorithm does not guarantee dependency preservation

16 BCNF Decomposition Algorithm ALGORITHM BCNF (R: Relation, F: FD set)
BEGIN 1. D  {R} 3. While some X → Y holds in some Ri(A1,…,An) in D and (X → Y) is not trivial, X is not a superkey of Ri Ri1  X+ ∩({A1,…,An}) Ri2  X  ({A1,…,An} - X+ ) Result  Result – {Ri}  {Ri1,Ri2} 4. Return result END

17 BCNF Example: R = (A, B, C) F = {A → B, B → C} Is R in BCNF?
A: Consider the nontrivial dependencies in F: 1. A → B, A → R (A is a key) 2. B → C, B → A (B is not a key) Therefore, R not in BCNF

18 BCNF Example: Q: Is the decomposition lossless?
R = R1  R2 R1 = (A, B); R2 = (B, C) F = {A → B, B → C} Are R1, R2 in BCNF? A: 1. Test R1: A → B covered, A → R1 (all other FD’s covered trivial) 2. Test R2: B → C covered, B → R2 (all other FD’s covered trivial)  R1, R2 in BCNF Q: Is the decomposition lossless?

19 BCNF Decompose R into BCNF: R = (A, B, C, D, E, H)
F = {A → BC, E → HA} Decompose R into BCNF:

20 BCNF Decomposition Decomposition #1: R = R1  R3  R4 Q: Is this DP?
R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #1: R = R1  R3  R4 R = (A, B, C, D, E, H) Decompose on A → BC R1 = (A, B, C) R2 = (A, D, E, H) Decompose on E → HA R3 = (A, E, H) R4 = (D, E) Q: Is this DP? A: Yes. All Fc covered by R1, R3, R4. Therefore F+ covered

21 BCNF Decomposition (cont.)
R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #2: R = R1  R3  R5  R6 R = (A, B, C, D, E, H) Decompose on A → B R1 = (A, B) R2 = (A, C, D, E, H) Decompose on E → HA R3 = (A, E, H) R4 = (C, D, E) Decompose on E → C R5 = (C, E) R6 = (E, D) Q: Not DP. Why? A: A → C not covered by R1, R3, R5 , R6.

22 More BCNF (cont.) Q: Can we decompose on FD’s in Fc to get a DP BCNF decomposition? A: Sometimes, BCNF + DP not possible Example: R = (J, K, L) F = {JK → L, L → K} (Fc = F) Decompose on Or: JK → L L → K L → K JK → L

23 Not DP: JK → L not covered
More BCNF (cont.) Q: Can we decompose on FD’s in Fc to get a DP BCNF decomposition? A: Sometimes, BCNF + DP not possible R = (J, K, L) F = {JK → L, L → K} Decomposition #1: Decomposition #2: R = (J, K, L) Decompose on L → K R = (J, K, L) Decompose on JK → L R1 = (L, K) R2 = (J, L) R2 = (J, K, L) R2 = (L, K) Not DP: JK → L not covered

24 BCNF Decomposition R(A, B, C, D, E, F) F = { AB  CD, C  EF, D  A }
(AC) (AB) (DB) (CB)

25 BCNF Decomposition R (S, P, Q, X, Y, N, C)
F = { S  NC, P  XY, SP  Q , QP } Decompose to BCNF Is it dependency preserving?

26 Properties of Decompositions
When we work with BCNF, we must look at properties involving multiple relations: Nonadditive (Lossless) Join: No tuples that are not in the original relation (spurious tuples) are generated when decomposed relations are joined Dependency Preservation: Every functional dependency in the original relation is represented somewhere in the decomposition

27 BCNF vs. 3NF Every relation in BCNF is in 3NF
Not every relation in 3NF is in BCNF 3NF relations that are not in BCNF fail because some prime attribute is determined by something that is not a superkey – this is allowed by 3NF but not by BCNF Decomposing tables into BCNF can be tricky – functional dependencies can be lost!

28 Remarks on Algorithms Different runs may yield different results, depending on the order in which attributes and functional dependencies are considered We must know all functional dependencies We can’t always guarantee dependency preservation for BCNF, but we can generate a 3NF decomposition and then consider the individual relations in the result

29 Transactions Motivated by two independent requirements
Concurrent database access Resilience to system failures

30 Transactions Concurrent Database Access DBMS Data More software
Even more software Select… Update… Create Table… Drop Index… Help… Delete… More software DBMS Data

31 Transactions Concurrent Access: Attribute-level Inconsistency
Select enrollment from College Where cName = ‘DePaul’ Update College Set enrollment = enrollment Where cName = ‘DePaul’ concurrent with … C2 Select enrollment from College Where cName = ‘DePaul’ Update College Set enrollment = enrollment Where cName = ‘DePaul’

32 Transactions: Flights Example
Flights(fltNo,fltDate,seatNo,seatStatus) select seatNo from Flights Where fltNo= 123 and fltDate = ‘ ’ and seatStatus = ‘available’; Update Flights set seatStatus = ‘occuped’; Where fltNo= 123 and fltDate = ‘ ’ and seatNo = ’22A’; concurrent with … C2 select * select seatNo from Flights Where fltNo= 123 and fltDate = ‘ ’ and seatStatus = ‘available’; Update Flights set seatStatus = ‘occuped’; Where fltNo= 123 and fltDate = ‘ ’ and seatNo = ’22A’; Both modifying the apply record for the student id = 123

33 Transactions Concurrent Access: Tuple-level Inconsistency
select * from Apply Where sID = 123 Update Apply Set major = ‘CS’ Where sID = 123 concurrent with … C2 select * from Apply Where sID = 123 Update Apply Set decision = ‘Y’ Where sID = 123 Both modifying the apply record for the student id = 123

34 Transactions Concurrent Access: Table-level Inconsistency
Update Apply Set decision = ‘Y’ Where sID In (Select sID From Student Where GPA > 3.9) concurrent with … Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500

35 Transactions Concurrent Access: Multi-statement inconsistency
Insert Into Archive Select * From Apply Where decision = ‘N’; Delete From Apply Where decision = ‘N’; concurrent with … Select Count(*) From Apply; Select Count(*) From Archive;

36 Transactions Concurrency Goal
Execute sequence of SQL statements so they appear to be running as a group or “in isolation” Simple solution: execute them in isolation But want to enable concurrency whenever safe to do so Database systems are geared toward performance. They typically operate in concurrent (multi- processor/multi-threaded/asynchronous I/O) environments. Clients may work on different parts of the DBMS

37 Transactions System Failure DBMS Data More software Select… Update…
Create Table… Drop Index… Help… Delete… More software DBMS Data

38 Transactions Resilience to System Failures Bulk Load DBMS Data

39 Transactions Resilience to System Failures DBMS Data
Insert Into Archive Select * From Apply Where decision = ‘N’; Delete From Apply Where decision = ‘N’; DBMS Data

40 Transactions Resilience to System Failures DBMS Lots of updates
buffered in memory DBMS Data

41 Transactions System-Failure Goal
Guarantee all-or-nothing execution, regardless of failures DBMS Data

42 Transactions Transactions Solution for both concurrency and failures
A transaction is a sequence of one or more SQL operations treated as a unit Transactions appear to run in isolation If the system fails, each transaction’s changes are reflected either entirely or not at all Transactions

43 Transactions Transactions Solution for both concurrency and failures
A transaction is a sequence of one or more SQL operations treated as a unit. SQL standard: Transaction begins with a Begin Transaction statement On “commit” transaction ends and new one begins Current transaction ends on session termination “Autocommit” turns each statement into transaction Transactions

44 Transactions Transactions Solution for both concurrency and failures
A transaction is a sequence of one or more SQL operations treated as a unit Transactions appear to run in isolation If the system fails, each transaction’s changes are reflected either entirely or not at all Transactions

45 Transactions A Atomicity all-or-nothing 3
Every time a DBMS encounters a transaction, the DBMS software guarantees the following ACID Properties Meaning Order A Atomicity all-or-nothing 3 C Consistency consistent DB state 4 I Isolation appear to act in isolation 1 D Durability commits are persistent 2

46 Transactions . . . DBMS (ACID Properties) Isolation Serializability
Serializability Operations may be interleaved, but execution must be equivalent to some sequential (serial) order of all transactions DBMS Data

47 Serializability Basic Assumption– Each transaction preserves database consistency. Thus, serial execution of a set of transactions preserves database consistency. A (possibly concurrent) schedule is serializable if it is equivalent to a serial schedule.

48 Schedules Schedules– a sequences of instructions that specify the chronological order in which instructions of concurrent transactions are executed A schedule for a set of transactions must consist of all instructions of those transactions Must preserve the order in which the instructions appear in each individual transaction. A transaction that successfully completes its execution will have a commit instructions as the last statement By default transaction assumed to execute commit instruction as its last step A transaction that fails to successfully complete its execution will have an abort instruction as the last statement

49 Schedule 1 Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B. An example of a serial schedule in which T1 is followed by T2 :

50 Schedule 2 A serial schedule in which T2 is followed by T1 :

51 Schedule 3 Let T1 and T2 be the transactions defined previously. The following schedule is not a serial schedule, but it is equivalent to Schedule 1. Note -- In schedules 1, 2 and 3, the sum “A + B” is preserved.

52 Schedule 4 The following concurrent schedule does not preserve the sum of “A + B”

53 Transactions Concurrent Access: Attribute-level Inconsistency
Select enrollment from College Where cName = ‘DePaul’ S1 Update College Set enrollment = enrollment Where cName = ‘DePaul’ S2 concurrent with … C2 Select enrollment from College Where cName = ‘DePaul’ S3 S4 Update College Set enrollment = enrollment Where cName = ‘DePaul’

54 Transactions Concurrent Access: Tuple-level Inconsistency
select * from Apply Where sID = 123 Update Apply Set major = ‘CS’ Where sID = 123 S2 concurrent with … C2 S3 select * from Apply Where sID = 123 Update Apply Set decision = ‘Y’ Where sID = 123 S4 Both modifying the apply record for the student id = 123

55 Transactions Concurrent Access: Table-level Inconsistency
Update Apply Set decision = ‘Y’ Where sID In (Select sID From Student Where GPA > 3.9) C1 concurrent with … C2 Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500

56 Transactions Concurrent Access: Multi-statement inconsistency
Insert Into Archive Select * From Apply Where decision = ‘N’; Delete From Apply Where decision = ‘N’; concurrent with … Select Count(*) From Apply; Select Count(*) From Archive;

57 Transactions DBMS (ACID Properties) Durability If system crashes
after transaction commits, all effects of transaction remain in database DBMS Data

58 Transactions DBMS (ACID Properties) Atomicity Each transaction is
“all-or-nothing,” never left half done DBMS Data

59 Transactions Transaction Rollback (= Abort)
Undoes partial effects of transaction Can be system- or client-initiated Each transaction is “all-or-nothing,” never left half done Begin Transaction; <get input from user> SQL commands based on input <confirm results with user> If ans=‘ok’ Then Commit; Else Rollback;

60 Transactions . . . DBMS (ACID Properties) Consistency
Each client, each transaction: Can assume all constraints hold when transaction begins Must guarantee all constraints hold when transaction ends Serializability  constraints always hold DBMS Data

61 Transactions . . . DBMS (ACID Properties) Isolation Serializability
Serializability Operations may be interleaved, but execution must be equivalent to some sequential (serial) order of all transactions DBMS  Overhead  Reduction in concurrency Data

62 Transactions  Overhead  Concurrency  Consistency Guarantees . . .
(ACID Properties) Isolation Weaker “Isolation Levels” Read Uncommitted Read Committed Repeatable Read Strongest “Isolation Levels” Serilizable order DBMS  Overhead  Concurrency Data  Consistency Guarantees

63 Transactions . . . Isolation Levels DBMS Per transaction
“In the eye of the beholder” My transaction is Read Uncommitted My transaction is Repeatable Read DBMS Data

64 Transactions Dirty Reads
“Dirty” data item: written by an uncommitted transaction Select enrollment from College Where cName = ‘DePaul’ Update College Set enrollment = enrollment Where cName = ‘DePaul’ concurrent with … Select Avg(enrollment) From College

65 Transactions Dirty Reads
“Dirty” data item: written by an uncommitted transaction Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500 concurrent with … Select GPA From Student Where sID = 123 concurrent with … Update Student Set sizeHS = 2600 Where sID = 234

66 Transactions Serializable Strongest isolation level SQL Default
Read Uncommitted A data item is dirty if it is written by an uncommitted transaction. Problem of reading dirty data written by another uncommitted transaction: what if that transaction eventually aborts?

67 Transactions Isolation Level Read Uncommitted
A transaction may perform dirty reads Select Set GPA from Student Where sizeHS > 2500 Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500 concurrent with … Select Avg(GPA) From Student

68 Transactions Isolation Level Read Uncommitted
A transaction may perform dirty reads Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500 concurrent with … Set Transaction Isolation Level Read Uncommitted; Select Avg(GPA) From Student;

69 Transactions Read Committed
Cannot read dirty data written by other uncommitted transactions. But read-committed is still not necessarily serializable Repeatable Read If a tuple is read once, then the same tuple must be retrieved again if query is repeated. Still not serilizable; may see phantom tuples—tuples inserted by other concurrent transactions.

70 Transactions Isolation Level Read Committed
A transaction may not perform dirty reads Still does not guarantee global serializability Select GPA From Student Where sizeHS > 2500 Update Student Set GPA = (1.1)  GPA Where sizeHS > 2500 concurrent with … Set Transaction Isolation Level Read Committed; Select Avg(GPA) From Student; Select Max(GPA) From Student;

71 Transactions Isolation Level Repeatable Read
A transaction may not perform dirty reads An item read multiple times cannot change value Still does not guarantee global serializability Update Student Set GPA = (1.1)  GPA; Update Student Set sizeHS = 1500 Where sID = 123; concurrent with … Set Transaction Isolation Level Repeatable Read; Select Avg(GPA) From Student; Select Avg(sizeHS) From Student;

72 Transactions Isolation Level Repeatable Read
A transaction may not perform dirty reads An item read multiple times cannot change value But a relation can change: “phantom” tuples Insert Into Student [ 100 new tuples ] concurrent with … Set Transaction Isolation Level Repeatable Read; Select Avg(GPA) From Student; Select Max(GPA) From Student;

73 Transactions Isolation Level Repeatable Read
A transaction may not perform dirty reads An item read multiple times cannot change value But a relation can change: “phantom” tuples Delete From Student [ 100 tuples ] concurrent with … Set Transaction Isolation Level Repeatable Read; Select Avg(GPA) From Student; Select Max(GPA) From Student;

74 Transactions Read Only transactions Helps system optimize performance
Independent of isolation level Set Transaction Read Only; Set Transaction Isolation Level Repeatable Read; Select Avg(GPA) From Student; Select Max(GPA) From Student;

75 Transactions Isolation Levels: Summary dirty reads nonrepeatable reads
phantoms Read Uncommitted Read Committed Repeatable Read Serializable

76 Transactions Isolation Levels: Summary Weaker isolation levels
Standard default: Serializable Weaker isolation levels Increased concurrency + decreased overhead = increased performance Weaker consistency guarantees Some systems have default Repeatable Read Isolation level per transaction and “eye of the beholder” Each transaction’s reads must conform to its isolation level


Download ppt "CSC 453 Database Systems Lecture"

Similar presentations


Ads by Google