CS411 Database Systems Kazuhiro Minami 16: Final Review Session.

Slides:



Advertisements
Similar presentations
CM20145 Concurrency Control
Advertisements

Concurrency Control III. General Overview Relational model - SQL Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Concurrency Control II. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 5: Tree-based Concurrency Control and Validation Currency Control Professor.
1 Supplemental Notes: Practical Aspects of Transactions THIS MATERIAL IS OPTIONAL.
Jinze Liu. Have studied C.C. mechanisms used in practice - 2 PL - Multiple granularity - Tree (index) protocols - Validation.
Database Systems, 8 th Edition Concurrency Control with Time Stamping Methods Assigns global unique time stamp to each transaction Produces explicit.
Chapter 6 Process Synchronization: Part 2. Problems with Semaphores Correct use of semaphore operations may not be easy: –Suppose semaphore variable called.
Concurrent Transactions Even when there is no “failure,” several transactions can interact to turn a consistent state into an inconsistent state.
SECTION 18.8 Timestamps. What is Timestamping? Scheduler assign each transaction T a unique number, it’s timestamp TS(T). Timestamps must be issued in.
CONCURRENCY CONTROL SECTION 18.7 THE TREE PROTOCOL By : Saloni Tamotia (215)
CS 582 / CMPE 481 Distributed Systems Concurrency Control.
CS 245Notes 101 CS 245: Database System Principles Notes 10: More TP Hector Garcia-Molina.
Quick Review of May 1 material Concurrent Execution and Serializability –inconsistent concurrent schedules –transaction conflicts serializable == conflict.
Concurrency III (Timestamps). Schedulers A scheduler takes requests from transactions for reads and writes, and decides if it is “OK” to allow them to.
Transaction Management and Concurrency Control
Transaction Management and Concurrency Control
Concurrency III (Timestamps). Serializability Via Timestamps (Continued) Every DB element X has two timestamps: 1. RT (X) = highest timestamp of a transaction.
What is a Transaction? Logical unit of work
Concurrency. Busy, busy, busy... In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it.
Concurrency. Correctness Principle A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction,
Cs4432concurrency control1 CS4432: Database Systems II Concurrency Control with Recovery.
1 Lecture 19: Concurrency Control Friday, February 18, 2005.
Concurrency Control by Validation (Section 18.9) Priyadarshini.S Cs_257_117_ch 18_18.9.
18.8 Concurrency Control by Timestamps - Dongyi Jia - CS257 ID:116 - Spring 2008.
18.8 Concurrency Control by Timestamps CS257 Student Chak P. Li.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
System Catalogue v Stores data that describes each database v meta-data: – conceptual, logical, physical schema – mapping between schemata – info for query.
Prepared By: Ronak Shah Professor :Dr. T. Y Lin ID: 116.
Data Concurrency Control And Data Recovery
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
CS411 Database Systems Kazuhiro Minami 14: Concurrency Control.
CS 245Notes 101 CS 245: Database System Principles Notes 10: More TP Hector Garcia-Molina.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Concurrency Control in Database Operating Systems.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two schedules are conflict equivalent if:  Involve the same actions of the same.
1 Final Review Tuesday, March 6, The Final Date: Tuesday, March 13, 2007 Time: 6:30 - 8:30 Room: EE 037 You must come to campus Open book exam.
7c.1 Silberschatz, Galvin and Gagne ©2003 Operating System Concepts with Java Module 7c: Atomicity Atomic Transactions Log-based Recovery Checkpoints Concurrent.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
Concurrency Control Introduction Lock-Based Protocols
1 Concurrency control lock-base protocols timestamp-based protocols validation-based protocols Ioan Despi.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
6340 DBMS Components. DBMS OS, application, middleware Components: storage, query optimizer, recovery manager, transaction processor, security.
CSE544: Transactions Concurrency Control Wednesday, 4/26/2006.
1 CSE232A: Database System Principles More Concurrency Control and Transaction Processing.
9 1 Chapter 9_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
Lecture 9- Concurrency Control (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
Jinze Liu. Tree-based concurrency control Validation concurrency control.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
Transaction Management Transparencies
Concurrency Control via Timestamps
Chapter 10 Transaction Management and Concurrency Control
Section 18.8 : Concurrency Control – Timestamps -CS 257, Rahul Dalal - SJSUID: Edited by: Sri Alluri (313)
Concurrency Control Chapter 17
Introduction to Database Systems CSE 444 Lecture 23: Final Review
Final Review Datalog (Ch 10) Rules and queries
Concurrency Control by Validation
Concurrency Control by Validation
Concurrency Control Chapter 17
Lecture 19: Concurrency Control
Transaction management
Transaction Management
Introduction to Database Systems CSE 444 Lecture 23: Final Review
Introduction to Database Systems CSE 444 Lectures 17-18: Concurrency Control November 5-7, 2007.
Final Review Friday, December 8, 2006.
Presentation transcript:

CS411 Database Systems Kazuhiro Minami 16: Final Review Session

Concurrency Control

Concurrency Control – Basic concepts What is a transaction? Which actions do we consider in a transaction? How to represent a transaction? What is a schedule? What is the goal of concurrency control? What is a serial schedule? What is a serializable schedule? What is a conflict-serializable schedule? What are conflicting swaps? How to determine whether a schedule is conflict- serializable?

Basic Concepts on Locks What is a lock? What is a lock table? What kind of information is stored there? What is the consistency of transactions? What is the legality of transactions? What is the notations for actions of locking and unlocking? What is the job of the locking scheduler? What is the two-phase locking (2PL) condition? What type of serializable schedules are produced with this approach? 4

Two-phase locked schedule T 1 : r 1 (A);w 1 (B); T 2 : r 2 (A);w 2 (A);w 2 (B); 1.Make T 1 and T 2 consistent, but not 2PL by adding lock and unlock actions 2.Give a legal, but not serializable schedule of T 1 and T 2 in question 1 3.Make T 1 and T 2 consistent and 2PL by adding lock and unlock actions 4.Give a legal schedule of T 1 and T 2 in question 3 5

Concepts on Timestamp-based Concurrency Control What is a timestamp? When is a timestamp assigned to a transaction? What’s the notation for transaction T’s timestamp? What information do we maintain for each database element X? What type of serializable schedules are produced with this approach? How the timestamp-based approach solve the problem of “dirty” read? 6

Assumed Serial Schedule Conflict serializable schedule that is equivalent to a serial schedule in which the timestamp order of transactions is the order to execute them T starts U starts V starts Actual schedule T starts U starts V starts Serial schedule

How to detect T’s reading X too late? T startU start U write X T read X

How can you detect T’s writing to late? 9 T startU start U read XT write X

Prevention of Dirty Read How to prevent transaction T from reading data written by uncommitted transaction U? We want T to wait until U commits 10 T start U start U write X T read X U aborts

Thomas Write Rule Why can we skip transaction T’s write on element X, which is already modified by transaction U? What if there is another transaction V starting after T but before U? What if V starts after U? 11 T start U start U writes XT writes X

Another Problem with Dirty Data T start U start U writes X T writes X U aborts Thomas write rule: T’s write can be skipped if But, we want T to wait until U commits T commits TS(T) < WT(X) We need to restore the previous value for X, but…

Concurrency Control by Timestamps Tell me what happens as each executes –st 1 ; r 1 (A); st 2 ; w 2 (B); r 2 (A); w 1 (B) T1 T2 A B RT=0 WT= r 1 (A) r 2 (A) RT=100 WT=200w 2 (B) RT=200 w 1 (B) OK, but needs to wait until T2 commits

Concurrency Control by Validation

Concurrency Control by Validation Another type of optimistic concurrency control Maintain a record of what active transactions are doing Just before a transaction starts to write, it goes through a “validation phase” If a there is a risk of physically unrealizable behavior, the transaction is rolled back Read actionsWrite actions Validation Transaction T

Validation-based Scheduler Keep track of each transaction T’s –Read set RS(T): the set of elements T read –Write set WS(T): the set of elements T write Execute transactions in three phases: 1.Read. T reads all the elements in RS(T) 2.Validate. Validate T by comparing its RS(T) an WS(T) with those in other transactions. If the validation fails, T is rolled back 3.Write. T writes its values for the elements in WS(T)

Assumed Serial Schedule for Validation We may think of each transaction that successfully validates as executing at the moment that it validates T validates U validates V validates Actual schedule Serial schedule T validates U validates V validates

Potential Violation: Read too Early Transactions T and U such that 1.U has validated 2.START(T) < FIN(U) 3.RS(T)  WS(U) is not empty U startT startU validated T validating T reads X U writes X

Another Potential Violation: Write too Early Two transactions T and U such that –U is in VAL –VAL(T) < FIN(U) –WS(T)  WS(U) is not empty T validating U finish T writes X U writes X U validated

Validation Rules To validate a transaction T, 1.Check that RS(T)  WS(U) is an empty set for any validated U and START(T) < FIN(U) 2. Check that WS(T)  WS(U) is an empty set for any validated U that did not finish before T validated, i.e., if VAL(T) < FIN(U)

Example Problem In the following sequence of events, tell what happens when each sequence is processed by a validation-based scheduler. R1(A,B); R2(B,C); V1; R3(C,D); V3; W1(A); V2; W2(A); W3(B); T1T2 T3 RS={B,C} WS={A} RS={C,D} WS={B} RS={A,B} WS={A}

Storage How to store records in a block? –Fixed-length record –Variable-length record What extra information do we need to store a record in a block? What information is stored in a block header? What information is stored in a record header? In which situations do we create an overflow block?

Indexing Do you know which types of index can we use in different situations? –Dense/Sparse –Secondary (unclustered) –Multi-level index B-tree –Do you understand the structure of a B-tree? Root node; internal node; leaf node –How to find/insert/delete a record? Dynamic hash table –Do you understand the structure of a dynamic hash table? Pointer array, data bucket, nub, etc. –How to find/insert/delete a record? –What are differences from static hash tables? –What are different between extensible and linear hash tables?

Query execution What are cost parameters? Can you describe how different join algorithms work? –Nested loop join; sort-based join; hash-based join Do you know the memory requirement of each algorithm? Do you know in which situation we can use a simple-sort join or sort-merge-join?

Query Optimization Do you know how to apply algebraic laws to get a better logical plan? Do you know how to perform a cost-based optimization on a logical plan? –Dynamic programming –Left-deep or Right-deep join trees Do you know how to estimate the size of an intermediate operation? –Selection –Join

Logging & Recovery What’s the correctness principle? What are primitive four operations of transactions? How does undo logging work? –Log records –Undo-logging rules –Recovery procedure How does redo logging work? How does undo/redo logging work? What is a checkpoint? What is a nonquiescent checkpoint? Do you know when is added in each method? Do you know recovery procedures with a checkpointed log? What are advantages and disadvantages of each method?