Chapter 11 Grid Concurrency Control 11.1 A Grid Database Environment 11.2 An Example 11.3 Grid Concurrency Control (GCC) 11.4 Correctness of GCC 11.5 Features.

Slides:



Advertisements
Similar presentations
Serializability in Multidatabases Ramon Lawrence Dept. of Computer Science
Advertisements

Database Systems (資料庫系統)
More About Transaction Management Chapter 10. Contents Transactions that Read Uncommitted Data View Serializability Resolving Deadlocks Distributed Databases.
Concurrency Control WXES 2103 Database. Content Concurrency Problems Concurrency Control Concurrency Control Approaches.
Unit 9 Concurrency Control. 9-2 Wei-Pang Yang, Information Management, NDHU Content  9.1 Introduction  9.2 Locking Technique  9.3 Optimistic Concurrency.
1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Concurrency Control II
Cs4432concurrency control1 CS4432: Database Systems II Lecture #22 Concurrency Control: Locking-based Protocols Professor Elke A. Rundensteiner.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17 Sections
Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.
Chapter 2 Analytical Models 2.1Cost Models 2.2Cost Notations 2.3Skew Model 2.4Basic Operations in Parallel Databases 2.5Summary 2.6Bibliographical Notes.
Concurrency Control and Recovery In real life: users access the database concurrently, and systems crash. Concurrent access to the database also improves.
Distributed Systems 2006 Styles of Client/Server Computing.
Transaction Management and Concurrency Control
Chapter 13 Replica Management in Grids
Transaction Management and Concurrency Control
Transaction Management and Concurrency Control
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
Session - 14 CONCURRENCY CONTROL CONCURRENCY TECHNIQUES Matakuliah: M0184 / Pengolahan Data Distribusi Tahun: 2005 Versi:
Chapter 12 Grid Transaction Atomicity and Durability
Concurrency. Busy, busy, busy... In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it.
Transaction Management
Chapter 4 Parallel Sort and GroupBy 4.1Sorting, Duplicate Removal and Aggregate 4.2Serial External Sorting Method 4.3Algorithms for Parallel External Sort.
Chapter 3 Parallel Search 3.1Search Queries 3.2Data Partitioning 3.3Search Algorithms 3.4Summary 3.5Bibliographical Notes 3.6Exercises.
Concurrency. Correctness Principle A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction,
Chapter 10 Transactions in Distributed and Grid Databases
Chapter 5 Parallel Join 5.1Join Operations 5.2Serial Join Algorithms 5.3Parallel Join Algorithms 5.4Cost Models 5.5Parallel Join Optimization 5.6Summary.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
BACS 485—Database Management Concurrency Control Overview of Database Concurrency Control.
TRANSACTION PROCESSING TECHNIQUES BY SON NGUYEN VIJAY RAO.
© 1997 UW CSE 11/13/97N-1 Concurrency Control Chapter 18.1, 18.2, 18.5, 18.7.
TRANSACTIONS AND CONCURRENCY CONTROL Sadhna Kumari.
AN OPTIMISTIC CONCURRENCY CONTROL ALGORITHM FOR MOBILE AD-HOC NETWORK DATABASES Brendan Walker.
CS 162 Discussion Section Week 9 11/11 – 11/15. Today’s Section ●Project discussion (5 min) ●Quiz (10 min) ●Lecture Review (20 min) ●Worksheet and Discussion.
Distributed Transactions
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Distributed Database Systems Overview
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Chapter 11 Concurrency Control. Lock-Based Protocols  A lock is a mechanism to control concurrent access to a data item  Data items can be locked in.
1 Concurrency Control II: Locking and Isolation Levels.
Databases Illuminated
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Transactions and Concurrency Control Fall 2007 Himanshu Bajpai
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.
1 CSE 480: Database Systems Lecture 24: Concurrency Control.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
1 CSE232A: Database System Principles More Concurrency Control and Transaction Processing.
1 Lecture 4: Transaction Serialization and Concurrency Control Advanced Databases CG096 Nick Rossiter [Emma-Jane Phillips-Tait]
9 1 Chapter 9_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
10 1 Chapter 10_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
SHUJAZ IBRAHIM CHAYLASY GNOPHANXAY FIT, KMUTNB JANUARY 05, 2010 Distributed Database Systems | Dr.Nawaporn Wisitpongphan | KMUTNB Based on article by :
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Transaction Management and Concurrency Control
Concurrency Control.
Outline Introduction Background Distributed DBMS Architecture
Chapter 10 Transaction Management and Concurrency Control
Concurrency Control WXES 2103 Database.
Chapter 15 : Concurrency Control
Transaction management
UNIVERSITAS GUNADARMA
Presentation transcript:

Chapter 11 Grid Concurrency Control 11.1 A Grid Database Environment 11.2 An Example 11.3 Grid Concurrency Control (GCC) 11.4 Correctness of GCC 11.5 Features of GCC Protocol 11.6 Summary 11.7 Bibliographical Notes 11.8 Exercises

Grid Concurrency Control Concurrency control protocol helps to maintain the consistency of data in database Concurrency control protocol addresses ‘C’ and ‘I’ of ACID properties Serializability in the most widely accepted correctness criterion Different DB architecture needs different concurrency control protocol, i.e. concurrency control protocol for a centralized DBMS will be different that that of a distributerd DBMS D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.1 A Grid Database Environment Data is geographically distributed in Grid environment. A typical working of database in Grid architecture is shown in the figure A distributed grid DB with 3 sites are shown, DB1, DB2, and DB3 (connected via grid middleware) Transactions can be submitted at any site and may need to access data from all the sites Originator / coordinator is a site where transaction is submitted Transactions T1 and T2 submitted to DB1 and they needs to access data from DB2 and DB3 as well Transaction and site identifiers are suffixed, e.g. T1 will have sub- transactions ST12 & ST13; and T2 will have sub-transactions ST21 and ST22 D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.1 A Grid Database Environment (Cont’d) Data access must be synchronized to maintain correctness of data Global lock tables, global logs etc cannot be implemented in Grid environment Different DB sites may implement different concurrency control procols, e.g. one site may use locking whereas other site may use optimistic concurrency control protocol This situation is unavoidable in Grid architecture due to heterogeneous DB sites D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.2 An Example Following example shows that using traditional concurrency control protocols in the Grid environment may potentially corrupt the data Example Consider four data objects are stored in two databases DB2 and DB3: DB2 = O1 and O2 DB3 = O3 and O4 Two transactions are submitted to the database DB1, as shown below: T1 = r1(O1) r1(O2) w1(O3) w1(O1) C1 T2 = r2(O1) r2(O3) w2(O4) w2(O1) C2 The transactions are submitted to the Grid middleware and the metadata service forms required sub-transactions as follows: Sub-transactions of T1: ST12 = r12(O1) r12(O2) w12(O1) C12(11.1) ST13 = w13(O3) C13 (11.2) Sub-transactions of T2: ST22 = r22(O1) w22(O1) C22(11.3) ST23 = r23(O3) w23(O4) C23 (11.4) D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.2 An Example (Cont’d) The sub-transactions are submitted to respective sites, i.e. ST12 and ST22 are submitted to DB2 and ST13 and ST23 are submitted to DB3 As all DB sites are autonomous and hence schedules/histories are created independently. Say DB2 create following history: H2 = r12(O1) r12(O2) w12(O1)C12 r22(O1) w22(O1) C22 (11.5) and DB3 creates following history: H3 = r23(O3) w23(O4) C23 w13(O3) C13 (11.6) From equation 11.5 serializability order: T1 execute before T2 and from equation 11.6 serializability order: T2 executes before T1 Though there is no problem in executing histories H2 and H3 in isolation, but when both histories are combined then serilaizability graph produces a cycle T1  T2  T1 Traditional distributed DB handles this situation by implementing a global management, which is not possible in Grid Databases. Next, Grid Concurrency Control protocol is discussed D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) The above example is the motivation for GCC; where, though individual sites generate serializable schedules, in global view of things the transactions may be ordered incorrectly Functions required by GCC: DB_Accessed(T): takes the global transaction as argument and returns set of databases where sub-transactions of the global transaction are submitted Split_Trans(T): takes the global transaction as argument and returns a set of sub- transactions Active_Trans(DB): takes the database as an argument and returns the set of global transactions having any sub-transaction running in the database Cardinality (Any Set): takes any set, e.g. set of databases or set of sub- transactions and returns the number of elements in the set Append_TS (Subtransaction): takes the sub-transaction as an argument and attaches a unique timestamp to it. Sub-transactions of same global transaction will have same timestamp value D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Grid Serializability Theorem Traditional Conflict Serializability is not sufficient to ensure consistency in Grid database environment Grid serializability theorem is needed to ensure correctness of data Global transactions can be classified in 2 categories: Global transactions with only one sub-transaction and Global transaction having more than one sub-transaction Total order is defined as below: D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) In traditional serializability theory, serial history is considered correct. On the same ground Grid-serial history is considered correct in Grid architecture Grid serial history is defined as below: Condition (1) of definition 11.2 is very strict and does not allow interleaving of operations Hence a more practical approach, Grid Serializable history is used D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Grid serializable history: Grid serializability is analysed by the grid serializability graph If the graph is acyclic the history is Grid serializable D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Grid Serializability graph is defined as below: D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Condition (1) considers local transactions in Grid Serializability graph Condition (2) only considers those global transactions having more than one subtransaction Condition (3) shows the arc between conflicting transactions Grid serializability graph is stored at local sites as there is no global management layer Following types of conflicts are possible: Conflict between global transactions (global-global conflict) Conflict between global transaction and local transaction (global-local conflict) Conflict between local transactions (local-local conflict) Acyclic Grid-serializability graph is used to resolve global-local conflict Total-order is used to resolve global-global conflict D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Based on the Grid serializability graph and total order Grid serializability theorem is as follows: D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Example of Grid serializability graph: In addition to the global transaction (in earlier example), consider additional local transactions as follows: Local Transactions. (LT12 is read as local transaction 1 at database site DB2): LT12 = lr12(O1) lw12(O2) lC12 LT13 = lw13(O3) lC13 Now consider following modified histories: H2 = lr12(O1) r12(O1) r12(O2) w12(O1)C12 r22(O1) w22(O1) lw12(O2) C22 lC12 H3 = r23(O3) w23(O4) lw13(O3) C23 w13(O3) C13 lC13 D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Following figure shows the Grid serializability graph at sites DB2 and DB3 Three possible types of conflicts are discussed below: Global-global conflict: At site DB2, ST12 precedes ST22 (i.e. T1 precedes T2) and at site DB3, ST23 precedes ST13 (i.e. T2 precedes T1). Thus a cycle is formed at different sites. And it may be impossible to identify the cycle without a global management layer. Total order used in Grid serializability avoids formation of cycles are distributed sites Global-local conflict: Can be identified and resolved by local DBMS, e.g. in DB2 ST12 and LT12 Local-local conflict: Can be identified and resolved by local DBMS, similar to traditional DBMS D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Grid Concurrency Control Protocol Has 2 phases: submission & termination Site where transaction is submitted is called originator Split_trans(T) function is used to generate multiple sub-transactions of global transaction Sub-transactions are then submitted to participating sites Unique timestamp is attached to each sub-transactions before submitting Sub-transactions at local databases are executed in total-order A local schedular does not distinguishes between a local transaction and a sub-transaction of global transaction Global transaction with only one sub-transaction does not need to be in total-order as they cannot conflict with other global transaction at more than one site D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

GCC (Cont’d) Submission phase of GCC D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Step-1) Checks if data from multiple sites need to be accessed if data from only originator is required then treat as local transaction If more multiple DB needs to be accessed then the transaction is submitted to metadata services. Split_trans(T) function is used to create sub-transactions Step-2) Global transactions are added to a set which stores all the currently executing global transactions. The set name is Active_Trans Step-3) The middleware appends a timestamp to all sub-transactions before submitting it to respective databases Step-4) If more than one active global transaction exists simultaneously that accesses more than one database, then sub-transactions are executed in total order (according to the timestamp) Step-5) When all sub-transactions of a global transaction finish execution then the global transaction is removed from the Active_Trans set (details in termination phase of GCC) Note: Active_Trans is a set of currently active global transactions and Active_trans(DB) is a function that take DB site as argument and returns active transactions executing in that database D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Termination phase of GCC A global transaction is active till even one of the sub-transaction is executing Steps of termination are as follows: When a sub-transaction finishes execution, the originator is informed Active Transactions, Conflicting Active Transactions and databases access by global transaction set are updated accordingly Check whether the completed sub-transaction is the last sub-transaction of the global transaction if not the last, then sub-transactions waiting in the queue cannot be scheduled if the sub-transaction is the last sub-transaction of the global transaction, then other conflicting sub-transactions can be scheduled. Sub-transactions from the queue then follows the normal submission steps D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Termination phase of GCC D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Revisiting the example of section 11.2 Say, transaction T1’s timestamp is 1 and T2’s timestamp is 2 History, H2, produced by site DB2 is a serial history (equation 11.5) with T1 preceding T2 GCC will not schedule transactions as in H3 (equation 11.6) due to step-4) of the submission phase of GCC. It will always follow the total-order based on timestamp. Hence, sub-transactions of T1 will always be scheduled before sub-transactions of T2. GCC will generate histories H2 (equation 11.5) and H3 (equation 11.6) as follows: H2 = r12(O1) r12(O2) w12(O1)C12 r22(O1) w22(O1) C22 (same as (11.5)) H3 = w13(O3) C13 r23(O3) w23(O4) C23 (corrected execution order by the GCC protocol) Thus both schedules have ordered the transactions in total-order with T1 preceding T2 D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Comparison with traditional concurrency control protocols Operations of a general centralised locking protocol (e.g. centralised two phase locking) in homogeneous distributed DBMS D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Operations of a general distributed locking protocol (e.g. decentralised two phase locking) in homogeneous distributed DBMS D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.3 Grid Concurrency Control (GCC) (Cont’d) Operations of a general Multi-DBMS protocol D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

Operations of GCC protocol 11.3 Grid Concurrency Control (GCC) (Cont’d) D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.4 Correctness of GCC Protocol Grid-serializable schedule is considered correct in Grid environment A concurrency control protocol conforming to Theorem 11.1 is Grid serializable and thus is correct Proposition 11.1: All local transactions and global subtransactions submitted to any local scheduler are scheduled in serializable order. Proposition 11.2: Any two global transactions having more than one subtransaction actively executing simultaneously must follow total- order. Based on the proposition 11.1 and 11.2 following theorem can be proved: Theorem 11.2: Every schedule produced by GCC protocol is Grid- serializable. D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.5 Features of GCC Protocol Concurrency control in heterogeneous environment - Does not use global lock table etc. and hence can work in Autonomous, Heterogeneous environment Reducing the load from originator site - As GCC does not use a centralized scheduling schemes, originator sites have reduced load Reducing number of messages in the inter-network - Communication between the originator and other participating sites is reduced But due to absence of global management layer, some of the valid interleaving may not be possible and hence may result in strict schedule D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

11.6 Summary Global management layer cannot be used in Grid environment GCC protocol maintains the correctness of data in Grid environment GCC protocol can work in heterogeneous environment Optimizing the scheduling process may be hard The focus was to maintain the consistency of data in Grid databases D. Taniar, C.H.C. Leung, W. Rahayu, S. Goel: High-Performance Parallel Database Processing and Grid Databases, John Wiley & Sons, 2008

Continue to Chapter 12…