Concurrency Control in Distributed Databases. By :- Rishikesh Mandvikar rmandvik[at]engr.smu.edu May 1, 2004
2 Topics Serializability Theory Centralized Databases Distributed Databases Lock Based Concurrency Control Algorithms Centralized (2PL, S2PL) Distributed (C2PL, PC2PL, D2PL) Optimistic Concurrency Control
3 Serializability Theory [13]
4
5 Serializability Theory extended to Distributed Database [14] Fragmentation Horizontal Vertical Hybrid Replication Synchronous Replication ROWA Protocol Voting Asynchronous Replication
6 Classification of CC Algorithms [14]
7 Locking based CC Algorithms Centralized 2PL (Relaxed S2PL) S2PL Distributed C2PL PC2PL D2PL
8 2 Phase locking (2PL) [13] Rules: Growing phase: “A txn that has to read/write a data object first has to request a read/write lock on it.” Shrinking phase: “A txn cant request additional locks once it releases a lock.”
9 Lock Graph for 2PL
10 Strict 2 Phase Locking (S2PL) [13] Rules: Growing phase: “A txn that has to read/write a data object first has to request a read/write lock on it.” Non - Shrinking phase: “Txn releases all locks only when it completes.”
11 Lock Graph for S2PL
12 2PL, S2PL [13]
13 2PL, S2PL Differences 2PL Cascading aborts Conflict serializable schedules (not all) High concurrency S2PL No cascading aborts Serializable schedules Low concurrency
14 Centralized 2PL
15 Centralized 2PL [14] Cons Failure of primary site Bottleneck situation Communication links
16 Primary Copy 2PL [14] Lock on primary copy necessary Lock management at the primary-copy sites only Pros Reduces load at central site Cons Deadlock handling is partially centralized
17 Distributed 2PL [14]
18 Distributed 2PL [14] Pros Lock management independency Cons Complex deadlock handling required Communication cost
19 Optimistic Concurrency Control [13][14] Txns assumed to have no conflicts Private workspace area Validation of txns before write phase
20 Optimistic Concurrency Control [13][14] Txn phases: Read and Compute read from database and write into private workspace Validate Timestamps assigned over here Check for conflict with concurrent txns Write Copy into database if validation successful
21 Optimistic Concurrency Control [13][14] For Ti and Tj where TS(Ti) < TS(Tj) Validation Criteria All phases of Ti execute before Tj Ti ends before write phase of Tj and Ti doesn’t modify data read by Tj Ti finishes its read phase before Tj finishes its read phase and they both don’t read/write any common data
22 Optimistic Concurrency Control [13][14] Validation For validating Tj w.r.t committed txn Ti where TS(Ti) < TS(Tj) Maintain a list of read/write object list for Tj Other cant commit while Tj is validated Once Validated, write phase allowed to finish Bottleneck situation
23 Optimistic Concurrency Control [13][14] Advantages Increased concurrency with a good “mix” of txns. Better than Lock based systems Disadvantages Bottleneck situation Maintaining read/write list for every txn Copying the private space to the database Long txns
24 Optimistic Concurrency Control [13][14] Disadvantages Long txns Read/write list would be very long Chance of Restart is proportional to the square of its size [9]
25 Research Optimistic CC algorithm IBM’s IMS FASTPATH (Centralized DBMS) OCC in Distributed DBMS
26 Conclusion Serializability Theory Lock Based Systems Optimistic CC algorithms Timestamp Ordering
27 Questions??