Download presentation
Presentation is loading. Please wait.
1
Signature Based Concurrency Control Thomas Schwarz, S.J. JoAnne Holliday Santa Clara University Santa Clara, CA 95053 tjschwarz,jholliday@scu.edu
2
Overview Transactional concurrency control in a distributed system: Signatures are a better version of version numbers. Signatures are calculated from the records.
3
Basic Idea A signatures is a short string of f bits calculated from a record. We assume here an LH* file scenario. File is a dictionary data structure associating keys with a non-key field: key cnon-key field signature
4
Basic Idea When a transaction reads a record it records the signature of the record. When the transaction is ready to commit, it checks whether any signatures of records it read have changed. If this is the case, the transaction restarts. Otherwise, it commits.
5
Basic Idea Danger of false negative: Two different records can have the same signature. Control the probability of false negatives by the length of the signature (16B) MD5, (20B) SHA1 are excepted in computer forensics.
6
Simple Signature Scheme Each transaction i contains atomic operations: R i (x) – Read record x W i (x) – Write record x V i (x) – Verify the signature of record x A i – Abort C i – Commit
7
Simple Signature Scheme Rules for transaction i All reads precede all verify. All verifies precede all writes. If another transaction j writes to x between a read and a verify, then transaction i aborts. If all verifies are successful, then the transaction does all its writes and commits.
8
Simple Signature Scheme Dirty Reads: R i (x) W j (x) A j C i or R i (x) W j (x) C i A j Impossible, because a transaction that writes also commits.
9
Simple Signature Scheme Fuzzy Reads: R i (x) W j (x) C j R i (x) Possible only if we were to allow multiple reads to the same item x: R 1 (x) W 2 (x) C 2 R 1 (x) V 1 (x) C 1.
10
Simple Signature Scheme If we do all the reads in a single block: Has arguably ANSI REPEATABLE READ property. Even has ANSI ANOMALY SERIALIZABLE. But it is certainly not serializable: R 1 (x) R 2 (x) R 1 (y) R 2 (y) V 1 (x) V 2 (x) V 1 (y) V 2 (y) W 1 (x) W 2 (x) W 2 (y) W 1 (y) C 1 C 2
11
Extended Signature Scheme Add: Verify-Write phase is atomic. Then: Scheme is (conflict) serializable. Proof (Idea): Consider all reads to be “pre-reads”. Only the verify operations are read in the sense of concurrency control. Then the result follows by definition.
12
Implementation Lock based implementation: Read-Calculate Phase No locking at all. However, a transaction that reads an exclusively locked record might want to reread that record because that record might change. Verify-Write Phase Read lock on all the signatures of records read. Write lock on all the signatures of records to be modified. Verify signatures and decide on commit / abort. Release all locks.
13
Implementation Lock based implementation: Conservative Strict Two-Phase Locking Locks are short-lived: One round of messages to acquire locks and signatures. One round of messages for commit / abort and release messages.
14
Implementation No-locking scheme Transaction appear to servers to be very short. Chance for conflict limited.
15
Signature Implementation We do not use the record signature directly, but a region signature. A region is a contiguous set of keys that all hash to the same bucket. Typically, a region should have between 0.5 and 5 records on average.
16
Signature Implementation Let c i be the keys in a region. Then set the region signature to be Arithmetic is done in a GF. g hashes keys into GF. The record signature of a non-existing record is zero.
17
Signature Implementation The verify operations read region signatures. Addressed by the key-space they cover. Locking is done on regions. Store region signatures. Large regions have little storage overhead, small ones have large storage overhead.
18
Signature Implementation Region signatures prevent phantom records.
19
Implementation No-Locking Scheme Assumes loosely synchronized clocks. Clocks that are accurate to within a small multiple of average message delay. Transaction acquires a time-stamp at the lowest numbered SDDS bucket it visits. Transaction sends verify / write / vote requests to all servers it visited. Each server votes on whether the transaction should commit. In the usual way. If every server returns a yes vote to the transaction manager, then the transaction commits. Transaction manager sends out the result of the vote.
20
Discussion Signature scheme interesting if transactions have large calculation times and updates are rare. Signature scheme should be extendible to replicated databases. Size of region can be fit to the scale of the file, so that a region always has about the same number of records. E.g. whenever the LH* split pointer returns to zero, split regions in half.
21
Discussion Future Work: Performance evaluation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.