Failure and Availibility DS Lecture # 12
Optimistic Replication Let everyone make changes –Only 3 % transactions ever abort Make changes, send updates –If someone else’s changes come through with T_him < T_you, your changes are overridden Wait for a bit before committing –deadlocks
Two Phase Commit Blocking: Trades availability for correctness How? TMCohort Can commit? yes Do Commit Committed
2PC Blocking TMCohort Can commit? yes Do Commit Committed
2PC Blocking After Can_Commit, TM fails –Everyone blocks After Do_Commit, chort fails –TM blocks Single host failure can compromise the availability of the system
2PC Blocking Blocks because of the fear of unknown –The system can be in an unknown state, so everyone blocks hoping for stability
3PC Remove the fear of unknown Structure the state transition to remove ambiguity between commit/abort
3PC: Three Phase Commit Non-blocking consistency Combines agreement with transactions –1. There's no single state from which it's possible to make a transition directly to either a commit or abort state. –2. There is no state in which it is not possible to make a final decision and from which a transition to a commit state can be made.
Three-phase Commit
3PC Protocol Timeout is more meaningful Phase 1: yes/no –Failure, Timeout: abort Phase 2: Prepare to commit –Cohort Failure Before vote –abort After vote –Commit Phase 3: Commit –All acks: commit –After failure: commit
Recovery Checkpoints –Independent –Coordinated Logging –Independent checkpoints –Store activity
Quorum-based Protocols Read and Write quorum –K + G > N If Write quorum > n/2 + 1 –Always an overlap with read quorum Improve availbility
Starbucks 2PC? Interesting story from the web.. Synchronous Vs Asynchronous –Failure Model