Download presentation
Presentation is loading. Please wait.
Published byHerbert Owens Modified over 8 years ago
1
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable Distributed Systems
2
A Nonblocking Quorum Consensus Protocol for Replicated Data Agenda Introduction Nonblocking Quorum Overview Blocking versus Nonblocking Propagation Mechanism Application in multi-robot systems Task allocation Fault tolerance for distributed task allocation algorithms Conclusions
3
A Nonblocking Quorum Consensus Protocol for Replicated Data Introduction Motivation for data replication Fault Tolerance Delays in accessing data Gifford’s (Blocking) Quorum Protocol q r [x] + q w [x] > n[x] Read-write operations blocked until current copy is identified Quorum protocol developed for full file access Adopting for databases inefficient
4
A Nonblocking Quorum Consensus Protocol for Replicated Data Nonblocking Quorum Overview Assume background mechanism to propagate updates to all copies Optimistic assumption: “Every copy of an object is current with high probability.” Significantly reduces delay If obsolete copy, transaction can be rolled back
5
A Nonblocking Quorum Consensus Protocol for Replicated Data Nonblocking Quorum Read an object x 1. Send read request to q r [x] nodes 2. Continues computation using value in first reply 3. Quorum accumulation in background; replies of quorum concurrently collected with execution of transaction 4. If obsolete, transaction rolled back
6
A Nonblocking Quorum Consensus Protocol for Replicated Data Nonblocking Quorum Write an object x 1. Send write request to q w [x] nodes 2. Continues computation 3. Replies gather concurrently, used to calculate version number 4. Transaction commit delayed until required quorum is assembled (1)
7
A Nonblocking Quorum Consensus Protocol for Replicated Data Performance Comparison Assumptions: Transaction, T, executes without conflicts Delays due to quorum accumulation & rollback q r [x], q w [x], n[x] are same for each object x Blocking Latency L B = k * D Nonblocking Latency L NB ≈ (k-1)p*D+Dfor k*c«D L NB improvement over L B if p is small
8
A Nonblocking Quorum Consensus Protocol for Replicated Data Optimistic Assumption Every copy is current most of the time cannot be justified in general. Mechanism that propagates updates made to q w [x] to all copies of x would decrease value of p by increasing the number of current copies of x
9
A Nonblocking Quorum Consensus Protocol for Replicated Data Propagation Mechanism Broadcast mechanism to propagate updates from write quorum to all copies Need not be reliable X messages contain new value of object Some messages will be received by nodes among X which already current.
10
A Nonblocking Quorum Consensus Protocol for Replicated Data Propagation Mechanism General model for spread of information x new x old x new
11
A Nonblocking Quorum Consensus Protocol for Replicated Data Effect of Propagation p is the probability that first reply is obsolete p -> 0 as number of propagation cycles between successive logical accesses increases
12
A Nonblocking Quorum Consensus Protocol for Replicated Data Effect of Propagation In order to justify optimistic assumption that every copy is current most of the time, we must integrate propagation mechanism such that average number of propagation cycles between successive logical accesses is sufficiently large.
13
A Nonblocking Quorum Consensus Protocol for Replicated Data Propagation mechanism Log: local copy organized as an ordered sequence of event records Gossip messages: used to keep copies of the log up to date Each site maintains a timetable with events and the timestamps when they occurred A site uses the timetable to decide which portion of its log it should send to another site and which should be discarded For a particular event record, a site maintains its local copy if it is not certain that all other sites are aware of this event
14
A Nonblocking Quorum Consensus Protocol for Replicated Data Propagation mechanism Run in background and periodically Two properties are guaranteed: Propagation property: assumes that site failures and networks partitions are not permanent Causality property For this application: Event: describes a transaction commit and contains the new version of objects written Implicit communication: gossip messages Explicit communication: unicast (information pertinent to the quorum only)
15
A Nonblocking Quorum Consensus Protocol for Replicated Data Nonblocking quorum protocol with the propagation mechanism Atomic transactions: finish with a commit or an abort Two-phase commit protocol to guarantee the atomicity of transactions that involve multiple sites Concurrency control protocol only permits recoverable executions The copies of the data object are not modified until a transaction commits Main difference with the original protocol appears when a transaction decides to commit Coordinator: site at which a transaction is initiated Participants: other sites that participate in the execution of the transaction
16
A Nonblocking Quorum Consensus Protocol for Replicated Data Two-phase commit protocol Commitment is delayed until t completes its verification for every object. T explicitly sends a “prepare” message to all participants If any participant fails, T aborts sending explicitly messages If not T is committed The coordinator sends a commit message to all participants using implicit communication. Update coordinator’s copy of the log. Update database information when a site receives a commit gossip message. The commit record is discarded from the log when all sites have learned about T’s commitment Commit messages are sent to every site. More overhead (can be reduced using optimizations) More robust than the original one. Not need a special termination protocol when failures occur. 1st phase 2nd phase
17
A Nonblocking Quorum Consensus Protocol for Replicated Data Application in multi-robot systems Cooperation Coodination Time Space Task Allocation Nonblocking quorum protocol can be used for real time applications such as robotics Use this algorithm to increase dependability in multi- robot systems
18
A Nonblocking Quorum Consensus Protocol for Replicated Data Task Allocation Problem: Given a number of robots and tasks which robot should execute which task in order to minimize a parameter (traveled distance, mission time, etc.) Illustrative example: 3 robots 3 tasks (go to a certain point) Minimize global distance and time of the mission
19
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 1 Task 2 Task 3
20
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 1 Task 2 Task 3
21
A Nonblocking Quorum Consensus Protocol for Replicated Data Different approaches Centralized Exponential computational complexity (NP-hard) Slow response for dynamic environments Single point of failure Decentralized Tolerate dynamic environments No single point of failure More complex and efficient but not optimal solutions Most successful approach so far based on the CNP (Contract Net Protocol)
22
A Nonblocking Quorum Consensus Protocol for Replicated Data How does it work? Based on roles: Director of the auction Bidders Simplest protocol: Director announces a task with an associted minimum bid. Bidders send their bids. The director selects the best bid. The director allocates the task to the best bidder. The roles are played dynamically by the different robots. Each robot only knows the tasks allocated to him.
23
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 1 Task 2 Task 3
24
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2
25
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2
26
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1
27
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1
28
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1 Task 3
29
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1 Task 3
30
A Nonblocking Quorum Consensus Protocol for Replicated Data What happens when a robot fails? Types of failures Failure that does not allow the robot to execute the task (motors stalled, video-cameras broken, etc.) Failure in communications, i.e., the robot is not able to communicate with the rest of the robots. Robot is dead (for example failure in the main power supply) Focus on robot deaths and failures in communication When a robot fails the tasks allocated to him are lost. Mission failed!!.
31
A Nonblocking Quorum Consensus Protocol for Replicated Data Possible solution Use well-know algorithms already tested for dependable distributed systems Use of quorums in order to replicate the tasks allocated to each robot. All robots work as clients and servers. Because data is only read when there is a failure. Number of writings much larger than number of readings. r+w>n (n=number of robots). For this application, in order to minimize the overhead r>>w. Write once, read all Overhead is very important Use dynamic quorums Just support one crash failure r+w=n+2 Take into account that a team of robots can split up and join frequently.
32
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1 Task 3
33
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1 Task 3
34
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1 Task 3
35
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1 Task 3
36
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1 Task 3
37
A Nonblocking Quorum Consensus Protocol for Replicated Data Illustrative example Robot A Robot B Robot C Task 2 Task 1 Task 3 Mission completed successfully!!
38
A Nonblocking Quorum Consensus Protocol for Replicated Data Conclusions Nonblocking quorums better than blocking quorums (in terms of latency) when you use propagation mechanism. Propagation mechanism lowers probability that first reply is obsolete. Algorithms used in distributed systems can be applied in other fields such as robotics Quorum protocol could be a good solution to make more fault tolerant multi-robot systems
39
Questions? Paper Presentation: Dependable Distributed Systems
40
Antidio Viguria Ann Krueger Thank you for your attention! Paper Presentation: Dependable Distributed Systems
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.