Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 9: Paxos ( ) Made Simple Spring 2007 Alex Shraer
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Abstract “The Paxos algorithm, when presented in plain English, is very simple.”
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring The Model Asynchronous Benign failures (non-Byzantine) Processes may fail by stopping, and restart. –No limitation on number of failures. Messages can be duplicated and lost, but not corrupted.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Agents Three classes: –proposers –acceptors –learners A single process may act as more than one agent.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Our Goal – Understand Paxos Safety requirements: –A single value is chosen, –Only a value that has been proposed may be chosen, –A process never learns that a value has been chosen unless it actually has been. Liveness requirements (conditional): –Some proposed value is eventually chosen, and –If a value has been chosen, then a process can eventually learn the value.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Choosing a value Take I: –Have a single acceptor agent. –All proposers send proposals to the acceptor. –Acceptor chooses first proposed value it receives. Where’s the problem?
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Choosing a value (cont’d) Take II: –Multiple acceptors –A proposer sends a proposal to a set of acceptors –An acceptor may accept the proposed value. –The value is chosen when a large enough set of acceptors have accepted it. How large is large enough?
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring When to choose? If there are no failures or message loss, we want a value to be chosen even if there is only one proposer: P1: An acceptor must accept the first proposal that it receives. What happens if several values are proposed by different proposers at about the same time?
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Two proposals v1 v2 v1 v2 v1 v2 v1
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Numbered Proposals P1 + “majority” requirement –An acceptor must be allowed to accept >1 proposal. Keeping track of proposals: unique numbers –In fact, pairs: (counter, pid) Accepting a value: –A single proposal with the same number (and value) has been accepted by a majority.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Maintaining Agreement We can allow multiple proposals to be chosen, but Must guarantee that all chosen proposals have the same value, so P2: If a proposal with value v is chosen, then every higher-numbered proposal that is chosen has value v.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Maintaining Agreement (cont’d) To be chosen, a proposal must be accepted by at least one acceptor, so we can satisfy P2 by satisfying: P2a: If a proposal with value v is chosen, then every higher-numbered proposal accepted by any acceptor has value v.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Worst Case Scenario 1, v1 2,v2 1, v1 Choose v1 2, v2 accept accept ! Does not hear any message (yet)
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Maintaining Agreement (cont’d) To guarantee P2a, we need to strengthen it P2b: If a proposal with value v is chosen, then every higher-numbered proposal issued by any proposer has value v. –Note: P2b P2a P2 How can P2b be enforced? –Hard to predict the future – easy to learn the past!
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Satisfying P2b P2c: If a proposal (n, v) is issued, then there exists a majority S of acceptors, such that: Either no acceptor in S has accepted any proposal numbered less than n Or v is the value of the highest-numbered proposal among all proposals numbered less than n accepted by the acceptors in S. Proof of P2b: induction on n
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Maintaining P2c (Prepare) Choose proposal number n Request from all acceptors –A promise to not accept any proposal numbered < n –The proposal with the highest number < n that has been accepted (if any) Await replies from majority thereof
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Issuing Proposals Issue a proposal (v,n) such that: –Either v is the value of the highest-numbered proposal, –Or v is any value if no proposals were reported Is it enough to get the response that no proposal < n was accepted from (only) a majority of acceptors?
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Acceptor’s Algorithm Can ignore any request without compromising safety. Should be allowed to respond (for progress) –Can always respond to prepare. –Can respond to accept only if it not promised not to. P1a: An acceptor can accept a proposal numbered n iff it has not responded to a prepare request having a number greater than n.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Optimizations If an acceptor receives a prepare request numbered n, but already responded to a prepare request numbered > n… –No need to reply! A proposer can abandon a request if someone else is issue a higher-numbered one
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Acceptor State An acceptor must remember: –The highest-numbered proposal that it has ever accepted (aka AcceptNum) –The number of the highest-numbered prepare request to which it has responded (aka BallotNum) Need stable storage –Why? Homework question
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Putting it all together…Phase 1 Proposer: selects a proposal number n and sends a prepare request with n to a majority of acceptors. Acceptor: if a prepare request is received with number n > that of any prepare request to which it has already responded, it responds to the request with a promise not to accept any more proposals numbered less than n + with the highest-numbered proposal (if any) that it has accepted.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Phase 2 Proposer: if receives a response to prepare(n) from a majority of acceptors sends an accept(n,v) request to each of those acceptors –v is the value of the highest-numbered proposal among the responses, or any value if the responses reported no proposals. Acceptor: if receives an accept request for a proposal numbered n, it accepts the proposal unless it has already responded to a prepare request having a number greater than n.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Progress Prepare(1) Ack(1, ) Prepare(3) Ack(3, ) Prepare(4)
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Progress (cont’d) A a distinguished proposer is required to guarantee progress –A leader –The only one allowed to issue proposals in stable state
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Learning a Chosen Value Multiple options (efficiency vs fault-tolerance): –All processes are learners accepted value sent to everyone. –Distinguished learner(s)
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Putting it all Together 11 2 n (“accept”, 1,1 ,v 1 ) 1 2 n n (“prepare”,1) (“ack”, 1,1 ,r 0, ) decide v 1 (“accept”, 1,1 ,v 1 ) 1.Failure-free run 2.Our implementation always trusts process 1
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Leslie Lamport “At the PODC 2001 conference, I got tired of everyone saying how difficult it was to understand the Paxos algorithm … The current version is 13 pages long, and contains no formula more complicated than n1 > n2 “