P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems.

P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems

2 Why Distributed ≠ Centralized ?  Failures: a process can deviate from its specification  There are problems that cannot be solved fault-tolerant (even if just one process might fail)

3 Crash failures  Crash fault-tolerant consensus cannot be achieved in an asynchronous system [FLP85]  A process crashes = prematurely halts all its activities

4 Abstracting out crash failures  Failure detectors [Chandra and Toueg, 1996] Engineering side: can be specified and implemented independently of algorithms Theory side: can be used for comparing and classifying problems (the weakest failure detectors)

5 Using failure detectors Eventually strong FD <>S [Chandra and Toueg, 1996]: outputs a list of suspected processes. There is a time after which: every crashed process is suspected by every correct process some correct process is never suspected by any correct process  Consensus is solvable with <>S and a majority of correct processes

6 Using failure detectors, contd.  Abstracting out a majority assumption : Quorum failure detector Σ [DFG, 2004] : outputs a list of processes, called a quorum Every two quorums (output at any processes at any times) intersect There is a time after which every output quorum contain only correct processes

7 The weakest failure detector  <>S is necessary to solve consensus [CHT, 1996]  Σ is the weakest FD to implement a RW register [DFG, 2004] => (<>S, Σ) is the weakest FD to solve consensus

8 State machine replication [Lamport, 1984; Schneider, 1993;…] ClientsServers requests response

9 State machine replication Client: broadcast request to all servers wait until a response is received Server: repeat forever if there are unserved requests use consensus (<>S, Σ) to agree on the order in which the requests are served send the results of served requests to the clients

10 Useful abstractions  SMR (Totally ordered broadcast) = reliable broadcast + consensus [Toueg, Hadzilacos, 1993]  Consensus = (<>S, Σ)

11 Detectable Byzantine failures Crash Mute Ignorant Byzantine failures Detectable Byzantine

12 Byzantine failure detectors  BFDs are parameterized with the specification of the correct system behavior  The output of BFD depends solely on detectable failures: no information about steps performed by correct processes can be extracted (necessary to distinguish algorithms from BFDs)

13 Byzantine FD abstraction BFDAutomaton Ai Network Monitoring algorithms (Peerreview, HotDep 2006) Enforcing algorithms (SMR) Application

14 State machine replication: classics Client: broadcast requests to all servers wait until a response is received Server: repeat forever if there are unserved requests use consensus to agree on the order in which the requests are served send the results of served requests to the clients (!) a single malicious process can ignore correct requests and inject bogus requests

15 BFT state machine replication [Doudou et al, 2005] reliable broadcast + weak interactive consistency WIConsistency: every correct process proposes a value and decides on a set of values  the decided set contains at least one value proposed by a correct process  no two correct processes decide differently SMR can be implemented using RB and WIConsistency

16 The question  SMR = RB + WIConsistency?  No: (<>S B, Σ B ) can implement SMR but cannot implement WIConsistency => WIConsistency > SMR

17 <>S B [MR97,DS98,KMM03] Outputs a list of suspected to be mute processes. There is a time after which: every mute process is suspected by every correct process some correct process is never suspected by any correct process

18 Byzantine quorum FD Σ B Outputs a list of processes, called quorum Every two quorums (output at any two correct processes at any times) share at least one correct process There is a time after which every output quorum contain only correct processes

19 SMR using (<>S B, Σ B )  (<>S B, Σ B ) can be used to implement BFT replication system  Adaptation of BFT [Castro, Liskov, 1999]: wait until receive acks from 2f+1 processes => wait until receive acks from Σ B If the primary replica is timed-out then initiate a view change => If the primary replica is in <>S B then initiate a view change

20 WIConsistency using (<>S B, Σ B ) ? Assume an algorithm exists  Let processes in Q be correct and the rest crash initially  E: Q decide on V (set of values proposed by Q)  E’: an extension of E in which some pi not in Q decides V  E’’: an extension of E in which all processes in V are faulty and pi is correct => contradiction

21 Related work  State machine replication [Lamport 84, 89; Schneider, 1990; Doudou et al., 2005;…]  Failure detectors [Chandra, Toueg, 1991; Chandra et al., 1992; Delporte et al., 2003;…]  Byzantine quorum systems [Malkhi, Reiter, 1997]  Byzantine failure detection [MR97; DS98; KMM03; AMPR01; BAR, 2005; …]

22 Conclusions Byzantine FD abstraction does make sense!  BFT state machine replication using (<>S B, Σ B )  BFT SMR is strictly weaker than WIConsistency  Is the lower bound tight?  How to implement Byzantine FDs?

23 Monitoring: PeerReview [HKD06] BFD produces three types of indications for the application layer: trusted, suspected, and exposed. Completeness:  Eventually, every detectably ignorant node is forever suspected by every correct node  Eventually, every detectably malicious node is exposed by every correct node Accuracy:  No correct node is forever suspected by a correct node  No node is exposed by a correct node, unless it is detectably malicious

24 PeerReview approach  Nodes locally observe message traffic and classify other nodes as trusted, suspected, or exposed  Quick overview: Every node keeps a log of all its local inputs and outputs Use crypto techniques to ensure that log is accurate & linear Nodes can audit each others' log at any time To check for faulty behavior, auditors replay the contents of the log In case of misbehavior, produce evidence that can be verified independently by other nodes  Eventually complete and accurate! State machine (e.g. NFS) Application Network PeerReview detector {trusted, suspected, exposed}

25 Typical consensus algorithm repeat round++ c = round mod n if p=c then try to “lock” the current estimate help in locking until a decided value is received from c, or c is suspected by <>S until a decided value is received

P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems.

Similar presentations

Presentation on theme: "P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems.

Similar presentations

Presentation on theme: "P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems."— Presentation transcript:

Similar presentations

About project

Feedback