Failure Detectors CS 717 Ashish Motivala Dec 6 th 2001.

Failure Detectors CS 717 Ashish Motivala Dec 6 th 2001

Some Papers Relevant Papers Unreliable Failure Detectors for Reliable Distributed Systems. Tushar Deepak Chandra and Sam Toueg. Journal of the ACM. A gossip-style failure detection service. R. van Renesse, Y. Minsky, and M. Hayden. Middleware '98. Scalable Weakly-consistent Infection-style Process Group Membership protocol. Ashish Motivala, Abhinandan Das, Indranil Gupta. To be submitted to DSN 2002 tomorrow. http://www.cs.cornell.edu/gupta/swim http://www.cs.cornell.edu/gupta/swim On the Quality of Service of Failure Detectors. Wei Chen, Cornell University (with Sam Toueg, Advisor, and Marcos Aguilera, Contributing Author). DSN 2000. Fail-aware failure detectors. C. Fetzer and F. Cristian. In Proceedings of the 15th Symposium on Reliable Distributed Systems.

Asynchronous vs Synchronous Model –No value to assumptions about process speed –Network can arbitrarily delay a message –But we assume that messages are sequenced and retransmitted (arbitrary numbers of times), so they eventually get through. Failures in asynchronous model? Usually, limited to process “crash” faults –If detectable, we call this “fail-stop” – but how to detect?

Asynchronous vs Synchronous Model No value to assumptions about process speed Network can arbitrarily delay a message But we assume that messages are sequenced and retransmitted (arbitrary numbers of times), so they eventually get through. Assume that every process will run within bounded delay Assume that every link has bounded delay Usually described as “synchronous rounds”

Failures in Asynchronous and Synchronous Systems Usually, limited to process “crash” faults If detectable, we call this “fail-stop” – but how to detect? Can talk about message “omission” failures: failure to send is the usual approach But network assumed reliable (loss “charged” to sender) Process crash failures, as in asynchronous setting “Byzantine” failures: arbitrary misbehavior by processes

Realistic??? Asynchronous model is too weak since they have no clocks(real systems have clocks, “most” timing meets expectations… but heavy tails) Synchronous model is too strong (real systems lack a way to implement synchronize rounds) Partially Synchronous Model: async n/w with a reliable channel Timed Asynchronous Model: time bounds on clock drift rates and message delays [Fetzer]

Impossibility Results Consensus: All processes need to agree on a value FLP Impossibility of Consensus –A single faulty process can prevent consensus –Realistic because a slow process is indistinguishable from a crashed one. Chandra/Toueg Showed that FLP Impossibility applies to many problems, not just consensus –In particular, they show that FLP applies to group membership, reliable multicast –So these practical problems are impossible in asynchronous systems They also look at the weakest condition under which consensus can be solved

Byzantine Consensus Example: 3 processes, 1 is faulty (A, B, C) Non-faulty processes A and B start with input 0 and 1, respectively They exchange messages: each now has a set of inputs {0, 1, x}, where x comes from C C sends 0 to A and 1 to B A has {0, 1, 0} and wants to pick 0. B has {0, 1, 1} and wants to pick 1. By definition, impossibility in this model means “xxx can’t always be done”

Chandra/Toueg Idea Theoretical Idea Separate problem into –The consensus algorithm itself –A “failure detector:” a form of oracle that announces suspected failure –But the process can change its decision Question: what is the weakest oracle for which consensus is always solvable?

Sample properties Completeness: detection of every crash –Strong completeness: Eventually, every process that crashes is permanently suspected by every correct process –Weak completeness: Eventually, every process that crashes is permanently suspected by some correct process

Sample properties Accuracy: does it make mistakes? –Strong accuracy: No process is suspected before it crashes. –Weak accuracy: Some correct process is never suspected –Eventual {strong/ weak} accuracy: there is a time after which {strong/weak} accuracy is satisfied.

A sampling of failure detectors Completeness Accuracy StrongWeakEventually StrongEventually Weak StrongPerfect P Strong S Eventually Perfect  P Eventually Strong  S Weak D Weak W  D D Eventually Weak  W

Perfect Detector? Named Perfect, written P Strong completeness and strong accuracy Immediately detects all failures Never makes mistakes

Example of a failure detector The detector they call W : “eventually weak” More commonly:  W : “diamond- W ” Defined by two properties: –There is a time after which every process that crashes is suspected by some correct process {weak completeness} –There is a time after which some correct process is never suspected by any correct process {weak accuracy} Eg. we can eventually agree upon a leader. If it crashes, we eventually, accurately detect the crash

 W : Weakest failure detector They show that  W is the weakest failure detector for which consensus is guaranteed to be achieved Algorithm is pretty simple –Rotate a token around a ring of processes –Decision can occur once token makes it around once without a change in failure-suspicion status for any process –Subsequently, as token is passed, each recipient learns the decision outcome

Building systems with  W Unfortunately, this failure detector is not implementable This is the weakest failure detector that solves consensus Using timeouts we can make mistakes at arbitrary times

Group Membership Service X Asynchronous Lossy Network pi pj pi X pj’s Membership list Join Leave Failure Process Group

Data Dissemination using Epidemic Protocols Want efficiency, robustness, speed and scale Tree distribution is efficient, but fragile and hard configure Gossip is efficient and robust but has high latency. Almost linear in network load and scales O(nlogn) in detection time with number of processes.

State Monotonic Property A gossip message contains the state of the sender of the gossip. The receiver used a merge function to merge the received state and the sent state. Need some kind of monotonicity in state and in gossip

Simple Epidemic Assume a fixed population of size n For simplicity, assume homogeneous spreading –Simple epidemic: any one can infect any one with equal probability Assume that k members are already in infected And that the infection occurs in rounds

Probability of Infection Probability P infect (k,n) that a particular uninfected member is infected in a round if k are already in a round if k are already infected? P infect (k,n) = 1 – P(nobody infects member) = 1 – (1 – 1/n) k E(#newly infected members) = (n-k)x P infect (k,n) Basically its a Binomial Distribution

2 Phases Intuition: 2 Phases First Half: 1 -> n/2 Phase 1 Second Half: n/2 -> nPhase 2 For large n, P infect (n/2,n) ~ 1 – (1/e) 0.5 ~ 0.4

Infection and Uninfection Infection –Initial Growth Factor is very high about 2 –At the half way mark its about 1.4 –Exponential growth Uninfection –Slow death of uninfection to start –At half way mark its about 0.4 –Exponential decline

Rounds Number of rounds necessary to infect the entire population is O(log n) Robbert uses and base of 1.585 for experiments

How the Protocol Works Each member maintains a list of (address heartbeat) pairs. Periodically each member gossips: –Increments his heartbeat –Sends (part of) list to a randomly chosen member On receipt of gossip, merge the lists Each member maintains the last heartbeat of each list member

SWIM Group Membership Service X Asynchronous Lossy Network pi pj pi X pj’s Membership list Join Leave Failure Process Group

System Design Join, Leave, Failure : broadcast to all processes Need to detect a process failure at some process quickly (to be able to broadcast it) Failure Detector Protocol Specifications –Detection Time –Accuracy –Load Specified by application designer to SWIM Optimized by SWIM

SWIM Failure Detector Protocol Protocol period = T time units X X K random processes pipj

Expected Detection time e/(e-1) protocol periods Load: O(K) per process –Inaccuracy probability exponential in K Process failures detected –in O(log N) protocol periods w.h.p. –in O(N) protocol periods deterministically Properties

Why not Heartbeating ? Centralized : single failure point All-to-all : O(N) load per process Logical ring : unpredictability on multiple failures

Win2000, 100 Base-T Ethernet LAN Protocol Period = 3*RTT, RTT=10 ms, K=1 LAN Scalability

Deployment Broadcast ‘suspicion’ before ‘declaring’ process failure Piggyback broadcasts through ping messages –Epidemic-style broadcast WAN –Load on core routers –No representatives per subnet/domain

Failure Detectors CS 717 Ashish Motivala Dec 6 th 2001.

Similar presentations

Presentation on theme: "Failure Detectors CS 717 Ashish Motivala Dec 6 th 2001."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Failure Detectors CS 717 Ashish Motivala Dec 6 th 2001.

Similar presentations

Presentation on theme: "Failure Detectors CS 717 Ashish Motivala Dec 6 th 2001."— Presentation transcript:

Similar presentations

About project

Feedback