 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 1 Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 1 Principles of Reliable Distributed Systems Lecture 7: Failure Detectors Spring 2007 Prof. Idit Keidar

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 2 Material Chandra and Toueg, Unreliable Failure Detectors for Reliable Distributed Systems. Mostefaoui and Raynal, Solving Consensus using Chandra-Toueg’s Unreliable Failure Detectors: A General Approach. Keidar and Rajsbaum, On the Cost of Fault- Tolerant Consensus When There are no Faults: A Tutorial.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 3 Fault-Tolerant Asynchronous Consensus is Impossible Every asynchronous fault-tolerant consensus algorithm has a fair execution in which no process decides [ FLP85 ] It is possible to design asynchronous consensus algorithms that don’t always terminate

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 4 Should We Give Up? We can always model a system as synchronous with the right timeout –Messages never take more than 2 days But modeling systems as synchronous requires conservative timeouts –Problem?

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 5 Motivation: Choosing a Model Example network: –99% of packets arrive within 10 µsec –Upper bound of 1000 µsec on message latency What would we choose the round duration for a round-based synchronous system? –Implication? We would like to choose a timeout of 10 µsec, but without violating safety…

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 6 The Middle Ground We can choose timeouts that usually hold –During long stable periods, delays and processing times are bounded like synchronous model –Some unstable periods like asynchronous model We can design algorithms that always ensure safety, but ensure liveness only at stable times

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 7 How Do We Model This? Assume that in each run there is a Global Stabilization Time (GST) after which the system is stable Unbounded Unknown

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 8 Eventual Synchrony Model [Dwork, Lynch, Stockmeyer 88] Processes have clocks with bounded drift There are upper bounds –  on message delay, and –  on processing time GST, global stabilization time –Until GST, unstable: bounds do not hold –After GST, stable: bounds hold –GST unbounded, unknown

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 9 Eventual Synchrony in Practice For , , choose bounds that hold with high probability Stability forever? –We assume that once stable remains stable –In practice, has to last “long enough” for given algorithm to terminate

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 10 Time-Free Algorithms Describe algorithms using a failure detector abstraction [Chandra, Toueg 96] Goal: abstract away time, get simpler algorithms What makes a good abstraction?

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 11 The Failure Detector Abstraction [Chandra, Toueg 96] Each process has a local failure detector oracle –Typically outputs list of processes suspected to have crashed at any given time Algorithm A 1 FD {p 3,p 7 } Network Algorithm A n FD {p 3 } …

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 12 A Natural Failure Detector Implementation in Eventual Synchrony Model Implement failure detector using timeouts: –When expecting a message from a process i, wait  clock skew before suspecting i In stable periods,  always hold, hence no false suspicions

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 13 The Resulting Failure Detector Is ◊P - Eventually Perfect Strong Completeness: From some point on, every faulty process is suspected by every correct process Eventual Strong Accuracy: From some point on, no correct process is suspected Is it implementable in asynchronous systems?

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 14 t 0 q does not suspect p 00 t 0 p crashes '0'0 t 1 q suspects p t 0 p’s msgs delayed 11 t 1 q suspects pt 2 q does not suspect p '1'1 Are we done? Now,  1 is fair Build a Fair Run W/out Failures s.t. There Is No Time After Which q Does Not Suspect p t0t0 t 1 q suspects p t 2 p crahses t 3 q suspects p 22 t0t0 t 1 q suspects p t 2 p’s msgs delayed t 3 q suspects p Continue by induction to build an infinite fair run in which q is correct, suspected at t 1,t 3,t 5, …

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 15 Unreliable Failure Detectors Failure detector’s output can be wrong (even arbitrary) for an unbounded (finite) prefix of a run Captures eventual synchrony An algorithm that tolerates unbounded periods of asynchrony is called indulgent [ Guerraoui 98 ]

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 16 Observations on Indulgent Consensus Algorithms Every indulgent consensus algorithm also solves uniform consensus [ Guerraoui 98 ] It is impossible to solve t-resilient indulgent consensus when t ≥ n/2 [ Chandra, Toueg 96; Guerraoui 98 ] –Henceforward, assume t < n/2

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 17 Weaker Failure Detector: ◊S – Eventually Strong Strong Completeness Eventual Weak Accuracy: There exists some correct process that is not suspected by any correct process from some point on –Processes do not know who this process is I suspect Josh and Joe I suspect Joe and John I suspect Joe Joanne Josh Joe I suspect Josh John

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 18 Some Notes on ◊S ◊P is a subset of ◊S –Every failure detector of class ◊P is also of class ◊S Strictly weaker than ◊P –Homework question Equivalent to the weakest for consensus

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 19 Model n processes 1,…,n t<n/2 of them can crash Reliable links between correct processes Asynchronous with ◊S

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 20 ◊S-based Consensus: MR Algorithm [ Mostefaoui, Raynal 99 ] Asynchronous rounds: –Each process locally progresses through rounds r = 1, 2, 3, … –Different processes can progress at different times Rotating coordinator –Process i mod n is the coordinator of round i Each round consists of two phases

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 21 MR Algorithm val  input; est   || for r =1, 2, … do coord  (r-1 mod n)+1 if I am coord, then send (r,val) to all wait for ( (r, v) from coord OR suspect coord (by ◊S)) if receive v from coord then est  v else est   send (r, est) to all wait for (r,e) from n-t processes if any non-  value e received then val  e if all received e’s have same non-  value v then send (“decide”, v) to all return(v) || Upon receive (“decide”, v), forward to all; return(v) 1 2

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 22 Failure-Free Suspicion-Free Run 11 2 n...... (1, v 1 ) 1 2 n...... all have est = v 1 all decide v 1 (decide, v 1 )

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 23 Coordinator is Suspected 11 2 n...... (1, v 1 ) 1 2 n...... (1,  ) all have est =  delayed (2, v 2 ) delayed no decision

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 24 One Suspicion is Enough for FLP 11 2 n...... (1, v 1 ) 1 2 n...... (1,  ) est =  delayed (2, v 2 ) no decision

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 25 Phase 1 Rationale Ensure that for every p i : est i  {val coord,  } –Do all processes have the same est? Progress –Why does the 1 st phase terminate?

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 26 Phase 2 Rationale Ensure Agreement –If process p i decides v during round r, and process p j progresses to round r+1, then p j does so with est j = v. Progress –Why does the 2 nd phase terminate?

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 27 Phase 2 Rationale (Cont’d) The 2 nd phase ends upon receiving (r, est) from a majority of processes (n-t is a majority) Why is the majority important? –Every two majority sets intersect –If one process gets n-t messages with v, then every other correct process gets at least one message with v

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 28 Possible Scenarios in Phase 2 p i gets only v –p i decides –All other processes get v at least once p i gets only  –All other processes get  at least once –Nobody decides p i gets both v and  –Some other process might decide v –p i sets est i to v Can p i get two different values v and v’?

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 29 Validity Proof For every i, val i and est i always store the initial value of some process or  By induction on the length of the execution Initially, for every process i, val i stores i’s initial value, est i is  Subsequently, they can only change to store a val j or est i value sent by some process j

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 30 Lemmas Lemma 1: If in some round r, two messages (r,v) and (r,v’) are sent such that v ≠  and v’ ≠ , then v=v’. Lemma 2: If in some round r, n-t processes send (r,v), then for every round r’>r, if a message (r’,v’) with v’ ≠  is sent, then v=v’. –Hint: n-t > n/2.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 31 Agreement Proof Assume by contradiction that two different decisions, v ≠ v’ are made. Let r (r’) be the first round in which some process i (i’) decides v (v’) when it receives n-t (r,v) ((r’,v’)) messages. By Lemma 1, r ≠ r’, and by Lemma 2, neither r > r’ nor r’>r. A contradiction.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 32 Termination Proof Steps Progress: until some process decides, no process is ever “stuck” in a round forever First decision: some correct process eventually decides Subsequent decisions: if some correct process decides, then all correct processes eventually decide

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 33 What Do We Need “Decide” For? val  input; est   || for r =0,1, 2, … do coord  (r mod n)+1 if I am coord, then send (r,val) to all wait for ( (r, val) from coord OR suspect coord (by ◊S)) if receive val from coord then est  val else est   send (r, est) to all wait for (r,est) from n-t processes if any non-  est received then val  est if all ests have same non-  value v then send (“decide”, v) to all return(v) od

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 34 Why Send “Decide”? 11 2 n...... (1, v 1 ) 1 2 n...... suspect 1 est =  delayed no decision delayed (1,  ) Decide

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 35 Disseminating the Decision OK, so we need the 1 st “decide”. Why forward to all? Hint: reliable broadcast

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 36 Why Forward “Decide”? n=4, t=1 11 2 4 (1, v 1 ) 1 2 4 suspect 1 est =  no decision (1,  ) decide 3 3 2 4 3 (2, v 1 ) X X 4 2 stuck, no n-t

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 37 How Long Does It Take? The algorithm can take unbounded time –What if no failures occur? Is this inevitable? Can we say more than “decision is reached eventually” ?

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 38 Performance Metric Number of communication steps in well-behaved runs Well-behaved: –No failures –Stable (synchronous) from the beginning –No false suspicions Motivation: common case

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 39 The Algorithm’s Running Time in Well-Behaved Runs In round 1, the coordinator is correct, not suspected by any process All processes decide at the end of phase two of round 1 –Decision in two communication steps –Halting (stopping) takes three steps –Same as in synchronous model For Uniform Consensus

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 40 Alternative Weak Failure Detector  – Leader –Outputs one trusted process –From some point, all correct processes trust the same correct process Can easily implement ◊S Is the weakest for consensus [Chandra, Hadzilacos, Toueg 96]

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 41 A Natural  Implementation Use ◊P implementation Output lowest id non-suspected process

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 1 Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

Similar presentations

Presentation on theme: " Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 1 Principles of Reliable Distributed Systems Lecture 7: Failure Detectors."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 1 Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

Similar presentations

Presentation on theme: " Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 1 Principles of Reliable Distributed Systems Lecture 7: Failure Detectors."— Presentation transcript:

Similar presentations

About project

Feedback