Download presentation
Presentation is loading. Please wait.
Published byChristian Waters Modified over 9 years ago
1
Relative Power of Models in Distributed Computing Petr Kuznetsov TU Berlin/DT-Labs
2
What makes distributed computing special? Failures and lack of synchrony Computing units are unreliable and unsynchronized (otherwise ≅ centralized system) Blind Men and the Elephant
3
What makes distributed system hard? Multitude of abstractions and models Lack of categorization, complexity classes Unnatural? Blind Men and the Elephant
4
4 Distributed modeling jumble CAS, LL/SC? RW shared memory? Message passing? Snapshot memory? Sub- consensus objects? Clouds, data centers…? t-resilience ?
5
5 Today t-resilience ≅ wait-freedom [BG simulation] ( t+1)-process wait-free system is equivalent to a n- process t-resilient system (t<n) a colorless task T is solvable t-resiliently iff T is solvable wait-free by t+1 processes Non-uniform fault models ≅ wait-freedom Set consensus power of an adversary Colorless tasks Colored conjectures
6
6 Model n processes p 1,…,p n Read-write shared memory Atomic snapshots [AADGMS90] Crash failures p2p2 p1p1 W(R 1,1) W(R 2,1) p3p3 Snapshot(R) W(R 3,1)Snapshot(R) (1,0,0) (1,1,1) (1,0,1)
7
7 Distributed tasks Functions in distributed computing A task (I,O,Δ): I – set of input vectors O – set of output vectors Task specification Δ: I →2 O
8
8 k-set consensus Processes start with inputs in V (|V|>k) Safety: The set of outputs is a subset of inputs of size at most k Liveness: Every correct process eventually outputs (wait- freedom) k=1: consensus wait-free (k+1)-process k-set consensus is impossible Colorless: a process is free to adopt inputs or outputs
9
9 The wait-free model: 2 processes P Q P reads before Q writes P reads after Q writes Q reads after P writes Q reads before P writes Full-information protocol: while not done write(view) view := snapshot(memory) Wait-free consensus is impossible!
10
10 The wait-free model: 3 processes PQ R
11
11 The wait-free model: 3 processes PQ R
12
12 The wait-free model: 3 processes PQ R Sperner’s Lemma: wait- free (k+1)-process k-set consensus is impossible [BG93,SZ93,HS93]
13
13 What about k-resilience? Assume 1 out of a million process may fail? Can we solve consensus? Can we solve k-set consensus in a k-resilient system for some n>k? No! (Otherwise we could do it wait-free)
14
14 BG agreement Part I
15
15 BG agreement [Borowsky, Gafni, 1993] Safety as in consensus: Every output is an input No two outputs are different Liveness: Every correct process outputs, if no participating process fails
16
16 BG-agreement: protocol Code for p i : write(A i,input i ) S:=snapshot(A) write(B i,S) wait until for all p j in S, B j ≠ decide on the smallest input in the smallest B j
17
17 BG agreement: correctness Liveness: Suppose each participant takes 3 steps: every wait terminates (If a participant “dies” between the writes – block) Safety: Consider p t that wrote the smallest snapshot S to B t for all B j ≠ , p t is in B j every p i waits until p t writes every p i decides on the smallest input in S
18
18 BG simulation k+1 simulators q 1,…,q k+1 n simulated processes p 1,..,p n (n>k) p1p1 pnpn q1q1 q k+1 ….
19
19 Simulation Every simulator q i takes a snapshot to get the “most recent” view of p j Run 3 steps of BG-agreement to agree on the view of p j If the view is decided, register the view If not proceed to the next process in round-robin Safe: the simulated views Live? No! What if a BG-agreement blocks?
20
20 Simulation order Run the BG agreements in round-robin When done with the “write-phase” of a BG- agreement for a step of p j – proceed to p j+1 mod n p2p2 p4p4 q1q1 q3q3 p1p1 p3p3 q2q2
21
21 Progress A faulty simulator may block at most one process At most k simulators can fail At most k simulated process fail – a k-resilient run!
22
22 Application: colorless tasks Suppose T=(I,O,Δ) is solvable k-resiliently, let A be the corresponding (full-information) protocol Each q i starts with an input in V I The first view of each p j is its input (each q i proposes its own input) In the resulting k-resilient run of A, some p j output q i adopts the first output value it sees. k-resilient k-set consensus is impossible!
23
23 Works both ways n simulators p 1,…,p n can k-resiliently simulate a wait-free run on q 1,..,q k+1 what matters is the number of failures, not the number of simulators All k-resilient systems are equivalent! (with respect to colorless tasks)
24
24 On Non-Uniform Fault Models Part II
25
25 Uniform fault models Processes failures are IID P(pi fails)=ε P(no faults)=(1-ε) n P(t or less faults)=I 1-ε (n-t,t+1) t-resilience: at most t faults wait-freedom: at most n-1 faults
26
26 But… Processes may fail In a correlated way In non-identical way
27
27 Non-identical faults Processes p,q,r p and r fail independently q is unlikely to fail Possible runs: pqr (no faults) pq (r fails) qr (p fails) q (p and r fail) p q r
28
28 Correlated faults Processes p,q,r p and q share unreliable hardware q and r share unreliable software It is unlikely that both hardware and software fail Possible runs: pqr (no faults) p (software fault) r (hardware fault) p q r
29
29 Generic adversaries [Delporte et al., 2009] p p rs A - set of process subsets The model = all runs with in correct sets in A A= {p,qr,rs} qrs
30
30 Hitting sets A can solve 2-set consensus Can it solve consensus? pqrs Hitting set of A A={p,qr,rs} pqrs Yes: for all S in A, h(A S )=1
31
31 Commit-adopt [Gafni, 1998] Liveness: Every correct process returns Safety: Return (adopt,v) or (commi,v) where v was proposed If one value proposed, only (commit,*) is returned If one returns (commit,v), then only (*,v) is rerurned
32
32 Commit-adopt: wait-free protocol Code for p i : write(A i,input i ) S:=collect(A) if |vals(S)|=1 then write(B i,input i ) else write(B i,fail) S:=collect(B) if there are no fails in S then return (commit,input i ) if for some j, input j is in S return (adopt, input j ) return (commit,input i )
33
33 Commit-adopt: correctness Liveness: immediate Safety: If all proposals are the same – every process commits At most one non-fail value in B If a process commits: the value is seen by every terminated process in B
34
34 Leader-based consensus v i := input i while true r++ (u,v i ) := CommitAdopt r (v i ) if u = commit then return v i v i := get the estimate from the Leader i Safety provided by Commit-Adopt Liveness if, eventually, the same correct leader is elected
35
35 Electing a leader using A for all S in A, h(A S )=1 shared C[1],…,C[n] \\ shared counters R[1],…,R[n] \\ the most recent rounds while true r++ R[i]:= r wait until for some S in A: for all j in S, R[j]≥r for all j not in S: C[j]++ \\ increment the counter Leader i := argmin(C[1],…,C[n])
36
36 Set consensus number of A setcon(A)= 0, if A is empty max S in A min a in S setcon(A S,a ) +1, otherwise A S – all S’ in A, subsets of S A S,a – all S’ in A S, not containing a A = {pqr,pq,pr,p,q,r}, setcon(A)=2: S = pqr and a = p, we have A S,a = {q, r} and setcon(A S,a ) = 1
37
37 A = {pqr, pq, pr, p, q, r} for S=pqr, a=p, A S,a ={q,r} and setcon (A S,a ) = 1 setcon(A)= setcon (A S,a )+1=2 A 1 ={pqr,pq,pr,p} A 2 ={q,r}
38
38 Partitioned adversary A=A 1,…,A k setcon(A i )=1 there exists S in A, such that for all a in S: A S,a is in A i+1,…,A k setcon(A S,a )=k-i
39
39 Characterizing setcon setcon(A)=k if and only if A solves k-set consensus but not (k-1)-set consensus A with setcon(A)=k solves a colorless task T if and only if T is solvable (k-1)-resiliently n-process system with A ≅ k-process wait-free
40
40 Sufficiency Solving k-set consensus with A, setcon(A)=k Split A into A 1,..,A k, each of setcon 1 Run k parallel leader-based consensuses – at least one terminates Adopt the first returned value
41
41 Necessity Suppose A can solve (k-1)-set consensus k processes solve (k-1)-set consensus as follows: “Leveled” BG-simulation, starting from level k At current level L, simulate steps of S such that setcon(A S )=L If blocked simulating a step of p on level L go to simulating S’ in A S,p such that setcon(A S’ )=L-1 If a higher level “unblocks”, return to it k k-1 2 1 …
42
42 Asymmetric progress conditions [Imbs et al., DISC 2010] Make progress if p 1 participates (wait-free for p 1 ), or at most one process is eventually up (obstruction-free for the rest) A={p 1 *,p 2,…p n } setcon(A)=2 A 1 ={p 1 *} A 2 ={p 2,…p n }
43
43 What if we use stronger objects? Suppose we can use k-process consensus objects What is the min k such that consensus is solvable? Suppose k≤n-1 2 process solve wait-free consensus as follows: BG-simulation (a slow simulator blocks up to n-1 simulated processes) Start with simulating all in round-robin If blocked – simulate the first unblocked process If blocked – go back to simulating all Eventually, either all advance, or exactly one runs solo
44
44 Summary t-resilience ≅ wait-freedom [BG93] Adversaries ≅ wait-freedom [GK10a] What about complexity? Task solvability undecidable for >2 processes [HR97,GK99] What about colored tasks? Extended to generic tasks [GK10b]
45
45 It’s wait-free!
46
46 References E. Borowsky and E. Gafni. Generalized FLP impossibility result for t-resilient asynchronous computations,'' STOC 1993 E. Gafni and P. Kuznetsov Turning Adversaries into Friends: Simplified, Made Constructive, and Extended OPODIS 2010 E. Gafni and P. Kuznetsov Relating L-Resilience and Wait-Freedom via Hitting Sets ICDCN 2011 D. Imbs, M. Raynal, G. Taubenfeld On Asymmetric Progress Conditions DISC 2010
47
47 QUESTIONS?
48
48 Colorless distributed tasks V I –set of input values V O -set of output values Val(U) denotes the set of values in vector U In is in Δ (Out) Val(In) is subset of Val(In’) Val(Out’) is subset of Val(Out) Out’ is subset of Δ(In’)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.