Download presentation
Presentation is loading. Please wait.
Published byMatthew Rodgers Modified over 9 years ago
1
Fault tolerance and related issues in distributed computing Shmuel Zaks zaks@cs.technion.ac.il GSSI - Feb 20161
2
2 Haifa
3
GSSI - Feb 20163 CS, Technion
4
Part 0: Part 0: Distributed computing – an overview: basic notions; seminar focus: from lower bounds, via impossibility, to fault tolerance and self-stabilization. Part 1: Part 1: Lower bounds Part 2: Part 2: Computing in spite of faults - impossibility of consensus Part 3: Part 3: Detecting faults - the snapshot algorithm Part 4: Self-stabilization - Self recovery from faults GSSI - Feb 20164
5
Part 0: Part 0: An overview Part 1: Part 1: Lower bounds Part 2: Part 2: Computing in spite of faults Part 3: Part 3: Detecting faults Part 4: Part 4: Self-stabilization GSSI - Feb 20165
6
processors communication problem Communication network GSSI - Feb 20166 A. The model
7
Anonymous GSSI - Feb 20167
8
12 a e 6 c Unique identities GSSI - Feb 20168
9
d a e b c message passing communication lines, channels topology communication GSSI - Feb 20169
10
ab c d e directed, undirected (message passing) GSSI - Feb 201610
11
message delivery mechanism fifo reliable, no faults finite, arbitrary delay queues of messages (message passing) GSSI - Feb 201611
12
Distributed algorithm, protocol Send a message receive a message do local computation GSSI - Feb 201612 (message passing) Execution
13
R4R4 GSSI - Feb 201613 R1R1 R2R2 R3R3 R5R5 e a b d c shared memory
14
7 2 31 25 9 88 40 9 A B read/write (shared memory) GSSI - Feb 201614
15
synchronization Synchronous, Asynchronous d a e b c GSSI - Feb 201615
16
Asynchronous Model GSSI - Feb 201616 ij time t+???time t Clock Network (synchronization)
17
Synchronous Model GSSI - Feb 201617 ij time t+dtime t Clock Network (synchronization)
18
Asynchronous Model - many executions GSSI - Feb 201618 Synchronous Model - unique execution rounds (synchronization)
19
Asynchronou s GSSI - Feb 201619 Synchronous Shared memory Message passing (synchronization)
20
GSSI - Feb 201620 Asynchronous model: for correctness, for upper bound analysis Synchronous model: for lower bound analysis
21
Topology GSSI - Feb 201621 Ring d a e b c
22
Clique d a e b c (Topology) GSSI - Feb 201622
23
7 2 31 25 9 88 40 12 4 General (Topology) GSSI - Feb 201623
24
Why simple networks? They enable the understanding of many design issues In existing general networks – assume a virtual simple network implemented (e.g. a ring) (Topology) GSSI - Feb 201624
25
Complexity measures GSSI - Feb 201625 Synchronous system time Asynchronous system communication communication (messages, bits) time (synchronous time, longest chain, bounded delay)
26
Parallel vs. Distributed computing Parallel computing – given a problem … (ex: sorting) Distributed computing – Given a network … (ex: broadcast) GSSI - Feb 201626
27
(Parallel vs. Distributed computing( Parallel computing : time vs. number of processors Distributed computing: number of messages Complexity goals: Parallel computing: efficiency Distributed computing: correctness GSSI - Feb 201627
28
problem, task P1P1 P2P2 P3P3 input output 3 7 5 Leader election yes no consensus 1 0 0 1 1 1 GSSI - Feb 201628 b. Problems
29
issues GSSI - Feb 201629 design and analysis of algorithms impossibility, lower bounds fault tolerance
30
problems GSSI - Feb 201630 broadcast snapshot consensus shortest path, maximal flow leader election, breaking symmetry, maximum finding, spanning tree, center termination deadlock
31
Example: broadcast GSSI - Feb 201631 d a e b c f
32
Broadcast: bfs (breadth-first-search) GSSI - Feb 201632 d a e b c f
33
Broadcast: dfs (depth-first-search) GSSI - Feb 201633 d a e b c f
34
message complexity each edge carries exactly one message at each direction message complexity is 2|E| GSSI - Feb 201634
35
time complexity GSSI - Feb 201635 synchronous time 2|E| longest chain 2|E| bounded delay 2|E|
36
pi (propogation of information), shout-echo GSSI - Feb 201636 d a e b c f
37
Algorithm pi ( p ropogation of i nformation) send m to each neighbour stop GSSI - Feb 201637 if receive m along edge e: send m on all edges except e stop
38
pi Theorem: The following holds for every execution of the pi algorithm: A processor receives the message m at most once. The execution terminates. each processor receives the message m. The edges on which processors receive m form a spanning tree. The message complexity is 2|E|-|V|+1. The time complexity … GSSI - Feb 201638
39
pif (propogation of information with feedback) shout-echo GSSI - Feb 201639 d a e b c f
40
Distributed algorithms “Positive” results: design, analysis, upper bounds “Negative” results: lower bounds, impossibility GSSI - Feb 201640 c. In this seminar
41
P1P1 P2P2 P3P3 input output 3 7 5 Leader election yes no GSSI - Feb 201641 Part 1: Part 1: Lower bounds
42
GSSI - Feb 201642 message passing asynchronous 9 4 5 8 6 ? x x x x Leader election
43
9 4 5 8 6 GSSI - Feb 201643 We’ll see: a lower bound of Ω (n log n) messages
44
GSSI - Feb 201644 d a e b c f Lower bound and fault tolerance Usually all processors need to compute some function Lower bound of Ω(|E|) g
45
problem, task P1P1 P2P2 P3P3 input output consensus 1 0 0 1 1 1 GSSI - Feb 201645 Part 2: Part 2: Computing in spite of faults
46
message passing asynchronous 0 1 1 0 1 Consensus GSSI - Feb 201646 We’ll see: impossibility to reach consensus.
47
GSSI - Feb 201647 Snapshot Part 3: Part 3: Detecting faults
48
GSSI - Feb 201648 We’ll see: snapshot algorithm.
49
GSSI - Feb 201649 Example: clock synchronization 66 6 6 6 7 7 7 7 7 Part 4: Self-stabilization
50
GSSI - Feb 201650 67 6 4 6
51
4 Let’s try … 6 6 6 7 GSSI - Feb 201651
52
4 But … 6 6 6 7 GSSI - Feb 201652
53
GSSI - Feb 201653 67 6 4 6 We’ll see: self stabilizing algorithms, proofs and performance analysis.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.