Download presentation
Presentation is loading. Please wait.
1
Ordering and Consistent Cuts Presented by Chi H. Ho
2
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport
3
Introduction 2000 PODC Influential Paper Award Outline of the paper: not in presented order –Partial and Total Orderings –Logical and Physical Clocks –Clock and Strong Clock Conditions –Synchronize Physical Clocks Beyond…
4
“Happened Before” a b : if –a and b are events in the same process and a comes before b, or –a is the send event of some message, and b is the receive event of the same message. Transitive: (a b) & (b c) (a c) Concurrent: ( a b) & (b a). Partial Ordering
5
Examples q 5 p 4 q 2 q 3 p 1 r 3 q 2 // p 2 q 2 // p 3 Partial Ordering
6
Logical Clock Clock Condition: a,b: a b C(a) < C(b) Partial Ordering Implementation
7
Logical Clock Implementation Rules: –IR1: Each process P i increments C i between any two successive events. –IR2: If event a is the sending of a message m by process P i, then the message contains a timestamp T m = C i (a). Upon receiving a message m, process P j sets C j greater than or equal to its present value and greater than T m. Partial Ordering Implementation
8
Examples Partial Ordering Implementation P0P0 P1P1 0
9
Examples Partial Ordering Implementation P0P0 P1P1 0 0
10
Examples Partial Ordering Implementation P0P0 P1P1 0 0 1
11
Examples Partial Ordering Implementation P0P0 P1P1 0 0 1 1
12
Examples Partial Ordering Implementation P0P0 P1P1 0 0 1 12
13
Examples Partial Ordering Implementation P0P0 P1P1 0 0 1 123 [3]
14
Examples Partial Ordering Implementation P0P0 P1P1 0 0 1 123 [3] 2 [2]
15
Examples Partial Ordering Implementation P0P0 P1P1 0 0 1 123 [3] 2 [2] 4
16
Examples Partial Ordering Implementation P0P0 P1P1 0 0 1 123 [3] 2 [2] 4 4
17
Extended “Happened Before” a => b: iff –C i (a) < C j (b), or –(C i (a) = C j (b)) & (P i ≺ P j ) Total Ordering
18
Example Application Shared resource granting –Fixed number of processes –Single shared resource –Requirements: I. Mutual Exclusive II. Fair III. Exhaustive Total Ordering
19
Example Application Solution: Distributed algorithm Model: –Channels are FIFO –Each process maintains a process queue Algorithm –Request: broadcast T m :P i request resource –Release: broadcast T m :P i release resource –Receive request: enqueue –Receive release: dequeue –Resource granted (local decision): P i T m :P i request resource w/ T m min P i has received from every process a msg timestamped later than T m Note: –Can be generalized to solve Replicated State Machine! Total Ordering
20
Anomaly Amazon.com [19]
21
Anomaly Amazon.com [19]
22
Anomaly Amazon.com [19] [7]
23
Anomaly Amazon.com [19] [7]
24
Anomaly Amazon.com [19] [7] External event
25
Strong Clock Condition S = {events in the system} S = S ⋃ {relevant external events} is “happened before” for S ∀ a,b ∈ S : a b C(a) < C(b) Avoid Anomaly
26
Physical Clocks PC1: (drift rate bound) ∃ << 1 such that ∀ i: |dC i (t)/dt – 1| < PC2: (drift bound) i,j: |C i (t) – C j (t)| <
27
Avoid Anomaly < shortest msg transmission time ∀ i,j,t: C i (t+ ) – C j (t) > 0 Physical Clocks /(1- ) Amazon.com j i C j (t) > C i (t+ ) >
28
Implementation Rules IR1’: –For each i, if P i does not receive a message at physical time t, then C i is differentiable at t and dC i (t)/dt > 0. IR2’: –(a) If P i sends a message m at physical time t, then m contains a timestamp T m = C i (t). –(b) Upon receiving a message m at time t’, process P j sets C j (t’) equal to maximum (C j (t’-0), T m + m ) Physical Clocks
29
Synchronize Physical Clocks Physical Clocks Problem statement: –IR1’ and IR2’ are followed, –Message delay is bounded, –Clocks satisfied PC1, –Goal: PC2 Algorithm: –Every seconds, a message is sent over every arc. Guarantees: –Clocks are synchronized after t 0 + d – d(2 + )
30
Beyond… Shortcomings: –No gap-detection property – C(a) < C(b) ??? –Bounds are not practical (So is PC!)
31
Gap Detection Property Problem statement: –Given: a, b, C(a), C(b), C(a) < C(b), –Determine if c exists, where C(a) < C(c) < C(b) ? Beyond…
32
Another Strong Clock Condition a b C(a) < C(b) Beyond…
33
What clock, then? Causal histories: Beyond… Vector Clocks:
34
More on Vector Clocks Strong Clock Condition Concurrent Pair-wise Inconsistent Consistent Cut Counting Gap Detection Beyond…
35
More on Vector Clocks Strong Clock Condition Concurrent Pair-wise Inconsistent Consistent Cut Counting Gap Detection, but… Beyond… X Weak Gap-Detection Given a, b, can detect existence of c such that (c a) & (c b)
36
Reference O. Babaoglu and K. Marzullo. Consistent global states of distributed systems: Fundamental concepts and mechanisms. In Sape Mullender, editor, Distributed Systems, ch. 4, pages 55--96. Addison Wesley, 2nd ed., 1993. http://citeseer.ist.psu.edu/babaoglu93consis tent.html http://citeseer.ist.psu.edu/babaoglu93consis tent.html Note: some materials in this paper are used to clarify a few concepts in the next paper. Beyond…
37
Distributed Snapshots: Determining Global States of Distributed Systems K. Mani Chandy Leslie Lamport
38
Introduction Outline of the paper: –Motivation –Model –Algorithm –Correctness –Other issues Beyond…
39
Motivation Capture the global state of a system. Really? True global state: Impossible!!! p1p1 e11e11 e12e12 e13e13 e14e14 e15e15 e16e16 p2p2 e21e21 e22e22 e23e23 e24e24 e25e25
40
p1p1 e11e11 e12e12 e13e13 e14e14 e15e15 e16e16 p2p2 e21e21 e22e22 e23e23 e24e24 e25e25 Motivation Capture the global state of a system. Really? These are what can be done Are they useful?
41
p1p1 e11e11 e12e12 e13e13 e14e14 e15e15 e16e16 p2p2 e21e21 e22e22 e23e23 e24e24 e25e25 Motivation Capture the global state of a system. Useful? Equivalent! p1p1 e11e11 e12e12 e13e13 e14e14 e15e15 e16e16 p2p2 e21e21 e22e22 e23e23 e24e24 e25e25
42
p1p1 e11e11 e12e12 e13e13 e14e14 e15e15 e16e16 p2p2 e21e21 e22e22 e23e23 e24e24 e25e25 Motivation Capture the global state of a system. Useful? Consistent, but not happens in reality. p1p1 e11e11 e12e12 e13e13 e14e14 e15e15 e16e16 p2p2 e21e21 e22e22 e23e23 e24e24 e25e25
43
p1p1 e11e11 e12e12 e13e13 e14e14 e15e15 e16e16 p2p2 e21e21 e22e22 e23e23 e24e24 e25e25 Motivation Capture the global state of a system. Useful? Not even consistent! p1p1 e11e11 e12e12 e13e13 e14e14 e15e15 e16e16 p2p2 e21e21 e22e22 e23e23 e24e24 e25e25
44
Motivation Capture the global state of a system. Useful? Yes: –To detect stable properties of a system: y(S) y(S’) for all S’ reachable from S. –E.g.: “computation has terminated,” “the system is deadlocked,” “all tokens in a token ring have disappeared.”
45
Model A distributed system A distributed system (on the right). A global state = set of processes’ and channels’ states. Event: –atomic –e = Computation: –seq =(e i : 0 i n) –S i+1 = next(S i, e i ) Channels’ assumptions: –Singly directed –FIFO –Asynchronous –Error free –Infinite buffer
46
Algorithm Invoker: behave as if receiving a marker from a virtual node. Receiving rule for process q receiving a marker along channel c : if q has not recorded its state then begin q records its state; q records the state c as the empty sequence end else q records the state of c as the sequence of messages received along c after q’s state was recorded and before q received the marker along c. Sending rule for a process p : for each outgoing channel c : p sends one marker along c after p records its state and before p sends further messages along c.
47
Illustration Next 14 slides, courtesy of Professor Birman.
48
Chandy/Lamport p q r s t u v w x y z A network
49
Chandy/Lamport p q r s t u v w x y z A network I want to start a snapshot
50
Chandy/Lamport p q r s t u v w x y z A network p records local state
51
Chandy/Lamport p q r s t u v w x y z A network p starts monitoring incoming channels
52
Chandy/Lamport p q r s t u v w x y z A network “contents of channel p- y”
53
Chandy/Lamport p q r s t u v w x y z A network p floods message on outgoing channels…
54
Chandy/Lamport p q r s t u v w x y z A network
55
Chandy/Lamport p q r s t u v w x y z A network q is done
56
Chandy/Lamport p q r s t u v w x y z A network q
57
Chandy/Lamport p q r s t u v w x y z A network q
58
Chandy/Lamport p q r s t u v w x y z A network q z s
59
Chandy/Lamport p q r s t u v w x y z A network q v z x u s
60
Chandy/Lamport p q r s t u v w x y z A network q v w z x u s y r
61
Chandy/Lamport p q r s t u v w x y z A snapshot of a network q x u s v r t w p y z Done!
62
Correctness Consistency Termination
63
Consistency m is recorded iff so is send(m) : –sender’s state recording and marker sending are done atomically. m is not recorded more than once: –if channel is recorded before receiver, it will be empty. –if channel is recorded after receiver, none of the in-channel messages will be recorded as the receiver’s state. Correctness:
64
Termination Assumptions: –L1: no marker remains forever in a channel. –L2: processes’ states are recorded in finite time. Every process either spontaneously records its state, or there is a path from such a process. Every channel is flushed by a marker after the sender records its state. Correctness:
65
Remained Issues Property of recorded state: S i --> S * --> S f Stable detection: –Stable property: y(S i ) definite definite y(S f ) –Algorithm: begin record a global state S * ; definite := y(S * ) end.
66
Beyond… Channels’ assumptions: –Singly directed –FIFO –Asynchronous –Error free –Infinite buffer
67
Non-FIFO What is FIFO for? –Separate messages between before-snapshot and after- snapshot. A snapshot counter piggybacked on messages would do just fine! Beyond:
68
Beyond… Channels’ assumptions: –Singly directed –FIFO –Asynchronous –Error free –Infinite buffer Messages can be corrupted/duplicated Messages can be dropped
69
Unreliable channels How to deal with corruption? –Checksum/ECC; reduced to drop. How to deal with duplication? –Message ID How to deal with dropping? –Channel states are not needed anymore. –Markers indicate completion. Beyond:
70
Even More Aggressive… Don’t want to piggyback! Step 1: no piggybacking: –Block all messages sent after recording local state and before receiving marker from all neighbors. Step 2: no blocking, min piggybacking –Blocked messages are sent with piggybacked snapshot info. Beyond:
71
Conclusion Two influential papers. Much work built upon these results. Can be improved significantly when being adopted to particular systems. Additional comments/suggestions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.