Time And Global Clocks CMPT 431.

Slides:



Advertisements
Similar presentations
Global States.
Advertisements

Virtual Time “Virtual Time and Global States of Distributed Systems” Friedmann Mattern, 1989 The Model: An asynchronous distributed system = a set of processes.
Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
CS542 Topics in Distributed Systems Diganta Goswami.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Logical Clocks and Global State.
Distributed Systems Spring 2009
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA.
Ordering and Consistent Cuts Presented By Biswanath Panda.
CMPT 431 Dr. Alexandra Fedorova Lecture VIII: Time And Global Clocks.
Distributed Systems Fall 2009 Logical time, global states, and debugging.
CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Ordering and Consistent Cuts Presented by Chi H. Ho.
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Clock Synchronization Physical clocks Clock synchronization algorithms –Cristian’s.
Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms CS 249 Project Fall 2005 Wing Wong.
Global Predicate Detection and Event Ordering. Our Problem To compute predicates over the state of a distributed application.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Logical Clocks and Global State.
Chapter 5.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
1 Distributed Systems CS 425 / CSE 424 / ECE 428 Global Snapshots Reading: Sections 11.5 (4 th ed), 14.5 (5 th ed)  2010, I. Gupta, K. Nahrtstedt, S.
1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526.
Issues with Clocks. Context The tree correction protocol was based on the idea of local detection and correction. Protocols of this type are complex to.
Lecture 6-1 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Indranil Gupta (Indy) September 12, 2013 Lecture 6 Global Snapshots Reading:
Chapter 10: Time and Global States
Synchronization. Why we need synchronization? It is important that multiple processes do not access shared resources simultaneously. Synchronization in.
“Virtual Time and Global States of Distributed Systems”
Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS?
Distributed Systems Fall 2010 Logical time, global states, and debugging.
Event Ordering Greg Bilodeau CS 5204 November 3, 2009.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Logical Clocks. Topics r Logical clocks r Totally-Ordered Multicasting.
Event Ordering. CS 5204 – Operating Systems2 Time and Ordering The two critical differences between centralized and distributed systems are: absence of.
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
Distributed systems. distributed systems and protocols distributed systems: use components located at networked computers use message-passing to coordinate.
CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Systems Lecture 6 Global states and snapshots 1.
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Global state and snapshot
Global state and snapshot
Vector Clocks and Distributed Snapshots
CSE 486/586 Distributed Systems Global States
Overview of Ordering and Logical Time
Distributed Snapshot.
COT 5611 Operating Systems Design Principles Spring 2012
EECS 498 Introduction to Distributed Systems Fall 2017
湖南大学-信息科学与工程学院-计算机与科学系
Slides for Chapter 14: Time and Global States
Logical Clocks and Casual Ordering
Outline Theoretical Foundations - continued Lab 1
Event Ordering.
Outline Theoretical Foundations
CS 425 / ECE 428  2013, I. Gupta, K. Nahrtstedt, S. Mitra, N. Vaidya, M. T. Harandi, J. Hou.
Chapter 5 (through section 5.4)
Slides for Chapter 11: Time and Global State
Slides for Chapter 14: Time and Global States
Lecture 8 Processes and events Local and global states Time
Outline Theoretical Foundations - continued
ITEC452 Distributed Computing Lecture 8 Distributed Snapshot
Distributed Snapshot.
CSE 486/586 Distributed Systems Global States
Jenhui Chen Office number:
Distributed algorithms
CIS825 Lecture 5 1.
Consistent cut If this is not true, then the cut is inconsistent
COT 5611 Operating Systems Design Principles Spring 2014
Outline Theoretical Foundations
Distributed Snapshot.
Presentation transcript:

Time And Global Clocks CMPT 431

A Distributed Computation With Data Dependencies System A System B A waits for B B waits for C System C Ask work from System B Compute Send result to user Ask data from System C Respond to System A System C became unresponsive

Detecting and Fixing the Problem User gets no response from System A How do we detect that System C is causing the problem? Easy to detect if we get a snapshot of a global state Constructing global state in an asynchronous distributed system is a difficult problem

A Naïve Approach Each system records each event it performed and its timestamp Suppose events in the this system happened in this real order: Time Tc0: System C sent data to System B (before C stopped responding) Time Ta0: System A asked for work from System B Time Tb0: System B asked for data from System C Tc0 Ta0 Tb0

A Naïve Approach (cont) Ideally, we will construct real order of events from local timestamps and detect this dependency chain: System A Ta System A asked for work Tb System B System B asked for data System C sent data System C Tc

A Naïve Approach (cont) But in reality, we do not know if Tc occurred before Ta and Tb, because in an asynchronous distributed system clocks are not synchronized! System A Ta System A asked for work Tb System B System B asked for data System C sent data System C sent data System C Tc Tc

This Lecture’s Topic We will learn how to solve this problem in an asynchronous distributed system Our goal is to construct a consistent global state in presence of unsynchronized local clocks We will begin by creating a formal model for Distributed computation Event precedence in a distributed computation We will use event precedence model to define consistent global state Then we will learn to construct consistent global states

Model of a Distributed Computation Collection of processes p0, …, pn Processes communicate by sending messages Event send(m) enqueues message m on the sending end Event receive(m) dequeues message m on the receiving end Events ei0, ei1, … , ein are steps of a distributed computation at process pi hi = ei0, ei1, … , ein is a local history of pi

Events and Causal Precedence In an asynchronous distributed system there is no notion of global clock Events cannot be ordered according to some real order We can only talk about causal event ordering, that is if one event affects the outcome of another event For example events send(m) and receive(m) are in causal order, because receive(m) could not have occurred before send(m)

Formal Rules for Ordering of Events If local events precede one another, then they also precede one another in the global ordering: If eik ,eim Є hi and k < m, then eik→eim Sending a message always precedes receipt of that message: If ei = send(m) and ej= receive(m), then ei→ej Event ordering is associative: If e → e’ and e’ → e”, then e → e”

Space-time Diagram of a Distributed Computation e21→e36 e22 || e36

Cuts of a Distributed Computation Suppose there is an external monitor process External monitor constructs a global state: Asks processes to send it local history Global state constructed from these local histories is a cut of a distributed computation

Example Cuts e11 e12 e13 e14 e15 e16 e21 e22 e23 e31 e32 e33 e34 e35

Consistent vs. Inconsistent Cuts A cut is consistent if for any event e included in the cut, any event e’ that causally precedes e is also included in that cut For cut C: (e Є C) Λ (e’→ e) => e’ Є C

Are These Cuts Consistent? p1 e21 e22 e23 p2 e31 e32 e33 e34 e35 e36 p3 C

Are These Cuts Consistent? causally precedes e36 e11 e12 e13 e14 e15 e16 …but not included in C e21 e22 e23 e31 e32 e33 e34 e35 e36 inconsistent included in C C’

Cuts and Global State Recall that our goal is to construct a consistent global state We’d like to get a snapshot of a distributed system as would have been observed by a perfect observer A consistent cut corresponds to a consistent global state What’s next: we will learn how to construct consistent cuts in the absence of synchronized clocks Consistent global states can be constructed via: Active monitoring Passive monitoring

Passive vs. Active Monitoring Passive monitoring: There is a process p0 external to the distributed computation There are processes pi (1 ≤ i ≤ n), part of the computation Each process pi sends to p0 a timestamp of each event it executes Active monitoring: When p0 desires to find out the state of the system it asks all processes pi to send its state

Passive vs. Active Monitoring You need both passive and active monitoring The kind of monitoring you use depends on what property of a distributed system you are trying to detect Passive monitoring gives you more information about the distributed computation than active monitoring Passive monitoring records the entire event history Passive monitoring uses logical clocks or vector clocks – a way to timestamp an event in the absence of global clock Active monitoring does not depend of special clocks, but it is also less powerful

What Do We Need to Know to Construct a Consistent Cut? causally precedes e36 e11 e12 e13 e14 e15 e16 We must know the causal ordering of events. If we do we can detect an inconsistent cut …but not included in C e21 e22 e23 e31 e32 e33 e34 e35 e36 inconsistent included in C C’

Constructing a Consistent Cut Via Passive Monitoring We must know the causal ordering of event If there was a global clock, we would detect this as follows: RC(e) – timestamp of event e given by a global clock Clock condition: e’→ e => RC(e) < RC(e’) What to do if there is no global clock? We will use logical clocks and vector clocks

Logical Clocks e11 e12 e13 e14 e15 e16 LC=1 LC=2 LC=3 LC=4 LC=5 LC=6 Each process maintains a local value of a logical clock LC Logical clock of process p counts how many events in a distributed computation causally preceded the current event at p (including the current event). LC(ei) – the logical clock value at process pi at event ei Suppose we had a distributed system with only a single process e11 e12 e13 e14 e15 e16 LC=1 LC=2 LC=3 LC=4 LC=5 LC=6

Logical Clocks (cont.) In a system with more than one process logical clocks are updated as follows: Each message m that is sent contains a timestamp TS(m) TS(m) is the logical clock value associated with sending event at the sending process e11 e12 e13 e14 e15 e16 LC=1 send(m) TS(m) = 1

Logical Clocks (cont) When the receiving process receives message m, it updates its logical clock to: max{LC, TS(m)} + 1 e11 e12 e13 e14 e15 e16 LC=1 send(m) TS(m) = 1 What is the LC value of e22? e21 e22 LC=1 LC=2

Illustration of a Logical Clock 1 2 \ 3 4 5 6 7 p1 1 5 6 p1 1 2 3 4 5 7 p1

Causal Delivery with Logical Clocks We have a external monitor process p0 Each process pi sends an event notification message m to p0 after each event ei To let p0 construct a correct causal ordering of events, we define the following delivery rule for event notification messages: Causal delivery rule: Event notification messages are delivered to p0 in the increasing logical clock timestamp order

Causal Delivery Rule 1 2 3 4 p0 1 2 4 5 6 7 p1 1 2 3 4 5 7 p2 m(LC=1)

Gap Detection Rule 1 2 3 4 p0 1 2 4 5 6 7 p1 1 2 3 4 5 7 p2 Oops! m(LC=1) m(LC=2) m(LC=4) m(LC=1) m(LC=3) 1 2 4 5 6 7 p1 Must not deliver a message unless 100% sure that all messages with smaller timestamps have been received 1 2 3 4 5 7 p2

Rules to Achieve Causal Delivery Do not deliver event notification message from a process pi unless all messages with smaller timestamps have been delivered (FIFO delivery) Rule #2: Do not deliver to p event notification message with a timestamp TS(m) unless p received messages with timestamps one smaller, equal or greater than TS(m) from all other processes

Causal Delivery Rules 1 1 2 p0 1 2 4 5 6 7 p1 1 2 3 4 5 7 p2 m(LC=1)

Causal Precedence Relation Detecting causal precedence relationship with logical clocks requires sending more than just timestamps!

Causal Precedence or Concurrency? 1 1 2 2 p0 1 2 p1 e11 e11and e22are concurrent e22 1 2 p2

Causal Precedence or Concurrency? 1 1 2 2 p0 1 2 p1 e11 Now e11and e22are causally related But timestamps have not changed!!! e22 1 2 p2

Causal Precedence Relation Using just timestamps from logical clocks does not help us detect whether two events are causally related or concurrent We could fix this by including a local history in the event notification message Not a good idea: messages will become large A better idea: vector clocks

Introduction to Vector Clocks The clock of each process is represented by a vector The vector dimension is the number of processes in the system Each entry of a vector clock corresponds to a process Suppose there are three processes: p1, p2, and p3 Then the vector clock at each process will look like this: VC[1] – entry for process p1 VC[2] – entry for process p2 VC[3] – entry for process p3

Arrives with the message Using Vector Clocks VC[1] = 0 VC[2] = 0 VC[1] = 1 VC[2] = 0 1 p1 Increment local component to “1” for an internal of “send” event Arrives with the message Increment other processes’ components to be no less than at the incoming VC Initialize to 0 VC[1] = 1 VC[2] = 0 1 2 p2 VC[1] = 0 VC[2] = 0 VC[1] = 0 VC[2] = 1 VC[1] = 1 VC[2] = 2 Increment the local component

The Meaning of Vector Clocks An entry of a vector clock for process pi is the number of events at pi VC[1] = 3 – number of events at p1 VC[2] = 0 – number of events at p2 VC[3] = 1 – number of events at p3 p1 knows for sure how many events it executed But how can it be sure about p2 and p3? It can’t! So p1‘s entries for p2 and p3 are... what p1 thinks about p2 and p3

Meaning of Vector Clocks (cont) p1 thinks incorrectly about other processes if p1 does not communicate with other processes Once p1 exchanges messages with other processes, it updates its information about other processes using vector clock timestamps received from other processes

Another Vector Clock Example VC[1] = 1 VC[2] = 0 VC[3] = 0 VC[1] = 2 VC[1] = 3 VC[1] = 4 VC[2] = 1 VC[3] = 3 VC[1] = 5 VC[2] = 1 VC[3] = 3 VC[1] = 6 VC[2] = 1 VC[3] = 3 VC[2] = 1 VC[2] = 1 VC[3] = 0 VC[3] = 3 p1 VC[1] = 0 VC[2] = 1 VC[3] = 0 p2 VC[1] = 1 VC[2] = 2 VC[3] = 4 VC[1] = 4 VC[2] = 3 VC[3] = 4 p3 VC[1] = 0 VC[2] = 0 VC[3] = 1 VC[1] = 1 VC[1] = 1 VC[2] = 0 VC[3] = 3 VC[1] = 1 VC[2] = 0 VC[3] = 4 VC[1] = 1 VC[2] = 0 VC[3] = 5 VC[1] = 5 VC[2] = 1 VC[3] = 6 VC[2] = 0 VC[3] = 2

Causal Precedence or Concurrency? VC[1] = 1 VC[2] = 0 VC[1] = 0 VC[2] = 1 VC[1] = 0 VC[2] = 2 VC[1] = 2 VC[2] = 0 p0 VC[1] = 1 VC[2] = 0 VC[1] = 2 VC[2] = 0 1 2 p1 e11 VC[1] = 0 VC[2] = 2 VC[1] = 0 VC[2] = 1 e11and e22are concurrent e22 1 2 p2

Causal Precedence or Concurrency? VC[1] = 1 VC[2] = 0 VC[1] = 0 VC[2] = 1 VC[1] = 1 VC[2] = 2 VC[1] = 2 VC[2] = 0 p0 Now e11and e22are causally related VC[1] = 1 VC[2] = 0 VC[1] = 2 VC[2] = 0 1 2 p1 e11 VC[1] = 0 VC[2] = 1 VC[1] = 1 VC[2] = 2 e22 1 2 p2

Vector Clocks Detect Causal Precedence Let us look at timestamps received by observer p0: In case when events were concurrent: From p1: From p2: In case when events where causally related: VC[1] = 2 VC[2] = 0 VC[1] = 0 VC[2] = 2 VC[1] = 2 VC[2] = 0 Now the timestamps help detect causal precedence VC[1] = 1 VC[2] = 2

Use Vector Clocks to Construct Consistent Cuts A consistent cut is a cut that does not include pairwise inconsistent events So the first step is to understand pairwise inconsistency

Informal Definition of Pairwise Inconsistency Look at vector timestamps of two events: ei at pi and ej at pj If at event ei process pi thinks that process pj executed more events than process pj thinks it has executed at event ej, the events are pairwise inconsistent Example: timestamp at p3 timestamp at p1 The events are pairwise inconsistent Process p1 must know better about what it’s doing than process p3 could ever know. VC[1] = 5 VC[2] = 1 VC[3] = 6 VC[1] = 4 VC[2] = 1 VC[3] = 3 p3 thinks that p1 has executed 5 events p1 has executed 4 events!

Formal Definition of Pairwise Inconsistency Two events ei and ej , are pairwise inconsistent if: (VC(ei )[i] < VC(ej)[i]) ٧ (VC(ej )[j] < VC(ei)[j]) the number of events pi thinks to have executed at event ei the number of events pj thinks that pi has executed at event ej

Cut Consistency Among all events in the frontier of the cut, check each pair of events for pairwise inconsistency If at least two events are pairwise inconsistent, then the cut is inconsistent

Is this Cut Consistent? p1 p2 p3 VC[1] = 1 VC[2] = 0 VC[3] = 0

And What About This One? p1 p2 p3 VC[1] = 1 VC[2] = 0 VC[3] = 0

Active Monitoring Recall: we used vector clocks so we can construct consistent global states (or cuts) from processes’ local histories via passive monitoring Remember, there is another type of monitoring – active monitoring: Does not depend of vector clocks, but it is less powerful Active monitoring is also referred to as constructing a distributed snapshot

Distributed Snapshot Protocol (Chandy and Lamport) Monitor p0 sends “take snapshot” message to all processes pi If pi receives such “take snapshot” message for the first time, it: Records its state Stops doing any activity related to the distributed computation Relays the “take snapshot” message on all of its outgoing channels Starts recording a state of its incoming channels – records all messages that have been delivered after the receipt of the first “take snapshot” message

Distributed Snapshot Protocol (Chandy and Lamport)(cont.) When pi receives a “take snapshot” message from process pj, it stops recording the state of the channel between itself and pj When pi receives “take snapshot” message from all processes and from po, it stops recording the snapshot and sends it to po

Properties of Chandy-Lamport Protocol Let’s see why this protocol constructs only consistent snapshots, provided that FIFO channels are used Recall the key property of a consistent snapshot (i.e., consistent cut): If event (e → e”) and e” is included in the snapshot, then e must also be included in that snapshot Let’s see why this property holds if we use the Chandy-Lamport distributed snapshot protocol

Properties of Chandy-Lamport Protocol Can it happen that e21 is included in the snapshot and e11 is not? p0 Suppose p1 finished taking the snapshot before p2 received the first “take snapshot” message e11 p1 This is the only situation when e21 would be included in the snapshot and e11 would not be snapshot finish e21 p2 Let’s see why this won’t happen “take snapshot” message

Chandy-Lamport Protocol: Example 1 Recall that processes relay the first “take snapshot” message on all outgoing channels snapshot finish e11 p1 By protocol, p1 cannot finish the snapshot until it receives the second “take snapshot” message from all other processes snapshot finish e21 p2 So p1 cannot send e11 and finish taking the snapshot before p2 received the first “take snapshot” message. So that situation will not happen

Chandy-Lamport Protocol: Example 2 snapshot finish So that situation will not happen e11 p1 e21 p2 But p2 must stop recording events from p1 after it receives the “take snapshot” message from p1.

Using Global States for Evaluating System Properties We now know how to construct consistent global states using active and passive monitoring Let us look how consistent global states can be used to solve problems in distributed systems Global state is constructed in order to evaluate a predicate: “Is the system in deadlock?” “Is the memory unreachable (can it be garbage collected)?” “Does x=y?” (For distributed debugging)

Stable and Unstable Predicates Predicates can be stable and unstable A stable predicate does not change its value after the system has reached this state Deadlock – once the system deadlocks it stays there Garbage collection – once the memory is no longer referenced, no one could acquire a reference for it An unstable predicate can change its value Does x=y? If this were true at one point, it could no longer be true later Stable predicates are easy to check – they can be checked from distributed snapshots or vector clock based causal event histories

Unstable Predicates Are Harder to Check x = 3 x = 4 Want to evaluate predicate: (y-x) = 2 An external monitor knows causal ordering of events, not the actual Valid actual sequences are: e11 e21 e12 e13 e22 e11 e21 e12 e22 e13 They are both valid because they preserve causal ordering e21 → e12 e11 e12 e13 p1 VC[1] = 1 VC[2] = 0 VC[1] = 2 VC[2] = 1 VC[1] = 0 VC[2] = 1 VC[1] = 0 VC[2] = 2 e21 e22 p2 y = 6 y = 4

Unstable Predicates x = 3 x = 4 Valid actual sequences: e11 e21 e12 e13 e22 y –x = 2 is TRUE after e13 e11 e21 e12 e22 e13 y –x = 2 is NOT TRUE e11 e12 e13 p1 VC[1] = 1 VC[2] = 0 VC[1] = 2 VC[2] = 1 x = 3 y = 6 x = 4 VC[1] = 0 VC[2] = 1 VC[1] = 0 VC[2] = 2 e21 e22 p2 y = 6 y = 4 x = 3 y = 6 y = 4 x = 4

Detecting “Possibly True” With unstable predicates you cannot always detect whether a predicate was definitely true But you can detect if it was possibly true Need to look at all possible valid sequences of events If the predicate is true in at least one of them, then it is possibly true Detecting “possibly true” is useful for distributed debugging If a wrong condition is possibly true, the system has a bug Even if the bug was not triggered it is possible that it will be triggered under a different ordering of events

Detecting “Definitely True” Sometimes you can detect definitely true Construct all valid system states If the condition is true in all of them, then the condition is definitely true Any actual ordering of events would have triggered the condition, given the correct causal ordering

Summary Observing a global state of the system is useful for: Distributed debugging Distributed garbage collection Deadlock detection Constructing a consistent global state is difficult in absence of global clocks Consistent global states can be constructed using Active monitoring (distributed snapshots) Passive monitoring (local event histories with vector clock timestamps)

Summary (cont.) Distributed snapshots have limited use Can be used to evaluate stable predicates Local event histories with vector clock timestamps contain more information – one can reconstruct a causal history of a distributed computation Can be used to evaluate unstable predicates Unstable predicates cannot always be evaluated with certainty One can always evaluate “possibly true” Cannot always evaluate “definitely true”