Causality & Global States. P1 P2 P3 12 3 4 5 0 0 0 1 2 Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order.

Slides:



Advertisements
Similar presentations
1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create (
Advertisements

Global States.
Distributed Snapshots: Determining Global States of Distributed Systems Joshua Eberhardt Research Paper: Kanianthra Mani Chandy and Leslie Lamport.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 14: Time and.
CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.
SES Algorithm SES: Schiper-Eggli-Sandoz Algorithm. No need for broadcast messages. Each process maintains a vector V_P of size N - 1, N the number of processes.
CS542 Topics in Distributed Systems Diganta Goswami.
1 Global State $500$200 A B C1: Empty C2: Empty Global State 1 $450$200 A B C1: Tx $50 C2: Empty Global State 2 $450$250 A B C1: Empty C2: Empty Global.
Uncoordinated Checkpointing The Global State Recording Algorithm.
Uncoordinated Checkpointing The Global State Recording Algorithm Cristian Solano.
Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.
CS542 Topics in Distributed Systems Diganta Goswami.
Distributed Computing 5. Snapshot Shmuel Zaks ©
CS 582 / CMPE 481 Distributed Systems
Ordering and Consistent Cuts Presented By Biswanath Panda.
CMPT 431 Dr. Alexandra Fedorova Lecture VIII: Time And Global Clocks.
Distributed Systems Fall 2009 Logical time, global states, and debugging.
Slides for Chapter 10: Time and Global State
Time and Global States Chapter 11. Why time? Time is an Important and interesting issue in distributes systems. One we can measure accurately. Can use.
Ordering and Consistent Cuts Presented by Chi H. Ho.
Cloud Computing Concepts
1 Distributed Process Management: Distributed Global States and Distributed Mutual Exclusion.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Clock Synchronization Physical clocks Clock synchronization algorithms –Cristian’s.
Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms CS 249 Project Fall 2005 Wing Wong.
Global Predicate Detection and Event Ordering. Our Problem To compute predicates over the state of a distributed application.
Lecture 3-1 Computer Science 425 Distributed Systems Lecture 3 Logical Clock and Global States/ Snapshots Reading: Chapter 11.4&11.5 Klara Nahrstedt.
Chapter 5.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
1 Distributed Systems CS 425 / CSE 424 / ECE 428 Global Snapshots Reading: Sections 11.5 (4 th ed), 14.5 (5 th ed)  2010, I. Gupta, K. Nahrtstedt, S.
Distributed Computing 5. Snapshot Shmuel Zaks ©
1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms Author: Ozalp Babaoglu and Keith Marzullo Distributed Systems: 526.
Chapter 9 Global Snapshot. Global state  A set of local states that are concurrent with each other Concurrent states: no two states have a happened before.
Lecture 6-1 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Indranil Gupta (Indy) September 12, 2013 Lecture 6 Global Snapshots Reading:
1 Distributed Process Management Chapter Distributed Global States Operating system cannot know the current state of all process in the distributed.
Distributed Snapshot. Think about these -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes?
Distributed Systems Fall 2010 Logical time, global states, and debugging.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Snapshot. One-dollar bank Let a $1 coin circulate in a network of a million banks. How can someone count the total $ in circulation? If not.
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
D ISTRIBUTED S YSTEM UNIT-2 Theoretical Foundation for Distributed Systems Prepared By: G.S.Mishra.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section.
1 Chapter 11 Global Properties (Distributed Termination)
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
The Principles of Operating Systems Chapter 9 Distributed Process Management.
Distributed Systems Lecture 6 Global states and snapshots 1.
CSE 486/586 Distributed Systems Reliable Multicast --- 1
CSE 486/586 Distributed Systems Global States
Theoretical Foundations
Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013
COT 5611 Operating Systems Design Principles Spring 2012
Distributed Snapshot.
湖南大学-信息科学与工程学院-计算机与科学系
Slides for Chapter 14: Time and Global States
Time And Global Clocks CMPT 431.
Slides for Chapter 11: Time and Global State
Slides for Chapter 14: Time and Global States
ITEC452 Distributed Computing Lecture 8 Distributed Snapshot
CSE 486/586 Distributed Systems Global States
Jenhui Chen Office number:
Distributed algorithms
CIS825 Lecture 5 1.
Slides for Chapter 14: Time and Global States
COT 5611 Operating Systems Design Principles Spring 2014
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Chandy-Lamport Example
Distributed Snapshot.
Presentation transcript:

Causality & Global States

P1 P2 P Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order of messages causes an action based on information that another host has not yet received. In designing a DS, potential for causality violation is important

Detecting Causality Violation P1 P2 P3 (1,0,0) (2,0,0) Physical Time (2,0,2) Potential causality violation can be detected by vector timestamps. If the vector timestamp of a message is less than the local vector timestamp, on arrival, there is a potential causality violation. 0,0,0 1,0,0 2,0,1 2,2,2 2,1,2 2,0,2 2,0,0 Violation: (1,0,0) < (2,1,2)

Communication Modes in DS  Unicast (best effort or reliable)  Messages are sent from exactly one host to one host  Best effort guarantees that if a message is delivered it would be intact  Reliable guarantees delivery of messages.  Broadcast  Messages are sent from exactly one host to all hosts on the same network.  Reliable broadcast protocols are not practical  Multicast  Messages are sent from exactly one host to several hosts on the same or different networks.  Multicast can be implemented above a reliable unicast

 Process messages from each host in the order they were sent:  Each processor keeps a sequence number for each host (use own sequence for marking own messages)  When a message is received, as expected, accept higher than expected, buffer in a queue lower than expected, reject Ordering Guarantees (FIFO) If Message# is

Example: FIFO Multicast P1 P2 P Physical Time Reject: 1 < Accept 1 = Accept: 2 = Buffer 2 > Accept: 1 = Accept Buffer 2 = Reject: 1 < Accept 1 = Sequence Vector 0 0 0

 Process messages in an order that guarantees no potential causal violation  Each processor keeps a vector of sequences, one entry for each host  A copy of the vector is sent with each message  When a message is received, If all vector entries are as expected, accept else higher than expected, buffer in a queue lower than expected, reject Ordering Guarantees (Causal) If any vector entry is

Example: Causal Ordering Multicast P1 P2 P3 Physical Time (1,1,0) Reject: Accept 0,0,0 1,0,0 1,1,0 1,0,0 Buffer, missin g P1(1) 1,1,0 Accept: 1,0,0 Accept Buffered messag e 1,1,0 (1,0,0) (1,1,0) Accept

Process Histories and States  For a process P i, history(P i ) = h i = prefix history(P i k ) = h i k = S i k, P i ‘s State immediately before k th event  For a set of processes, global history, H =  i (h i ) global state, S =  i (S i k i ) a cut C  H = h 1 c1  h 2 c2  …  h n cn the frontier of C = {e i ci, i = 1,2, … n}

Consistent States  A cut C is consistent if  e  C (if f  e then f  C)  A global state S is consistent if it corresponds to a consistent cut P1 P2 P3 e10e10 e11e11 e12e12 e13e13 e20e20 e21e21 e22e22 e30e30 e31e31 e32e32 Inconsistent cut Consistent cut

Global States  A Run is a total ordering of events in H that is consistent with each h i ’s ordering  A Linearization is a run consistent with happens-before (  ) relation in H.  Linearizations pass through consistent global states.  A global state S k is reachable from global state S i, if there is a lineralization, L, that passes through S i and then through S k.  A DS evolves as a series of transitions between global states S 0, S 1, ….

Global State Predicates  A global-state-predicate is a function from the set of global states to {true, false}, e.g. deadlock, termination  A stable global-state-predicate is one that once it becomes true, remains true in subsequent global states, e.g. deadlock  if P is a gobal-state-predicate, then safety(P)   S reachable from S 0, P(S) = false liveness(p)   L  lineralizatoins from S0  S L : L passes through S L & P(S L ) = true  We need a way to record global states

The “Snapshot” Algorithm  Records a set of process and channel states such that the combination is a consistent GS.  Assumptions:  No failure, all mages arrive intact, exactly once  Communication channels are unidirectional  Message access is FIFO-ordered  There is a comm. Path between any two processes  Any process may initiate the snapshot (sends Marker)  Snapshot does not interfere with normal execution behavior  Each process records its state and the state of its incoming channels (no central collection)

The “Snapshot” Algorithm (2) Marker sending rule for process P i  record P i ’s state  for each outgoing channel C, send a marker on C Marker receiving rule for a process P k, on receipt of a marker over channel C`  if P k has not yet recorded its state -record P k ’s state -record the state of C` as “empty” -turn on recording of messages over other incoming channels -for each outgoing channel C, send a marker on C else -record the state of C` as all the messages received over C` since P k saved its state

Snapshot Example P1 P2 P3 e10e10 e20e20 e23e23 e30e30 e13e13 a b M e 1 1,2 M 1- P1 initiates snapshot: records its state (S1); sends Markers to P2 & P3; turns on recording for channels C21 and C31 e 2 1,2,3 M M 2- P2 receives Marker over C12, records its state (S2), sets state(C12) = {} sends Marker to P1 & P3; turns on recording for channel C32 e14e14 3- P1 receives Marker over C21, sets state(C21) = {a} e 3 2,3,4 M M 4- P3 receives Marker over C13, records its state (S3), sets state(C13) = {} sends Marker to P1 & P2; turns on recording for channel C23 e24e24 5- P2 receives Marker over C32, sets state(C32) = {b} e31e31 6- P3 receives Marker over C23, sets state(C23) = {} e13e13 7- P1 receives Marker over C31, sets state(C31) = {}