20101 Synchronization in distributed systems A collection of independent computers that appears to its users as a single coherent system.

Slides:



Advertisements
Similar presentations
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Distributed Mutual Exclusion.
Advertisements

Time, Clocks, and the Ordering of Events in a Distributed System
Global States.
CS542 Topics in Distributed Systems Diganta Goswami.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
Virtual Time “Virtual Time and Global States of Distributed Systems” Friedmann Mattern, 1989 The Model: An asynchronous distributed system = a set of processes.
Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Distributed Process Management
Chapter 18 Distributed Process Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Logical Clocks and Global State.
Distributed Systems Dinesh Bhat - Advanced Systems (Some slides from 2009 class) CS 6410 – Fall 2010 Time Clocks and Ordering of events Distributed Snapshots.
Distributed Systems Spring 2009
CS 582 / CMPE 481 Distributed Systems
Ordering and Consistent Cuts Presented By Biswanath Panda.
Distributed Process Management
Synchronization Clock Synchronization Logical Clocks Global State Election Algorithms Mutual Exclusion.
Synchronization in Distributed Systems CS-4513 D-term Synchronization in Distributed Systems CS-4513 Distributed Computing Systems (Slides include.
Ordering and Consistent Cuts Presented by Chi H. Ho.
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
Cloud Computing Concepts
1 Distributed Process Management: Distributed Global States and Distributed Mutual Exclusion.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Clock Synchronization Physical clocks Clock synchronization algorithms –Cristian’s.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Logical Clocks and Global State.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Chapter 5.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed Mutual Exclusion.
Distributed Mutex EE324 Lecture 11.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Naming Name distribution: use hierarchies DNS X.500 and LDAP.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
1 Distributed Process Management Chapter Distributed Global States Operating system cannot know the current state of all process in the distributed.
Presenter: Long Ma Advisor: Dr. Zhang 4.5 DISTRIBUTED MUTUAL EXCLUSION.
“Virtual Time and Global States of Distributed Systems”
Synchronization Chapter 5.
Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS?
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport Massachusetts Computer Associates,Inc. Presented by Xiaofeng Xiao.
Real-Time & MultiMedia Lab Synchronization Distributed System Jin-Seung,KIM.
Lecture 12-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) October 4, 2012 Lecture 12 Mutual Exclusion.
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Chapter 11 Global Properties (Distributed Termination)
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.
Chapter pages1 Distributed Process Management Chapter 14.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Oct 1, 2015 Lecture 12: Mutual Exclusion All slides © IG.
The Principles of Operating Systems Chapter 9 Distributed Process Management.
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
Global state and snapshot
Global state and snapshot
Lecture 3: State, Detection
Distributed Mutex EE324 Lecture 11.
Distributed Mutual Exclusion
湖南大学-信息科学与工程学院-计算机与科学系
Outline Theoretical Foundations - continued Lab 1
Outline Distributed Mutual Exclusion Introduction Performance measures
Distributed Snapshot Distributed Systems.
Chapter 5 (through section 5.4)
Slides for Chapter 11: Time and Global State
Jenhui Chen Office number:
Distributed algorithms
Distributed Systems and Concurrency: Synchronization in Distributed Systems Majeed Kassis.
CIS825 Lecture 5 1.
Slides for Chapter 14: Time and Global States
Outline Theoretical Foundations
Presentation transcript:

20101 Synchronization in distributed systems A collection of independent computers that appears to its users as a single coherent system.

20102 Physical clocks each computer has its own clock, but they are not perfect Network Time Protocol is used to synchronize clocks top level time servers are connected to physical clocks, e.g. atom clocks second level time servers connect to several of them algorithm to correct for network delay typically achieving 10 ms accuracy over the internet ntp1.science.ru.nl time.windows.com

20103 Distributed global states Suppose a customer has a bank account distributed over two branches. Suppose one wishes to calculate the total amount at exactly 3:00. In various ways this can go wrong: an amount may be in transit or the clocks are not perfectly synchronized. This can be solved by examining all messages in transit at the time of the observation. The state of a branch consists then both of the current balance and the messages that have been sent or received.

20104 Some definitions A Channel exists between two processes if they exchange messages, for convenience viewed as one way. The State of a process includes the sequence of messages that have been sent or received, and internal conditions of the process. A Snapshot records the state of a process, including all messages sent and received since the last snapshot. A Distributed snapshot is a collection of snapshots, one for each process. A Global State is the combined state of all processes. A true global state cannot be determined because of the time lapse associated with message transfers and the difficulty in synchronizing the clocks. One can attempt to define a global state by collecting snapshots from all processes.

20105 Distributed snapshot algorithm Chandy and Lamport (1985) gave an algorithm to record a consistent global state. They assume that messages are delivered in the order they are send and that no messages are lost. The method uses a special control message, called a marker. Some process initiates the algorithm by recording its state and sending a marker on all outgoing channels. After the algorithm terminates the snapshots present at each process record a consistent global state. It can be used to adapt any centralized algorithm to a distributed environment, because the basis of any centralized algorithm is knowledge of the global state.

20106 the algorithm Marker Sending Rule for process i Process i records its state. For each outgoing channel C on which a marker has not been sent, i sends a marker along C before i sends further messages along C. Marker Receiving Rule along channel C for process j if j has not recorded its state then Record the state of C as the empty set Follow the ‘Marker Sending Rule’ else Record the state of C as the set of messages received along C after j’ s state was recorded and before j received the marker along C

20107 Distributed Mutual exclusion The model used for examining approaches to mutual exclusion in a distributed context. A number of systems or nodes is assumed interconnected by some network. In each node one process is responsible for resource allocation. It controls a number of resources and services a number of local processes. Algorithms for mutual exclusion may be centralized or distributed.

20108 Centralized In a centralized algorithm one node is designated as the control node, it controls the access to all resources shared over the network. When any process wants access to a critical resource, it issues a request to its local resource controlling process. This process sends this request to the resource controlling process on the control node, which returns a permission message when the shared resource becomes available. When a process has finished with a resource, a release message is sent to the control node. Such a centralized algorithm has two key properties: Only (a process in) the control node makes resource allocation decisions. All necessary information is concentrated in the control node, including the identity and location of all resources and the allocation status of each resource. It is easy to see how mutual exclusion is enforced. There are however drawbacks. If the central node fails, then the mutual exclusion mechanism breaks down, at least temporarily. Furthermore, every resource allocation and de-allocation requires an exchange of messages with the control node and execution time on it. Thus the control node may become a bottleneck.

20109 Distributed A fully distributed algorithm is characterized by the following properties: 1.All nodes have equal amount of information, on average. 2.Each node has only a partial picture of the total system and must make decisions based on it. 3.All nodes bear equal responsibility for the final decision. 4.All nodes expend equal effort, in average, in effecting a final decision. 5.Failure of a node, in general, does not result in a total system collapse. 6.There exists no system wide common clock with which to regulate the timing of events. point 2: some distributed algorithms require that all information known to any node be communicated to all other nodes. Even in this case, some of that information is in transit and will not have arrived at all of the other nodes. Thus a node's information is usually not completely up to date, thus partial. point 6: because of the communication delay, it is impossible to maintain a system wide clock that is instantly available to all systems. It is also technically impractical to maintain one central clock and to keep all local clocks synchronized.

Ordering of events We would like to be able to say that event a at system i occurred before (or after) event b at system j, and to arrive consistently at this conclusion at all nodes. Lamport (1978) has proposed a method, called timestamping which orders events in a distributed system without using physical clocks. This technique is so efficient and effective that it is used in many algorithms for mutual exclusion and deadlock prevention. Ultimately, we are concerned with actions that occur at a local system, such as a process entering or leaving its critical session. However, in a distributed system, processes interact by means of messages. Therefore, it makes sense to associate events with messages, remark that a local event can be bound to a message very simply. To avoid ambiguity, we associate events with the sending of messages only, not with the receipt of messages

Timestamping Each node i in the network maintains a local counter Ci, which functions as a clock. Each time a system transmits a message, it first increments its clock by 1. The message is sent in the form: (contents, Ti=Ci, i) When a message is received, the receiving node j sets its clock as: Cj := 1 + max( Cj, Ti) At each node the ordering is then: message x from i precedes message y from j if one of the following conditions holds: Ti < Tj or Ti = Tj and i < j Each message is sent from one process to all other processes. If some are not sent this way, it is impossible that all sites have the same ordering of messages. Only a collection of partial orderings exists.

examples In the above example P1 begins with its clock at 0. It increments its clock by 1 and transmits (a,1,1), the first 1 is the timestamp and the second the identity of P1. P2 and P3 on receive of the message set their clocks to 1 + max( 1,0) = 2. P2 increments its clock by 1 and transmits (x, 3, 2). Etc. At the end, each process agrees to the order {a, x, b,j}. Note that b might have been send after j in physical time. This example shows that the algorithm works in spite of differences in transmission time between pair of systems. The message from P1 arrives earlier than that of P4 at site 2 but later at site 3. Nevertheless, after all messages have been received, the ordering is the same at all sites, namely {a, q}.

Distributed queue One of the earliest proposed approaches to providing distributed mutual exclusion is based on the concept of a distributed queue (Lamport 1978). It uses the previously described model, and assumes a fully connected network, every process can send a message directly to every other process. For simplicity, we describe the case in which each site only controls a single resource. At each site, a data structure is maintained that keeps a record of the most recent message received from each site, and the most recent message sent at this site. Lamport refers to this structure as a queue, actually it is an array with one entry for each site. At any instant, entry qi[j] in the local array contains a message from Pj. The array is initialized as: qi[j] = (Release, 0, j) for j = 1,...., N Three types of messages are used in this algorithm: (Request, Ti, i): A request for access to the resource is made by Pi (Reply, Tj, j): Pj replies to a request message (Release, Tk, k): Pk releases a resource previously allocated to it

…algorithm 1.When Pi wants access to a resource, it sends (Request, Ti, i) to all other processes and puts it at its own qi[i]. 2.When Pj receives (Request, Ti, i) it puts it in qj[i] and sends (Reply, Tj, j) to Pi. 3.Pi can access a resource (enter its critical session) when both of the following conditions hold: qi[i] is the earliest Request message in qi. all other messages in qi are later than qi[i] 4.Pi releases a resource by sending (Release, Ti, i) to all processes and putting it at its own qi[i]. 5.When Pj receives (Release, Ti, i) or (Reply, Ti, i) it puts it in qj[i]. This algorithm enforce mutual exclusion, is fair, avoids deadlock and starvation. Note that 3(N-1) messages (error free) are required for each mutual exclusion access. If broadcast is also used this reduces to N+1 messages. Ricard and Agrawala (1981) optimized the Lamport method by eliminating release messages ( 2(N-1) messages, with broadcast N ). Token passing algorithms reduce the number of messages further.