Distributed Computing

Slides:



Advertisements
Similar presentations
Synchronization.
Advertisements

CS 542: Topics in Distributed Systems Diganta Goswami.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Time and Clock Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Page 1 Lecturer: Roohollah Abdipour Acknowledgment: Many of the slides are based on slides by Prof. Paul Krzyzanowski at Rutgers University,
Distributed Systems Spring 2009
1. Explain why synchronization is so important in distributed systems by giving an appropriate example When each machine has its own clock, an event that.
Synchronization Clock Synchronization Logical Clocks Global State Election Algorithms Mutual Exclusion.
Distributed Systems CS Synchronization – Part II Lecture 8, Sep 28, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Synchronization in Distributed Systems. Mutual Exclusion To read or update shared data, a process should enter a critical region to ensure mutual exclusion.
SynchronizationCS-4513, D-Term Synchronization in Distributed Systems CS-4513 D-Term 2007 (Slides include materials from Operating System Concepts,
Lecture 13 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Last quiz Max: 69 / Median: 52 / Min: 24 In a box outside.
Synchronization in Distributed Systems CS-4513 D-term Synchronization in Distributed Systems CS-4513 Distributed Computing Systems (Slides include.
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
Clock Synchronization and algorithm
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao Cleveland State University.
EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao Cleveland State University.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
Lecture 9: Time & Clocks CDK4: Sections 11.1 – 11.4 CDK5: Sections 14.1 – 14.4 TVS: Sections 6.1 – 6.2 Topics: Synchronization Logical time (Lamport) Vector.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Distributed Systems Foundations Lecture 1. Main Characteristics of Distributed Systems Independent processors, sites, processes Message passing No shared.
1 Synchronization Part 1 REK’s adaptation of Claypool’s adaptation of Tanenbaum’s Distributed Systems Chapter 5.
Synchronization Chapter 6 Part I Clock Synchronization & Logical clocks Part II Mutual Exclusion Part III Election Algorithms Part IV Transactions.
Synchronization.
1 Physical Clocks need for time in distributed systems physical clocks and their problems synchronizing physical clocks u coordinated universal time (UTC)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
OS2- Sem , R. Jalili Synchronization Chapter 5.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Naming Name distribution: use hierarchies DNS X.500 and LDAP.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.
Page 1 Logical Clocks Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation is.
Synchronization CSCI 4900/6900. Importance of Clocks & Synchronization Avoiding simultaneous access of resources –Cooperate to grant exclusive access.
Synchronization CSCI 4780/6780. Mutual Exclusion Concurrency and collaboration are fundamental to distributed systems Simultaneous access to resources.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Synchronization. Why we need synchronization? It is important that multiple processes do not access shared resources simultaneously. Synchronization in.
Synchronization Chapter 5. Outline 1.Clock synchronization 2.Logical clocks 3.Global state 4.Election algorithms 5.Mutual exclusion 6.Distributed transactions.
Synchronization Chapter 5.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS?
1 Clock Synchronization & Mutual Exclusion in Distributed Operating Systems Brett O’Neill CSE 8343 – Group A6.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Real-Time & MultiMedia Lab Synchronization Distributed System Jin-Seung,KIM.
Synchronization CSCI 4900/6900. Importance of Clocks & Synchronization Avoiding simultaneous access of resources –Cooperate to grant exclusive access.
Distributed Process Coordination Presentation 1 - Sept. 14th 2002 CSE Spring 02 Group A4:Chris Sun, Min Fang, Bryan Maden.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems Topic 5: Time, Coordination and Agreement
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
Logical Clocks. Topics r Logical clocks r Totally-Ordered Multicasting.
6 SYNCHRONIZATION. introduction processes synchronize –exclusive access. –agree on the ordering of events much more difficult compared to synchronization.
Synchronization. Clock Synchronization In a centralized system time is unambiguous. In a distributed system agreement on time is not obvious. When each.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
Synchronization Chapter 5. Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
Lecture on Synchronization Submitted by
Synchronization in Distributed Systems In a single CPU system, critical regions, mutual exclusion and other synchronization problems are generally solved.
Distributed Computing
CSC 8320 Advanced Operating Systems Spring 2006
Distributed Systems CS
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems CS
Chapter 5 (through section 5.4)
Prof. Leonardo Mostarda University of Camerino
Distributed Systems CS
Chap 5 Distributed Coordination
Last Class: Naming Name distribution: use hierarchies DNS
Presentation transcript:

Distributed Computing Synchronization Dr. Yingwu Zhu

Topics to Discuss Physical vs. Logical Clocks Lamport Vector Clocks Lamport Clocks Lamport Vector Clocks Mutual Exclusion Algorithms Election Algorithms

Synchronization? What’s for? Temporal ordering of events produced by concurrent processes Synchronization between senders and receivers of message Msg m1 from process P to Q is sent before or after msg m2 from process Q? Coordination of joint activity Serialization of concurrent access for shared objects (e.g., access to a shared printer)

An Ideal World All machines’ clocks are perfectly synchronized, synchronization is really easy!

Clock Synchronization Example In centralized systems, no problem for the above Make program In distributed systems, when each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time.

Logical vs. Physical Clocks Logical clock keeps track of event ordering Among related (causal) events Do not care the real time where events occurred Physical clock keeps time of day Consistent across systems

Physical Clock (timer) in Computers Real-time Clock: CMOS clock (counter) circuit driven by a quartz oscillator battery backup to continue measuring time when power is off OS generally programs a timer circuit to generate an interrupt periodically e.g., 60, 100, 250, 1000 interrupts per second(Linux 2.6+ adjustable up to 1000 Hz) Programmable Interval Timer (PIT) –Intel 8253, 8254 Interrupt service procedure adds 1 to a counter in memory

Physical Clock Problems Getting two systems to agree on time Two clocks hardly ever agree Quartz oscillators oscillate at slightly different frequencies Clocks tick at different rates Create ever-widening gap in perceived time Clock Drift Difference between two clocks at one point in time Clock Skew

Clock Drift Frequencies of perfect, slow and fast clocks

Dealing with Drift Assume we set computer to true time Not good idea to set clock back Illusion of time moving backwards can confuse message ordering and software development environments

Dealing with Drift Go for gradual clock correction If fast: Make clock run slower until it synchronizes If slow: Make clock run faster until it synchronizes Clock synchronization, e.g., Linear compensation function

Compensating for a fast clock

Getting Accurate Time Not practical solution for every machine Attach GPS receiver to each computer ±1 msec of UTC (Universal Coordinated Time) Attach WWV radio receiver Obtain time broadcasts from Boulder or DC ±3 msec of UTC (depending on distance) Attach GOES receiver ±0.1 msec of UTC Not practical solution for every machine Cost, size, convenience, environment

Practical Clock Synchronization NTP (Network Time Protocol) Berkeley algorithm

Clock Synchronization: Network Time Protocol (NTP) Synchronize from another machine One with a more accurate clock Machine/service that provides time information: Time server (w/ WWV receiver)

Clock Synchronization: NTP Assumption: latency AB and BA is same, and good estimate! Offset of A to B: theta = T3 - [(T2-T1) + (T4-T3)] / 2 Delay estimate: delta = [(T2-T1) + (T4-T3)] / 2, keeps the minimum one! Adjust gradually: e.g., to slow down, add a smaller time for each interrupt

Clock Synchronization: The Berkeley Algorithm The time server is active, polling every machine periodically for their time Based on responses, it computes an average time and tell every machine to adjust their clocks Used scenarios: No machine has a WWV receiver All machines agree on the same time, but not necessarily the real time

The Berkeley Algorithm (1) The time daemon asks all the other machines for their clock values.

The Berkeley Algorithm (2) The machines answer.

The Berkeley Algorithm (3) The time daemon tells everyone how to adjust their clock.

Logical Clocks In a classic paper (1978), Lamport showed Although clock synchronization is possible, it need not be absolute If two processes do not interact, it is not necessary that their clocks be synchronized! More importantly, the processes should agree on the order in which events occur! This matters!

Logical Clocks Assign sequence numbers to messages All cooperating processes can agree on order of events vs. physical clocks: time of day Assume NO central time source Each system maintains its own local clock No total ordering of events No concept of happened-when

Lamport’s Logical Clocks (1) The "happens-before" relation → can be observed directly in two situations: If a and b are events in the same process, and a occurs before b, then a → b is true. If a is the event of a message being sent by one process, and b is the event of the message being received by another process, then a → b Happens-before is transitive: if a  b and b c then a c

Logical clocks & concurrency Assign “clock” value to each event if ab then clock(a) < clock(b) since time cannot run backwards If a and b occur on different processes that do not exchange messages, then neither a  b nor ba are true These events are concurrent

Lamport’s Logical Clocks (1) (a) Three processes, each with its own clock. The clocks run at different rates. Lamport clocks: Counters or Sequence numbers

Lamport’s Logical Clocks (2) (b) Lamport’s algorithm corrects the clocks (by adding 1).

Lamport’s Logical Clocks (3) Figure 6-10. The positioning of Lamport’s logical clocks in distributed systems.

Lamport’s Logical Clocks (4) Updating the local counter Ci for process Pi Each process maintains a local counter Before executing an event Pi executes Ci ← Ci + 1. When process Pi sends a message m to Pj, it sets m’s timestamp ts (m) equal to Ci after having executed the previous step. Upon the receipt of a message m, process Pj adjusts its own local counter as Cj ← max{Cj , ts (m)}, after which it then executes the first step and delivers the message to the application.

Lamport’s algorithm Each message carries a timestamp of the sender’s clock When a message arrives: if receiver’s clock < message timestamp, set system clock to (message timestamp + 1) else do nothing Clock must be advanced between any two events in the same process

Lamport’s algorithm Algorithm allows us to maintain time ordering among related events Partial ordering

Summary Algorithm needs monotonically increasing software counter Incremented at least when events that need to be timestamped occur Each event has a Lamport timestamp attached to it For any two events, where ab: C(a) < C(b)

Example: Totally Ordered Multicasting Updating a replicated database and leaving it in an inconsistent state. Totally Ordered Multicasting: all msgs are delivered in the same order to each receiver! Can be implemented by Lamport’s logical clocks (multicast messages and acks, msg queue ordered by timestamp, msg delivered to app if it is the first and acks from all nodes are received)

Problems Identical timestamps: two events could be concurrent Detect causal relations If C(e) < C(e’), cannot conclude that ee’ Looking at Lamport timestamps, cannot conclude which events are causally related Solution: use a vector clock

Vector Clocks (1) Concurrent message transmission using logical clocks. Trcv(m1) < Tsnd(m2), but m1 and m2 are concurrent Lamport clocks do not capture causality!

Vector Clocks (2) Vector clocks are constructed by letting each process Pi maintain a vector VCi with the following two properties: VCi [ i ] is the number of events that have occurred so far at Pi. In other words, VCi [ i ] is the local logical clock at process Pi . If VCi [ j ] = k then Pi knows that k events have occurred at Pj. It is thus Pi’s knowledge of the local time at Pj . If VC[a] < VC[b] then event a  event b

Vector Clocks (3) Steps carried out to accomplish property 2 of previous slide: Before executing an event Pi executes VCi [ i ] ← VCi [i ] + 1. When process Pi sends a message m to Pj, it sets m’s (vector) timestamp ts (m) equal to VCi after having executed the previous step. Upon the receipt of a message m, process Pj adjusts its own vector by setting VCj [k ] ← max{VCj [k ], ts (m)[k ]} for each k, after which it executes the first step and delivers the message to the application.

Enforcing Causal Communication Figure 6-13. Enforcing causal communication. Causally ordered multicasting: weaker than totally ordered multicasting; if 2 msgs are not related to each other, we do not care in which order they are delivered to apps. Assume clocks are adjusted only when sending/receiving msgs. Sending by incrementing the item in the VC by 1; receiving only by adjusting to max for all components in VC. A msg from process i is delivered to apps only the following 2 conditions are met: 1) ts(m)[i] = VCj [i ] + 1 2) ts(m)[k] <= VCj[k] for all k != i

Mutual Exclusion A Centralized Algorithm (a) Process 1 asks the coordinator for permission to access a shared resource. Permission is granted. (b)Process 2 then asks permission to access the same resource. The coordinator does not reply. (c) When process 1 releases the resource, it tells the coordinator, which then replies to 2.

Mutual Exclusion A Centralized Algorithm Simple: 3 messages: request, grant, release Downsides Simple point of failures Performance bottleneck

Mutual Exclusion : A Distributed Algorithm The requestor broadcasts a message containing the requested resource, process id, and logical time Three different cases: If the receiver is not accessing the resource and does not want to access it, it sends back an OK message to the sender. If the receiver already has access to the resource, it simply does not reply. Instead, it queues the request. If the receiver wants to access the resource as well but has not yet done so, it compares the timestamp of the incoming message with the one contained in the message that it has sent everyone. The lowest one wins. (Lamport’s clock vector to implement tm)

Mutual Exclusion A Distributed Algorithm Two processes want to access a shared resource at the same moment. Process 0 has the lowest timestamp, so it wins. When process 0 is done, it sends an OK also, so 2 can now go ahead.

Mutual Exclusion A Distributed Algorithm Message complexity: 2(n-1) per entry Magnify the single point of failure problem in centralized algorithms (n points) Group membership is known Bottleneck: each machine handles same load, but machines may be heterogeneous

Mutual Exclusion A Token Ring Algorithm (a) An unordered group of processes on a network. (b) A logical ring constructed in software.

Mutual Exclusion A Token Ring Algorithm Problems Lost tokens, how to detect them? Process failures, how to detect them?

Election Algorithms Many distributed systems require one process to act as coordinator/initiator, or perform some special role Elect one to fit into that role In general, election algorithms attempt to locate the process with the highest process number as the coordinator Traditional alg. assumes: message passing is reliable; network topology does not change

Election Algorithms The Bully Algorithm P sends an ELECTION message to all processes with higher numbers. If no one responds, P wins the election and becomes coordinator. If one of the higher-ups answers, it takes over. P’s job is done.

The Bully Algorithm (1) (a) Process 4 holds an election. (b) Processes 5 and 6 respond, telling 4 to stop. (c) Now 5 and 6 each hold an election.

The Bully Algorithm (2) (d) Process 6 tells 5 to stop. (e) Process 6 wins and tells everyone.

Election: A Ring Algorithm Figure 6-21. Election algorithm using a ring. After discovering crash of the old coordinator, some process initiate the ELECTION message circulating the ring (containing process numbers whose processes saw the message) Then, the COORDINATOR message is circulating again, containing all the members, the process with the highest number is the new coordinator