Distributed Systems CS

Slides:



Advertisements
Similar presentations
CS 542: Topics in Distributed Systems Diganta Goswami.
Advertisements

Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Distributed Computing
1 Algorithms and protocols for distributed systems We have defined process groups as having peer or hierarchical structure and have seen that a coordinator.
Page 1 Mutual Exclusion* Distributed Systems *referred to slides by Prof. Paul Krzyzanowski at Rutgers University and Prof. Mary Ellen Weisskopf at University.
Distributed Systems Spring 2009
CS 582 / CMPE 481 Distributed Systems
Distributed Systems CS Synchronization – Part II Lecture 8, Sep 28, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
SynchronizationCS-4513, D-Term Synchronization in Distributed Systems CS-4513 D-Term 2007 (Slides include materials from Operating System Concepts,
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed Systems CS Synchronization – Part III Lecture 9, Oct 3, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Synchronization in Distributed Systems CS-4513 D-term Synchronization in Distributed Systems CS-4513 Distributed Computing Systems (Slides include.
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Distributed Systems CS Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Lecture 14 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Project P01 deadline on Wednesday November 3 rd. Non-blocking.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Distributed Mutex EE324 Lecture 11.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 6 Synchronization.
Distributed Systems CS Synchronization – Part III Lecture 12, Oct 1, 2014 Mohammad Hammoud.
4.5 DISTRIBUTED MUTUAL EXCLUSION MOSES RENTAPALLI.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Presenter: Long Ma Advisor: Dr. Zhang 4.5 DISTRIBUTED MUTUAL EXCLUSION.
Vector Clock Each process maintains an array of clocks –vc.j.k denotes the knowledge that j has about the clock of k –vc.j.j, thus, denotes the clock of.
Distributed Systems CS Synchronization – Part III Lecture 10, Sep 30, 2013 Mohammad Hammoud.
Synchronization Chapter 5.
Event Ordering Greg Bilodeau CS 5204 November 3, 2009.
CIS825 Lecture 2. Model Processors Communication medium.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems Topic 5: Time, Coordination and Agreement
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.
Lecture 11: Coordination and Agreement Central server for mutual exclusion Election – getting a number of processes to agree which is “in charge” CDK4:
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Oct 1, 2015 Lecture 12: Mutual Exclusion All slides © IG.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
Coordination and Agreement
4.5 Distributed Mutual Exclusion
Distributed Systems CS
Distributed Mutex EE324 Lecture 11.
Chapter 6.3 Mutual Exclusion
Distributed Systems CS
Distributed Systems CS
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Distributed Mutual Exclusion Introduction Performance measures
EEC 688/788 Secure and Dependable Computing
Distributed Systems CS
Distributed Systems CS
Lecture 10: Coordination and Agreement
Distributed Systems CS
Distributed Systems CS
Chapter 5 (through section 5.4)
Synchronization (2) – Mutual Exclusion
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Prof. Leonardo Mostarda University of Camerino
Distributed Systems CS
Lecture 11: Coordination and Agreement
Logical time (Lamport)
Presentation transcript:

Distributed Systems CS 15-440 Synchronization – Part III Lecture 10, October 5, 2016 Mohammad Hammoud

Today… Last Session: Logical Clocks Today’s Session: Distributed Mutual Exclusion Election Algorithms Announcements: Midterm exam is on Wednesday, Oct 12 (it is open book, open notes) PS3 is out. It is due on Monday, Oct 17 Project II is out. It is due on Oct 23 by midnight

Where do We Stand in the Synchronization Chapter? Previous two lectures Time Synchronization Physical Clock Synchronization (or, simply, Clock Synchronization) Here, actual time on the computers are synchronized Logical Clock Synchronization Computers are synchronized based on the relative ordering of events Mutual Exclusion How to coordinate between processes that access the same resource? Election Algorithms Here, a group of entities elect one entity as the coordinator for solving a problem Today’s lecture

Summary of Lamport’s Clock Lamport advocated using logical clocks Processes synchronize based on the time values of their logical clocks rather than the absolute time values of their physical clocks Which applications in DS need logical clocks? Applications with provable ordering of events Perfect physical clock synchronization is hard to achieve in practice. Hence we cannot provably order the events Applications with rare events Events are rarely generated, and physical clock synchronization overhead is not justified However, Lamport’s clock cannot guarantee perfect ordering of events by just observing the time values of two arbitrary events

Limitation of Lamport’s Clock Lamport’s Clock ensures that if ab, then C(a) < C(b) However, it does not say anything about any two arbitrary events a and b by only comparing their time values For any two arbitrary events a and b, C(a) < C(b) does not mean that ab Example: P1 P2 P3 Compare m1 and m3 m1:6 6 8 10 m2:20 P2 can infer that m1m3 12 16 20 18 24 30 m3:32 24 32 40 Compare m1 and m2 30 40 50 P2 cannot infer that m1m2 or m2m1 36 48 60 42 61 56 70 48 64 80 54 54 72 90 60 80 100

Logical Clocks We will study two types of logical clocks Lamport’s Clock Vector Clocks

Vector Clocks Vector Clocks was proposed to overcome the limitation of Lamport’s clock (i.e., C(a)< C(b) does not mean that ab) The property of inferring that a occurred before b is known as the causality property A Vector clock for a system of N processes is an array of N integers Every process Pi stores its own vector clock VCi Lamport’s time values for events are stored in VCi VCi(a) is assigned to an event a If VCi(a) < VCi(b), then we can infer that ab (or more precisely, that event a causally precedes event b)

Updating Vector Clocks Vector clocks are constructed by the following two properties: VCi[i] is the number of events that have occurred at process Pi so far VCi[i] is the local logical clock at process Pi If VCi[j]= k, then Pi knows that k events have occurred at Pj VCi[j] is Pi’s knowledge of the local time at Pj Increment VCi whenever a new event occurs Pass VCj along with the message

Vector Clock Update Algorithm Whenever there is a new event at Pi, increment VCi[i] When a process Pi sends a message m to Pj: Increment VCi[i] Set m’s timestamp ts(m) to the vector VCi When message m is received process Pj : VCj[k] = max(VCj[k], ts(m)[k]) ; (for all k) Increment VCj[j] VC0=(0,0,0) VC0=(1,0,0) VC0=(2,0,0) P0 m:(2,0,0) VC1=(0,0,0) VC1=(2,1,0) P1 VC2=(0,0,0) P2

Inferring Events with Vector Clocks Let a process Pi send a message m to Pj with timestamp ts(m), then: Pj knows the number of events at the sender Pi that causally precede m (ts(m)[i] – 1) denotes the number of events at Pi Pj also knows the minimum number of events at other processes Pk that causally precede m (ts(m)[k] – 1) denotes the minimum number of events at Pk VC0=(0,0,0) VC0=(1,0,0) VC0=(2,0,0) P0 m:(2,0,0) VC1=(0,0,0) VC1=(0,1,0) VC1=(2,2,0) VC1=(2,3,0) P1 m’:(2,3,0) VC2=(0,0,0) VC2=(2,3,1) P2

Enforcing Causal Communication Assume that messages are multicast within a group of processes, P0, P1 and P2 To enforce causally-ordered multicasting, the delivery of a message m sent from Pi to Pj can be delayed until the following two conditions are met: ts(m)[i] = VCj[i] + 1 (Condition I) ts(m)[k] <= VCj[k] for all k != i (Condition II) Assuming that Pi only increments VCi[i] upon sending m and adjusts VCi[k] to max{VCi[k], ts(m)[k]} for each k upon receiving a message m’ VC0=(0,0,0) VC0=(1,0,0) VC0=(1,1,0) P0 m:(1,0,0) m:(1,1,0) VC1=(0,0,0) VC1=(1,0,0) VC1=(1,1,0) P1 VC2=(0,0,0) VC2=(1,0,0) VC2=(1,1,0) P2 VC2=(1,1,0) Condition II does not hold  Delay delivery

Summary – Logical Clocks Logical Clocks are employed when processes have to agree on relative ordering of events, but not necessarily actual time of events Two types of Logical Clocks Lamport’s Logical Clocks Supports relative ordering of events across different processes by using the happened-before relationship Vector Clocks Supports causal ordering of events

Overview Time Synchronization Mutual Exclusion Election Algorithms Clock Synchronization Logical Clock Synchronization Mutual Exclusion Election Algorithms

Need for Mutual Exclusion Distributed processes need to coordinate to access shared resources Example: Writing a file in a Distributed File System Client A Server Client C Read from file abc.txt Write to file abc.txt P1 P3 Distributed File abc.txt Client B P2 Write to file abc.txt In uniprocessor systems, mutual exclusion to a shared resource is provided through operating system support. However, such support is insufficient to enable mutual exclusion of distributed entities In Distributed Systems, processes coordinate accesses to a shared resource by passing messages to enforce distributed mutual exclusion

Types of Distributed Mutual Exclusion Mutual exclusion algorithms are classified into two categories Permission-based Approaches A process, which wants to access a shared resource, requests the permission from one or more coordinators Token-based Approaches Each shared resource has a token Token is circulated among all the processes A process can access the resource if it has the token Request to access Coordinator C1 Client 1 Grant P1 Server Access Resource Server Resource Access Access Access Client 1 Client 2 Client 3 P1 P2 P3 Token Token Token

Overview Time Synchronization Mutual Exclusion Election Algorithms Clock Synchronization Logical Clock Synchronization Mutual Exclusion Permission-based Approaches Token-based Approaches Election Algorithms

Permission-based Approaches There are two types of permission-based mutual exclusion algorithms Centralized Algorithms Decentralized Algorithms We will study an example of each type of algorithms

a. A Centralized Algorithm One process is elected as a coordinator (C) for a shared resource Coordinator maintains a Queue of access requests Whenever a process wants to access the resource, it sends a request message to the coordinator to access the resource When the coordinator receives the request: If no other process is currently accessing the resource, it grants the permission to the process by sending a “grant” message If another process is accessing the resource, the coordinator queues the request, and may not reply to the request The process releases the exclusive access after accessing the resource The coordinator will then send the “grant” message to the next process in the queue P0 P1 P2 Rel Req Grant Access Req Grant Access Resource C P2 P1 Queue

Discussion about Centralized Algorithm Blocking vs. Non-blocking Requests The coordinator can block the requesting process until the resource is free Otherwise, the coordinator can send a “permission-denied” message back to the process The process can poll the coordinator at a later time, or The coordinator queues the request. Once the resource is released, the coordinator will send an explicit “grant” message to the process The algorithm guarantees mutual exclusion, and is simple to implement Fault-Tolerance: Centralized algorithm is vulnerable to a single-point of failure (at coordinator) Processes cannot distinguish between dead coordinator and request blocking Performance Bottleneck: In a large system, single coordinator can be overwhelmed with requests

b. A Decentralized Algorithm To avoid the drawbacks of the centralized algorithm, Lin et al. [1] advocated a decentralized mutual exclusion algorithm Assumptions: Distributed processes are in a Distributed Hash Table (DHT) based system Each resource is replicated n times The ith replica of a resource rname is named as rname-i Every replica has its own coordinator for controlling access The coordinator for rname-i is determined by using a hash function Approach: Whenever a process wants to access the resource, it will have to get a majority vote from m > n/2 coordinators If a coordinator does not want to vote for a process (because it has already voted for another process), it will send a “permission-denied” message to the process

A Decentralized Algorithm – An Example If n=10 and m=7, then a process needs at-least 7 votes to access the resource C1 rname-1 Req Access C2 rname-2 5 6 7 3 4 2 1 P0 OK C3 rname-3 C4 rname-4 Req C5 rname-5 1 3 2 P1 OK C6 rname-6 Deny C7 rname-7 Deny C8 rname-8 C9 rname-9 C10 rname-10 Pi n = Number of votes gained = Process i Cj = Coordinator j rname-x xth replica = of a resource rname

Fault-tolerance in Decentralized Algorithm The decentralized algorithm assumes that the coordinator recovers quickly from a failure However, the coordinator would have reset its state after recovery Coordinator could have forgotten any vote it had given earlier Hence, the coordinator may incorrectly grant permission to the processes Mutual exclusion cannot be deterministically guaranteed But, the algorithm probabilistically guarantees mutual exclusion

Probabilistic Guarantees in the Decentralized Algorithm What is the minimum number of coordinators who should fail for violating mutual exclusion? At least n-m+1 coordinators should fail Let the probability of violating mutual exclusion be Pv Derivation of Pv Let T be the lifetime of the coordinator Let p=Δt/T be the probability that a coordinator crashes during time-interval Δt Let P[k] be the probability that k out of m coordinators crash during the same interval We compute the mutual exclusion violation probability Pv by: In practice, this probability should be very small For T=3 hours, Δt=10 s, n=32, and m=0.75n : Pv =10-40

Overview Time Synchronization Mutual Exclusion Election Algorithms Clock Synchronization Logical Clock Synchronization Mutual Exclusion Permission-based Approaches Token-based Approaches Election Algorithms

A Token Ring Algorithm In the Token Ring algorithm, each resource is associated with a token The token is circulated among the processes The process with the token can access the resource How the token is circulated among processes? Resource Access T T All processes form a logical ring where each process knows its next process One process is given the token to access the resource The process with the token has the right to access the resource If the process has finished accessing the resource OR does not want to access the resource: it passes the token to the next process in the ring P0 P1 P2 P3 P4 P5 P6 P7 T T T T T T

Discussion about Token Ring Token ring approach provides deterministic mutual exclusion There is one token, and the resource cannot be accessed without a token Token ring approach avoids starvation Each process will receive the token Token ring has a high-message overhead When no processes need the resource, the token circulates at a high-speed If the token is lost, it must be re-generated Detecting the loss of the token is difficult (especially if the amount of time between successive appearances of the token is unbounded) Dead processes must be purged from the ring ACK based token delivery can assist in purging dead processes

Comparison of Mutual Exclusion Algorithms Delay before a process can access the resource (in message times) Number of messages required for a process to access and release the shared resource Problems Centralized Decentralized Token Ring 2 3 Coordinator crashes 2mk 2mk + m; k=1,2,… Large number of messages Token may be lost Ring can cease to exist since processes crash 0 to (n-1) 1 to ∞ Assume that: n = Number of processes in the distributed system For the Decentralized algorithm: m = minimum number of coordinators who have to agree for a process to access a resource k = average number of requests made by the process to a coordinator to request for a vote

Overview Time Synchronization Mutual Exclusion Election Algorithms Clock Synchronization Logical Clock Synchronization Mutual Exclusion Permission-based Approaches Token-based Approaches Election Algorithms

Election in Distributed Systems Many distributed algorithms require one process to act as a coordinator Typically, it does not matter which process is elected as the coordinator Time server Client 1 Server Resource P1 Coordinator C1 Home Node Selection in Naming Berkeley Clock Synchronization Algorithm A Centralized Mutual Exclusion Algorithm

Election Process Any process Pi in the DS can initiate the election algorithm that elects a new coordinator At the termination of the election algorithm, the elected coordinator process should be unique Every process may know the process ID of every other process, but it does not know which processes have crashed Generally, we require that the coordinator is the process with the largest process ID The idea can be extended to elect best coordinator Example: Election of a coordinator with least computational load If the computational load of process Pi denoted by loadi, then coordinator is the process with highest 1/loadi. Ties are broken by sorting process ID.

Election Algorithms We will study two election algorithms Bully Algorithm Ring Algorithm

1. Bully Algorithm A process initiates election algorithm when it notices that the existing coordinator is not responding Process Pi calls for an election as follows: Pi sends an “Election” message to all processes with higher process IDs When process Pj with j>i receives the message, it responds with a “Take-over” message. Pi no more contests in the election Process Pj re-initiates another call for election. Steps 1 and 2 continue If no one responds, Pi wins the election. Pi sends “Coordinator” message to every process 1 5 6 3 7 4 2 Take-Over Election Election Take-Over Coordinator Election Election Take-over Election Election X

2. Ring Algorithm This algorithm is generally used in a ring topology When a process Pi detects that the coordinator has crashed, it initiates an election algorithm Pi builds an “Election” message (E), and sends it to its next node. It inserts its ID into the Election message When process Pj receives the message, it appends its ID and forwards the message If the next node has crashed, Pj finds the next alive node When the message gets back to the process that started the election: it elects process with highest ID as coordinator, and changes the message type to “Coordination” message (C) and circulates it in the ring E: 5,6,0 C: 6 E: 5,6,0,1 C: 6 1 2 3 4 5 6 7 E: 5,6,0,1,2 C: 6 X E: 5,6 C: 6 E: 5,6,0,1,2,3 C: 6 C: 6 E: 5 C: 6 E: 5,6,0,1,2,3,4

Comparison of Election Algorithms Number of Messages for Electing a Coordinator Problems Bully Algorithm Ring Algorithm O(n2) Large message overhead 2n An overlay ring topology is necessary Assume that: n = Number of processes in the distributed system

Summary of Election Algorithms Election algorithms are used for choosing a unique process that will coordinate certain activities At the end of the election algorithm, all nodes should uniquely identify the coordinator We studied two algorithms for election Bully algorithm Processes communicate in a distributed manner to elect a coordinator Ring algorithm Processes in a ring topology circulate election messages to choose a coordinator

Election in Large-Scale Networks Bully Algorithm and Ring Algorithm scale poorly with the size of the network Bully Algorithm needs O(n2) messages Ring Algorithm requires maintaining a ring topology and requires 2n messages to elect a leader In large networks, these approaches do not scale well We discuss a scalable election algorithm for large-scale peer-to-peer networks

Election in Large-Scale Peer-to-Peer Networks Many P2P networks have a hierarchical architecture for balancing the advantages between centralized and distributed networks Typically, P2P networks are neither completely unstructured nor completely centralized Centralized networks are efficient and, they easily facilitate locating entities and data Flat unstructured peer-to-peer networks are robust, autonomous and balances load between all peers

Super-peers In large unstructured Peer-to-Peer Networks, the network is organized into peers and super-peers A super-peer is an entity that does not only participate as a peer, but also carries on an additional role of acting as a leader for a set of peers Super-peer acts as a server for a set of client peers All communication from and to a regular peer proceeds through a super-peer It is expected that super-peers are long-lived nodes with high-availability Super Peer Regular Peer Super-Peer Network

Super-Peers – Election Requirements In a hierarchical P2P network, several nodes have to be selected as super-peers Traditionally, only one node is selected as a coordinator Requirements for a node being elected as a super-peer Super-peers should be evenly distributed across the overlay network There should be a predefined proportion of super-peers relative to the number of regular peers Each super-peer should not need to serve more than a fixed number of regular peers

Election of Super-peers in a DHT-based system   m k

Next Class Consistency and Replication- Part I

References [1] Shi-Ding Lin, Qiao Lian, Ming Chen and Zheng Zhang, “A Practical Distributed Mutual Exclusion Protocol in Dynamic Peer-to-Peer Systems”, Lecture Notes in Computer Science, 2005, Volume 3279/2005, 11-21, DOI: 10.1007/978-3-540-30183-7_2 [2] http://en.wikipedia.org/wiki/Causality