Mutual exclusion Concurrent access of processes to a shared resource or data is executed in mutually exclusive manner Distributed mutual exclusion can.

Slides:



Advertisements
Similar presentations
CS542 Topics in Distributed Systems Diganta Goswami.
Advertisements

CS 542: Topics in Distributed Systems Diganta Goswami.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
1 Algorithms and protocols for distributed systems We have defined process groups as having peer or hierarchical structure and have seen that a coordinator.
Page 1 Mutual Exclusion* Distributed Systems *referred to slides by Prof. Paul Krzyzanowski at Rutgers University and Prof. Mary Ellen Weisskopf at University.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Synchronization in Distributed Systems
Distributed Systems Spring 2009
LEADER ELECTION CS Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.
CS 582 / CMPE 481 Distributed Systems
What we will cover…  Distributed Coordination 1-1.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
1. Explain why synchronization is so important in distributed systems by giving an appropriate example When each machine has its own clock, an event that.
Synchronization Clock Synchronization Logical Clocks Global State Election Algorithms Mutual Exclusion.
Computer Science Lecture 11, page 1 CS677: Distributed OS Last Class: Clock Synchronization Logical clocks Vector clocks Global state.
CS603 Process Synchronization February 11, Synchronization: Basics Problem: Shared Resources –Generally data –But could be others Approaches: –Model.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
Synchronization in Distributed Systems CS-4513 D-term Synchronization in Distributed Systems CS-4513 Distributed Computing Systems (Slides include.
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
EEC-681/781 Distributed Computing Systems Lecture 11 Wenbing Zhao Cleveland State University.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
Election Algorithms and Distributed Processing Section 6.5.
Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 6 Synchronization.
4.5 DISTRIBUTED MUTUAL EXCLUSION MOSES RENTAPALLI.
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.
1 Mutual Exclusion: A Centralized Algorithm a)Process 1 asks the coordinator for permission to enter a critical region. Permission is granted b)Process.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Coordination and Agreement. Topics Distributed Mutual Exclusion Leader Election.
DC6: Chapter 12 Coordination Election Algorithms Distributed Mutual Exclusion Consensus Group Communication.
Global State (1) a)A consistent cut b)An inconsistent cut.
Synchronization CSCI 4780/6780. Mutual Exclusion Concurrency and collaboration are fundamental to distributed systems Simultaneous access to resources.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Presenter: Long Ma Advisor: Dr. Zhang 4.5 DISTRIBUTED MUTUAL EXCLUSION.
Vector Clock Each process maintains an array of clocks –vc.j.k denotes the knowledge that j has about the clock of k –vc.j.j, thus, denotes the clock of.
Synchronization Chapter 5.
Studying Different Problems from Distributed Computing Several of these problems are motivated by trying to use solutiions used in `centralized computing’
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
6.5 Election Algorithms -Avinash Madineni.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
Lecture 7- 1 CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 7 Distributed Mutual Exclusion Section 12.2 Klara Nahrstedt.
Synchronization in Distributed Systems Chapter 6.3.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.
Distributed Mutual Exclusion Synchronization in Distributed Systems Synchronization in distributed systems are often more difficult compared to synchronization.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
CSC 8420 Advanced Operating Systems Georgia State University Yi Pan Transactions are communications with ACID property: Atomicity: all or nothing Consistency:
Revisiting Logical Clocks: Mutual Exclusion Problem statement: Given a set of n processes, and a shared resource, it is required that: –Mutual exclusion.
Lecture 11: Coordination and Agreement Central server for mutual exclusion Election – getting a number of processes to agree which is “in charge” CDK4:
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Oct 1, 2015 Lecture 12: Mutual Exclusion All slides © IG.
Distributed Systems 31. Theoretical Foundations of Distributed Systems - Coordination Simon Razniewski Faculty of Computer Science Free University of Bozen-Bolzano.
Distributed Processing Election Algorithm
Chapter 6.3 Mutual Exclusion
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Distributed Mutual Exclusion Introduction Performance measures
Lecture 10: Coordination and Agreement
Synchronization (2) – Mutual Exclusion
Prof. Leonardo Mostarda University of Camerino
Lecture 11: Coordination and Agreement
Distributed Systems and Concurrency: Synchronization in Distributed Systems Majeed Kassis.
CSE 486/586 Distributed Systems Leader Election
Presentation transcript:

Mutual exclusion Concurrent access of processes to a shared resource or data is executed in mutually exclusive manner Distributed mutual exclusion can be classified into two different categories. Token based solutions Permission based approach

Token based approach In token based solutions mutual exclusion is achieved by passing a special message between the processes, known as a token.

processes share a special message known as a token. There is only one token available. Token holder has right to access shared resource Wait for/ask for (depending on algorithm) token; enter Critical Section when it is obtained and pass to another process on exit. If a process receives the token and doesn’t need it, just pass it on.

Overview - Token-based Methods Advantages: Starvation can be avoided by efficient organization of the processes Deadlock is also avoidable Disadvantage: token loss Must initiate a cooperative procedure to recreate the token Must ensure that only one token is created!

Permission-based solutions process that wishes to access a shared resource must first get permission from one or more other processes. Avoids the problems of token-based solutions, but is more complicated to implement.

Basic Algorithms Centralized Decentralized Distributed Distributed with “voting” – for increased fault tolerance Token ring algorithm

Centralized algorithm One process is elected as the coordinator. Whenever a process wants to access a shared resource, it sends a request message to the coordinator stating which resource it wants to access and asking for permission.

If no other process is currently accessing that resource, the coordinator sends back a reply granting permission.

Mutual Exclusion A Centralized Algorithm Figure 6-14. Process 1 asks the coordinator for permission to access a shared resource. Permission is granted.

Mutual Exclusion A Centralized Algorithm Figure 6-14 Process 2 then asks permission to access the same resource. The coordinator does not reply.

Mutual Exclusion A Centralized Algorithm Figure 6-14. (c) When process 1 releases the resource, it tells the coordinator, which then replies to 2.

Centralized Mutual Exclusion Central coordinator manages requests FIFO queue to guarantee no starvation 1 2 1 2 1 2 OK Request Release Request OK 3 3 3 2 No Reply Wait Queue Figure 6-14

Decentralized algorithm Based on the Distributed Hash Table (DHT) system structure Object names are hashed to find the node where they are stored n replicas of each object are placed on n successive nodes Hash object name to get addresses Now every replica has a coordinator that controls access

Coordinators respond to requests at once: Yes or No For a process to use the resource it must receive permission from m > n/2 coordinators. If the requester gets fewer than m votes it will wait for a random time and then ask again. If a request is denied, or when the CS is completed, notify the coordinators who have sent OK messages, so they can respond again to another request.

Distributed algorithms

Distributed algorithms are the backbone of distributed computing systems. They are essential for the implementation of distributed systems. Distributed operating systems Distributed databases Distributed communication systems Real-time process-control systems Transportation systems, etc.

A distributed algorithm is an algorithm designed to run on computer hardware constructed from interconnected processors. Distributed algorithms are used in many varied application areas of distributed computing, such as telecommunications, scientific computing, distributed information processing, and real-time process control. Standard problems solved by distributed algorithms include leader election, consensus, distributed search, spanning tree generation, mutual exclusion, and resource allocation. .

Distributed algorithms are typically executed concurrently, with separate parts of the algorithm being run simultaneously on independent processors, and having limited information about what the other parts of the algorithm are doing. One of the major challenges in developing and implementing distributed algorithms is successfully coordinating the behavior of the independent parts of the algorithm in the face of processor failures and unreliable communications links.

Distributed Mutual Exclusion Probabilistic algorithms do not guarantee mutual exclusion is correctly enforced. Many other algorithms do, including the following. Originally proposed by Lamport, based on his logical clocks and total ordering relation Modified by Ricart-Agrawala

The Algorithm Two message types: Request Critical Section: sent to all processes in the group Reply/OK: A message eventually received at the request site, Si, from all other sites. Messages are time-stamped based on Lamport’s total ordering relation, with logical clock, process id.

Requesting When a process Pi wants to access a shared resource it builds a message with the resource name, pid and current timestamp: Request (ra, tsi, i) A request sent from P3 at “time” 4 would be time-stamped (4.3). Send the message to all processes, including yourself. Assumption: message passing is reliable.

Processing a Request Pi sends a Request (ra, tsi, i) to all sites. When Pk receives the request it inserts it on its own queue and sends a Reply (OK) if it is not in the critical section and doesn’t want the critical section does nothing, if it is in its critical section If it isn’t in the CS but would like to be, sends a Reply if the incoming Request has a lower timestamp than its own, otherwise does not reply.

Executing the Critical Section Pi can enter its critical section when it has received an OK Reply from every other process. At this time its request message will be at the top of every queue.

Distributed algorithms outline Synchronization Distributed mutual exclusion: needed to regulate accesses to a common resource that can be used only by one process at a time Election Used for instance, to design a new coordinator when the current coordinator fails

A Distributed Algorithm (1) Three different cases: If the receiver is not accessing the resource and does not want to access it, it sends back an OK message to the sender. If the receiver already has access to the resource, it simply does not reply. Instead, it queues the request. If the receiver wants to access the resource as well but has not yet done so, it compares the timestamp of the incoming message with the one contained in the message that it has sent everyone. The lowest one wins.

A Distributed Algorithm (2) Figure 6-15. (a) Two processes want to access a shared resource at the same moment.

A Distributed Algorithm (3) Figure 6-15. (b) Process 0 has the lowest timestamp, so it wins.

A Distributed Algorithm (4) Figure 6-15. (c) When process 0 is done, it sends an OK also, so 2 can now go ahead.

Distributed algorithms: outline Distributed agreement Distributed agreement is used for To determine which nodes are alive in the system To control the behavior of some components In distributed databases to determine when to commit a transaction Fault tolerance

Distributed algorithms: outline Check-pointing and recovery Error recovery is essential for fault-tolerance When a processor fails and then is repaired, it will need to recover its state of the computation To enable recovery, check-pointing (recording of the state into a stable storage) is needed

A Token Ring Algorithm Previous algorithms are permission based, this one is token based. Processors on a bus network are arranged in a logical ring, ordered by network address, or process number (as in an MPI environment), or some other scheme. Main requirement: that the processes know the ordering arrangement.

Algorithm Description At initialization, process 0 gets the token. The token is passed around the ring. If a process needs to access a shared resource it waits for the token to arrive. Execute critical section & release resource Pass token to next processor. If a process receives the token and doesn’t need a critical section, hand to next processor.

Lost Tokens What does it mean if a processor waits a long time for the token? Another processor may be holding it It’s lost No way to tell the difference; in the first case continue to wait; in the second case, regenerate the token.

A Token Ring Algorithm Figure 6-16. (a) An unordered group of processes on a network. (b) A logical ring constructed in software.

A Comparison of the Four Algorithms Figure 6-17. A comparison of three mutual exclusion algorithms.

Election Algorithms Bully Algorithm Ring Algorithm

In general, election algorithms attempt to locate the process with the highest process number and designate it as coordinator.

Motivation We often need a coordinator in distributed systems Leader, distinguished node/process If we have a leader, mutual exclusion is trivially solved The leader determined who enters CS If we have a leader, totally ordered broadcast trivially solved The leader stamps messages with consecutive integers 38

What is Leader Election? In distributed computing, leader election is the process of designating a single process as the organizer, coordinator, initiator or sequencer of some task distributed among several computers (nodes). Leader election is the process of determining a process as the manager of some task distributed among several processes (computers).

Why is Leader Election Required? The existence of a centralized controller greatly simplifies process synchronization. However, if the central controller breaks down, the service availability can be limited. The problem can be avoided if a new controller (leader) can be chosen. Different Algorithms would be employed to successfully elect the leader

Bully Algorithm

When any process notices that the coordinator is no longer responding to requests, it initiates an election. A process P, holds an election as follows.

P sends an ELECTION message to all processes with higher numbers. If no one responds, P wins the election and becomes coordinator. If one of the higher-ups answers, it takes over. P’s job is done.

Bully Algorithm When a process P notices that current coordinator has failed, it sends an ELECTION message to all processes with higher IDs. If no one responds, P becomes the leader. If a higher-up receives P’s message, it will send an OK message to P and execute the algorithm. Process with highest ID takes over as coordinator by sending COORDINATOR message. If a process with higher ID comes back, it takes over leadership by sending COORDINATOR message.

At any moment, a process can get an ELECTION message from one of its lower-numbered colleagues. When such a message arrives, the receiver sends an OK message back to the sender to indicate that he is alive and will take over.

The receiver then holds an election, unless it is already holding one. Eventually , all processes give up but one, and that one is the new coordinator. It announces its victory by sending all processes a message telling them that starting immediately it is the new coordinator.

Bully Algorithm - Example Process 4 holds an election Process 5 and 6 respond, telling 4 to stop Now 5 and 6 each hold an election

Bully Algorithm - Example Process 6 tells 5 to stop Process 6 wins and tells everyone

A ring algorithm Assume that all processes are physically or logically ordered, so that each process knows who is successor is. When any process notices that the coordinator is not functioning, it builds an ELECTION message containing its own process number and sends a message to its successor.

If the successor is down, the sender skips over the successor and goes to the next number along the ring ,or the one after that, until a running process is located. At each step along the way, the sender adds its own process number to the list in the message effectively making itself a candidate to be elected as coordinator.

Leader Election on Ring Each node has a unique identifier Nodes only send messages clockwise Each node acts on its own Protocol: A node send election message with its own id clockwise Election message is forwarded if id in message larger than own message Otherwise message discarded A node becomes leader if it sees it own election message election msg for 7 election msg for 7 2 8 7 4 5 51

A Ring Algorithm Figure 6-21. Election algorithm using a ring.

Elections in Wireless Environments Consider a wireless ad hoc network. To elect a leader, any node in the network, called the source, can initiate an election by sending an ELECTION message to its immediate neighbors (i.e., the nodes in its range). When a node receives an ELECTION for the first time, it designates the sender as its parent, and subsequently sends out an ELECTION message to all its immediate neighbors, except for the parent. When a node receives an ELECTION message from a node other than its parent, it merely acknowledges the receipt.

When node R has designated node Q as its parent, it forwards the ELECTION message to its immediate neighbors (excluding Q) and waits for acknowledgments to come in before acknowledging the ELECTION message from Q. This waiting has an important consequence. First, note that neighbors that have already selected a parent will immediately respond to R. More specifically, if all neighbors already have a parent, R is a leaf node and will be able to report back to Q quickly. In doing so, it will also report information such as its battery lifetime and other resource capacities.

This information will later allow Q to compare R's capacities to that of other downstream nodes, and select the best eligible node for leadership. Of course, Q had sent an ELECTION message only because its own parent P had done so as well. In turn, when Q eventually acknowledges the ELECTION message previously sent by P, it will pass the most eligible node to P as well. In this way, the source will eventually get to know which node is best to be selected as leader, after which it will broadcast this information to all other nodes.

Elections in Large-Scale Systems Identified the following requirements that need to be met for superpeer selection: 1. Normal nodes should have low-latency access to superpeers. 2. Superpeers should be evenly distributed across the overlay network. 3. There should be a predefined portion of superpeers relative to the total number of nodes in the overlay network. 4. Each superpeer should not need to serve more than a fixed number of normal nodes.

In the case of DHT-based systems, the basic idea is to reserve a fraction of the identifier space for superpeers. Recall that in DHT-based systems each node receives a random and uniformly assigned m-bit identifier. Now suppose we reserve the first (i.e., leftmost) k bits to identify superpeers. For example, if we need N superpeers, then the first rlog2 (N)l bits of any key can be used to identify these nodes.

To explain, assume we have a (small) Chord system with m = 8 and k = 3. When looking up the node responsible for a specific key p, we can first decide to route the lookup request to the node responsible for the pattern p AND 11100000 which is then treated as the superpeer. Note that each node id can check whether it is a superpeer by looking up Id AND 11100000 to see if this request is routed to itself.

different approach assume we need to place N superpeers evenly throughout the overlay. The basic idea is simple: a total of N tokens are spread across N randomly-chosen nodes. No node can hold more than one token. Each token represents a repelling force by which another token is inclined to move away. The net effect is that if all tokens exert the same repulsion force, they will move away from each other and spread themselves evenly in the geometric space.