CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.

Slides:



Advertisements
Similar presentations
CS542 Topics in Distributed Systems Diganta Goswami.
Advertisements

CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.
Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
Distributed Systems Distributed Coordination. Introduction Concurrent processes in same system –Common memory and clock –Easy to see order of events Concurrent.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Lectures 25-26: Distributed Coordination (Ch 18)
Chapter 18: Distributed Coordination Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 18 Distributed Coordination Event Ordering.
Chapter 16: Distributed Coordination Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Apr 11, 2005 Chapter 16 Distributed.
Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.
CSC 4320/6320 Operating Systems Lecture 13 Distributed Coordination
Page 1 Mutual Exclusion* Distributed Systems *referred to slides by Prof. Paul Krzyzanowski at Rutgers University and Prof. Mary Ellen Weisskopf at University.
Synchronization in Distributed Systems
CompSci 143aSpring, Deadlocks 6.1 Deadlocks with Reusable and Consumable Resources 6.2 Approaches to the Deadlock Problem 6.3 A System Model.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 17 Distributed Coordination Event Ordering Mutual Exclusion Atomicity Concurrency.
Deadlocks CS 3100 Deadlocks1. The Deadlock Problem A set of blocked processes each holding a resource and waiting to acquire a resource held by another.
Distributed Coordination CS 3100 Distributed Coordination1.
Distributed Systems Spring 2009
CS 582 / CMPE 481 Distributed Systems
What we will cover…  Distributed Coordination 1-1.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
Chapter 18-1: Distributed Coordination Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 18 Distributed Coordination Chapter.
Module 2.4: Distributed Systems
Synchronization in Distributed Systems. Mutual Exclusion To read or update shared data, a process should enter a critical region to ensure mutual exclusion.
SynchronizationCS-4513, D-Term Synchronization in Distributed Systems CS-4513 D-Term 2007 (Slides include materials from Operating System Concepts,
Chapter 18.2: Distributed Coordination Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 18 Distributed Coordination Chapter.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
Synchronization in Distributed Systems CS-4513 D-term Synchronization in Distributed Systems CS-4513 Distributed Computing Systems (Slides include.
Deadlocks Gordon College Stephen Brinton. Deadlock Overview The Deadlock Problem System Model Deadlock Characterization Methods for Handling Deadlocks.
Distributed process management: Distributed deadlock
Chapter 18.3: Distributed Coordination Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 18 Distributed Coordination Chapter.
1 Mutual exclusion (mx) and Deadlock(dl) handling Overview of Event Ordering Mutual Exclusion Atomicity Locking protocols Time-stamping Deadlock Handling.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Deadlocks in Distributed Systems Deadlocks in distributed systems are similar to deadlocks in single processor systems, only worse. –They are harder to.
Distributed Deadlocks and Transaction Recovery.
Distributed Mutex EE324 Lecture 11.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
4.5 DISTRIBUTED MUTUAL EXCLUSION MOSES RENTAPALLI.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
O/S 4740 Distributed Coordination. Event Ordering In a Centralized system, we have common memory and clock, –So we can always determine the order that.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
1 Distributed Process Management Chapter Distributed Global States Operating system cannot know the current state of all process in the distributed.
Presenter: Long Ma Advisor: Dr. Zhang 4.5 DISTRIBUTED MUTUAL EXCLUSION.
Cosc 4740 Chapter 6, Part 4 Deadlocks. The Deadlock Problem A set of blocked processes each holding a resource and waiting to acquire a resource held.
Chapter 18: Distributed Coordination Adapted to COP4610 by Robert van Engelen.
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
Lecture 12-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) October 4, 2012 Lecture 12 Mutual Exclusion.
Chapter 16: Distributed Coordination Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Apr 11, 2005 Outline n Event.
Chapter 7: Deadlocks. 7.2CSCI 380 – Operating Systems Chapter 7: Deadlocks The Deadlock Problem System Model Deadlock Characterization Methods for Handling.
Distributed Mutual Exclusion Synchronization in Distributed Systems Synchronization in distributed systems are often more difficult compared to synchronization.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
Chapter 18: Distributed Coordination Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Apr 11, 2005 Chapter 18 Distributed.
Academic Year 2014 Spring Academic Year 2014 Spring.
Revisiting Logical Clocks: Mutual Exclusion Problem statement: Given a set of n processes, and a shared resource, it is required that: –Mutual exclusion.
Silberschatz and Galvin  Operating System Concepts Module 18: Distributed Coordination Event Ordering Mutual Exclusion Atomicity Concurrency.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Chapter 7: Deadlocks.
CSE 120 Principles of Operating
Chapter 18: Distributed Coordination
Concurrency: Deadlock and Starvation
Distributed Mutex EE324 Lecture 11.
Outline Distributed Mutual Exclusion Introduction Performance measures
CSc 552 Advanced Unix Process deadlock deadlock prevention
Synchronization (2) – Mutual Exclusion
Module 18: Distributed Coordination
Distributed Mutual eXclusion
Presentation transcript:

CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection client-server), Global Names File Caching: In memory, In local disk Cache Update Policies : Write Back, Write through Case study: Sun Microsystems NFS  Today: distributed coordination

CS3772 What is distributed coordination?  In previous lectures we discussed various mechanisms to synchronize actions of processes in one machine Mutual exclusion: Semaphores, locks, monitors Ways of dealing with deadlocks: ignoring it, detecting it (let deadlock occur, detect them, and try to recover), prevention (statically make deadlock structurally impossible), Avoidance (avoid deadlock by allocating resources carefully) These mechanisms have been centralized Distributed coordination can be seen as generalization of these to distributed systems.

CS3773 Event Ordering Being able to order events is important to synchronization, e.g., we need to be able to specify that a resource can only be used after it has been granted. In a centralized system, it is possible to determine order of events This because all processes share common clock and memory In a distributed system there is no common clock It is therefore sometimes impossible to tell which of two events occurred first.

CS3774 Event order: The Happened- Before Relation Happened-Before denoted with arrow, e.g. A->B If A and B are events in the same process, and A was executed before B, then A-> B If A is the event sending a message and B is receiving a message, then A->B If A->B and B->C, then A->C If two events A and B are not related with -> relation then these events were executed concurrently  We don’t know which of these two events happened first

CS3775 Example:Space-time diagram three distributed processes p0 p1 p2 p3 q0 q1 q2 q3 r0 r1 r2 r3

CS3776 Example cont.  Ordered events: Are p0->q1; r0->q3; q2->r3; q0->p3 And also p0->q3 (as p0->q1 AND q1->q3)…  Concurrent events: q0 and p2 r0 and q2 p1 and q2  Since neither affects the other it is NOT important to know

CS3777 Implementation of Event Ordering  We would need either a COMMON CLOCK or PERFECTLY SYNCHRONIZED CLOCKS to determine event ordering in distributed systems  Not available/possible unfortunately!  How can we define the happened-before relationship WITHOUT physical clocks in distributed systems?

CS3778 Implementation of Event Ordering  We define a logical clock, LCi, for each process Pi.  We associate a timestamp with each event  We advance the logical clocks when sending messages to account for slower logical clocks, i.e., if A send to B and B’s clock is less that A’s timestamp, we advance LC(B) to LC(A) + 1;  Now we can meet global-ordering requirement: if A->B then A’s timestamp < B’s timestamp.

CS3779 Mutual Exclusion  How can we provide mutual exclusion across distributed processes?  1. Centralized approach We have one of the processes as coordinator To enter a critical section each process sends Request and waits for a Reply message. If there is a process in the critical section the coordinator queues the request. To leave the critical section we must send a Release message

CS37710 Centralized approach for mutual exclusion Advantages: Relatively small overhead Ensures mutual exclusion If scheduling is fair no starvation occurs Disadvantages: Coordinator can fail A new coordinator must be ELECTED Once the new coordinator is elected it must poll all the processes to reconstruct the request queue.

CS37711 Fully Distributed approach for mutual exclusion  Far more complicated solution  When a process Pi wants to enter its critical section, it generates a new timestamp TS, and sends a message Request(Pi,TS) to all processes.  A process can enter the critical section if receives Reply messages from all other processes.  Process Pj may not reply directly Because is already in its critical section Because it wants to enter its critical section, it checks TS and if his is smaller, the Reply is deferred

CS37712 Fully Distributed Approach  Advantages: mutual exclusion ensured Starvation free (scheduled based on Timestamp) Deadlock free  Disadvantages All processes must know each other If one process fails system collapses. Need continuous monitoring of the state of all processes to detect when one process fails.  Suitable for small number of processes

CS37713 Token-Passing approach to mutual exclusion  A token (is a special type of message) circulates among all processes  Processes logically organized in a ring  If a process does not need to enter a critical section it passes the token to its neighbor  Advantage: in highly loaded system only one message may be enough, starvation free …  Disadvantage: if a process fails a new logical ring must be established, in system with low contention (no process wants to enter its critical section) the amount of messages per a critical section entry can be very large.

CS37714 Deadlock handling with deadlock prevention  Deadlock avoidance not practical- require information about resource usage ahead of time that is rarely available.  Deadlock prevention Can use the local algorithms with modifications For example, we can use the resource-ordering (ensuring that resources are accessed in order) technique but first we need to define a global ordering among resources. New techniques are using time-stamp ordering: The wait-die scheme  Non-preemptive technique  If TS of Pi is smaller than TS of Pj, the resource Pi is requesting is hold by Pj, then Pi can wait for resource. Otherwise Pi must be rolled back (restarted). The wound-wait scheme  Preemptive  The opposite of wait-die: Pi waits if its TS is larger than Pj’s, otherwise Pj is rolled back and the resource is preempted from Pj.

CS37715 Deadlock handling with deadlock detection  The deadlock-prevention may preempt resources even if no deadlock has occurred!  Deadlock detection is based on so called wait-for graphs  A wait-for graph shows resource allocation state  A cycle in the wait-for graph represents deadlock P5P3 P1P2 P3 P2 P4 Site A Site B

CS37716 Global wait-for graphs  To show that there is NO DEADLOCK it is not enough to show that there is no cycle locally  We need to construct the global wait-for graph  It is the union of all local graphs. P5P3 P1 P2 P4

CS37717 How to construct this global wait-for graph?  Centralized approach:  The graph is maintained in ONE process: the deadlock-detection coordinator  Since there is communication delay in the system we have two types of graphs: Real wait-for graph // real but unknown state of the system Constructed wait-for graph // approximation generated by the coordinator during the execution of its algorithm  When is the wait-for graph constructed? 1.Whenever a new local edge inserted/removed a message is sent 2. Periodically maintained 3. Whenever the coordinator invokes the cycle-detector algorithm  What happens if a cycle is detected? The coordinator selects a victim and notifies all processes

CS37718 Centralized approach deadlock detection  False cycles may exist in the constructed global wait-for graph (because messages arrive in some order and delays contribute to edges added that form cycles; if a removed edge message arrives after another add edge message)  There is a centralized deadlock detection algorithm based on Option 3 that guarantees that it detects all deadlocks and no false deadlocks are detected.

CS37719 Summary  Event ordering in distributed systems  Various approaches for Mutual Exclusion in distributed systems Centralized approach Toke based approach Fully distributed  Deadlock prevention and detection Global wait-for graph