Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 1 Instructor: Dr. Khalil Distributed.

Similar presentations


Presentation on theme: "Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 1 Instructor: Dr. Khalil Distributed."— Presentation transcript:

1 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 1 Instructor: Dr. Khalil Distributed Process Management Team Members: Mazen Hammad Chuck Mann Vrushali Nidgundi Hong Zhang Course: CSE 8343 Advanced Operating Systems Professor: Dr. Mohamed Khalil (Group 2)

2 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 2 Instructor: Dr. Khalil Distributed Process Management A Collection of processors that do not share memory or a clock. Distributed process management provides various mechanisms for:  Process synchronization and communication.  Dealing with the deadlock problem and the variety of failures that are not encountered in a centralized system. Overview:  Process Migration  Distributed Global States  Distributed Algorithms

3 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 3 Instructor: Dr. Khalil Process Migration The process is not always executed at the site in which it is initiated, the entire process or parts of it, maybe executed at different sites. Motivation: Load Balancing: Performance can be improved if the load is balanced. Communications Performance: Intensively communicating processes can be moved to one particular node. If a data analysis is performed on a file/files larger than the process size it may be good idea to move the process to the data area rather than the other way around.

4 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 4 Instructor: Dr. Khalil Availability: Long-running processes may need to move if the machine is going down. Utilizing special capabilities: A process can be moved to a particular node to benefit from a specialized hardware or software capability. Motivation (Continued)

5 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 5 Instructor: Dr. Khalil Initiation of Migration  Depends on the goal of migration If goal is load balancing, then some module in operating system responsible for monitoring will initiate the migration process. Module will preempt and signal the process migration. The module has to be in contact with peer modules on other systems to decide where to migrate the process to keep load balance. If the goal is to reach a particular resource, then a process may migrate itself, in this case process has to be aware of the distributed system. Where as in the first case the entire migration process is transparent.

6 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 6 Instructor: Dr. Khalil What is Migrated  Must destroy the process on the resource system and create it on the target system.  Process control block and any links must be moved.

7 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 7 Instructor: Dr. Khalil Example of Process Migration (Before/After)

8 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 8 Instructor: Dr. Khalil Migration Schemes  Eager (All): Transfer entire address space. No trace of process is left behind. If address space is large and if the process does not need most of it, then this approach my be unnecessarily expensive.  Pre-Copy : Process continues to execute on the source node while the address space is copied. pages modified on the source during pre-copy operation have to be copied a second time. Reduces the time that a process is frozen and cannot execute during migration.

9 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 9 Instructor: Dr. Khalil  Eager (Dirty) : Transfer only that portion of the address space that is in main memory and has been modified. Any additional blocks of the virtual address space are transferred on demand. The source machine is involved throughout the life of the process. Migration Schemes (Continued)

10 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 10 Instructor: Dr. Khalil Migration Schemes (Continued)  Copy-on-Reference: Pages are only brought over on reference. Variation of eager (dirty). Has lowest initial cost of process migration.  Flushing: Pages are cleared from main memory by flushing dirty pages to disk. Relives the source of holding any pages of the migrated process in main memory.

11 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 11 Instructor: Dr. Khalil Negotiation of Migration 1.Starter on the source system (S) decides a process P should be migrated to a target system (D). It sends a message to D starter for a transfer request. 2.If D ’ s starter is ready to accept the offer, it sends a positive response. 3.S ’ s starter communicates this message to S ’ s kernel. 4.Kernel of S then offers to send process P to machine D, the offer includes statistics about P (age, processor and communication loads).

12 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 12 Instructor: Dr. Khalil 5.Starters decision is communicated to D. 6. D reserves necessary resources to avoid deadlock and flow control, finally sends an acceptance offer. 7. If D is short of those resources described in the offer, it may reject the offer. Otherwise, kernel on the D relays the message to the controlling starter. The relay includes the same information received from S. Negotiation of Migration (Continued)

13 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 13 Instructor: Dr. Khalil Example of Negotiation of Process Migration

14 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 14 Instructor: Dr. Khalil Eviction  System evict a process that has been migrated to it.  Negotiation allows the designated target machine in migration decision, it may also be useful to evict a process which has been migrated for an adequate response. Sprite has this capability, on sprite each process runs on a single host throughout its life time, this host is known as home node of the process. A process migrated to any node becomes a foreign process and the destination node may evict any foreign process in which case it is forced back to the home node.

15 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 15 Instructor: Dr. Khalil The elements of the Sprite eviction mechanism Are as follows: A monitor process at each node monitors current load to determine when to accept a process. If the monitors detects activity it initiates an eviction process on all foreign processes. If a process is evicted, it is sent back to the home node. All processes once marked for eviction are immediately suspended, giving extra processing power to that node. The entire address space of an evicted process is transferred to home node. Eviction (Continued)

16 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 16 Instructor: Dr. Khalil Some Terms  Channel: Exists between two processes if they exchange messages.  State: Sequence of messages that have been sent and received along channels incident with the process.  Snapshot: Records the state of a process.  Global State: The combined state of all processes.  Distributed Snapshot: A collection of snapshots, one for each process.

17 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 17 Instructor: Dr. Khalil Distributed Global States The state of of a distributed system, called the global state (or global snapshot), is given by the collective state of processes and channels.  Operating system cannot know the current state of all process in the distributed system.  A process can only know the current state of all the processes on a local system through the process control block in memory.  Concurrency issues like mutual exclusion, deadlock and starvation are also present in distributed systems.  Remote processes only know state information that is received by messages. These messages represent the state in the past

18 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 18 Instructor: Dr. Khalil Example  Bank account is distributed over two branches.  The total amount in the account is the sum at each branch.  At 3:00 PM the account balance is determined.  Messages are sent to request the information.

19 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 19 Instructor: Dr. Khalil Example (Continued)  If at the time of balance determination, the balance from branch A is in transit to branch B.  The result is a false reading.  All messages in transit must be examined at time of observation.  Total consists of balance at both branches and amount in message.

20 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 20 Instructor: Dr. Khalil  If clocks at the two branches are not perfectly synchronized.  Transfer amount at 3:01 from branch A.  Amount arrives at branch B at 2:59.  At 3:00 the amount is counted twice. Example (Continued)

21 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 21 Instructor: Dr. Khalil Distributed Snapshot Algorithm  Assumption is that messages are delivered in the order they are sent.  It uses a control message called MARKER.  A process (Q) starts this algorithm by recording its state and sending a MARKER to all outgoing channels before any messages are sent.

22 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 22 Instructor: Dr. Khalil  Each process (say P) upon receiving a MARKER performs: 1.(P) records its local state. 2.(P) records the state of the incoming channel from (Q) to (P) as empty. 3.(P) propagates the MARKER to all of its neighbors along all outgoing channels. 4.Algorithm terminates once MARKER has been received along all channels. Distributed Snapshot Algorithm

23 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 23 Instructor: Dr. Khalil Distributed Mutual Exclusion The problem of mutual exclusion arises in distributed systems whenever concurrent access to shared resources by several sites is involved.  Mutual exclusion must be enforced: only one process at a time is allowed in its critical section.  A process that halts in its non-critical section must do so without interfering with other processes.  It must not be possible for a process requiring access to a critical section to be delayed indefinitely: no deadlock or starvation.

24 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 24 Instructor: Dr. Khalil  When no process is in a critical section, any process that requests entry to its critical section must be permitted to enter without delay.  No assumptions are made about relative process speeds or number of processors.  A process remains inside its critical section for a finite time only. Distributed Mutual Exclusion

25 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 25 Instructor: Dr. Khalil Centralized Algorithm for Mutual Exclusion  One node is designated as the control node.  This node control access to all shared objects.  If control node fails, mutual exclusion breaks down.

26 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 26 Instructor: Dr. Khalil Distributed Algorithm  Average all nodes have equal amount of information.  Each node has a partial picture of the entire system and decision is based on that.  All nodes bear equal responsibility for the final decision.  All nodes expands equal effort in effecting a decision.  Failure of a node does not collapse the whole system  Timing events can not be regulated against a system wide common clock.

27 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 27 Instructor: Dr. Khalil Time-Stamping  Each system on the network maintains a counter which functions as a clock.  Each site has a numeric identifier.  When a message is received, the receiving system sets its counter to one more than the maximum of its current value and the incoming time-stamp (counter).  If two messages have the same time-stamp, they are ordered by the number of their sites.  For this method to work each message is sent from one process to all other processes. Ensures all sites have same ordering of messages. For mutual exclusion and deadlock all processes must be aware of the situation.

28 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 28 Instructor: Dr. Khalil Distributed Deadlock  More complicated and complex in distributed systems.  No node has the accurate knowledge of the current state of the overall system.  Message transfer between processes involves an unpredictable delay. Two Types of Deadlocks:  Resource allocation.  Communication of messages.

29 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 29 Instructor: Dr. Khalil Deadlock in Resource Allocation  Mutual exclusion.  Hold and wait.  No preemption.  Circular wait.

30 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 30 Instructor: Dr. Khalil Deadlock Prevention  Circular-wait condition can be prevented by defining a linear ordering of resource types.  Hold-and-wait condition can be prevented by requiring that a process request all of its required resource at one time, and blocking the process until all requests can be granted simultaneously.

31 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 31 Instructor: Dr. Khalil Distributed Deadlock Detection  The difficulty is that each site only knows about its own resources, whereas deadlock may involve distributed resources, following techniques can be employed: Centralized Control: One site is responsible for deadlock detection. Therefore it has the complete picture so it can detect deadlock. Hierarchical Control: Lowest node above the nodes involved in deadlock. It is a tree structure, at each node other than leaf nodes, information about all the resource allocation of all dependent nodes is collected. It allows the detection of deadlock at lower level rather than root node. Distributed Control: All processes cooperate in the deadlock detection function. In this case considerable information is exchanged with timestamps, thus overheads are significant.

32 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 32 Instructor: Dr. Khalil Deadlock in Message Communication Mutual Waiting:  Deadlock occurs in message communication: When each of a group of processes is waiting for a message from another member of the group and there are no messages in transit. Unavailability of Message Buffers:  Well known in packet-switching data networks, for each node, the queue to the adjacent node in one direction is full with packets destined for the next node beyond. Example: Buffer space for A is filled with packets destined for B. The reverse is true at B.  Structured Buffer Pool is used to prevent deadlock.

33 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 33 Instructor: Dr. Khalil Unavailability of Message Buffers

34 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 34 Instructor: Dr. Khalil Structured Buffer Pool

35 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 35 Instructor: Dr. Khalil References 1.Mittal, Neeraj (2003), “Notes on Consistent Global States,” CS 6378: Advanced Operating Systems, The University of Texas at Dallas, Fall 2003. [http://www.utdallas.edu/~neerajm/cs6378f03/001/snapshot.pdf] 2.Williams, Stephen and Kafura D. (1995), “Global State Recording Algorithm :GSRA,” Online Lecture Notes, CS 5204 – Operating Systems, Virginia Tech, Fall 2003. [http://courses.cs.vt.edu/~cs5204/fall99/ Summaries/GlobalState/global_state.html] 3.Singhal, M. and Shivaratri, N. (1994), Advanced Concepts in Operating Systems, McGraw-Hill, pp. 112-113. 4.Chandy, K. M. and Lamport, L. (1991), “Distributed Snapshots: Determining Global States of Distributed Systems”, ACM Transactions on Computer Systems, vol. 9, no. 3, pp. 272-314. 5.Stallings, William (2001), Operating Systems: Internals and Design Principles, 4th Ed., Prentice-Hall, Upper Saddle River, NJ, Figs. 14.1, 14.2, 14.3, 14.17, and 14.18.

36 Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 36 Instructor: Dr. Khalil Questions?


Download ppt "Nov 8, 2003 SMU School of Engineering Group 2: Hammad, Mann, Nidgundi, & Zhang CSE 8343 Advanced Operating Systems Slide 1 Instructor: Dr. Khalil Distributed."

Similar presentations


Ads by Google