Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 6 -- Spring 2001.

Slides:



Advertisements
Similar presentations
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Distributed Scheduling.
Advertisements

Ch 11 Distributed Scheduling –Resource management component of a system which moves jobs around the processors to balance load and maximize overall performance.
Scheduling Criteria CPU utilization – keep the CPU as busy as possible (from 0% to 100%) Throughput – # of processes that complete their execution per.
CS 542: Topics in Distributed Systems Diganta Goswami.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
CPU Scheduling Questions answered in this lecture: What is scheduling vs. allocation? What is preemptive vs. non-preemptive scheduling? What are FCFS,
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed Scheduling.
1 Scheduling and Migration Lê Công Nguyên Trần Phúc Nguyên Phan Tiên Khôi.
Chap 5 Process Scheduling. Basic Concepts Maximum CPU utilization obtained with multiprogramming CPU–I/O Burst Cycle – Process execution consists of a.
Chapter 5 CPU Scheduling. CPU Scheduling Topics: Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Chapter 3: CPU Scheduling
What we will cover…  Distributed Coordination 1-1.
Chapter 6: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Chapter 6: CPU Scheduling Basic.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
A. Frank - P. Weisberg Operating Systems CPU Scheduling.
Distributed Process Management1 Learning Objectives Distributed Scheduling Algorithms Coordinator Elections Orphan Processes.
Election Algorithms and Distributed Processing Section 6.5.
Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm.
Deadlocks in Distributed Systems Deadlocks in distributed systems are similar to deadlocks in single processor systems, only worse. –They are harder to.
Load distribution in distributed systems
Distributed Scheduling
Challenges of Process Allocation in Distributed System Presentation 1 Group A4: Syeda Taib, Sean Hudson, Manasi Kapadia.
Chapter 5: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Basic Concepts Maximum CPU utilization.
CPU-Scheduling Whenever the CPU becomes idle, the operating system must select one of the processes in the ready queue to be executed. The short term scheduler.
Chapter 6: CPU Scheduling
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-3 CPU Scheduling Department of Computer Science and Software Engineering.
Chapter 6 CPU SCHEDULING.
1 Distributed Operating Systems and Process Scheduling Brett O’Neill CSE 8343 – Group A6.
 Escalonamento e Migração de Recursos e Balanceamento de carga Carlos Ferrão Lopes nº M6935 Bruno Simões nº M6082 Celina Alexandre nº M6807.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 5: Process Scheduling.
1 11/29/2015 Chapter 6: CPU Scheduling l Basic Concepts l Scheduling Criteria l Scheduling Algorithms l Multiple-Processor Scheduling l Real-Time Scheduling.
Distributed and hierarchical deadlock detection, deadlock resolution
1 CS.217 Operating System By Ajarn..Sutapart Sappajak,METC,MSIT Chapter 5 CPU Scheduling Slide 1 Chapter 5 CPU Scheduling.
6.1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Static Process Scheduling
Distributed Scheduling Motivations: reduce response time of program execution through load balancing Goal: enable transparent execution of programs on.
CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
CPU Scheduling CS Introduction to Operating Systems.
Process Scheduling. Scheduling Strategies Scheduling strategies can broadly fall into two categories  Co-operative scheduling is where the currently.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Basic Concepts Maximum CPU utilization obtained with multiprogramming
Lecturer 5: Process Scheduling Process Scheduling  Criteria & Objectives Types of Scheduling  Long term  Medium term  Short term CPU Scheduling Algorithms.
Load Distributing Algorithm: Some load distribution algorithms are Sender Initiated algorithms Receiver Initiated algorithms Symmetrically Initiated algorithms.
lecture 5: CPU Scheduling
Dan C. Marinescu Office: HEC 439 B. Office hours: M, Wd 3 – 4:30 PM.
Chapter 6: CPU Scheduling
CPU Scheduling G.Anuradha
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
3: CPU Scheduling Basic Concepts Scheduling Criteria
Chapter5: CPU Scheduling
Chapter 5: CPU Scheduling
Chapter 6: CPU Scheduling
Load Balancing/Sharing/Scheduling Part II
Chapter 5: CPU Scheduling
Outline Announcement Distributed scheduling – continued
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Presentation transcript:

Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 6 -- Spring 2001

7 March 2001CS-551, Lecture 62 CS551: Lecture 6 n Topics – Distributed Process Management (Chapter 7) n Distributed Scheduling Algorithm Choices n Scheduling Algorithm Approaches n Coordinator Elections n Orphan Processes – Distributed File Systems (Chapter 8) n Distributed Name Service n Distributed File Service n Distributed Directory Service

7 March 2001CS-551, Lecture 63 Distributed Deadlock Prevention n Assign each process a global timestamp when it starts n No two processes should have same timestamp n Basic idea: “When one process is about to block waiting for a resource that another process is using, a check is made to see which has a larger timestamp (i.e. is younger).” Tanenbaum, DOS (1995)

7 March 2001CS-551, Lecture 64 Distributed Deadlock Prevention n Somehow put timestamps on each process, representing creation time of process n Suppose a process needs a resource already owned by another process n Determine relative ages of both processes Decide if waiting process should Preempt, Wait, Die, or Wound owning process n Two different algorithms

7 March 2001CS-551, Lecture 65 Distributed Deadlock Prevention n Allow wait only if waiting process is older – Since timestamps increase in any chain of waiting processes, cycles are impossible n Or allow wait only if waiting process is younger – Here timestamps decrease in any chain of waiting process, so cycles are again impossible n Wiser to give older processes priority

7 March 2001CS-551, Lecture 66 Example: wait-die algorithm Waits Holds resourceWants resource Holds resource Dies

7 March 2001CS-551, Lecture 67 Example: wound-wait algorithm Preempts Holds resourceWants resource Holds resource Waits

7 March 2001CS-551, Lecture 68 Algorithm Comparison n Wait-die kills young process – When young process restarts and requests resource again, it is killed once more – Less efficient of these two algorithms n Wound-wait preempts young process – When young process re-requests resource, it has to wait for older process to finish – Better of the two algorithms

7 March 2001CS-551, Lecture 69 Figure 7.7 The Bully Algorithm. (Galli, p. 169)

7 March 2001CS-551, Lecture 610 Process Management in a Distributed Environment n Processes in a Uniprocessor n Processes in a Multiprocessor n Processes in a Distributed System – Why need to schedule – Scheduling priorities – How to schedule – Scheduling algorithms

7 March 2001CS-551, Lecture 611 Distributed Scheduling n Basically resource management n Want to distribute processing load among the processing elements in order to maximize performance n Consider having several homogeneous processing elements on a LAN with equal average workloads – Workload may still not be evenly distributed – Some PEs may have idle cycles

7 March 2001CS-551, Lecture 612 Efficiency Metrics n Communication cost – Low if very little or no communication required – Low if all communicating processes n on same PE n not distant (small number of hops) n Execution cost – Relative speed of PE – Relative location of needed resources – Type of n operating system n machine code n architecture

7 March 2001CS-551, Lecture 613 Efficiency Metrics, continued n Resource Utilization – May be based upon n Current PE loads n Load status state n Resource queue lengths n Memory usage n Other resource availability

7 March 2001CS-551, Lecture 614 Level of Scheduling n When to run process locally or to send it to an idle PE? n Local Scheduling – Allocate process to local PE – Review Galli, Chapter 2, for more information n Global Scheduling – Choose which PE executes which process – Also called process allocation – Precedes local scheduling decision

7 March 2001CS-551, Lecture 615 Figure 7.1 Scheduling Decision Chart. (Galli,p.152)

7 March 2001CS-551, Lecture 616 Distribution Goals n Load Balancing – Tries to maintain an equal load throughout system n Load Sharing – Simpler – Tries to prevent any PE from becoming too busy

7 March 2001CS-551, Lecture 617 Load Balancing / Load Sharing n Load Balancing – Try to equalize loads at PEs – Requires more information – More overhead n Load Sharing – Avoid having an idle PE if there is work to do n Anticipating Transfers – Avoid PE idle wait while a task is coming – Get a new task just before PE becomes idle

7 March 2001CS-551, Lecture 618 Figure 7.2 Load Distribution Goals. (Galli,p.153)

7 March 2001CS-551, Lecture 619 Processor Allocation Algorithms n Assume virtually identical PEs n Assume PEs fully interconnected n Assume processes may spawn children n Two strategies – Non-migratory n static binding n non-preemptive – Migratory n dynamic binding n preemptive

7 March 2001CS-551, Lecture 620 Processor Allocation Strategies Non-migratory (static binding, non- preemptive) – Transfer before process starts execution – Once assigned to a machine, process stays there Migratory (dynamic binding, preemptive) – Processes may move after execution begins – Better load balancing – Expensive: must collect and move entire state – More complex algorithms

7 March 2001CS-551, Lecture 621 Efficiency Goals n Optimal – Completion time – Resource Utilization – System Throughput – Any combination thereof n Suboptimal – Suboptimal Approximate – Suboptimal Heuristic

7 March 2001CS-551, Lecture 622 Optimal Scheduling Algorithms n Requires state of all competing processes n Scheduler must have access to all related information n Optimization is a hard problem – Usually NP-Hard for multiple processors n Thus, consider – Suboptimal Approximate solutions – Suboptimal Heuristic solutions

7 March 2001CS-551, Lecture 623 SubOptimal Approximate Solutions n Similar to Optimal Scheduling algorithms n Try to find good solutions, not perfect solutions n Searches are limited n Include intelligent shortcuts

7 March 2001CS-551, Lecture 624 SubOptimal Heuristic Solutions n Heuristics – Employ rules-of-thumb – Employ intuition – May not be provable n Generally considered to work in an acceptable manner n Examples: – If PE has heavy load, don’t give it more to do – Locality of reference for related processes, data

7 March 2001CS-551, Lecture 625 Figure 7.1 Scheduling Decision Chart. (Galli,p.152)

7 March 2001CS-551, Lecture 626 Types of Load Distribution Algs n Static – Decisions are hard-wired in n Dynamic – Use static information to make decisions – Overhead of keeping track of information n Adaptive – A type of dynamic algorithm – May work differently at different loads

7 March 2001CS-551, Lecture 627 Load Distribution Algorithm Issues n Transfer Policy n Selection Policy n Location Policy n Information Policy n Stability n Sender-initiated versus Receiver-Initiated n Symmetrically-Initiated n Adaptive Algorithms

7 March 2001CS-551, Lecture 628 Load Dist. Algs. Issues, cont. n Transfer Policy – When it is appropriate to move a task? – If load at sending PE > threshold – If load at receiving PE < threshold n Location Policy – Find a receiver PE – Methods: n Broadcast messages n Polling: random, neighbors, recent candidates

7 March 2001CS-551, Lecture 629 Load Dist. Algs. Issues, cont. n Selection Policy – Which task should migrate? – Simple n Select new tasks n Non-Preemptive – Criteria n Cost of transfer – should be covered by reduction in response time n Size of task n Number of dependent system calls (use local PE)

7 March 2001CS-551, Lecture 630 Load Dist. Algs. Issues, cont. n Information Policy – What information should be collected? n When? From whom? By whom? – Demand-driven n Get info when PE becomes sender or receiver n Sender-initiated – senders look for receivers n Receiver-initiated – receivers look for senders n Symmetrically-initiated – either of above – Periodic – at fixed time intervals, not adaptive – State-change-driven n Send info about node state (rather than solicit)

7 March 2001CS-551, Lecture 631 Load Dist. Algs. Issues, cont. n Stability – Queuing Theoretic n Stable: Sum(arrival load + overhead) < capacity n Effective: Using the algorithm gives better performance than not doing load distribution n An effective algorithm cannot be unstable n A stable algorithm can be ineffective (overhead) – Algorithmic Stability n E.g. Performing overhead operations, but making no forward progress n E.g. moving a task from PE to PE, only to learn that it increases the PE workload enough that it needs to be transferred again

7 March 2001CS-551, Lecture 632 Load Dist Algs Issues, concluded n Stability – Queuing Theoretic n Stable: Sum(arrival load + overhead) < capacity n Effective: Using the algorithm gives better performance than not doing load distribution n An effective algorithm cannot be unstable n A stable algorithm can be ineffective (overhead) – Algorithmic Stability n E.g. Performing overhead operations, but making no forward progress n E.g. moving a task from PE to PE, only to learn that it increases the PE workload enough that it needs to be transferred again

7 March 2001CS-551, Lecture 633 Load Dist Algs: Sender-Initiated n Sender PE thinks it is overloaded n Transfer Policy – Threshold (T) based on PE CPU queue length (QL) n Sender: QL > T n Receiver: QL < T n Selection Policy – Non-preemptive n Allows only new tasks n Long-lived tasks makes this policy worthwhile

7 March 2001CS-551, Lecture 634 Load Dist Algs: Sender-Initiated n Location (3 different policies) – Random n Select a receiver at random – Useless or wasted if destination is loaded n Want to avoid transferring the same task from PE to PE to PE – Include limit on number of transfers – Threshold n Start polling PEs at random – If ‘receiver’ found, send task to it – Limit search to ‘Poll-limit’ If limit hit, keep task on current PE

7 March 2001CS-551, Lecture 635 LDAs: Sender-Initiated n Location (3 different policies, cont.) – Shortest n Poll a random set of PEs – Choose PE with shortest queue length n Only a little better than Threshold Location Policy – Not worth the additional work

7 March 2001CS-551, Lecture 636 LDAs: Sender-Initiated n Information Policy – Demand-driven n After identifying a sender n Stability – At high load, PE might not find a receiver – Polling will be wasted – Polling increases the load on the system n Could lead to instability

7 March 2001CS-551, Lecture 637 LDAs: Receiver-Initiated n Receiver is trying to find work n Transfer Policy – If local QL < T, try to find a sender n Selection Policy – Non-preemptive n But there may not be any – Worth the effort

7 March 2001CS-551, Lecture 638 LDAs: Receiver-Initiated n Location Policy – Select PE at random – If taking a task does not move that PE’s load below threshold, take it – If no luck after trying the Poll Limit times, n Wait until another task completed n Wait another time period n Information Policy – Demand-driven

7 March 2001CS-551, Lecture 639 LDAs: Receiver-Initiated n Stability – Tends to be stable n At high load, a sender should be found n Problem – Transfers tend to be preemptive n Tasks on sender node have already started

7 March 2001CS-551, Lecture 640 LDAs: Symmetrically-Initiated n Both senders and receivers can search for tasks to transfer n Has both advantages and disadvantages of two previous methods n Above average algorithm – Try to keep load at each PE at acceptable level – Aiming for exact average can cause thrashing

7 March 2001CS-551, Lecture 641 LDAs: Symmetrically-Initiated n Transfer Policy – Each PE n Estimates the average load n Sets both an upper and a lower threshold – Equal distance from any estimate n If load > upper, PE acts as a sender n If load < lower, PE acts as a receiver

7 March 2001CS-551, Lecture 642 LDAs: Symmetrically-Initiated n Location Policy – Sender-initiated n Sender broadcasts a TooHigh message, sets timeout n Receiver sends Accept message, clears timeout, increases Load value, sets timeout n If sender still wants to send when Accept message comes, sends task n If sender gets TooLow message before Accept, sends task n If sender has TooHigh timeout with no Accept – Average estimate is too low – Broadcasts ChangeAvg message to all PEs

7 March 2001CS-551, Lecture 643 LDAs: Symmetrically-Initiated n Location Policy – Receiver-initiated n Receiver sends TooLow message, sets timeout n Rest is converse of sender-initiated algorithm n Selection Policy – Use a reasonable policy n Non-preemptive, if possible n Low cost

7 March 2001CS-551, Lecture 644 LDAs: Symmetrically-Initiated n Information Policy – Demand-driven – Determined at each PE – Low overhead

7 March 2001CS-551, Lecture 645 LDAs: Adaptive n Stable: Symmetrically-Initiated – Previous instability was due to too much polling by the sender – Each PE keeps lists of the other Pes sorted into three categories n Sender overloaded n Receiver overloaded n Okay – Each PE has all other Pes receiver list at start

7 March 2001CS-551, Lecture 646 LDAs: Adaptive n Transfer Policy – Based on PE CPU queue length – Low threshold (LT) and high threshold (HT) n Selection Policy – Sender-initiated: only sends new tasks – Receiver-initiated: takes any task n Trying for low cost n Information Policy – Demand-driven – maintains lists

7 March 2001CS-551, Lecture 647 LDAs: Adaptive n Location Policy – Receiver-initiated n Order of polling – Sender’s list – head to tail (new info first) – OK list – tail to head (out-of-date first) – Receiver list (tail to head) n When PE becomes receiver, QL < LT – Starts polling If it finds a sender, transfer happens Else use replies to update lists – Continues until It finds a sender It is no longer a receiver It hits the Poll Limit

7 March 2001CS-551, Lecture 648 LDAs: Adaptive n Notes – At high loads, activity is sender-initiated, but there sender will soon have an empty receiver list  no polling n So it will go to receiver-initiated – At low loads, receiver-initiated  failure n But overhead doesn’t matter at low load n And lists get updated n So sender-initiated should work quickly

7 March 2001CS-551, Lecture 649 Load Scheduling Algorithms (Galli) n Usage Points – Charged for using remote PEs, resources n Graph Theory – Minimum cutset of assignment graph – Maximum flow of graph n Probes – Messages to locate available, appropriate PEs n Scheduling Queues n Stochastic Learning

7 March 2001CS-551, Lecture 650 Figure 7.3 Usage Points. (Galli,p.158)

7 March 2001CS-551, Lecture 651 Figure 7.4 Economic Usage Points. (Galli, p.159)

7 March 2001CS-551, Lecture 652 Figure 7.5 Two-Processor Min-Cut Example. (Galli, p.161)

7 March 2001CS-551, Lecture 653 Figure 7.6 A Station with Run Queues and Hints. (Galli, p.164)

7 March 2001CS-551, Lecture 654 CPU Queue Length as Metric n PE queue length correlates well with response time – Easy to measure – Caution: n When accepting new migrating process, increment queue length right away n Perhaps time-out needed in case process never arrives n PE queue length does not correlate well with PE utilization – Daemon to monitor PE utilization: overhead

7 March 2001CS-551, Lecture 655 Election Algorithms n Bully algorithm (Garcia-Molina, 1982) n A Ring election algorithm

7 March 2001CS-551, Lecture 656 Bully Algorithm n Each processor has a unique number n One processor notices that the leader/server is missing – Sends messages to all other processes – Requests to be appointed leader – Includes his processor number n Processors with higher (lower) processor numbers can bully the first processor

7 March 2001CS-551, Lecture 657 Figure 7.7 The Bully Algorithm. (Galli, p. 169)

7 March 2001CS-551, Lecture 658 Bully Algorithm, continued n Initial processor need only send messages about election to higher/lower numbered processors n Any processors that respond effectively tell the first processor that they overrule him and that he is out of the running n These processors then start sending election messages to the other top processors

7 March 2001CS-551, Lecture 659 Bully Example calls election 3, 4 respond

7 March 2001CS-551, Lecture 660 Bully Example, continued calls election 4 calls election

7 March 2001CS-551, Lecture 661 Bully Example, concluded responds to 3 4 is the new leader

7 March 2001CS-551, Lecture 662 A Ring Election Algorithm n No token n Each processor knows successor n When a processor notices leader is down, sends election message to successor n If successor is down, sends to next processor n Each sender adds own number to message

7 March 2001CS-551, Lecture 663 Ring Election Algorithm, cont. n First processor eventually receives back the election message containing his number n Election message is changed to coordinator message and resent around ring n The highest processor number in message becomes the new leader n When first processor receives the coordinator message, it is deleted

7 March 2001CS-551, Lecture 664 Ring Election Example ,4,5,6,0,1,2 3,4,5,6,0,1 3,4,5,6,0 3,4,5,6 3,4,5 3,4

7 March 2001CS-551, Lecture 665 Orphan Processes n A child process that is still active after its parent process has terminated prematurely n Can happen with remote procedure calls n Wastes resources n Can corrupt shared data n Can create more processes n Three solutions follow

7 March 2001CS-551, Lecture 666 Orphan Cleanup n A process must clean up after itself after a crash – Requires each parent keep list of children – Parent thus has access to family tree – Must be kept in nonvolatile storage – On restart, each family tree member told of parent process’s death and halts execution n Disadvantage: parent overhead

7 March 2001CS-551, Lecture 667 Figure 7.8 Orphan Cleanup Family Trees. (Galli, p.170)

7 March 2001CS-551, Lecture 668 Child Process Allowance n All child processes receive a finite time allowance n If no time left, child must request more time from parent n If parent has terminated prematurely, child’s request goes unanswered n With no time allowance, child process dies n Requires more communication n Slows execution of child processes

7 March 2001CS-551, Lecture 669 Figure 7.9 Child Process Allowance. (Galli, p.172)

7 March 2001CS-551, Lecture 670 Process Version Numbers n Each process must keep track of a version number for its parent n After a system crash, the entire distributed system is assigned a new version number n Child forced to terminate if version number is out-of-date n Child may try to find parent – Terminates if unsuccessful n Requires a lot of communication

7 March 2001CS-551, Lecture 671 Figure 7.10 Process Version Numbers. (Galli, p.174)