Outline Announcement Distributed scheduling – continued

Slides:



Advertisements
Similar presentations
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Distributed Scheduling.
Advertisements

Ch 11 Distributed Scheduling –Resource management component of a system which moves jobs around the processors to balance load and maximize overall performance.
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Silberschatz and Galvin  Operating System Concepts Module 16: Distributed-System Structures Network-Operating Systems Distributed-Operating.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed Scheduling.
 2004 Deitel & Associates, Inc. All rights reserved. 1 Chapter 3 – Process Concepts Outline 3.1 Introduction 3.1.1Definition of Process 3.2Process States:
1 Scheduling and Migration Lê Công Nguyên Trần Phúc Nguyên Phan Tiên Khôi.
A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
Distributed Process Management
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Computer Science Lecture 7, page 1 CS677: Distributed OS Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features: –One.
Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 6 -- Spring 2001.
1 Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
PRASHANTHI NARAYAN NETTEM.
1 Distributed Systems: Distributed Process Management – Process Migration.
DISTRIBUTED PROCESS IMPLEMENTAION BHAVIN KANSARA.
Summary :- Distributed Process Scheduling Prepared BY:- JAYA KALIDINDI.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Distributed Process Implementation Hima Mandava. OUTLINE Logical Model Of Local And Remote Processes Application scenarios Remote Service Remote Execution.
Load distribution in distributed systems
Distributed Scheduling
1 Distributed Operating Systems and Process Scheduling Brett O’Neill CSE 8343 – Group A6.
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
Fault Tolerance Recovery (Week:9). Introduction (Recovery) Recovery refers to restoring a system to its normal operational state Generally, its a very.
1 Distributed Process Scheduling: A System Performance Model Vijay Jain CSc 8320, Spring 2007.
Computer Science Lecture 8, page 1 CS677: Distributed OS Last Class Threads –User-level, kernel-level, LWPs Multiprocessor Scheduling –Cache affinity –Preemption.
Transparent Process Migration: Design Alternatives and the Sprite Implementation Fred Douglis and John Ousterhout.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
1 Our focus  scheduling a single CPU among all the processes in the system  Key Criteria: Maximize CPU utilization Maximize throughput Minimize waiting.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
1 Process migration n why migrate processes n main concepts n PM design objectives n design issues n freezing and restarting a process n address space.
Page 1 Process Migration & Allocation Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this.
Summary :-Distributed Process Scheduling Prepared By:- Monika Patel.
CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.
DISTRIBUTED COMPUTING
Static Process Scheduling
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
A System Performance Model Distributed Process Scheduling.
Chapter 8 System Management Semester 2. Objectives  Evaluating an operating system  Cooperation among components  The role of memory, processor,
Operating Systems: Summary INF1060: Introduction to Operating Systems and Data Communication.
Distributed Scheduling Motivations: reduce response time of program execution through load balancing Goal: enable transparent execution of programs on.
Operating Systems Distributed-System Structures. Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design.
System Models Advanced Operating Systems Nael Abu-halaweh.
Load Distributing Algorithm: Some load distribution algorithms are Sender Initiated algorithms Receiver Initiated algorithms Symmetrically Initiated algorithms.
C HAPTER 5.4 DISTRIBUTED PROCESS IMPLEMENTAION By: Nabina Pradhan 10/09/2013.
Processes and threads.
Introduction to Load Balancing:
Operating Systems CMPSC 473
Process Management Process Concept Why only the global variables?
Lesson Objectives Aims Key Words
Chapter 3 – Process Concepts
Auburn University COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.
Chapter 16: Distributed System Structures
B. N. Bershad, T. E. Anderson, E. D. Lazowska and H. M
Advanced Operating Systems
Lecture 2: Processes Part 1
Distributed Process Scheduling: 5.1 A System Performance Model
Outline Announcements Fault Tolerance.
Outline Midterm results summary Distributed file systems – continued
Load Balancing/Sharing/Scheduling Part II
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
Chapter 2: Operating-System Structures
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
CSE 542: Operating Systems
Chapter 2: Operating-System Structures
Presentation transcript:

Outline Announcement Distributed scheduling – continued Quiz at the end of today’s class

COP 5611 - Operating Systems Announcement Schedule for the rest of the semester 4/10: Recovery 4/15: Fault tolerance 4/17: Class evaluation Protection and security 4/22: Protection and security – continued Quiz #3 4/24: Existing distributed systems and review Final exam 5:30-7:30PM, April 29, 2003 Cumulative February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Motivations February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Motivations – cont. February 23, 2019 COP 5611 - Operating Systems

Distributed Scheduling A distributed scheduler is a resource management component of a distributed operating system that focuses on judiciously and transparently redistributing the load of the system among the computers to maximize the overall performance February 23, 2019 COP 5611 - Operating Systems

Components of a Load Distributing Algorithm Four components Transfer policy Determines when a node needs to send tasks to other nodes or can receive tasks from other nodes Selection policy Determines which task(s) to transfer Location policy Find suitable nodes for load sharing Information policy February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Stability The queuing-theoretic perspective The CPU queues grow without bound if arrival rate is greater than the rate at which the system can perform work A load distributing algorithm is effective under a given set of conditions if it improves the performance relative to that of a system not using load distribution Algorithmic stability An algorithm is unstable if it can perform fruitless actions indefinitely with finite probability Processor thrashing February 23, 2019 COP 5611 - Operating Systems

Sender-Initiated Algorithms February 23, 2019 COP 5611 - Operating Systems

Receiver-Initiated Algorithms February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Empirical Comparison of Sender-Initiated and Receiver-Initiated Algorithms February 23, 2019 COP 5611 - Operating Systems

Symmetrically Initiated Algorithms Sender-initiated component A sender broadcasts a TooHigh message, sets a TooHigh timeout alarm, and listens for an Accept A receiver that receives a TooHigh message cancels its TooLow timeout, sends an Accept message to the sender, and increases its load value On receiving an Accept message, if the site is still a sender, choose the best task to transfer and transfer it If no Accept has been received before the timeout, it broadcasts a ChangeAverage message to increase the average load estimates at the other nodes February 23, 2019 COP 5611 - Operating Systems

Symmetrically Initiated Algorithms – cont. Receiver-initiated component It broadcasts a TooLow message, set a TooLow timeout alarm, and starts listening for a TooHigh message If TooHigh message is received, it cancels its TooLow timeout, sends an Accept message to the sender, and increases its load value If no TooHigh message is received before the timeout, the receiver broadcasts a ChangeAverage message to decrease the average at other nodes February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Comparison February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Adaptive Algorithms A stable symmetrically initiated algorithm Each node keeps of a senders list, a receivers list, and an OK list By classifying the nodes in the system as Sender/overloaded, Receiver/underloaded, or OK using the information gathered through polling February 23, 2019 COP 5611 - Operating Systems

A Stable Symmetrically Initiated Algorithm – cont. Sender-initiated component The sender polls the node at the head of the receiver The polled node moves the sender to the head of its sender list and sends a message indicating it is a receiver, sender, or OK node The sender updates the polled node based on the reply If the polled node is a receiver, it transfers a task The polling process stops if its receiver’s list becomes empty, or the number of polls reaches a PollLimit February 23, 2019 COP 5611 - Operating Systems

A Stable Symmetrically Initiated Algorithm – cont. Receiver-initiated component The nodes polled in the following order Head to tail of its senders list Tail to head in the OK list Tail to head in the receivers list February 23, 2019 COP 5611 - Operating Systems

A Stable Sender-Initiated Algorithm This algorithm uses the sender-initiated algorithm of the stable symmetrically initiated algorithm Each node is augmented by an array called the statevector It keeps track of its status at all the other nodes in the system It is updated based on the information at the polling stage The receiver-initiated component is replaced by the following protocol When a node becomes a receiver, it informs all the nodes that are misinformed February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Comparison February 23, 2019 COP 5611 - Operating Systems

Performance Under Heterogeneous Workloads February 23, 2019 COP 5611 - Operating Systems

Selecting a Suitable Load Sharing Algorithm The best algorithm depends on the system under consideration For example, if the system never attains high loads, sender-initiated algorithms will give an improved algorithm Stable scheduling algorithms should be used for systems that can reach high loads For systems with heterogeneous work loads, adaptive stable algorithms are preferable February 23, 2019 COP 5611 - Operating Systems

Other Requirements of Load Distributing Scalability The algorithm should work well in large distributed systems Location transparency Determinism Preemption Heterogeneity February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Case Studies The V-System The Sprite system Condor system The Stealth distributed scheduler February 23, 2019 COP 5611 - Operating Systems

Task Placement vs. Task Migration Task placement refers to the transfer of a task that is yet to begin execution to a new location and starts its execution there Task migration refers to the transfer of task that has already begun execution to a new location and continuing its execution there February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Task Migration State transfer The task’s state includes the content of registers, the task stack, the task’s status, virtual memory address space, file descriptors, any temporary files and buffered messages In addition, current working directory, signal masks and handlers, resource usage statistics, and references to children and parent processes Unfreeze The task is installed at the new machine and is put in the ready queue February 23, 2019 COP 5611 - Operating Systems

Issues in Task Migration State transfer The cost to support remote execution, including delays due to freezing the tasks, obtaining and transferring the state, and unfreezing the task Residual dependencies Transferring pages in the virtual memory space Redirection of messages Location-dependent system calls Residual dependencies are undesirable February 23, 2019 COP 5611 - Operating Systems

State Transfer Mechanisms February 23, 2019 COP 5611 - Operating Systems

Location Transparency Location transparency is essential to support task migration Task migration should hide the locations of tasks Remote execution of tasks should not require any special provisions in programs These require names be independent of their locations Addresses are maintained as hints An object can be accessed through pointers February 23, 2019 COP 5611 - Operating Systems

Task Migration Performance Cost of process migration in Sprite February 23, 2019 COP 5611 - Operating Systems

Task Migration Performance – cont. Cost of process migration in Charlotte February 23, 2019 COP 5611 - Operating Systems

COP 5611 - Operating Systems Summary Load distributed algorithms try to improve the overall system performance by transferring load from heavily loaded nodes to lightly loaded or idle nodes There are different load distributed algorithms developed To be effective, these algorithms must be able to collect the necessary information efficiently and minimize the overhead of task transferring and delays to due to task transferring February 23, 2019 COP 5611 - Operating Systems