Multiple Processor Systems Operating Systems: Internals and Design Principles, 6/E William Stallings Multiple Processor Systems Bits of Chapters 4, 10, 16 1
Parallel Processor Architectures 2
Multiprocessor Systems Continuous need for faster computers shared memory model message passing multiprocessor wide area distributed system
Multiprocessors Definition: A computer system in which two or more CPUs share full access to a common RAM
Multiprocessor Hardware Bus-based multiprocessors
UMA Multiprocessor using a crossbar switch Non-blocking network UMA Multiprocessor using a crossbar switch
Omega Switching Network Blocking network Omega Switching Network
Master-Slave multiprocessors Bus
Symmetric Multiprocessors Bus
Symmetric Multiprocessor Organization 10
Multiprocessor Operating Systems Design Considerations Simultaneous concurrent processes or threads: reentrant routines, IPC Scheduling: on which processor a process should run Synchronization: locks Memory management: e.g. shared pages Reliability and fault tolerance: graceful degradation 11
Multiprocessor Synchronization TSL fails if bus is not locked
Spinning versus Switching In some cases CPU must wait waits to acquire ready list In other cases a choice exists spinning wastes CPU cycles switching uses up CPU cycles also possible to make separate decision each time locked mutex encountered (e.g. using history)
Scheduling Design Issues Assignment of processes to processors Use of multiprogramming on individual processors Actual dispatching of a process 14
Assignment of Processes to Processors Treat processors as a pooled resource and assign process to processors on demand Static assignment: Permanently assign process to a processor Dedicate short-term queue for each processor Less overhead Processor could be idle while another processor has a backlog 15
Assignment of Processes to Processors (2) Dynamic assignment: Global queue Schedule to any available processor Process migration, cf. local cache 16
Master/slave architecture Key kernel functions always run on a particular processor Master is responsible for scheduling Slave sends service request to the master Disadvantages Failure of master brings down whole system Master can become a performance bottleneck 17
Peer architecture Kernel can execute on any processor Each processor does self-scheduling Complicates the operating system Make sure two processors do not choose the same process 18
Traditional Process Scheduling Single queue for all processes Multiple queues are used for priorities All queues feed to the common pool of processors 19
Thread Scheduling An application can be a set of threads that cooperate and execute concurrently in the same address space True parallelism 20
Comparison One and Two Processors 21
Comparison One and Two Processors (2) 22
Multiprocessor Thread Scheduling Load sharing Threads are not assigned to a particular processor Gang scheduling A set of related threads is scheduled to run on a set of processors at the same time Dedicated processor assignment Threads are assigned to a specific processor 23
Load Sharing Load is distributed evenly across the processors No centralized scheduler required Use global queues 24
Disadvantages of Load Sharing Central queue needs mutual exclusion Preemptive threads are unlikely to resume execution on the same processor If all threads are in the global queue, not all threads of a program will gain access to the processors at the same time 25
Multiprocessor Scheduling (3) Problem with communication between two threads both belong to process A both running out of phase
Gang Scheduling Simultaneous scheduling of threads that make up a single process Useful for applications where performance severely degrades when any part of the application is not running Threads often need to synchronize with each other 27
Dedicated Processor Assignment When application is scheduled, its threads are assigned to a processor Some processors may be idle No multiprogramming of processors 28
Application Speedup 29
Client/Server Computing Client machines are generally single-user PCs or workstations that provide a highly user-friendly interface to the end user Each server provides a set of shared services to the clients The server enables many clients to share access to the same database and enables the use of a high-performance computer system to manage the database 30
Generic Client/Server Environment 31
Client/Server Applications Basic software is an operating system running on the hardware platform Platforms and the operating systems of client and server may differ These lower-level differences are irrelevant as long as a client and server share the same communications protocols and support the same applications 32
Generic Client/Server Architecture 33
Middleware Set of tools that provide a uniform means and style of access to system resources across all platforms Enable programmers to build applications that look and feel the same Enable programmers to use the same method to access data 34
Role of Middleware in Client/Server Architecture 35
Distributed Message Passing 36
Basic Message-Passing Primitives 37
Reliability versus Unreliability Reliable message-passing guarantees delivery if possible Not necessary to let the sending process know that the message was delivered Send the message out into the communication network without reporting success or failure Reduces complexity and overhead 38
Blocking versus Nonblocking Process is not suspended as a result of issuing a Send or Receive Efficient and flexible Difficult to debug 39
Blocking versus Nonblocking Send does not return control to the sending process until the message has been transmitted OR does not return control until an acknowledgment is received Receive does not return until a message has been placed in the allocated buffer 40
Remote Procedure Calls Allow programs on different machines to interact using simple procedure call/return semantics Widely accepted Standardized Client and server modules can be moved among computers and operating systems easily 41
Remote Procedure Call 42
Remote Procedure Call Mechanism 43
Client/Server Binding Binding specifies the relationship between remote procedure and calling program Nonpersistent binding Logical connection established during remote procedure call Persistent binding Connection is sustained after the procedure returns 44
Synchronous versus Asynchronous Synchronous RPC Behaves much like a subroutine call Asynchronous RPC Does not block the caller Enable a client execution to proceed locally in parallel with server invocation 45
Clusters Alternative to symmetric multiprocessing (SMP) Group of interconnected, whole computers (nodes) working together as a unified computing resource Illusion is one machine 46
No Shared Disks 47
Shared Disk 48
Clustering Methods: Benefits and Limitations 49
Clustering Methods: Benefits and Limitations 50
Operating System Design Issues Failure management Highly available cluster offers a high probability that all resources will be in service No guarantee about the state of partially executed transactions if failure occurs Fault-tolerant cluster ensures that all resources are always available 51
Operating System Design Issues (2) Load balancing When new computer added to the cluster, the load-balancing facility should automatically include this computer in scheduling applications 52
Multicomputer Scheduling - Load Balancing (1) Process Graph-theoretic deterministic algorithm
Load Balancing (2) Sender-initiated distributed heuristic algorithm overloaded sender
Load Balancing (3) Receiver-initiated distributed heuristic algorithm under loaded receiver
Operating System Design Issues (3) Parallelizing Computation Parallelizing compiler: determines at compile time which parts can be executed parallel Parallelized application: by programmer, message passing for moving data Parametric computing: several runs with different settings – simulation model 56
Clusters Compared to SMP SMP is easier to manage and configure SMP takes up less space and draws less power SMP products are well established and stable 57
Clusters Compared to SMP Clusters are better for incremental and absolute scalability Clusters are superior in terms of availability 58