Multiple Processor Systems

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Multiple Processor Systems
Multiple Processor Systems I CS 423 Klara Nahrstedt/Sam King 10/11/20141.
© DEEDS – OS Course WS11/12 Lecture 10 - Multiprocessing Support 1 Administrative Issues  Exam date candidates  CW 7 * Feb 14th (Tue): * Feb 16th.
Distributed Systems CS
SE-292 High Performance Computing
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Chapter 8-1 : Multiple Processor Systems Multiple Processor Systems Multiple Processor Systems Multiprocessor Hardware Multiprocessor Hardware UMA Multiprocessors.
Super computers Parallel Processing By: Lecturer \ Aisha Dawood.
UMA Bus-Based SMP Architectures
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
Distributed Processing, Client/Server, and Clusters
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
1 CS 333 Introduction to Operating Systems Class 16 – Multiprocessor Operating Systems Jonathan Walpole Computer Science Portland State University.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
Multiprocessors Andreas Klappenecker CPSC321 Computer Architecture.
1 Lecture 1: Parallel Architecture Intro Course organization:  ~5 lectures based on Culler-Singh textbook  ~5 lectures based on Larus-Rajwar textbook.
Multiple Processor Systems 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
Multiprocessor and Distributed Systems
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
CPE 731 Advanced Computer Architecture Snooping Cache Multiprocessors Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of.
User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.
Parallel Computer Architectures
PRASHANTHI NARAYAN NETTEM.
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Synchronization and Scheduling in Multiprocessor Operating Systems
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
MIMD Shared Memory Multiprocessors. MIMD -- Shared Memory u Each processor has a full CPU u Each processors runs its own code –can be the same program.
August 15, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 12: Multiprocessors: Non-Uniform Memory Access * Jeremy R. Johnson.
Parallel Computer Architecture and Interconnect 1b.1.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
13/03/07 CENG334 Introduction to Operating Systems Erol Sahin Dept of Computer Eng. Middle East Technical University Ankara, TURKEY URL:
OS2- Sem ; R. Jalili Introduction Chapter 1.
Kyung Hee University 1/41 Introduction Chapter 1.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
Computer System Architecture Dept. of Info. Of Computer. Chap. 13 Multiprocessors 13-1 Chap. 13 Multiprocessors n 13-1 Characteristics of Multiprocessors.
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 8 Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
OS2- Sem1-83; R. Jalili Introduction Chapter 1. OS2- Sem1-83; R. Jalili Definition of a Distributed System (1) A distributed system is: A collection of.
1 Multiple Processors, A Network, An OS, and Middleware Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Cotter-cs431 Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved Chapter 8 Multiple Processor Systems.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
Distributed shared memory u motivation and the main idea u consistency models F strict and sequential F causal F PRAM and processor F weak and release.
August 13, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 11: Multiprocessors: Uniform Memory Access * Jeremy R. Johnson Monday,
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 8 Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient medium for communication among the processor memory.
Distributed Computing Systems CSCI 6900/4900. Review Definition & characteristics of distributed systems Distributed system organization Design goals.
Background Computer System Architectures Computer System Software.
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
The University of Adelaide, School of Computer Science
Multiple processor systems
Multiprocessor System Distributed System
Overview Parallel Processing Pipelining
MODERN OPERATING SYSTEMS Third Edition ANDREW S
Definition of Distributed System
12.4 Memory Organization in Multiprocessor Systems
Advanced Operating Systems
CMSC 611: Advanced Computer Architecture
Multiprocessor Introduction and Characteristics of Multiprocessor
Multiple Processor Systems
Multiple Processor Systems
Multiple Processor and Distributed Systems
High Performance Computing
Presentation transcript:

Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall 1

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Multiple Processor Systems Figure 8-1. (a) A shared-memory multiprocessor. (b) A message- passing multicomputer. (c) A wide area distributed system. (a) is a Uniform Memory Access (UMA) architecture, while (b) and (c) are Non-Uniform Memory Access (NUMA) architectures Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

UMA Multiprocessors with Bus-Based Architectures Figure 8-2. Three bus-based multiprocessors. (a) Without caching. (b) With caching. (c) With caching and private memories. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

UMA Multiprocessors Using Crossbar Switches Figure 8-3. The “Dance Hall” approach: (a) An 8 × 8 crossbar switch. (b) An open crosspoint. (c) A closed crosspoint. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Interconnection Technology (1) Figure 8-16. Various interconnect topologies. (a) A single switch. (b) A ring. (c) A grid. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Interconnection Technology (2) Figure 8-16. Various interconnect topologies. (d) A double torus. (e) A cube. (f) A 4D hypercube. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

UMA Multiprocessors Using Multistage Switching Networks (1) Figure 8-4. (a) A 2 × 2 switch with two input lines, A and B, and two output lines, X and Y. (b) A message format. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

UMA Multiprocessors Using Multistage Switching Networks (2) Figure 8-5. An omega switching network. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

NUMA Multiprocessors (1) Characteristics of NUMA machines: There is a single address space visible to all CPUs. Access to remote memory is via LOAD and STORE instructions. Access to remote memory is slower than access to local memory. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

NUMA Multiprocessors (2) Figure 8-6. (a) A 256-node directory-based multiprocessor. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

NUMA Multiprocessors (3) Figure 8-6. (b) Division of a 32-bit memory address into fields. (c) The directory at node 36. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Each CPU Has Its Own Operating System Figure 8-7. Partitioning multiprocessor memory among four CPUs, but sharing a single copy of the operating system code. The boxes marked Data are the operating system’s private data for each CPU. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Master-Slave Multiprocessors Figure 8-8. A master-slave multiprocessor model. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Symmetric Multiprocessors Figure 8-9. The SMP multiprocessor model. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Multiprocessor Synchronization (1) Figure 8-10. The TSL instruction can fail if the bus cannot be locked. These four steps show a sequence of events where the failure is demonstrated. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

TSL solution for multi-processors TSL involves testing and setting memory, this can require 2 memory accesses Not a problem to implement this in single-processor system Now, bus must be locked to avoid split transaction Bus provides a special line for locking A process that fails to acquire lock checks repeatedly issuing more TSL instructions Requires Exclusive access to memory block Cache coherence protocol would generate lots of traffic Goal: To reduce number of checks Exponential back-off: instead of constant polling, check only after delaying (1, 2, 4, 8 instructions) Maintain a list of processes waiting to acquire lock.

Busy-Waiting vs Process switch In single-processors, if a process is waiting to acquire lock, OS schedules another ready process OS must decide whether to switch (choice between spinning and switching) spinning wastes CPU cycles switching uses up CPU cycles also possible to make separate decision each time locked mutex encountered

Multiprocessor Synchronization (2) Figure 8-11. Use of multiple locks to avoid cache thrashing. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall Timesharing Figure 8-12. Using a single data structure for scheduling a multiprocessor. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall Space Sharing Figure 8-13. A set of 32 CPUs split into four partitions, with two CPUs available. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall Gang Scheduling (1) Figure 8-14. Communication between two threads belonging to thread A that are running out of phase. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall Gang Scheduling (2) The three parts of gang scheduling: Groups of related threads are scheduled as a unit, a gang. All members of a gang run simultaneously, on different timeshared CPUs. All gang members start and end their time slices together. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Gang Scheduling (3) Figure 8-15. Gang scheduling. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Interconnection Technology (3) Figure 8-17. Store-and-forward packet switching. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall Network Interfaces Figure 8-18. Position of the network interface boards in a multicomputer. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Blocking versus Nonblocking Calls (1) Figure 8-19. (a) A blocking send call. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Blocking versus Nonblocking Calls (2) Figure 8-19. (b) A nonblocking send call. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Blocking versus Nonblocking Calls (3) Choices on the sending side: Blocking send (CPU idle during message transmission). Nonblocking send with copy (CPU time wasted for the extra copy). Nonblocking send with interrupt (makes programming difficult). Copy on write (extra copy probably needed eventually). Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall Remote Procedure Call Figure 8-20. Steps in making a remote procedure call. The stubs are shaded gray. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

RPC Mechanism client stub proc. Communication module Local return call Client computer Server computer server service procedure Receive reply Send request Unmarshal results Marshal arguments Select procedure Execute procedure

Distributed Shared Memory (1) Figure 8-21. Various layers where shared memory can be implemented. (a) The hardware. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Distributed Shared Memory (2) Figure 8-21. Various layers where shared memory can be implemented. (b) The operating system. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Distributed Shared Memory (3) Figure 8-21. Various layers where shared memory can be implemented. (c) User-level software. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Distributed Shared Memory (4) Figure 8-22. (a) Pages of the address space distributed among four machines. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Distributed Shared Memory (5) Figure 8-22. (b) Situation after CPU 1 references page 10 and the page is moved there. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Distributed Shared Memory (6) Figure 8-22. (c) Situation if page 10 is read only and replication is used. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall False Sharing Figure 8-23. False sharing of a page containing two unrelated variables. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

A Graph-Theoretic Deterministic Algorithm Figure 8-24. Two ways of allocating nine processes to three nodes. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall

A Sender-Initiated Distributed Heuristic Algorithm Figure 8-25. (a) An overloaded node looking for a lightly loaded node to hand off processes to. (b) An empty node looking for work to do. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall