Lec 6 Chap. 13Multiprocessors

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Computer Architecture
© 2009 Fakultas Teknologi Informasi Universitas Budi Luhur Jl. Ciledug Raya Petukangan Utara Jakarta Selatan Website:
SE-292 High Performance Computing
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Chapter Thirteen Multiprocessors.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Processor support devices Part 1:Interrupts and shared memory dr.ir. A.C. Verschueren.
Today’s topics Single processors and the Memory Hierarchy
Multiple Processor Systems
Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
The Performance of Spin Lock Alternatives for Shared-Memory Microprocessors Thomas E. Anderson Presented by David Woodard.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
Multiprocessors Andreas Klappenecker CPSC321 Computer Architecture.
1: Operating Systems Overview

OPERATING SYSTEM OVERVIEW
Chapter 17 Parallel Processing.
University College Cork IRELAND Hardware Concepts An understanding of computer hardware is a vital prerequisite for the study of operating systems.
TECH CH03 System Buses Computer Components Computer Function
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 7 Multiprocessors and Multicomputers 7.1 Multiprocessor System Interconnects.
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
Interprocessor arbitration
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
CS-334: Computer Architecture
Computer System Architectures Computer System Software
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
Top Level View of Computer Function and Interconnection.
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
Parallel Computer Architecture and Interconnect 1b.1.
Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The.
CSCI 232© 2005 JW Ryder1 Parallel Processing Large class of techniques used to provide simultaneous data processing tasks Purpose: Increase computational.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
EEE440 Computer Architecture
Computer System Architecture Dept. of Info. Of Computer. Chap. 13 Multiprocessors 13-1 Chap. 13 Multiprocessors n 13-1 Characteristics of Multiprocessors.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Multiprocessors.
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 8 Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
ECEG-3202 Computer Architecture and Organization Chapter 3 Top Level View of Computer Function and Interconnection.
1: Operating Systems Overview 1 Jerry Breecher Fall, 2004 CLARK UNIVERSITY CS215 OPERATING SYSTEMS OVERVIEW.
PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.
1 Lecture 1: Computer System Structures We go over the aspects of computer architecture relevant to OS design  overview  input and output (I/O) organization.
Outline Why this subject? What is High Performance Computing?
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29.
Chapter 3 System Buses.  Hardwired systems are inflexible  General purpose hardware can do different tasks, given correct control signals  Instead.
Background Computer System Architectures Computer System Software.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
These slides are based on the book:
Overview Parallel Processing Pipelining
Dynamic connection system
Overview Parallel Processing Pipelining
Overview Parallel Processing Pipelining
Chapter 3 Top Level View of Computer Function and Interconnection
Multiprocessor Introduction and Characteristics of Multiprocessor
Chapter 17 Parallel Processing
INTERCONNECTION NETWORKS
Chap. 9 Pipeline and Vector Processing
Presentation transcript:

Lec 6 Chap. 13Multiprocessors 13-1 13-1 Characteristics of Multiprocessors Multiprocessors System = MIMD An interconnection of two or more CPUs with memory and I/O equipment » a single CPU and one or more IOPs is usually not included in a multiprocessor system Unless the IOP has computational facilities comparable to a CPU Computation can proceed in parallel in one of two ways 1) Multiple independent jobs can be made to operate in parallel 2) A single job can be partitioned into multiple parallel tasks Classified by the memory Organization 1) Shared memory or Tightly-coupled system » Local memory + Shared memory higher degree of interaction between tasks 2) Distribute memory or Loosely-coupled system » Local memory + message passing scheme (packet or message ) most efficient when the interaction between tasks is minimal 13-2 Interconnection Structure Multiprocessor System Components 1) Time-shared common bus 2) Multi-port memory CPU, IOP, Memory unit Interconnection Components 3) Crossbar switch 4) Multistage switching network 5) Hypercube system Computer System Architecture Chap. 13 Multiprocessors

Time-shared Common Bus 13-2 Time-shared Common Bus Time-shared single common bus system : » Only one processor can communicate with the memory or another processor at any given time when one processor is communicating with the memory, all other processors are either busy with internal operations or must be idle waiting for the bus Dual common bus system : Tightly coupled system » System bus + Local bus » Shared memory the memory connected to the common system bus is shared by all processors » System bus controller Link each local bus to a common system bus Memory unit CPU 1 CPU 2 CPU 3 IOP 1 IOP 2 Computer System Architecture Chap. 13 Multiprocessors

Multiplexers and arbitration logic 13-3 Multi-port memory : multiple paths between processors and memory » Advantage : high transfer rate can be achieved » Disadvantage : expensive memory control logic / large number of cables & connectors Crossbar Switch :  Memory Module I/O Port Block diagram of crossbar switch CPUs MM Memory modules MM 2 MM 3 Memory modules MM 2 MM 3 MM 1 MM 4 MM 1 MM 4 Data,address, and control form CPU 1 Memory module Data CPU 1 CPU 1 Data,address, and control form CPU 2 Address Multiplexers and arbitration logic CPU 2 CPU 2 Read/write Data,address, and control form CPU 3 CPU 3 Memory CPU 3 enable Data,address, and control form CPU 4 CPU 4 CPU 4 Computer System Architecture Chap. 13 Multiprocessors

Crossbar Switch 13-4 Cluster Chap. 13 Multiprocessors cluster cluster Crossbar- Hierarchies cluster cluster cluster cluster cluster cluster cluster cluster cluster Cluster Node Node Node Node Node Node Crossbar PU CU 8 Network Interface 8 I/O 4 Local Memory Computer System Architecture Chap. 13 Multiprocessors

Crossbar Switch 13-5 Chap. 13 Multiprocessors Computer System Architecture Chap. 13 Multiprocessors

Multistage Switching Network 13-6 Multistage Switching Network Control the communication between a number of sources and destinations » Tightly coupled system : PU » Loosely coupled system : PU MM PU Basic components of a multistage switching network : two-input, two-output interchange switch : Fig. 13-6 2 Processor (P1 and P2) are connected through switches to 8 memory modules (000 - 111) : Fig. 13-7 Omega Network : Fig. 13-8 » 2 x 2 Interchange switch를 사용하여 N input x N output network topology 구성 000 A 1 000 A 1 001 1 001 B B 1 1 0 010 A connected to 0 A connected to 1 1 3 011 P0 1 1 100 A A 4 100 1 B 1 B 5 101 1 B connected to 0 B connected to 1 110 1 111 7 111 Computer System Architecture Chap. 13 Multiprocessors

© Korea Univ. of Tech. & Edu. 13-7 Hypercube Interconnection : Fig. 13-9 : one-cube, two-cube, three-cube Loosely coupled system Hypercube Architecture : Intel iPSC ( n = 7, 128 node → n-cube, 2n node ) 011 111 01 11 010 110 001 101 00 10 000 100 13-3 Interprocessor Arbitration : Bus Control Single Bus System : Address bus, Data bus, Control bus Multiple Bus System : Memory bus, I/O bus, System bus System bus : Bus that connects CPUs, IOPs, and Memory in multiprocessor system(bus controller/arbitrator) Data transfer method over the system bus Synchronous bus : achieved by driving both units from a common clock source Asynchronous bus : accompanied by handshaking control signals Chap. 13 Multiprocessors © Korea Univ. of Tech. & Edu. Dept. of Info. & Comm. Computer System Architecture

System Bus : IEEE Standard 796 MultiBus 13-8 System Bus : IEEE Standard 796 MultiBus 86 signal lines : » Bus Arbitration : BREQ, BUSY, … Bus Arbitration Algorithm : Static / Dynamic Static : priority fixed » Serial (daisy-chain) arbitration : * Bus Busy Line If this line is inactive, no other processor is using the bus » Parallel arbitration : Fig. 13-11 Dynamic : priority flexible » Time slice (fixed length time) » Polling » LRU » FIFO » Rotating daisy-chain Computer System Architecture Chap. 13 Multiprocessors

Interprocessor Communication 13-9 13-4 Interprocessor Communication & Synchronization Interprocessor Communication shared memory : tightly coupled system » Accessible to all processors : common memory » Act as a message center similar to a mailbox no shared memory : loosely coupled system » message passing through I/O channel communication Interprocessor Synchronization Enforce the correct sequence of processes and ensure mutually exclusive access to shared writable data Mutual Exclusion » Protect data from being changed simultaneous by two or more processor Mutual Exclusion with Semaphore » Critical Session Once begun, must complete execution before another processor accesses » Semaphore Indicate whether or not a processor is executing a critical section » Hardware Lock Processor generated signal to prevent other processors from using system bus Computer System Architecture Chap. 13 Multiprocessors

13-5 Cache Coherence Semaphore shared memory 13-10 Semaphore shared memory 1) TSL SEM (Test and Set while Locked) » Hardware Lock SEM » 2 memory cycle  R  M [ SEM ] M [ SEM ]  1 : Test semaphore (semaphore) : Set semaphore ( processor shared memory )  2) R = 0 : shared memory is available R = 1: processor can not access shared memory (semaphore originally set) 13-5 Cache Coherence Conditions for Incoherence : Fig. 13-12, 13 Multiprocessor system with private caches X = 120 Main memory Bus X = 120 X = 52 X = 52 Caches » Write through : P2, P3 Incoherence » Write back : P2, P3, Main memory Incoherence P1 P2 P3 Processors (a) With write-through cache policy X = 52 Main memory Bus X = 120 X = 52 X = 52 Caches P1 P2 P3 Processors (b) With write-back cache policy Computer System Architecture Chap. 13 Multiprocessors

“A survey of cache coherence schemes for multiprocessors” 13-11 Solution to the Cache Coherence Problem Software » 1) Shared writable data are non-cacheable » 2) Writable data exists in one cache : Centralized global table Hardware » 1) Monitor possible write operation : Snoopy cache controller » IEEE Computer, 1988, Feb. “Synchronization, coherence, and event ordering in multiprocessors” » IEEE Computer, 1990, June. “A survey of cache coherence schemes for multiprocessors” Computer System Architecture Chap. 13 Multiprocessors

Snoopy Cache Controller 13-12 snoopy cache is a type of memory cache that performs bus sniffing. Such caches are used in systems where many processors or computers share the same memory and each have their own cache. In such systems processor 'A' may read a value from memory, then processor 'B' does the same. If either of the processors now change the value by writing back to memory they will invalidate the other processor's cached value. # In order to prevent this and maintain cache coherence # snoopy caches monitor ('snoop on') the memory bus to detect any writes to values that they are holding, including changes coming from other processors or distributed computers Watches bus for write operations to the shared memory. Invalidates cache entry if the write address appears Computer System Architecture Chap. 13 Multiprocessors

SMP MPP Cluster Constellation Single OS Shared Memory 13-13 SMP CPU Single OS Shared Memory Memory Interconnect OpenMP API: http://www.openmp.org/ MPP Multiple OS Distributed Memory Processor Interconnect MPI API : http://www.mpi-forum.org/ Cluster Cluster of IA32 (1 or 2 CPU) Node Interconnect Constellation Cluster of SMP node Memory Memory Interconnect Node Interconnect Node * Clusters in* top500.org * “simple” Cluster : 1 processor in each node Cluster of small SMP’s : small # processors / node Constellations : large # processors / node Computer System Architecture Chap. 13 Multiprocessors

Parallel Machine Code 13-14 Chap. 13 Multiprocessors Computer System Architecture Chap. 13 Multiprocessors

www.top500.org 13-15 MPP – Massively Parallel Processors Loosely coupled system, clusters are “rising” Clusters “simple” Cluster (1 processor in each node) Cluster of small SMP’s (small # processors / node) Constellations (large # processors / node) Older Architectures SIMD – Single Instruction Multiple Data Vector Processors (Old Cray machines) Computer System Architecture Chap. 13 Multiprocessors

www.top500.org 13-16 Chap. 13 Multiprocessors Computer System Architecture Chap. 13 Multiprocessors

Beowulf Clusters http://www.beowulf.org http://www.scyld.com 13-17 Beowulf Clusters http://www.beowulf.org http://www.scyld.com http://linuxhpc.org Computer System Architecture Chap. 13 Multiprocessors

13-18 Cloud computing Internet-based computing, whereby shared resources, software and information are provided to computers and other devices on-demand. The term "cloud" is used as a metaphor for the Internet, based on the cloud drawing used in the past to represent the telephone network, and later to depict the Internet in computer network diagrams. - Wikipedia - Computer System Architecture Chap. 13 Multiprocessors