Multiprocessor So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

1 Uniform memory access (UMA) Each processor has uniform access time to memory - also known as symmetric multiprocessors (SMPs) (example: SUN ES1000) Non-uniform.
© 2009 Fakultas Teknologi Informasi Universitas Budi Luhur Jl. Ciledug Raya Petukangan Utara Jakarta Selatan Website:
Distributed Systems CS
SE-292 High Performance Computing
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 7:
CS 213: Parallel Processing Architectures Laxmi Narayan Bhuyan Lecture3.
1 Parallel Scientific Computing: Algorithms and Tools Lecture #3 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.
Multiple Processor Systems
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
Introduction to MIMD architectures
1 Burroughs B5500 multiprocessor. These machines were designed to support HLLs, such as Algol. They used a stack architecture, but part of the stack was.
1 Introduction to MIMD Architectures Sima, Fountain and Kacsuk Chapter 15 CSE462.
Lecture 18: Multiprocessors
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Dec 5, 2005 Topic: Intro to Multiprocessors and Thread-Level Parallelism.
1 Lecture 18: Large Caches, Multiprocessors Today: NUCA caches, multiprocessors (Sections ) Reminder: assignment 5 due Thursday (don’t procrastinate!)
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
Multiprocessors ELEC 6200 Computer Architecture and Design Instructor: Dr. Agrawal Yu-Chun Chen 10/27/06.

An Introduction to Parallel Computing Dr. David Cronk Innovative Computing Lab University of Tennessee Distribution A: Approved for public release; distribution.
Chapter 17 Parallel Processing.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.
 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.
CPE 731 Advanced Computer Architecture Multiprocessor Introduction
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Parallel Architectures
Computer System Architectures Computer System Software
CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
CSIE30300 Computer Architecture Unit 15: Multiprocessors Hsin-Chou Chi [Adapted from material by and
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Multiprocessors.
PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.
Outline Why this subject? What is High Performance Computing?
Computer Architecture And Organization UNIT-II Flynn’s Classification Of Computer Architectures.
Lecture 3: Computer Architectures
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-2.
Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient medium for communication among the processor memory.
1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections )
Background Computer System Architectures Computer System Software.
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
The University of Adelaide, School of Computer Science
Multi Processing prepared and instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University June 2016Multi Processing1.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
These slides are based on the book:
Multiprocessor Systems
CS5102 High Performance Computer Systems Thread-Level Parallelism
CS 147 – Parallel Processing
Parallel and Multiprocessor Architectures – Shared Memory
Chapter 17 Parallel Processing
Symmetric Multiprocessing (SMP)
Multiprocessors - Flynn’s taxonomy (1966)
Introduction to Multiprocessors
Lecture 24: Memory, VM, Multiproc
Chapter 4 Multiprocessors
Lecture 24: Virtual Memory, Multiprocessors
Lecture 23: Virtual Memory, Multiprocessors
Presentation transcript:

Multiprocessor So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that appear

The types of multiprocessor Multiprocessors can be used in different ways: Uniprossesors (single-instruction, single-data or SISD) Within a single system to execute multiple, independent sequences of instructions in multiple contexts (multiple- instruction, multiple-data or MIMD); A single sequence of instructions in multiple contexts (single-instruction, multiple-data or SIMD, often used in vector processing); Multiple sequences of instructions in a single context (multiple-instruction, single-data or MISD, used for redundancy in fail-safe systems and sometimes applied to describe pipelined processors or hyper threading).

Multiprocessor types used The first multi SIMD processors were kind, and this architecture is still used for certain specialized machines MIMD type seems to be nowadays choice target for the current application computers: The MIMD are flexible: they can be used as a single user machine, or as multi-programmed machinery The MIMD can be built from existing processors

In the center of MIMD processors: memory We Can be classified into two classes MIMD processors depending on the number of processors in the machine. Ultimately, it is the memory organization that is affected: - Centralized shared memory - distributed memory

Centralized shared memory

The centralized shared memory is used by machines at most a dozen processors 1995 We use a bus that connects the processor and memory, with the help of local cache. We call this type of memory structure the Uniform Memory Access (UMA).

Distributed memory

The distributed memory is used in machines using "a lot" of processors, which require too much bandwidth for a single memory Advantages of distributed memory: it is easier to increase the bandwidth of memory while most memory accesses are local. Latency is also improved when using the local memory

Distributed memory models There are two memory models distributed: Unique address space, accessible by all processors, but distributed among the processors. It is said that this system is Non-Uniform Memory Access (NUMA) because the access time to the memory depends on the location of the area that is addressed (local or remote) Private address space, where each processor has exclusive access to the local memory. Multi-computer systems are sometimes called these systems

Distributed memory models For both models, the communication mode is different: Message Passing Processors communicate via message passing Processors have private memories Focuses attention on costly non-local operations Shared Memory Processors communicate by memory read/write Easy on small-scale machines Lower latency SMP or NUMA The kind that we will focus on today

Advantages and disadvantages of communication mechanisms Shared memory: Well-known mechanism Easy to program (and easy to build compilers) Better use of bandwidth (memory protection at the hardware level, not at the level of the operating system Possibility of using caching techniques Message-passing: simplified equipment Explicit communication, requiring the intervention of the programmer

Types of Shared-Memory Architectures UMA Uniform Memory Access Access to all memory occurred at the same speed for all processors. We will focus on UMA today. NUMA Non-Uniform Memory Access Typically interconnection is grid or hypercube. Access to some parts of memory is faster for some processors than other parts of memory. Harder to program, but scales to more processors

Shared Memory Multiprocessors Memory is shared either globally or locally, or a combination of the two.

Shared Memory Access Uniform Memory Access(UMA) systems use a shared memory pool, where all memory takes the same amount of time to access. Quickly becomes expensive when more processors are added.

Shared Memory Access Non-Uniform Memory Access(NUMA) systems have memory distributed across all the processors, and it takes less time for a processor to read from its own local memory than from non-local memory. Prone to cache coherence problems, which occur when a local cache isn’t in sync with non-local caches representing the same data. Dealing with these problems require extra mechanisms to ensure coherence.

References cgi.cse.unsw.edu.au/~cs3231/06s1/lectures