Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Distributed Systems CS
SE-292 High Performance Computing
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Today’s topics Single processors and the Memory Hierarchy
Classification of Distributed Systems Properties of Distributed Systems n motivation: advantages of distributed systems n classification l architecture.
Multiple Processor Systems
Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
Background Computer System Architectures Computer System Software.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
CS 284a, 7 October 97Copyright (c) , John Thornley1 CS 284a Lecture Tuesday, 7 October 1997.
2. Multiprocessors Main Structures 2.1 Shared Memory x Distributed Memory Shared-Memory (Global-Memory) Multiprocessor:  All processors can access all.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
Multiprocessors ELEC 6200 Computer Architecture and Design Instructor: Dr. Agrawal Yu-Chun Chen 10/27/06.
Chapter Hardwired vs Microprogrammed Control Multithreading
Chapter 17 Parallel Processing.
Multiple Processor Systems 8.1 Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Parallel Computer Architectures
4. Multiprocessors Main Structures 4.1 Shared Memory x Distributed Memory Shared-Memory (Global-Memory) Multiprocessor:  All processors can access all.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
Computer System Architectures Computer System Software
1. Introduction to Distributed Systems. 1. Introduction Two advances in computer technology: A. The development of powerful microprocessors. B. The invention.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Types of Operating Systems
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Multiprocessors.
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 8 Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.
Types of Operating Systems 1 Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
Copyright © Curt Hill Parallelism in Processors Several Approaches.
Lec 6 Chap. 13Multiprocessors
Cotter-cs431 Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved Chapter 8 Multiple Processor Systems.
2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract.
Distributed Computing Systems CSCI 6900/4900. Review Distributed system –A collection of independent computers that appears to its users as a single coherent.
Outline Why this subject? What is High Performance Computing?
Computer performance issues* Pipelines, Parallelism. Process and Threads.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-2.
Multiprocessor So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient medium for communication among the processor memory.
Distributed Computing Systems CSCI 6900/4900. Review Definition & characteristics of distributed systems Distributed system organization Design goals.
Background Computer System Architectures Computer System Software.
Primitive Concepts of Distributed Systems Chapter 1.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
The University of Adelaide, School of Computer Science
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Overview Parallel Processing Pipelining
Lecture 5 Approaches to Concurrency: The Multiprocessor
CS5102 High Performance Computer Systems Thread-Level Parallelism
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Lecture 1: Parallel Architecture Intro
Chapter 4 Multiprocessors
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Presentation transcript:

Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data

Multiprocessors Tightly coupled CPUs –Shared memory system –Share virtual memory as well Any threaded program operates in this mode Communication is usually through memory Example: –Word typing with one thread –Examining spelling in another Copyright © 2011 Curt Hill

Interconnections If the number of processors is small the bus is the interconnection device Small is typically less than When no longer small contention for the bus will reduce performance For larger sizes there must be separate buses and interconnections between them Copyright © 2011 Curt Hill

Cache Complications Even if the memory is truly shared such as on a bus we have problems with caches The problem is that two CPUs have two different caches and that the cache lines are different for the same memory location –Easy to do with write back policy –Not that hard with write through Copyright © 2011 Curt Hill

Cache Coherence Coherent caches disallow different cache lines for same memory locations There are several cache coherence protocols but the results should be similar –The caches agree with each other when they possess the same memory location One such protocol is called MESI Copyright © 2011 Curt Hill

Multiple memories With multiple bus interconnections each CPU will have memory This memory is to be shared with all the others The memories are patched together to provide the illusion that it is just one This can be done with a variety of switches –Crossbar or multistage Copyright © 2011 Curt Hill

Memory Consistency Types Strict – no cache, truly shared Sequential – all processors observe the same order Processor – writes are always seen in the same order Weak – does not guarantee writes in same order Release – serialized weak consistency Copyright © 2011 Curt Hill

Memory Hierarchies There is a limit how many CPUs may be added using the various switches Hardware costs increase as well NonUniform Memory Access NUMA architectures have quicker access to local memory and slower to non-local Cache coherent NUMAs complicate things even more Copyright © 2011 Curt Hill

Multicomputers Often loosely coupled CPUs –Always has some private memory –May have shared memory via a portion of their virtual memory space –Distributed memory system Must pass messages via high speed networking Allows larger scaling than multiprocessors Copyright © 2011 Curt Hill

MPP Massively Parallel Processors Typically commodity CPUs, such as Pentiums –Hundreds or thousands of them Very high-speed interconnection network –Low latency, high bandwidth message passing Very large I/O capabilities –Massive computing needs massive data Copyright © 2011 Curt Hill

Current King As of 11/2011 Japan’s K computer was on top of the super computer list Contains 705,024 SPARC64 cores Reaches 10 Petaflops/sec –10 quadrillion floating point operations a second Copyright © 2011 Curt Hill

Cray XT5 A Cray XT5 at Oak Ridge yielded 1.7 –Largest US super computer It was the king in 2009 Uses 37,333 AMD Opterons, each with 6 cores Copyright © 2011 Curt Hill

Tianhe Ia Currently second on Top 500 list 14,336 Xeon X5670 processors and 7,168 Nvidia Tesla M2050 GPUs 2.57 Petaflops Copyright © 2011 Curt Hill

Cluster Computing Most cluster computing schemes use ordinary workstations They communicate over conventional LAN/WAN This is the software alternative to the Multicomputers Beowulf cluster is an example Copyright © 2011 Curt Hill

Grid Computing Similar to cluster, but the running software does not dominate the machine Typically there is a central server that assigns tasks and receives results When the workstation is idle the grid program consumes the idle cycles Super computing for those with no budget Copyright © 2011 Curt Hill

Examples BOINC – Berkely Open Infrastructure for Network Computing –Client for a variety of different projects –5.6 Petaflops – Protein folding –5 Petaflops – Search for Extra Terrestial Intelligence –730 TeraFlops Copyright © 2011 Curt Hill