Princess Sumaya Univ. Computer Engineering Dept. Chapter 7:

Slides:



Advertisements
Similar presentations
Parallelism Lecture notes from MKP and S. Yalamanchili.
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
CSE431 Chapter 7A.1Irwin, PSU, 2008 CSE 431 Computer Architecture Fall 2008 Chapter 7A: Intro to Multiprocessor Systems Mary Jane Irwin (
Distributed Systems CS
SE-292 High Performance Computing
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
Issues in Parallel Processing Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
Multiprocessors CSE 4711 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor –Although.
1 Burroughs B5500 multiprocessor. These machines were designed to support HLLs, such as Algol. They used a stack architecture, but part of the stack was.
An Introduction To PARALLEL PROGRAMMING Ing. Andrea Marongiu
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Dec 5, 2005 Topic: Intro to Multiprocessors and Thread-Level Parallelism.
Multiprocessors ELEC 6200 Computer Architecture and Design Instructor: Dr. Agrawal Yu-Chun Chen 10/27/06.
Chapter Hardwired vs Microprogrammed Control Multithreading
Chapter 17 Parallel Processing.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
How Multi-threading can increase on-chip parallelism
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
Chapter 7 Multicores, Multiprocessors, and Clusters.
CPE 731 Advanced Computer Architecture Multiprocessor Introduction
Parallel Computer Architectures
MultiIntro.1 The Big Picture: Where are We Now? Processor Control Datapath Memory Input Output Input Output Memory Processor Control Datapath  Multiprocessor.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Princess Sumaya Univ. Computer Engineering Dept. د. بســام كحـالــه Dr. Bassam Kahhaleh.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Fundamental Issues in Parallel and Distributed Computing Assaf Schuster, Computer Science, Technion.
Computer System Architectures Computer System Software
Chapter 7 Multicores, Multiprocessors, and Clusters CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Zhao Zhang Iowa State University.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
Lecture 22Fall 2006 Computer Systems Fall 2006 Lecture 22: Intro. to Multiprocessors Adapted from Mary Jane Irwin ( )
1 Parallelism, Multicores, Multiprocessors, and Clusters [Adapted from Computer Organization and Design, Fourth Edition, Patterson & Hennessy, © 2009]
1 Multi-core processors 12/1/09. 2 Multiprocessors inside a single chip It is now possible to implement multiple processors (cores) inside a single chip.
CMPE 421 Parallel Computer Architecture Multi Processing 1.
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Morgan Kaufmann Publishers
Lecture 13: Multiprocessors Kai Bu
Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.
Computer Architecture And Organization UNIT-II Flynn’s Classification Of Computer Architectures.
Morgan Kaufmann Publishers Multicores, Multiprocessors, and Clusters
Lecture 3: Computer Architectures
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
Multiprocessor So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that.
August 13, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 11: Multiprocessors: Uniform Memory Access * Jeremy R. Johnson Monday,
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Processor Level Parallelism 2. How We Got Here Developments in PC CPUs.
Classification of parallel computers Limitations of parallel processing.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Lecture 13: Multiprocessors Kai Bu
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Distributed Processors
Parallel Processing - introduction
Multi-core processors
Morgan Kaufmann Publishers
Morgan Kaufmann Publishers
Merry Christmas Good afternoon, class,
Kai Bu 13 Multiprocessors So today, we’ll finish the last part of our lecture sessions, multiprocessors.
Chapter 17 Parallel Processing
Symmetric Multiprocessing (SMP)
Multiprocessors - Flynn’s taxonomy (1966)
Distributed Systems CS
CSC3050 – Computer Architecture
Chapter 4 Multiprocessors
Presentation transcript:

Princess Sumaya Univ. Computer Engineering Dept. Chapter 7:

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 1 / 9 Parallelism  Uniprocessor vs. Multiprocessors ●Process per Processor  Process-Level Parallelism ●Parallel Processing Program (Multithreading)  Multicore vs. Cluster ●Single Chip vs. LAN Interconnect

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 2 / 9 Parallel Processing Program  Amdahl’s Law Exercise: To achieve a speedup of 90 times faster with 100 processors, what percentage of the original computation can be sequential? Execution Time Execution Time Affected Execution After = ──────────────── + Time Improvement Amount of Improvement Unaffected

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 3 / 9 Scaling  Strong Scaling Speedup achieved on a multiprocessor without increasing the size of the problem. Exercise: Consider sum of 10 scalars (10 sequential additions, T add ) and sum of two 10 × 10 matrixes (100 parallel additions). What are the speedups for 10 & 100 processors?

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 4 / 9 Scaling  Weak Scaling Speedup achieved on a multiprocessor while increasing size of the problem proportional to increase in # of processors. Exercise: Consider sum of 10 scalars (10 sequential additions, T add ) and sum of two 100 × 100 matrixes (10,000 parallel additions). What are the speedups for 10 & 100 processors?

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 5 / 9 Load Balance  Non Ideal Balance Processors don’t get equal amount of work. Exercise: Consider 10 sequential additions and 10,000 parallel additions using 100 processors. What is the speedup when a processor has 2% of the load instead of 1%? What about 5% of the load?

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 6 / 9 Shared Memory Multiprocessors (SMP)  Single Physical Address Space ●Uniform Memory Access (UMA) ●Nonuniform Memory Access (NUMA) ●Synchronization (Lock) Main Memory ● ● ● InterconnectInterconnect CacheCache ProcessorProcessor I/O Controller CacheCache ProcessorProcessor CacheCache ProcessorProcessor ● ● ●

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 7 / 9 Message-Passing Multiprocessors  Private Physical Address Space ●Send-Message & Receive-Message Routines Main Memory ● ● ● InterconnectInterconnect CacheCache ProcessorProcessor I/O Controller CacheCache ProcessorProcessor CacheCache ProcessorProcessor ● ● ● Main Memory ● ● ●

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 8 / 9 Multithreading  Hardware Multithreading ●Sharing Processor’s Functional Units Among Threads (Switch state from one thread to another when stalled)  Fine-Grained Multithreading ●Switching State After Every Instruction  Coarse-Grained Multithreading ●Switching State After a Cache Miss  Simultaneous Multithreading (SMT) ●Multiple-Issue, Dynamically Scheduled Processor (Exploits thread-level & instruction-level parallelism)

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. 9 / 9 SISD, MIMD, SIMD, SPMD  Single-Instruction Single-Data ●Uniprocessor  Multiple-Instruction Multiple-Data ●Multiprocessor  Single-Instruction Multiple-Data ●Vector/Array Processor (Data-Level Parallelism)  Single-Program Multiple-Data ●Different Code Sections Execute in Parallel (MIMD)

Princess Sumaya University – Computer Arch. & Org (2) Computer Engineering Dept. Chapter 7