Lecture 5 Approaches to Concurrency: The Multiprocessor

Slides:



Advertisements
Similar presentations
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Advertisements

Distributed Systems CS
SE-292 High Performance Computing
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
CS 258 Parallel Computer Architecture Lecture 15.1 DASH: Directory Architecture for Shared memory Implementation, cost, performance Daniel Lenoski, et.
Multiprocessors CSE 4711 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor –Although.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Dec 5, 2005 Topic: Intro to Multiprocessors and Thread-Level Parallelism.
6/14/2015 How to measure Multi- Instruction, Multi-Core Processor Performance using Simulation Deepak Shankar Darryl Koivisto Mirabilis Design Inc.
Chapter Hardwired vs Microprogrammed Control Multithreading
Chapter 17 Parallel Processing.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
CPE 731 Advanced Computer Architecture Multiprocessor Introduction
Programming with CUDA WS 08/09 Lecture 13 Thu, 04 Dec, 2008.
Parallel Computer Architectures
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
CS 470/570:Introduction to Parallel and Distributed Computing.
Computer System Architectures Computer System Software
Parallel Computing Basic Concepts Computational Models Synchronous vs. Asynchronous The Flynn Taxonomy Shared versus Distributed Memory Interconnection.
18-447: Computer Architecture Lecture 30B: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013.
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
Department of Computer Science University of the West Indies.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
CSCI 232© 2005 JW Ryder1 Parallel Processing Large class of techniques used to provide simultaneous data processing tasks Purpose: Increase computational.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Message Passing Computing 1 iCSC2015,Helvi Hartmann, FIAS Message Passing Computing Lecture 1 High Performance Computing Helvi Hartmann FIAS Inverted CERN.
Lecture 13: Multiprocessors Kai Bu
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
DISTRIBUTED COMPUTING
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
Multiprocessor So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections )
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Background Computer System Architectures Computer System Software.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Computer Architecture Lecture 27: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015.
740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University.
Lecture 13: Multiprocessors Kai Bu
Overview Parallel Processing Pipelining
18-447: Computer Architecture Lecture 30B: Multiprocessors
Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Parallel Processing Basics
CS5102 High Performance Computer Systems Thread-Level Parallelism
Multi-Processing in High Performance Computer Architecture:
Multi-Processing in High Performance Computer Architecture:
The University of Adelaide, School of Computer Science
What is Parallel and Distributed computing?
Merry Christmas Good afternoon, class,
Lecture 1: Parallel Architecture Intro
Kai Bu 13 Multiprocessors So today, we’ll finish the last part of our lecture sessions, multiprocessors.
Parallel Architectures Based on Parallel Computing, M. J. Quinn
ECE/CS 757: Advanced Computer Architecture II
Interconnect with Cache Coherency Manager
Multiprocessors - Flynn’s taxonomy (1966)
Coe818 Advanced Computer Architecture
Overview Parallel Processing Pipelining
Distributed Systems CS
Computer Architecture: A Science of Tradeoffs
CSC3050 – Computer Architecture
Chapter 4 Multiprocessors
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 24: Multiprocessors
Lecture 17 Multiprocessors and Thread-Level Parallelism
CSL718 : Multiprocessors 13th April, 2006 Introduction
Types of Parallel Computers
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

Lecture 5 Approaches to Concurrency: The Multiprocessor

Outline Some basics Granularity Amdahl’s Law Metrics (speedup, efficiency, redundancy) Multi-core (one thread spans an engine, multiple engines) Tightly coupled vs. Loosely coupled Interconnect Cache Coherency Memory Consistency Examples Early: cm*, HEP, cosmic cube Recent: cell, power, niagara, larrabee Tomorrow: ACMP

Outline Some basics Granularity Amdahl’s Law Metrics (speedup, efficiency, redundancy) Multi-core (one thread spans an engine, multiple engines) Tightly coupled vs. Loosely coupled Interconnect Cache Coherency Memory Consistency Examples Early: cm*, HEP, cosmic cube Recent: cell, power, niagara, larrabee Tomorrow: ACMP

Granularity of Concurrency Intra-instruction (Pipelining) Parallel instructions (SIMD, VLIW) Tight-coupled MP Loosely-coupled MP

Metrics Speed-up: Efficiency: Utilization: Redundancy:

Outline Some basics Granularity Amdahl’s Law Metrics (speedup, efficiency, redundancy) Multi-core (one thread spans an engine, multiple engines) Tightly coupled vs. Loosely coupled Interconnect Cache Coherency Memory Consistency Examples Early: cm*, HEP, cosmic cube Recent: cell, power, niagara, larrabee Tomorrow: ACMP

Tightly-coupled vs Loosely-coupled Tightly coupled (i.e., Multiprocessor) Shared memory Each processor capable of doing work on its own Easier for the software Hardware has to worry about cache coherency, memory contention Loosely-coupled (i.e., Multicomputer Network) Message passing Easier for the hardware Programmer’s job is tougher

Interconnection networks Cost Latency Contention Cache Cohererency Snoopy Directory Memory Consistency Sequential Consistency and Mutual Exclusion

Outline Some basics Granularity Amdahl’s Law Metrics (speedup, efficiency, redundancy) Multi-core (one thread spans an engine, multiple engines) Tightly coupled vs. Loosely coupled Interconnect Cache Coherency Memory Consistency Examples Early: cm*, HEP, cosmic cube Recent: cell, power, niagara, larrabee Tomorrow: ACMP