Lecture 5 Approaches to Concurrency: The Multiprocessor

Slides:

Advertisements

Similar presentations

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.

Advertisements

Distributed Systems CS

SE-292 High Performance Computing

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

CS 258 Parallel Computer Architecture Lecture 15.1 DASH: Directory Architecture for Shared memory Implementation, cost, performance Daniel Lenoski, et.

Multiprocessors CSE 4711 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor –Although.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Dec 5, 2005 Topic: Intro to Multiprocessors and Thread-Level Parallelism.

6/14/2015 How to measure Multi- Instruction, Multi-Core Processor Performance using Simulation Deepak Shankar Darryl Koivisto Mirabilis Design Inc.

Chapter Hardwired vs Microprogrammed Control Multithreading

Chapter 17 Parallel Processing.

Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.

CPE 731 Advanced Computer Architecture Multiprocessor Introduction

Programming with CUDA WS 08/09 Lecture 13 Thu, 04 Dec, 2008.

Parallel Computer Architectures

Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.

CS 470/570:Introduction to Parallel and Distributed Computing.

Computer System Architectures Computer System Software

Parallel Computing Basic Concepts Computational Models Synchronous vs. Asynchronous The Flynn Taxonomy Shared versus Distributed Memory Interconnection.

18-447: Computer Architecture Lecture 30B: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013.

1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.

Department of Computer Science University of the West Indies.

Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.

CSCI 232© 2005 JW Ryder1 Parallel Processing Large class of techniques used to provide simultaneous data processing tasks Purpose: Increase computational.

Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"

Message Passing Computing 1 iCSC2015,Helvi Hartmann, FIAS Message Passing Computing Lecture 1 High Performance Computing Helvi Hartmann FIAS Inverted CERN.

Lecture 13: Multiprocessors Kai Bu

Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.

DISTRIBUTED COMPUTING

Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,

Multiprocessor So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that.

Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29.

Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.

1 Lecture 17: Multiprocessors Topics: multiprocessor intro and taxonomy, symmetric shared-memory multiprocessors (Sections )

LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?

Background Computer System Architectures Computer System Software.

Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.

Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)

Computer Architecture Lecture 27: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015.

740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University.

Lecture 13: Multiprocessors Kai Bu

Overview Parallel Processing Pipelining

18-447: Computer Architecture Lecture 30B: Multiprocessors

Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Parallel Processing Basics

CS5102 High Performance Computer Systems Thread-Level Parallelism

Multi-Processing in High Performance Computer Architecture:

Multi-Processing in High Performance Computer Architecture:

The University of Adelaide, School of Computer Science

What is Parallel and Distributed computing?

Merry Christmas Good afternoon, class,

Lecture 1: Parallel Architecture Intro

Kai Bu 13 Multiprocessors So today, we’ll finish the last part of our lecture sessions, multiprocessors.

Parallel Architectures Based on Parallel Computing, M. J. Quinn

ECE/CS 757: Advanced Computer Architecture II

Interconnect with Cache Coherency Manager

Multiprocessors - Flynn’s taxonomy (1966)

Coe818 Advanced Computer Architecture

Overview Parallel Processing Pipelining

Distributed Systems CS

Computer Architecture: A Science of Tradeoffs

CSC3050 – Computer Architecture

Chapter 4 Multiprocessors

The University of Adelaide, School of Computer Science

Lecture 17 Multiprocessors and Thread-Level Parallelism

Lecture 24: Multiprocessors

Lecture 17 Multiprocessors and Thread-Level Parallelism

CSL718 : Multiprocessors 13th April, 2006 Introduction

Types of Parallel Computers

Lecture 17 Multiprocessors and Thread-Level Parallelism

Presentation transcript:

Lecture 5 Approaches to Concurrency: The Multiprocessor

Outline Some basics Granularity Amdahl’s Law Metrics (speedup, efficiency, redundancy) Multi-core (one thread spans an engine, multiple engines) Tightly coupled vs. Loosely coupled Interconnect Cache Coherency Memory Consistency Examples Early: cm*, HEP, cosmic cube Recent: cell, power, niagara, larrabee Tomorrow: ACMP

Outline Some basics Granularity Amdahl’s Law Metrics (speedup, efficiency, redundancy) Multi-core (one thread spans an engine, multiple engines) Tightly coupled vs. Loosely coupled Interconnect Cache Coherency Memory Consistency Examples Early: cm*, HEP, cosmic cube Recent: cell, power, niagara, larrabee Tomorrow: ACMP

Granularity of Concurrency Intra-instruction (Pipelining) Parallel instructions (SIMD, VLIW) Tight-coupled MP Loosely-coupled MP

Metrics Speed-up: Efficiency: Utilization: Redundancy:

Outline Some basics Granularity Amdahl’s Law Metrics (speedup, efficiency, redundancy) Multi-core (one thread spans an engine, multiple engines) Tightly coupled vs. Loosely coupled Interconnect Cache Coherency Memory Consistency Examples Early: cm*, HEP, cosmic cube Recent: cell, power, niagara, larrabee Tomorrow: ACMP

Tightly-coupled vs Loosely-coupled Tightly coupled (i.e., Multiprocessor) Shared memory Each processor capable of doing work on its own Easier for the software Hardware has to worry about cache coherency, memory contention Loosely-coupled (i.e., Multicomputer Network) Message passing Easier for the hardware Programmer’s job is tougher

Interconnection networks Cost Latency Contention Cache Cohererency Snoopy Directory Memory Consistency Sequential Consistency and Mutual Exclusion

Outline Some basics Granularity Amdahl’s Law Metrics (speedup, efficiency, redundancy) Multi-core (one thread spans an engine, multiple engines) Tightly coupled vs. Loosely coupled Interconnect Cache Coherency Memory Consistency Examples Early: cm*, HEP, cosmic cube Recent: cell, power, niagara, larrabee Tomorrow: ACMP