Fundamental of Computer Architecture By Panyayot Chaikan 240-208 November 01, 2003.

Slides:

Advertisements

Similar presentations

© 2009 Fakultas Teknologi Informasi Universitas Budi Luhur Jl. Ciledug Raya Petukangan Utara Jakarta Selatan Website:

Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.

SE-292 High Performance Computing

1 Parallel Scientific Computing: Algorithms and Tools Lecture #3 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.

Parallel Computers Chapter 1

CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.

Chapter 17 Parallel Processing.

Lecture 10 Outline Material from Chapter 2 Interconnection networks Processor arrays Multiprocessors Multicomputers Flynn’s taxonomy.

1 Advanced computer systems (Chapter 12)

Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.

Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.

Parallel Processing Group Members: PJ Kulick Jon Robb Brian Tobin.

 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.

Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.

Parallel Computing Techniques. 1. Introduction 2. Parallel Machines 3. Clusters 4. Computational Grids 5. unGrid 6. Questions & Answers.

1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.

Introduction to Parallel Processing Ch. 12, Pg

Flynn’s Taxonomy of Computer Architectures Source: Wikipedia Michael Flynn 1966 CMPS 5433 – Parallel Processing.

Chapter 5 Array Processors. Introduction  Major characteristics of SIMD architectures –A single processor(CP) –Synchronous array processors(PEs) –Data-parallel.

Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.

1 Parallel computing and its recent topics. 2 Outline 1. Introduction of parallel processing (1)What is parallel processing (2)Classification of parallel.

Course Outline Introduction in software and applications. Parallel machines and architectures –Overview of parallel machines –Cluster computers (Myrinet)

Parallel Computing Basic Concepts Computational Models Synchronous vs. Asynchronous The Flynn Taxonomy Shared versus Distributed Memory Interconnection.

10-1 Chapter 10 - Advanced Computer Architecture Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring.

Multiprocessor systems Objective n the multiprocessors’ organization and implementation n the shared-memory in multiprocessor n static and dynamic connection.

1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.

Parallel and Distributed Computing References Introduction to Parallel Computing, Second Edition Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors.

CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

PIPELINING AND VECTOR PROCESSING

Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters,

Chapter 9: Alternative Architectures In this course, we have concentrated on single processor systems But there are many other breeds of architectures:

Flynn’s Architecture. SISD (single instruction and single data stream) SIMD (single instruction and multiple data streams) MISD (Multiple instructions.

Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.

PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.

Parallel Computing.

Data Structures and Algorithms in Parallel Computing Lecture 1.

Outline Why this subject? What is High Performance Computing?

Computer Architecture And Organization UNIT-II Flynn’s Classification Of Computer Architectures.

Lecture 3: Computer Architectures

Parallel Processing Presented by: Wanki Ho CS147, Section 1.

In1210/01-PDS 1 TU-Delft Large systems. In1210/01-PDS 2 TU-Delft Why parallelism(1) l Fundamental laws of nature: -example: channel widths are becoming.

Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient medium for communication among the processor memory.

LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?

Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.

These slides are based on the book:

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.

Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.

Parallel Architecture

CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.

Distributed and Parallel Processing

Advanced computer systems (Chapter 12)

Multiprocessor Systems

Course Outline Introduction in algorithms and applications

CS 147 – Parallel Processing

Flynn’s Classification Of Computer Architectures

Multi-Processing in High Performance Computer Architecture:

Data Structures and Algorithms in Parallel Computing

Parallel Architectures Based on Parallel Computing, M. J. Quinn

Chapter 17 Parallel Processing

Symmetric Multiprocessing (SMP)

Outline Interconnection networks Processor arrays Multiprocessors

AN INTRODUCTION ON PARALLEL PROCESSING

Part 2: Parallel Models (I)

Chapter 4 Multiprocessors

Presentation transcript:

Fundamental of Computer Architecture By Panyayot Chaikan November 01, 2003

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Chapter 10 แนะนำการประมวลผลแบบ ขนาน Introduction to Parallel processing

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture เนื้อหา แนะนำสถาปัตยกรรมการประมวลผลแบบขนาน มัลติโพรเซสเซอร์ เวกเตอร์คอมพิวเตอร์ คลัส เตอร์ Interconnection network แบบต่างๆ แนะนำการเขียนโปรแกรมแบบขนาน

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture High performance computer  Large computing capacity  Required to compute large amount of data in a reasonable amount of time  Often called Supercomputer

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Supercomputer Applications  Weather forecasting  Finite element analysis in structural design  Fluid flow analysis  Simulation of large complex physical system  Computer Aided Design (CAD)

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Parallel processing Picture from

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture 3 ways to construct Supercomputer  Vector processing  Multiprocessing  Distributed computer system

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Vector Supercomputing  Using fastest possible circuit  Wide path for access large main memory  Extensive I/O capability  Dissipate considerable power and require expensive cooling arrangement  Provide excellent performance but at very high price

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Vector Supercomputing  NEC SX5  CRAY CRAY1, Y-MP  Fujitsu VP5000  Hitachi SR8000

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Cray Supercomputer Picture from

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient high bandwidth medium for communication among  the processor  memory  I/O  Provide High performance but cheaper than vector processing

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Distributed computer system  Using many workstation connected by Local area network  Provide large computing capabilities at a reasonable cost

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Multiprocessing performance  Many computation can proceed in parallel  Difficulty:  the application must be broken down into small task that can be assigned to individual processor  Processors must communicate with each other to exchange data

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Classification of Parallel structure  Proposed by Flynn[1966]  4 types of computation  SISD  SIMD  MIMD  MISD

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture SISD  Single Instruction stream, Single Data stream  Used in single-processor computer system

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture SIMD  Single Instruction stream, Multiple Data stream  Single stream of instruction is broadcast to a number of processor  Each processor operates on its own data  Each processor has its own memories  All processors executes the same program but operate on different data

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture MIMD  Multiple Instruction stream, Multiple Data stream  Many processor execute a different program and access its own sequence of data

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture MISD  Multiple Instruction stream, Single Data stream  Common data structure is manipulated by separate processor  Each processor executes a different program  This form does not occur often in practice

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Array processing  Is the SIMD form of parallel processing  Instruction is broadcast from a central processor

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture 2 types of Array processing  Use small number of powerful processor  ILLIAC-IV: 64 processors, each processor is 64-bit  Use large number of very simple processor  CM2: processors, each processor is 1-bit  MP-1216: processors, each processor is 4-bit  Gamma II plus: 4096 processors, each processor is 8- bit

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Array processing  Well suited to numerical problem that can be expressed in matrix or vector format

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture The structure of general- purpose multiprocessors  UMA multiprocessor  NUMA multiprocessor  Distributed memory system

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A UMA multiprocessor

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A NUMA multiprocessor

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A distributed memory system

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Taxonomy of parallel processing

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Interconnection network  Single bus  Crossbar networks  Multistage networks  Hypercube networks  Mesh networks  Tree networks  Ring networks

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Crossbar interconnection network

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Multistage shuffle network

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A 3-dimensional Hypercube Network

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A 2-dimensional mesh network

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Four-way tree network

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Flat tree network

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Ring network

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture HP Convex architecture Picture from

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture HP Convex Hypernode Picture from

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture SGI Power Challenge Picture from

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Clustered Supercomputer Picture from

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Clusters

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Benefits of clustering  Incremental scalability  High availability  Superior price/performance

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Parallel programming  Task must be broken down into small task that can be assigned to individual processors at program level  Need operating system support  Different architecture, different programming method

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture A sequential program to compute the dot product integer array a[1..N], b[1..N] integer dot_product. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N] for k:= 1 to N dot_product := dot_product + x[k] * y[k] end for end do_dot

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture First attempt of 2- processor computation shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer id id := mypid() for k:= (id*N/2)+1 to (id+1)*N/2 lock (dot_product_lock) dot_product := dot_product + x[k] * y[k] unlock (dot_product_lock) end barrier (done) end do_dot

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture An efficient 2-processor computation of a shared memory machine shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer local_dot_product private integer id id := mypid() local_dot_product := 0 for k:= (id*N/2)+1 to (id+1)*N/2 local_dot_product := local_dot_product + x[k] * y[k] end lock (dot_product_lock) dot_product := dot_product + local_dot_product unlock (dot_product_lock) barrier (done) end do_dot

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture Performance considerations

Chapter 10 - Introduction to Parallel processing Fundamental of Computer Architecture จบ บทที่ 10