Parallel computer architecture classification

Slides:



Advertisements
Similar presentations
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Advertisements

Lecture 6: Multicore Systems
Parallel computer architecture classification
C SINGH, JUNE 7-8, 2010IWW 2010, ISATANBUL, TURKEY Advanced Computers Architecture, UNIT 1 Advanced Computers Architecture Lecture 4 By Rohit Khokher Department.
Introduction CS 524 – High-Performance Computing.
Supercomputers Daniel Shin CS 147, Section 1 April 29, 2010.
Tuesday, September 12, 2006 Nothing is impossible for people who don't have to do it themselves. - Weiler.
PSU CS 106 Computing Fundamentals II Introduction HM 1/3/2009.
 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
CMSC 611: Advanced Computer Architecture Parallel Computation Most slides adapted from David Patterson. Some from Mohomed Younis.
Flynn’s Taxonomy of Computer Architectures Source: Wikipedia Michael Flynn 1966 CMPS 5433 – Parallel Processing.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
KUAS.EE Parallel Computing at a Glance. KUAS.EE History Parallel Computing.
18-447: Computer Architecture Lecture 30B: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013.
10-1 Chapter 10 - Advanced Computer Architecture Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring.
Anshul Kumar, CSE IITD CS718 : Data Parallel Processors 27 th April, 2006.
HPC Technology Track: Foundations of Computational Science Lecture 2 Dr. Greg Wettstein, Ph.D. Research Support Group Leader Division of Information Technology.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
Introduction 9th January, 2006 CSL718 : Architecture of High Performance Systems.
Outline Classification ILP Architectures Data Parallel Architectures
Flynn’s Taxonomy SISD: Although instruction execution may be pipelined, computers in this category can decode only a single instruction in unit time SIMD:
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Chapter 9: Alternative Architectures In this course, we have concentrated on single processor systems But there are many other breeds of architectures:
CSE Advanced Computer Architecture Week-1 Week of Jan 12, 2004 engr.smu.edu/~rewini/8383.
Parallel Computing.
CS591x -Cluster Computing and Parallel Programming
Outline Why this subject? What is High Performance Computing?
Computer Architecture And Organization UNIT-II Flynn’s Classification Of Computer Architectures.
EKT303/4 Superscalar vs Super-pipelined.
Lecture 3: Computer Architectures
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
CPS 258 Announcements –Lecture calendar with slides –Pointers to related material.
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
CS203 – Advanced Computer Architecture Performance Evaluation.
Classification of parallel computers Limitations of parallel processing.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Processor Level Parallelism 1
Presented by Leo Pleše ScienceUp.org
These slides are based on the book:
CS203 – Advanced Computer Architecture
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.
18-447: Computer Architecture Lecture 30B: Multiprocessors
Introduction to Parallel Computing
CMSC 611: Advanced Computer Architecture
Computer Architecture: Parallel Processing Basics
Distributed Processors
buses, crossing switch, multistage network.
Parallel Processing - introduction
CS 147 – Parallel Processing
Super Computing By RIsaj t r S3 ece, roll 50.
Flynn’s Classification Of Computer Architectures
Modern Processor Design: Superscalar and Superpipelining
Multi-Processing in High Performance Computer Architecture:
MIMD Multiple instruction, multiple data
Chapter 17 Parallel Processing
Symmetric Multiprocessing (SMP)
Parallel Architectures
buses, crossing switch, multistage network.
Overview Parallel Processing Pipelining
Chap. 9 Pipeline and Vector Processing
AN INTRODUCTION ON PARALLEL PROCESSING
Part 2: Parallel Models (I)
COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING
The University of Adelaide, School of Computer Science
Multicore and GPU Programming
Presentation transcript:

Parallel computer architecture classification

Hardware Parallelism Classification Computing: execute instructions that operate on data. Flynn’s taxonomy (Michael Flynn, 1967) classifies computer architectures based on the number of instructions that can be executed and how they operate on data. Computer Instructions Data

Flynn’s taxonomy Single Instruction Single Data (SISD) Classification Single Instruction Single Data (SISD) Traditional sequential computing systems Single Instruction Multiple Data (SIMD) Multiple Instructions Multiple Data (MIMD) Multiple Instructions Single Data (MISD) Computer Architectures SISD SIMD MIMD MISD

SISD At one time, one instruction operates on one data Classification At one time, one instruction operates on one data Traditional sequential architecture

SIMD At one time, one instruction operates on many data Classification At one time, one instruction operates on many data Data parallel architecture Vector architecture has similar characteristics, but achieve the parallelism with pipelining. Array processors

Array processor (SIMD) Classification IP MAR MEMORY OP ADDR MDR A1 B1 C1 A2 B2 C2 AN BN CN DECODER ALU ALU ALU

MIMD Multiple instruction streams operating on multiple data streams Classification Multiple instruction streams operating on multiple data streams Classical distributed memory or SMP architectures

MISD machine Not commonly seen. Classification Not commonly seen. Systolic array is one example of an MISD architecture.

Flynn’s taxonomy summary Classification SISD: traditional sequential architecture SIMD: processor arrays, vector processor Parallel computing on a budget – reduced control unit cost Many early supercomputers MIMD: most general purpose parallel computer today Clusters, MPP, data centers MISD: not a general purpose architecture.

Feng’s classification 16K MPP 256 STARAN PEPE bit slice length IlliacIV 64 16 C.mmP 1 PDP11 IBM370 1 16 32 64 word length

Händler’s classification < K x K’ , D x D’ , W x W’ > control data word dash  degree of pipelining TI - ASC <1, 4, 64 x 8> CDC 6600 <1, 1 x 10, 60> x <10, 1, 12> (I/O) C.mmP <16,1,16> + <1x16,1,16> + <1,16,16> PEPE <1 x 3, 288, 32> Cray-1 <1, 12 x 8, 64 x (1 ~ 14)>

Modern classification (Sima, Fountain, Kacsuk) Classify based on how parallelism is achieved by operating on multiple data: data parallelism by performing many functions in parallel: function parallelism Control parallelism, task parallelism depending on the level of the functional parallelism. Parallel architectures Data-parallel architectures Function-parallel architectures

Data parallel architectures Classification Data-parallel architectures Vector architectures Associative And neural architectures SIMDs Systolic architectures

Data parallel architectures Classification Vector processors, SIMD (array processors), systolic arrays. IP MAR Vector processor (pipelining) MEMORY A B C OP ADDR MDR DECODER ALU

Data parallel architecture: Array processor Classification IP MAR MEMORY OP ADDR MDR A1 B1 C1 A2 B2 C2 AN BN CN DECODER ALU ALU ALU

Control parallel architectures Classification Function-parallel architectures Process level Parallel Arch Thread level Parallel Arch Instruction level Parallel Arch (ILPs) (MIMDs) Distributed Memory MIMD Superscalar processors Shared Memory MIMD Pipelined processors VLIWs

Performance of parallel architectures Classification Common metrics MIPS: million instructions per second MIPS = instruction count/(execution time x 106) MFLOPS: million floating point operations per second. MFLOPS = FP ops in program/(execution time x 106) Which is a better metric? FLOP is more related to the time of a task in numerical code # of FLOP / program is determined by the matrix size

Performance of parallel architectures Classification FlOPS units kiloFLOPS (KFLOPS) 10^3 megaFLOPS (MFLOPS) 10^6 gigaFLOPS (GFLOPS) 10^9  single CPU performance teraFLOPS (TFLOPS) 10^12 petaFLOPS (PFLOPS) 10^15  we are here right now 10 petaFLOPS supercomputers exaFLOPS (EFLOPS) 10^18  the next milestone

Peak and sustained performance Classification Peak performance Measured in MFLOPS Highest possible MFLOPS when the system does nothing but numerical computation Rough hardware measure Little indication on how the system will perform in practice.

Peak and sustained performance Classification Sustained performance The MFLOPS rate that a program achieves over the entire run. Measuring sustained performance Using benchmarks Peak MFLOPS is usually much larger than sustained MFLOPS Efficiency rate = sustained MFLOPS / peak MFLOPS

Measuring the performance of parallel computers Classification Benchmarks: programs that are used to measure the performance. LINPACK benchmark: a measure of a system’s floating point computing power Solving a dense N by N system of linear equations Ax=b Use to rank supercomputers in the top500 list.

Other common benchmarks Classification Micro benchmarks suit Numerical computing LAPACK ScaLAPACK Memory bandwidth STREAM Kernel benchmarks NPB (NAS parallel benchmark) PARKBENCH SPEC Splash

Summary Flynn’s classification Modern classification Performance SISD, SIMD, MIMD, MISD Modern classification Data parallelism function parallelism Instruction level, thread level, and process level Performance MIPS, MFLOPS Peak performance and sustained performance

References Classification K. Hwang, "Advanced Computer Architecture : Parallelism, Scalability, Programmability", McGraw Hill, 1993. D. Sima, T. Fountain, P. Kacsuk, "Advanced Computer Architectures : A Design Space Approach", Addison Wesley, 1997.