COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING

Slides:

Advertisements

Similar presentations

Instruction Level Parallelism and Superscalar Processors

Advertisements

DSPs Vs General Purpose Microprocessors

PIPELINING AND VECTOR PROCESSING

PIPELINE AND VECTOR PROCESSING

CMPT 334 Computer Organization

Chapter 3 Pipelining. 3.1 Pipeline Model n Terminology –task –subtask –stage –staging register n Total processing time for each task. –T pl =, where t.

Pipeline and Vector Processing (Chapter2 and Appendix A)

Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.

Introduction to Parallel Processing Ch. 12, Pg

SUPERSCALAR EXECUTION. two-way superscalar The DLW-2 has two ALUs, so it’s able to execute two arithmetic instructions in parallel (hence the term two-way.

Chapter One Introduction to Pipelined Processors.

Advanced Computer Architectures

1 Catalog of useful (structural) modules and architectures In this course we will be working mostly at the BEHAVIORAL and STRUCTURAL levels. We will rely.

5-Stage Pipelining Fetch Instruction (FI) Fetch Operand (FO) Decode Instruction (DI) Write Operand (WO) Execution Instruction (EI) S3S3 S4S4 S1S1 S2S2.

Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.

Pipeline And Vector Processing. Parallel Processing The purpose of parallel processing is to speed up the computer processing capability and increase.

Chapter One Introduction to Pipelined Processors.

PIPELINING AND VECTOR PROCESSING

Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.

CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page

Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.

PIPELINING AND VECTOR PROCESSING

Lecture 8: Processors, Introduction EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014,

The Central Processing Unit (CPU) and the Machine Cycle.

Parallel architecture Technique. Pipelining Processor Pipelining is a technique of decomposing a sequential process into sub-processes, with each sub-process.

The original MIPS I CPU ISA has been extended forward three times The practical result is that a processor implementing MIPS IV is also able to run MIPS.

1 Pipelining and Vector Processing Computer Organization Computer Architectures Lab PIPELINING AND VECTOR PROCESSING Parallel Processing Pipelining Arithmetic.

Principles of Linear Pipelining

Chapter 2 Data Manipulation. © 2005 Pearson Addison-Wesley. All rights reserved 2-2 Chapter 2: Data Manipulation 2.1 Computer Architecture 2.2 Machine.

Principles of Linear Pipelining. In pipelining, we divide a task into set of subtasks. The precedence relation of a set of subtasks {T 1, T 2,…, T k }

Pipelining and Parallelism Mark Staveley

Chapter One Introduction to Pipelined Processors

Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2010

EKT303/4 Superscalar vs Super-pipelined.

Lecture 3: Computer Architectures

Fundamentals of Programming Languages-II

3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,

Chapter One Introduction to Pipelined Processors

Speedup Speedup is defined as Speedup = Time taken for a given computation by a non-pipelined functional unit Time taken for the same computation by a.

RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.

Chapter One Introduction to Pipelined Processors.

DICCD Class-08. Parallel processing A parallel processing system is able to perform concurrent data processing to achieve faster execution time The system.

Principles of pipelining The two major parametric considerations in designing a parallel computer architecture are: –executing multiple number of instructions.

UNIT-V PIPELINING & VECTOR PROCESSING.

Chapter Overview General Concepts IA-32 Processor Architecture

PARALLEL COMPUTER ARCHITECTURE

Advanced Architectures

buses, crossing switch, multistage network.

Parallel Processing - introduction

Chapter 9 a Instruction Level Parallelism and Superscalar Processors

Components of Computer

Introduction to Pentium Processor

Parallel and Multiprocessor Architectures

Instruction Level Parallelism and Superscalar Processors

Pipelining and Vector Processing

Array Processor.

Multivector and SIMD Computers

Chapter 2: Data Manipulation

buses, crossing switch, multistage network.

Overview Parallel Processing Pipelining

Chap. 9 Pipeline and Vector Processing

AN INTRODUCTION ON PARALLEL PROCESSING

Chapter 2: Data Manipulation

Extra Reading Data-Instruction Stream : Flynn

COMPUTER ORGANIZATION AND ARCHITECTURE

Chapter 2: Data Manipulation

Presentation transcript:

COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING

SISD PERFORMANCE IMPROVEMENTS • Multiprogramming • Spooling • Multifunction processor • Pipelining • Exploiting instruction-level parallelism - Superscalar - Superpipelining - VLIW (Very Long Instruction Word)

MISD COMPUTER SYSTEMS Characteristics - There is no computer at present that can be classified as MISD

SIMD COMPUTER SYSTEMS Characteristics – Only one copy of the program exists - A single controller executes one instruction at a time

TYPES OF SIMD COMPUTERS Array Processors The control unit broadcasts instructions to all PEs, and all active PEs execute the same instructions ILLIAC IV, GF-11, Connection Machine, DAP, MPP Systolic Arrays - Regular arrangement of a large number of very simple processors constructed on VLSI circuits CMU Warp, Purdue CHiP Associative Processors Content addressing - Data transformation operations over many sets of arguments with a single instruction - STARAN, PEPE

MIMD COMPUTER SYSTEMS Characteristics - Multiple processing units - Execution of multiple instructions on multiple data Types of MIMD computer systems - Shared memory multiprocessors - Message-passing multicomputers

Pipeline and Vector Processing Parallel Processing Simultaneous data processing tasks for the purpose of increasing the computational speed Perform concurrent data processing to achieve faster execution time Multiple Functional Unit : Separate the execution unit into eight functional units operating in parallel = Pipelining Decomposing a sequential process into suboperations Each subprocess is executed in a special dedicated segment concurrently

General considerations Pipelining: Multiply and add operation : ( for i = 1, 2, …, 7 ) 3-Suboperation Segment 1) : Input Ai and Bi 2) : Multiply and input Ci 3) : Add Ci Content of registers in pipeline example : General considerations 4 segment pipeline : S : Combinational circuit for Suboperation R : Register(intermediate results between the segments) Space-time diagram : Show segment utilization as a function of time Task : T1, T2, T3,…, T6 Total operation performed going through all the segment

Pipeline = 9 clock cycles Speedup S : Nonpipeline / Pipeline S = n • tn / ( k + n - 1 ) • tp = 6 • 6 tn / ( 4 + 6 - 1 ) • tp = 36 tn / 9 tn = 4 n : task number ( 6 ) tn : time to complete each task in nonpipeline ( 6 cycle times = 6 tp) tp : clock cycle time ( 1 clock cycle ) k : segment number ( 4 ) If n  S = tn / tp task nonpipeline ( tn ) = pipeline ( k • tp ) S = tn / tp = k • tp / tp = k k (segment ) Pipeline = 9 clock cycles k + n - 1  n

Static Arithmetic Pipelines Most arithmetic pipelines performs fixed functions. Due to performance of a fixed function, it is also called unifunctional pipeline ALUs performs fixed-point using integer unit Floating-point operations is performed using a separate unit (coprocessor) All arithmetic operations can be performed using basic add and shift operations Arithmetic and logical shifts can be performed with shift registers Addition can be done using carry propagation adder (CPA) or carry save adder (CSA)

Arithmetic Pipeline Design

Arithmetic Pipeline Floating-point Adder Pipeline Example : 3 - 2 = 1 Add / Subtract two normalized floating-point binary number X = A x 2a = 0.9504 x 103 Y = B x 2b = 0.8200 x 102 4 segments suboperations 1) Compare exponents by subtraction : 3 - 2 = 1 X = 0.9504 x 103 Y = 0.8200 x 102 2) Align mantissas Y = 0.08200 x 103 3) Add mantissas Z = 1.0324 x 103 4) Normalize result Z = 0.1324 x 104

Multiply Pipeline Design CSA and CPA are used at different stages to design pipeline for fixed point multiplication Example: multiplication of two 8-bit numbers, producing a 16-bit result S1: generates eight partial products S2: two levels of CSAs taking eight numbers and producing four S3: two CSAs convert four numbers into two numbers S4: one CPA takes two numbers and result into one number

3-4 Instruction Pipeline Instruction Cycle 1) Fetch the instruction from memory 2) Decode the instruction 3) Calculate the effective address 4) Fetch the operands from memory 5) Execute the instruction 6) Store the result in the proper place Example : Four-segment Instruction pipeline Four-segment CPU pipeline : 1) FI : Instruction Fetch 2) DA : Decode Instruction & calculate EA 3) FO : Operand Fetch 4) EX : Execution

Instruction 3 Branch No Branch Branch