Overview Parallel Processing Pipelining

Slides:



Advertisements
Similar presentations
PIPELINING AND VECTOR PROCESSING
Advertisements

PIPELINE AND VECTOR PROCESSING
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Pipeline and Vector Processing (Chapter2 and Appendix A)
Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.

1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
Flynn’s Taxonomy of Computer Architectures Source: Wikipedia Michael Flynn 1966 CMPS 5433 – Parallel Processing.
Chapter One Introduction to Pipelined Processors.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
Pipeline And Vector Processing. Parallel Processing The purpose of parallel processing is to speed up the computer processing capability and increase.
Chapter One Introduction to Pipelined Processors.
Basic logical operations Operation Mechanism Through the combination of circuits that perform these three operations, a wide range of logical circuits.
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page
CSCI 232© 2005 JW Ryder1 Parallel Processing Large class of techniques used to provide simultaneous data processing tasks Purpose: Increase computational.
PIPELINING AND VECTOR PROCESSING
Parallel architecture Technique. Pipelining Processor Pipelining is a technique of decomposing a sequential process into sub-processes, with each sub-process.
1 Pipelining and Vector Processing Computer Organization Computer Architectures Lab PIPELINING AND VECTOR PROCESSING Parallel Processing Pipelining Arithmetic.
Parallel Computing.
Processor Architecture
Chapter One Introduction to Pipelined Processors
Data Structures and Algorithms in Parallel Computing Lecture 1.
Computer Architecture And Organization UNIT-II Flynn’s Classification Of Computer Architectures.
Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2010
EKT303/4 Superscalar vs Super-pipelined.
Introduction  The speed of execution of program is influenced by many factors. i) One way is to build faster circuit technology to build the processor.
Lecture 3: Computer Architectures
Parallel Processing Presented by: Wanki Ho CS147, Section 1.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
An Overview of Parallel Processing
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Chapter One Introduction to Pipelined Processors.
Classification of parallel computers Limitations of parallel processing.
DICCD Class-08. Parallel processing A parallel processing system is able to perform concurrent data processing to achieve faster execution time The system.
UNIT-V PIPELINING & VECTOR PROCESSING.
These slides are based on the book:
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
Overview Parallel Processing Pipelining
Advanced Architectures
CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.
CLASSIFICATION OF PARALLEL COMPUTERS
18-447: Computer Architecture Lecture 30B: Multiprocessors
Introduction to Parallel Computing
Distributed Processors
buses, crossing switch, multistage network.
Parallel Processing - introduction
CS 147 – Parallel Processing
Assembly Language for Intel-Based Computers, 5th Edition
Flynn’s Classification Of Computer Architectures
Pipelining.
Chapter One Introduction to Pipelined Processors
Pipelining and Vector Processing
Array Processor.
MIMD Multiple instruction, multiple data
Data Structures and Algorithms in Parallel Computing
Chapter 17 Parallel Processing
Symmetric Multiprocessing (SMP)
Multiprocessors - Flynn’s taxonomy (1966)
buses, crossing switch, multistage network.
Chap. 9 Pipeline and Vector Processing
AN INTRODUCTION ON PARALLEL PROCESSING
Part 2: Parallel Models (I)
Chapter 4 Multiprocessors
COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING
Extra Reading Data-Instruction Stream : Flynn
CSL718 : Multiprocessors 13th April, 2006 Introduction
Advanced Topic: Alternative Architectures Chapter 9 Objectives
COMPUTER ORGANIZATION AND ARCHITECTURE
Presentation transcript:

Overview Parallel Processing Pipelining Characteristics of Multiprocessors Interconnection Structures Inter processor Arbitration Inter processor Communication and Synchronization

Parallel Processing Execution of Concurrent Events in the computing process to achieve faster Computational Speed The purpose of parallel processing is to speed up the computer processing capability and increase its throughput, i.e. the amount of processing that can be accomplished during a given interval of time Levels of Parallel Processing - Job or Program level - Task or Procedure level - Inter-Instruction level Intra-Instruction level Lowest level : shift register, register with parallel load Higher level : multiplicity of functional unit that perform identical /different task

Parallel Computers Architectural Classification Flynn's classification Based on the multiplicity of Instruction Streams and Data Streams Instruction Stream Sequence of Instructions read from memory Data Stream Operations performed on the data in the processor Number of Data Streams Single Multiple Number of Instruction Streams Single SISD SIMD Multiple MISD MIMD

SISD Memory Control Unit Processor Unit Data stream Instruction stream Characteristics - Single computer containing a control unit, processor and memory unit - Instructions and data are stored in memory and executed sequentially - may or may not have parallel processing - parallel processing can be achieved by pipelining

SIMD Characteristics - Only one copy of the program exists Control Unit Memory Alignment network P • • • M Data bus Instruction stream Data stream Processor units Memory modules Characteristics - Only one copy of the program exists - A single controller executes one instruction at a time

MISD M CU P • Memory Instruction stream Data stream Characteristics - There is no computer at present that can be classified as MISD

MIMD Characteristics Types of MIMD computer systems Interconnection Network P M • • • Shared Memory Characteristics - Multiple processing units - Execution of multiple instructions on multiple data Types of MIMD computer systems - Shared memory multiprocessors - Message-passing multicomputers

Pipelining A technique of decomposing a sequential process into sub operations, with each sub process being executed in a special dedicated segment that operates concurrently with all other segments. - It is the characteristic of pipelining that several computations can be in progress in distinct segments at the same time. - Each segment performs partial processing dictated by the way the task is dictated - The result obtained from computation is in each segment is transferred to next segment in the pipeline - The final result is obtained after data has been passed through all segment

Pipelining Simplest way to understand pipelining is to imagine that each segment consist of input register followed by combinational circuit. The o/p of combinational circuit in a segment is applied to i/p register of next segment Ai * Bi + Ci for i = 1, 2, 3, ... , 7 Ai R1 R2 Multiplier R3 R4 Adder R5 Memory Bi Ci Segment 1 Segment 2 Segment 3 R1  Ai, R2  Bi Load Ai and Bi R3  R1 * R2, R4  Ci Multiply and load Ci R5  R3 + R4 Add

Operations in each Pipeline Stage Clock Pulse Segment 1 Segment 2 Segment 3 Number R1 R2 R3 R4 R5 1 A1 B1 2 A2 B2 A1 * B1 C1 3 A3 B3 A2 * B2 C2 A1 * B1 + C1 4 A4 B4 A3 * B3 C3 A2 * B2 + C2 5 A5 B5 A4 * B4 C4 A3 * B3 + C3 6 A6 B6 A5 * B5 C5 A4 * B4 + C4 7 A7 B7 A6 * B6 C6 A5 * B5 + C5 8 A7 * B7 C7 A6 * B6 + C6 9 A7 * B7 + C7

General Pipeline General Structure of a 4-Segment Pipeline 1 2 3 4 Input Clock Space-Time Diagram 5 6 7 8 9 T1 T2 T3 T4 T5 T6 Clock cycles Segment

Pipeline SpeedUp n: Number of tasks to be performed Conventional Machine (Non-Pipelined) tn: Clock cycle (time to complete each task) t1: Time required to complete the n tasks t1 = n * tn Pipelined Machine (k stages) tp: Clock cycle (time to complete each suboperation) tk: Time required to complete the n tasks tk = (k + n - 1) * tp Speedup Sk: Speedup Sk = n*tn / (k + n - 1)*tp

Pipeline SpeedUp As n becomes very larger than k-1 then k+n-1 approaches to n Then : S= tn/tp If we consider time taken to complete a task is same in both circuits then tn=ktp and speedup reduces to S= ktp/tn = k i.e. maximum theoritical speedup pipeline can provide is k.