AN INTRODUCTION ON PARALLEL PROCESSING

Slides:



Advertisements
Similar presentations
© 2009 Fakultas Teknologi Informasi Universitas Budi Luhur Jl. Ciledug Raya Petukangan Utara Jakarta Selatan Website:
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
SISD—Single Instruction Single Data Xin Meng Tufts University School of Engineering.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.

Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
1 Lecture 23: Multiprocessors Today’s topics:  RAID  Multiprocessor taxonomy  Snooping-based cache coherence protocol.
 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.
Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
Flynn’s Taxonomy of Computer Architectures Source: Wikipedia Michael Flynn 1966 CMPS 5433 – Parallel Processing.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
Pipeline And Vector Processing. Parallel Processing The purpose of parallel processing is to speed up the computer processing capability and increase.
2015/10/14Part-I1 Introduction to Parallel Processing.
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
CSCI 232© 2005 JW Ryder1 Parallel Processing Large class of techniques used to provide simultaneous data processing tasks Purpose: Increase computational.
Flynn’s Architecture. SISD (single instruction and single data stream) SIMD (single instruction and multiple data streams) MISD (Multiple instructions.
Parallel Computing.
Pipelining and Parallelism Mark Staveley
Data Structures and Algorithms in Parallel Computing Lecture 1.
Outline Why this subject? What is High Performance Computing?
Computer Architecture And Organization UNIT-II Flynn’s Classification Of Computer Architectures.
EKT303/4 Superscalar vs Super-pipelined.
Lecture 3: Computer Architectures
Parallel Processing Presented by: Wanki Ho CS147, Section 1.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
An Overview of Parallel Processing
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Classification of parallel computers Limitations of parallel processing.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
DICCD Class-08. Parallel processing A parallel processing system is able to perform concurrent data processing to achieve faster execution time The system.
These slides are based on the book:
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Introduction to Parallel Processing
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
A Level Computing – a2 Component 2 1A, 1B, 1C, 1D, 1E.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Overview Parallel Processing Pipelining
Advanced Architectures
CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.
CLASSIFICATION OF PARALLEL COMPUTERS
Multiprocessor Systems
Distributed Processors
buses, crossing switch, multistage network.
Parallel Processing - introduction
3.3.3 Computer architectures
CS 147 – Parallel Processing
Flynn’s Classification Of Computer Architectures
Multi-Processing in High Performance Computer Architecture:
Parallel and Multiprocessor Architectures
Pipelining and Vector Processing
Data Structures and Algorithms in Parallel Computing
Parallel Architectures Based on Parallel Computing, M. J. Quinn
Chapter 17 Parallel Processing
Symmetric Multiprocessing (SMP)
Lecture 24: Memory, VM, Multiproc
buses, crossing switch, multistage network.
Overview Parallel Processing Pipelining
Advanced Computer and Parallel Processing
Part 2: Parallel Models (I)
Chapter 4 Multiprocessors
COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING
Advanced Computer and Parallel Processing
Lecture 24: Virtual Memory, Multiprocessors
Lecture 23: Virtual Memory, Multiprocessors
Advanced Topic: Alternative Architectures Chapter 9 Objectives
COMPUTER ORGANIZATION AND ARCHITECTURE
Presentation transcript:

AN INTRODUCTION ON PARALLEL PROCESSING

Topics Covered An Overview of Parallel Processing Parallelism in Uniprocessor Systems Organization of Multiprocessor Flynn’s Classification MIMD System Architectures

AN OVERVIEW OF PARALLEL PROCESSING What is parallel processing? Parallel processing is a method to improve computer system performance by executing two or more instructions simultaneously. The goals of parallel processing. One goal is to reduce the “wall-clock” time or the amount of real time that you need to wait for a problem to be solved. Another goal is to solve bigger problems that might not fit in the limited memory of a single CPU.

AN ANALOGY OF PARALLELISM The task of ordering a shuffled deck of cards by suit and then by rank can be done faster if the task is carried out by two or more people. By splitting up the decks and performing the instructions simultaneously, then at the end combining the partial solutions you have performed parallel processing.

ANOTHER ANALOGY OF PARALLELISM Another analogy is having several students grade quizzes simultaneously. Quizzes are distributed to a few students and different problems are graded by each student at the same time. After they are completed, the graded quizzes are then gathered and the scores are recorded.

Parallelism in Uniprocessor Systems It is possible to achieve parallelism with a uniprocessor system. Some examples are the instruction pipeline, arithmetic pipeline, I/O processor. Note that a system that performs different operations on the same instruction is not considered parallel. Only if the system processes two different instructions simultaneously can it be considered parallel.

PARALLELISM IN A UNIPROCESSOR SYSTEM A reconfigurable arithmetic pipeline is an example of parallelism in a uniprocessor system. Each stage of a reconfigurable arithmetic pipeline has a multiplexer at its input. The multiplexer may pass input data, or the data output from other stages, to the stage inputs. The control unit of the CPU sets the select signals of the multiplexer to control the flow of data, thus configuring the pipeline.

A RECONFIGURABLE PIPELINE WITH DATA FLOW FOR THE COMPUTATION A[i]  B[i] * C[i] + D[i] To memory and registers 1 MUX 2 3 S1 S0 LATCH * 1 MUX 2 3 S1 S0 LATCH | 1 MUX 2 3 S1 S0 LATCH + 1 MUX 2 3 S1 S0 Data Inputs 0 0 x x 0 1 1 1

ORGANIZATION OF MULTIPROCESSOR SYSTEMS Flynn’s Classification Was proposed by researcher Michael J. Flynn in 1966. It is the most commonly accepted taxonomy of computer organization. In this classification, computers are classified by whether it processes a single instruction at a time or multiple instructions simultaneously, and whether it operates on one or multiple data sets.

Taxonomy of Computer Architectures Simple Diagrammatic Representation 4 categories of Flynn’s classification of multiprocessor systems by their instruction and data streams

SINGLE INSTRUCTION, SINGLE DATA (SISD) SISD machines executes a single instruction on individual data values using a single processor. Based on traditional Von Neumann uniprocessor architecture, instructions are executed sequentially or serially, one step after the next. Until most recently, most computers are of SISD type.

SISD Simple Diagrammatic Representation

SINGLE INSTRUCTION, MULTIPLE DATA (SIMD) An SIMD machine executes a single instruction on multiple data values simultaneously using many processors. Since there is only one instruction, each processor does not have to fetch and decode each instruction. Instead, a single control unit does the fetch and decoding for all processors. SIMD architectures include array processors.

SIMD Simple Diagrammatic Representation

MULTIPLE INSTRUCTION, MULTIPLE DATA (MIMD) MIMD machines are usually referred to as multiprocessors or multicomputer. It may execute multiple instructions simultaneously, contrary to SIMD machines. Each processor must include its own control unit that will assign to the processors parts of a task or a separate task. It has two subclasses: Shared memory and distributed memory.

MIMD Simple Diagrammatic Representation (Shared Memory) Simple Diagrammatic Representation(DistributedMemory)

MULTIPLE INSTRUCTION, SINGLE DATA (MISD) This category does not actually exist. This category was included in the taxonomy for the sake of completeness.

MIMD System Architectures Finally, the architecture of a MIMD system, contrast to its topology, refers to its connections to its system memory. A systems may also be classified by their architectures. Two of these are: Uniform memory access (UMA) Nonuniform memory access (NUMA)

UNIFORM MEMORY ACCESS (UMA) The UMA is a type of symmetric multiprocessor, or SMP, that has two or more processors that perform symmetric functions. UMA gives all CPUs equal (uniform) access to all memory locations in shared memory. They interact with shared memory by some communications mechanism like a simple bus or a complex multistage interconnection network.

UNIFORM MEMORY ACCESS (UMA) ARCHITECTURE Processor 1 Communications mechanism Processor 2 Shared Memory Processor n

NONUNIFORM MEMORY ACCESS (NUMA) NUMA architectures, unlike UMA architectures do not allow uniform access to all shared memory locations. This architecture still allows all processors to access all shared memory locations but in a nonuniform way, each processor can access its local shared memory more quickly than the other memory modules not next to it.

Nonuniform memory access (NUMA) Architecture Processor 1 Processor 2 Processor n Memory 1 Memory 2 Memory n Communications mechanism

THANK YOU