Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.

Slides:

Advertisements

Similar presentations

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.

Advertisements

Today’s topics Single processors and the Memory Hierarchy

Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.

CSCI-455/522 Introduction to High Performance Computing Lecture 2.

CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.

Parallel Architectures: Topologies Heiko Schröder, 2003.

History of Distributed Systems Joseph Cordina

IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)

Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.

2. Multiprocessors Main Structures 2.1 Shared Memory x Distributed Memory Shared-Memory (Global-Memory) Multiprocessor:  All processors can access all.

Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.

Lecture 10 Outline Material from Chapter 2 Interconnection networks Processor arrays Multiprocessors Multicomputers Flynn’s taxonomy.

Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.

Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.

Parallel Computer Architectures

4. Multiprocessors Main Structures 4.1 Shared Memory x Distributed Memory Shared-Memory (Global-Memory) Multiprocessor:  All processors can access all.

Introduction to Parallel Processing Ch. 12, Pg

Flynn’s Taxonomy of Computer Architectures Source: Wikipedia Michael Flynn 1966 CMPS 5433 – Parallel Processing.

1 Parallel computing and its recent topics. 2 Outline 1. Introduction of parallel processing (1)What is parallel processing (2)Classification of parallel.

Course Outline Introduction in software and applications. Parallel machines and architectures –Overview of parallel machines –Cluster computers (Myrinet)

CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.

Multiprocessor systems Objective n the multiprocessors’ organization and implementation n the shared-memory in multiprocessor n static and dynamic connection.

1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.

Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.

Department of Computer Science University of the West Indies.

Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.

CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page

Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

CSCI 232© 2005 JW Ryder1 Parallel Processing Large class of techniques used to provide simultaneous data processing tasks Purpose: Increase computational.

An Overview of Parallel Computing. Hardware There are many varieties of parallel computing hardware and many different architectures The original classification.

Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters,

Parallel architecture Technique. Pipelining Processor Pipelining is a technique of decomposing a sequential process into sub-processes, with each sub-process.

MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 8 Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,

PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.

Parallel Computing.

Data Structures and Algorithms in Parallel Computing Lecture 1.

2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract.

Outline Why this subject? What is High Performance Computing?

Computer Architecture And Organization UNIT-II Flynn’s Classification Of Computer Architectures.

Lecture 3: Computer Architectures

Parallel Processing Presented by: Wanki Ho CS147, Section 1.

3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.

Background Computer System Architectures Computer System Software.

Classification of parallel computers Limitations of parallel processing.

Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.

Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.

Overview Parallel Processing Pipelining

CLASSIFICATION OF PARALLEL COMPUTERS

Distributed and Parallel Processing

Multiprocessor Systems

buses, crossing switch, multistage network.

Course Outline Introduction in algorithms and applications

Constructing a system with multiple computers or processors

What is Parallel and Distributed computing?

Data Structures and Algorithms in Parallel Computing

Parallel Architectures Based on Parallel Computing, M. J. Quinn

Chapter 17 Parallel Processing

Outline Interconnection networks Processor arrays Multiprocessors

Constructing a system with multiple computers or processors

Constructing a system with multiple computers or processors

buses, crossing switch, multistage network.

Overview Parallel Processing Pipelining

AN INTRODUCTION ON PARALLEL PROCESSING

Constructing a system with multiple computers or processors

Part 2: Parallel Models (I)

COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING

Presentation transcript:

Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing

Fall 2008Introduction to Parallel Processing2 Parallel Systems in CYUT IBM SP2 (2 Nodes) Model RS/6000 SP-375MHz Wide Node CPU 2 (64-bit POWER3-II) per node Memory 3GB per node OS AIX4.3.3 Software MATLAB 、 SPSS IBM H80 Model RS/ H80 CPU 4 (64-bit PS64 III 500MHz) Memory 5GB OS AIX4.3.3 DB Sybase

Fall 2008Introduction to Parallel Processing3 Sun Fire 6800 CPU 20 (UltraSPACEIII) (12 * 1.2GHZ ; 8 * 900MHZ ) Memory 40GB OS Solaris 9.0 Storage 1.3TB Software Server 、 Web Server 、 Directory Server 、 Application Server 、 DB Server

Fall 2008Introduction to Parallel Processing4 The fastest computer of world  CPU: 6,562 dual-core AMD Opteron ® chips and 12,240 PowerXCell 8i chips  Memory: 98 TBs  OS: Linux  Speed: 1026 Petaflop/s IBM Roadrunner / June 2008

Fall 2008Introduction to Parallel Processing5 Parallel Computing  Parallel Computing is a central and important problem in many computationally intensive applications, such as image processing, database processing, robotics, and so forth.  Given a problem, the parallel computing is the process of splitting the problem into several subproblems, solving these subproblems simultaneously, and combing the solutions of subproblems to get the solution to the original problem.

Fall 2008Introduction to Parallel Processing6 Parallel Computer Structures  Pipelined Computers : a pipeline computer performs overlapped computations to exploit temporal parallelism.  Array Processors : an array processor uses multiple synchronized arithmetic logic units to achieve spatial parallelism.  Multiprocessor Systems : a multiprocessor system achieves asynchronous parallelism through a set of interactive processors.

Fall 2008Introduction to Parallel Processing7 Pipeline Computers  Normally, four major steps to execute an instruction:  Instruction Fetch (IF)  Instruction Decoding (ID)  Operand Fetch (OF)  Execution (EX)

Fall 2008Introduction to Parallel Processing8 Nonpipelined Processor

Fall 2008Introduction to Parallel Processing9 Pipeline Processor

Fall 2008Introduction to Parallel Processing10 Array Computers  An array processor is a synchronous parallel computer with multiple arithmetic logic units, called processing elements (PE), that can operate in parallel.  The PEs are synchronized to perform the same function at the same time.  Only a few array computers are designed primarily for numerical computation, while the others are for research purposes.

Fall 2008Introduction to Parallel Processing11 Functional structure of array computer

Fall 2008Introduction to Parallel Processing12 Multiprocessor Systems  A multiprocessor system is a single computer that includes multiple processors (computer modules).  Processors may communicate and cooperate at different levels in solving a given problem.  The communication may occur by sending messages from one processor to the other or by sharing a common memory.  A multiprocessor system is controlled by one operating system which provides interaction between processors and their programs at the process, data set, and data element levels.

Fall 2008Introduction to Parallel Processing13 Functional structure of multiprocessor system

Fall 2008Introduction to Parallel Processing14 Multicomputers  There is a group of processors, in which each of the processors has sufficient amount of local memory.  The communication between the processors is through messages.  There is neither a common memory nor a common clock.  This is also called distributed processing.

Fall 2008Introduction to Parallel Processing15 Grid Computing  Grid Computing enables geographically dispersed computers or computing clusters to dynamically and virtually share applications, data, and computational resources.  It uses standard TCP/IP networks to provide transparent access to technical computing services wherever capacity is available, transforming technical computing into an information utility that is available across a department or organization.

Fall 2008Introduction to Parallel Processing16 Multiplicity of Instruction-Data Streams  In general, digital computers may be classified into four categories, according to the multiplicity of instruction and data streams.  An instruction stream is a sequence of instructions as executed by the machine.  A data stream is a sequence of data including input, partial, or temporary results, called for by the instruction stream.  Flynn’s four machine organizations : SISD, SIMD, MISD, MIMD.

Fall 2008Introduction to Parallel Processing17 SISD  Single Instruction stream-Single Data stream  Instructions are executed sequentially but may be overlapped in their execution stages (pipelining).

Fall 2008Introduction to Parallel Processing18 SIMD  Single Instruction stream-Multiple Data stream  There are multiple PEs supervised by the same control unit.

Fall 2008Introduction to Parallel Processing19 MISD  Multiple Instruction stream-Single Data stream  The results (output) of one processor may become the input of the next processor in the macropipe.  No real embodiment of this class exists.

Fall 2008Introduction to Parallel Processing20 MIMD  Multiple Instruction stream-Multiple Data stream  Most Multiprocessor systems and Multicomputer systems can be classified in this category.

Fall 2008Introduction to Parallel Processing21 Shared-Memory Multiprocessors  Tightly-Coupled MIMD architectures shared memory among its processors.  Interconnected architecture:  Bus-connected architecture – the processors, parallel memories, network interfaces, and device controllers are tied to the same connection bus.  Directly connect architecture – the processors are connected directly to the high-end mainframes.

Fall 2008Introduction to Parallel Processing22 Distributed-Memory Multiprocessors  Loosely coupled MIMD architectures have distributed local memories attached to multiple processor nodes.  Message passing is the major communication method among the processor.  Most multiprocessors are designed to be scalable in performance.

Fall 2008Introduction to Parallel Processing23 Network Topologies  Let’s assume processors function independently and communicate with each other. For these communications, the processors must be connected using physical links. Such a model is called a network model or direct-connection machine.  Network topologies:  Complete Graph (Fully Connected Network)  Hypercubes  Mesh Network  Pyramid Network  Star Graphs

Fall 2008Introduction to Parallel Processing24 Complete Graph  Complete graph is a fully connected network.  The distance between any two processor (or processing nodes) is always 1.  If complete graph network with n nodes, each node has degree n-1.  An example of n = 5:

Fall 2008Introduction to Parallel Processing25 Hypercubes (k-cube)  A k-cube is a k-regular graph with 2 k nodes which are labeled by the k-bits binary numbers.  A k-regular graph is a graph in which each node has degree k.  The distance between two nodes a = (a 1 a 2 …a k ) and b = (b 1 b 2 …b k ) is the number of bits in which a and b differ. If two nodes is adjacent to each other, their distance is 1 (only 1 bit differ.)  If a hypercube with n nodes (n = 2 k ), the longest distance between any two nodes is log 2 n (=k).

Fall 2008Introduction to Parallel Processing26 Hypercube Structures k = 1 k = 3 k = k =

Fall 2008Introduction to Parallel Processing27 Mesh Network  The arrangement of processors in the form of a grid is called a mesh network.  A 2-dimensional mesh:  A k-dimensional mesh is a set of (k-1) dimensional meshes with corresponding processor communications.

Fall 2008Introduction to Parallel Processing28 3-Dimensional Mesh A 3-d mesh with 4 copies of 4  4 2-d meshes

Fall 2008Introduction to Parallel Processing29 Pyramid Network  A pyramid network is constructed similar to a rooted tree. The root contains one processor.  At the next level there are four processors in the form of a 2-dimensional mesh and all the four are children of the root.  All the nodes at the same level are connected in the form of a 2-dimensional mesh.  Each nonleaf node has four children nodes at the next level.  The longest distance between any two nodes is 2  height of the tree.

Fall 2008Introduction to Parallel Processing30 Pyramid Network Structure A pyramid of height 2

Fall 2008Introduction to Parallel Processing31 Star Graphs  k-star graph, consider the permutation with k symbols.  There are n nodes, if there are n (=k!) permutations.  Any two nodes are adjacent, if and only if their corresponding permutations differ only in the leftmost and in any one other position.  A k-star graph can be considered as a connection of k copies of (k-1)-star graphs.

Fall 2008Introduction to Parallel Processing32 A 3-Star Graph  k=3, there are 6 permutations: P 0 = (1, 2, 3)P 5 = (3, 2, 1) P 3 = (2, 3, 1)P 2 = (2, 1, 3) P 1 = (1, 3, 2)P 4 = (3, 1, 2) What degree of each node for 4-star graph?