High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub

Slides:

Advertisements

Similar presentations

Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.

Advertisements

1 Interconnection Networks Direct Indirect Shared Memory Distributed Memory (Message passing)

CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures.

1 Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E)

Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.

Interconnection Networks 1 Interconnection Networks (Chapter 6) References: [1,Wilkenson and Allyn, Ch. 1] [2, Akl, Chapter 2] [3, Quinn, Chapter 2-3]

1 CSE 591-S04 (lect 14) Interconnection Networks (notes by Ken Ryu of Arizona State) l Measure –How quickly it can deliver how much of what’s needed to.

Parallel Routing Bruce, Chiu-Wing Sham. Overview Background Routing in parallel computers Routing in hypercube network –Bit-fixing routing algorithm –Randomized.

NUMA Mult. CSE 471 Aut 011 Interconnection Networks for Multiprocessors Buses have limitations for scalability: –Physical (number of devices that can be.

Interconnection Network PRAM Model is too simple Physically, PEs communicate through the network (either buses or switching networks) Cost depends on network.

DS - IV - TT - 1 HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK DEPENDABLE SYSTEMS Vorlesung 4 Topological Testing Wintersemester 2000/2001 Leitung:

7. Fault Tolerance Through Dynamic or Standby Redundancy 7.6 Reconfiguration in Multiprocessors Focused on permanent and transient faults detection. Three.

Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.

1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,

Introduction to Parallel Processing Ch. 12, Pg

Interconnect Network Topologies

Interconnection Networks. Applications of Interconnection Nets Interconnection networks are used everywhere! ◦ Supercomputers – connecting the processors.

1 Lecture 23: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm Next semester:

Network Topologies Topology – how nodes are connected – where there is a wire between 2 nodes. Routing – the path a message takes to get from one node.

CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.

CSE Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.

Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page

1 Dynamic Interconnection Networks Miodrag Bolic.

CSCI 232© 2005 JW Ryder1 Parallel Processing Large class of techniques used to provide simultaneous data processing tasks Purpose: Increase computational.

Lecture 3 Innerconnection Networks for Parallel Computers

Computer Science and Engineering Parallel and Distributed Processing CSE 8380 January Session 4.

Birds Eye View of Interconnection Networks

1 Interconnection Networks. 2 Interconnection Networks Interconnection Network (for SIMD/MIMD) can be used for internal connections among: Processors,

Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture.

Super computers Parallel Processing

INTERCONNECTION NETWORKS Work done as part of Parallel Architecture Under the guidance of Dr. Edwin Sha By Gomathy Gowri Narayanan Karthik Alagu Dynamic.

COMP8330/7330/7336 Advanced Parallel and Distributed Computing Tree-Based Networks Cache Coherence Dr. Xiao Qin Auburn University

Interconnection Networks Communications Among Processors.

1 Computer Architecture & Assembly Language Spring 2001 Dr. Richard Spillman Lecture 26 – Alternative Architectures.

INTERCONNECTION NETWORK

Network Connected Multiprocessors

Overview Parallel Processing Pipelining

Parallel Architecture

Distributed and Parallel Processing

Interconnect Networks

Multiprocessor Systems

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.

Dynamic connection system

Lecture 23: Interconnection Networks

Connection System Serve on mutual connection processors and memory .

Interconnection topologies

Refer example 2.4on page 64 ACA(Kai Hwang) And refer another ppt attached for static scheduling example.

Overview Parallel Processing Pipelining

Static and Dynamic Networks

Parallel and Multiprocessor Architectures

Lecture 16: Parallel Algorithms I

Butterfly Network A butterfly network consists of (K+1)2^k nodes divided into K+1 Rows, or Ranks. Let node (i,j) refer to the jth node in the ith Rank.

Multiprocessors Interconnection Networks

Multiprocessor Introduction and Characteristics of Multiprocessor

Interconnection Network Design Lecture 14

Mesh-Connected Illiac Networks

Communication operations

Static Interconnection Networks

Advanced Computer and Parallel Processing

Network Topologies Charles Warren.

Interconnection Networks Contd.

Multiprocessors Interconnection Networks

Birds Eye View of Interconnection Networks

Interconnection Networks

ECE 753: FAULT-TOLERANT COMPUTING

Advanced Computer and Parallel Processing

Static Interconnection Networks

Multiprocessors and Multi-computers

Presentation transcript:

High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub

Overview of Interconnection Networks SIMD Interconnection Networks Bidirectional Single Loop Structure Advantages Total # of links is small (N) The degree of a node is "2" Disadvantages Diameter is big (N/2)  average message delay and message traffic density is proportional to N (for N processor loop) Fault in any two non-adjacent nodes will disconnect the loop January 11, 2019 Dr. Mahgoub

Suitable Applications Parallel algorithms that contain assignment of the following type: x [i] := x [i-1] + x [i+1] -2* x [i] January 11, 2019 Dr. Mahgoub

Completely Connected Networks Advantages Minimum diameter (1) Disadvantages Cost is prop. to N2 => N < 5 (N  # of nodes) Degree of a node (N-1) is high (N=4) Completely Connected Network January 11, 2019 Dr. Mahgoub

Tree Structure Binary Tree Advantages Low degree of a node Good inter-node distance Line and connection costs is prop. to N (N  # of nodes) Disadvantages High message traffic density through single nodes (especially at the second level) January 11, 2019 Dr. Mahgoub

• Any fault at the root will completely dislocate the system B  # of branches L  # of levels Tree Structure B=2, L=3 January 11, 2019 Dr. Mahgoub

Hypercube Connection Structures In general a hypercube structure can be obtained if the number of nodes N equals WD; W and D being integers if W = 2 and D = n (n is an integer)  the hypercube structure reduces to the Boolean n-cube January 11, 2019 Dr. Mahgoub

The Boolean n -cube N=2ⁿ January 11, 2019 Dr. Mahgoub

Mesh Nodes arranged into a q-dimensional lattice Communication is allowed only between neighboring nodes  interior nodes communicate with 2q other processors Some variants of the mesh allow (toroidal) wrap-around connections between processors on the edge January 11, 2019 Dr. Mahgoub

Mesh Two dimensional mesh With wrap -around ? With toroidal wrap - around ? January 11, 2019 Dr. Mahgoub

Applications Efficient sorting and Matrix Multiplication Algorithms have been designed for meshes of processors. Suitable for solving systems of second-order partial differential equations. January 11, 2019 Dr. Mahgoub

Consists of n=2K nodes numbered Shuffle - Exchange Consists of n=2K nodes numbered 0, 1, ..., n-1, and two kinds of connections, called shuffle and exchange. Exchange Links pairs of nodes whose numbers differ in their least significant bit Perfect Shuffle i --> 2i mod (n-1) (node n - 1 is connected to itself) January 11, 2019 Dr. Mahgoub

<--------> : Shuffle ------------: Exchange Shuffle - Exchange Network with 8 Nodes January 11, 2019 Dr. Mahgoub

Cube-Connected Cycle Obtained by replacing each node of the k-dimensional cube by a ring or a cycle of k nodes  has kx2k nodes Rings numbered from 0 to 2k-1 Nodes on a ring numbered from 0 to k-1 If two rings have numbers differing by "2i," connect node "i" on these rings January 11, 2019 Dr. Mahgoub

3 – Cube Cube Connected Cycles January 11, 2019 Dr. Mahgoub

MIMD Networks Bus network Multiple-Bus Crossbar Multistage Single Bus Network January 11, 2019 Dr. Mahgoub

Advantages and Disadvantages of Single Shared Bus Networks Multiple-Bus Networks Emerged as a solution for the bus contention problem in the single bus schemes Allow easy incremental expansions of the number of processors and memory modules in the system The buses can be configured in a variety of ways to provide a range of trade-offs between bandwidth, connection cost and reliability January 11, 2019 Dr. Mahgoub

An MxNxB Multiple - Bus Multiprocessor January 11, 2019 Dr. Mahgoub

Crossbar Networks A crossbar can be viewed as a number of vertical and horizontal links interconnected by a switch in each intersection A 4 x 4 Crossbar Switch January 11, 2019 Dr. Mahgoub

Advantages and Disadvantages of Crossbar Networks Multistage Networks Less complex than the crossbar Contains a number of switching elements (like crossbar switches) which typically have the same number of inputs and outputs (say "k") The switching elements are organized in "logk N" stages with "N/k" switching elements in each stage where N # of processors and memory modules January 11, 2019 Dr. Mahgoub

• One unique path exists between every processor-memory pair A request must pass through all stages of switching elements to reach its final destination  The latency time equals O (logk N) January 11, 2019 Dr. Mahgoub

Omega Networks Each switch box (2x2) has four switch functions: Straight Through Interchange January 11, 2019 Dr. Mahgoub

Upper Broadcast Lower Broadcast January 11, 2019 Dr. Mahgoub

010---------><101> 8 x 8 Omega Network January 11, 2019 Dr. Mahgoub

Typically distributed Omega is a self routing network Routing of requests Typically distributed Omega is a self routing network Outputs of a switching element are numbered "0" and "1“ If the destination address is <do d1 ... dn-1> where N=2n, then the switching element in stage "i" sends the message to output "di" January 11, 2019 Dr. Mahgoub

Disadvantages Improvements Advantages January 11, 2019 Dr. Mahgoub