Interconnection Networks 1 Interconnection Networks (Chapter 6) References: [1,Wilkenson and Allyn, Ch. 1] [2, Akl, Chapter 2] [3, Quinn, Chapter 2-3]

Slides:



Advertisements
Similar presentations
Shantanu Dutt Univ. of Illinois at Chicago
Advertisements

Basic Communication Operations
Lecture 9: Group Communication Operations
Parallel Architectures: Topologies Heiko Schröder, 2003.
Parallel Architectures: Topologies Heiko Schröder, 2003.
1 Interconnection Networks Direct Indirect Shared Memory Distributed Memory (Message passing)
1 Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E)
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
Parallel Routing Bruce, Chiu-Wing Sham. Overview Background Routing in parallel computers Routing in hypercube network –Bit-fixing routing algorithm –Randomized.
Interconnection Network PRAM Model is too simple Physically, PEs communicate through the network (either buses or switching networks) Cost depends on network.
1 Tuesday, October 03, 2006 If I have seen further, it is by standing on the shoulders of giants. -Isaac Newton.
2. Computational Models 1 MODELS OF COMPUTATION (Chapter 2) Models –An abstract description of a real world entity –Attempts to capture the essential features.
CS 684.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
Interconnection Network Topologies
3. Interconnection Networks. Historical Perspective Early machines were: Collection of microprocessors. Communication was performed using bi-directional.
Topic Overview One-to-All Broadcast and All-to-One Reduction
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
1 Static Interconnection Networks CEG 4131 Computer Architecture III Miodrag Bolic.
Connecting LANs, Backbone Networks, and Virtual LANs
CECS 474 Computer Network Interoperability WAN Technologies & Routing
Interconnect Network Topologies
Interconnection Networks. Applications of Interconnection Nets Interconnection networks are used everywhere! ◦ Supercomputers – connecting the processors.
Course Outline Introduction in software and applications. Parallel machines and architectures –Overview of parallel machines –Cluster computers (Myrinet)
Interconnect Networks
Network Topologies Topology – how nodes are connected – where there is a wire between 2 nodes. Routing – the path a message takes to get from one node.
CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
Basic Communication Operations Based on Chapter 4 of Introduction to Parallel Computing by Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar These.
Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.
Topic Overview One-to-All Broadcast and All-to-One Reduction All-to-All Broadcast and Reduction All-Reduce and Prefix-Sum Operations Scatter and Gather.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
1 Dynamic Interconnection Networks Miodrag Bolic.
Lecture 3 Innerconnection Networks for Parallel Computers
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters,
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar Reduced slides for CSCE 3030 To accompany the text ``Introduction.
1 Interconnection Networks. 2 Interconnection Networks Interconnection Network (for SIMD/MIMD) can be used for internal connections among: Processors,
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
Winter 2014Parallel Processing, Fundamental ConceptsSlide 1 2 A Taste of Parallel Algorithms Learn about the nature of parallel algorithms and complexity:
Super computers Parallel Processing
Parallel Computers 1 References: [1] - [4] given below; [5] & [6] given on slide Chapter 1, “Parallel Programming ” by Wilkinson, el. 2.Chapter 1,
HYPERCUBE ALGORITHMS-1
Basic Communication Operations Carl Tropper Department of Computer Science.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Copyright 2007 Koren & Krishna, Morgan-Kaufman Part.12.1 FAULT TOLERANT SYSTEMS Part 12 - Networks.
Parallel Processing & Distributed Systems Thoai Nam Chapter 3.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Tree-Based Networks Cache Coherence Dr. Xiao Qin Auburn University
Interconnection Networks Communications Among Processors.
2016/7/2Appendices A and B1 Introduction to Distributed Algorithm Appendix A: Pseudocode Conventions Appendix B: Graphs and Networks Teacher: Chun-Yuan.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs in Parallel Machines Dr. Xiao Qin Auburn University
INTERCONNECTION NETWORK
Computer Network Topology
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.
Lecture 23: Interconnection Networks
Connection System Serve on mutual connection processors and memory .
Interconnection topologies
Course Outline Introduction in algorithms and applications
Butterfly Network A butterfly network consists of (K+1)2^k nodes divided into K+1 Rows, or Ranks. Let node (i,j) refer to the jth node in the ith Rank.
Interconnection Network Design Lecture 14
Mesh-Connected Illiac Networks
Static Interconnection Networks
High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub
Embedded Computer Architecture 5SAI0 Interconnection Networks
Static Interconnection Networks
Chapter 2 from ``Introduction to Parallel Computing'',
Presentation transcript:

Interconnection Networks 1 Interconnection Networks (Chapter 6) References: [1,Wilkenson and Allyn, Ch. 1] [2, Akl, Chapter 2] [3, Quinn, Chapter 2-3] [25, Kumar, et. al.] [26, Tom Leighton, Introduction to Parallel Algorithms and Architectures] Comments on References: Reference [3] is a particularly good reference for this chapter Reference [26] is a classic and gives an detailed coverage of different networks. Review and Additional Network Concepts: A link is the connection between two nodes. –A switch that enables packets to be routed through the node to other nodes without disturbing the processor is assumed. –The link between two nodes can be either bidirectional or use two directional links. –Either one wire to carry one bit or parallel wires (one wire for each bit in word) can be used. –The above choices do not have a major impact on the concepts presented in this course.

Interconnection Networks 2 Interconnection Network Terminology (cont.) The diameter is the minimal number of links between the two farthest nodes in the network. –The diameter of a network gives the maximal distance a single message may have to travel. The bisection width of a network is the number of links that must be cut to divide the network of n PEs into two (almost) equal parts,  n/2  and  n/2 . The below terminology is given in [1] –The bandwidth is the number of bits that can be transmitted in unit time (i.e., bits per second). –The network latency is the time required to transfer a message through the network. The communication latency is the total time required to send a message, including software overhead and interface delay. The message latency or startup time is the time required to send a zero-length message. –Software and hardware overhead, such as »finding a route »packing and unpacking the message

Interconnection Networks 3 Interconnection Network Examples Completely Connected Network –Each of n nodes has a link to every other node. –Requires n(n-1)/2 links –Impractical, unless very few processors Line/Ring Network –A line consists of a row of n nodes, with connection to adjacent nodes. –Called a ring when a link is added to connect the two end nodes of a line. –The line/ring networks have many applications. –Diameter of a line is n-1 and of a ring is  n/2 . –Minimal distance, deadlock-free parallel routing algorithm: Go shorter of left or right.

Interconnection Networks 4 Interconnection Network Examples (cont) The Mesh Interconnection Network –Each node in a 2D mesh is connected to all four of its nearest neighbors. –The diameter of a  n  n mesh is 2(  n - 1) –Has a minimal distance, deadlock-free parallel routing algorithm: First route message up or down and then right or left to its destination. –If the horizonal and vertical ends of a mesh to the opposite sides, the network is called a torus. –Meshes have been used more on actual computers than any other network. –A 3D mesh is a generalization of a 2D mesh and has been used in several computers. –The fact that 2D and 3D meshes model physical space make them useful for many scientific and engineering problems.

Interconnection Networks 5 Interconnection Network Examples (cont) Binary Tree Network –A binary tree network is normally assumed to be a complete binary tree. –It has a root node, and each interior node has two links connecting it to nodes in the level below it. –The height of the tree is  lg n  and its diameter is 2  lg n . –In an m-ary tree, each interior node is connected to m nodes on the level below it. –The tree is particularly useful for divide-and- conquer algorithms. –Unfortunately, the bisection width of a tree is 1 and the communication traffic increases near the root, which can be a bottleneck. –In fat tree networks, the number of links is increased as the links get closer to the root. –Thinking Machines’ CM5 computer used a 4- ary fat tree network.

Interconnection Networks 6 Interconnection Network Examples (cont) Hypercube Network –A 0-dimensional hypercube consists of one node. –Recursively, a d-dimensional hypercube consists of two (d-1) dimensional hypercubes, with the corresponding nodes of the two (d-1) hypercubes linked. –Each node in a d-dimensional hypercube has d links. –Each node in a hypercube has a d-bit binary address. –Two nodes are connected if and only if their binary address differs by one bit. –A hypercube has n = 2 d PEs –Advantages of the hypercube include its low diameter of lg(n) or d its large bisection width of n/2 its regular structure. –An important practical disadvantage of the hypercube is that the number of links per node increases as the number of processors increase. Large hypercubes are difficult to implement. Usually overcome by increasing nodes by replacing each node with a ring of nodes. –Has a “minimal distance, deadlock-free parallel routing” algorithm called e-cube routing: At each step, the current address and the destination address are compared. Each message is sent to the node whose address is obtained by flipping the leftmost digit of current address where two addresses differ.

Interconnection Networks 7 Some Additional Networks Shuffle Exchange –Let n be a power of 2 and P 0, P 1,..., P n-1 denote the processors. –A perfect-shuffle connection is a one-way communication link that exists from P i to P 2i if i < n/2 and P i to P 2i+1-n if i  n/2 –Alternately, a perfect-shuffle connection exists between P i and P k if a left one-digit circular rotation of i, expressed in binary, produces k. –Its name is due to fact that if a deck of cards were “shuffled perfectly”, the shuffle link of i gives the final shuffled position of card i Example: See Figure 2.15 of [2, Akl]. –An exchange connection link is a two way link that exists between P i and P i+1 when i is even. –Figure 2.14 of [2, Akl] illustrates the shuffle & exchange links for 8 processors. –The reverse of a perfect shuffle link is called an unshuffle link. –A network with the shuffle, unshuffle, and exchange connections is called a shuffle- exchange network.

Interconnection Networks 8 Cube-Connected Cycles (or CCC) –A problem with the hypercube network with n=2 q PEs is the large number of links each processor must support when q is large. –The CCC solves this problem by replacing each node of the q-dimensional hypercube with a ring of q processors, each connected to 3 PEs: its two neighbors in the ring one processor in the ring of a neighboring hypercube node. –Example: See Figure 2.18 in [2, Akl] Network Metrics: Recall Metrics for comparing network topologies –Degree The degree of network is the maximum number of links incident on any processor. Each link uses a port on the processor, so the most economical network has the lowest degree –Diameter The distance between two processors P and Q is the number of links on the shortest path from P to Q.

Interconnection Networks 9 Comparison of Network Topologies (cont) –The diameter of a network is the maximum distance between pairs of processors. –The bisection width of a network is the minimum number of edges that must be cut to divide the network into two halves (within one). Table 2.21in [2] (reproduced below) compares the topologies of the networks we have discussed. –See Table 3-1 of Quinn for additional details. Topology Degree Diameter Bis. W. ==================================== Linear Array 2 O(n) 1 Mesh 4 O( ) n Tree 3 O(lg n) 1 Shuffle-Exchange 3 O(lg n) Hypercube O(lg n) O(lg n) 2 d-1 Cube-Con. Cycles 3 O(lg n) 2 d-1

Interconnection Networks 10 Embedding References: [1, Wilkinson], [3, Quinn], [25, Kumar, et. al], [26, Leighton]. The coverage in Quinn is good, but does not cover a few topics covered here. Leighton has an encyclopedic coverage of many interconnection network topics including models, algorithms, and embeddings. Coverage currently follows [1]; this will change in future. An embedding is a 1-1 function (also called a mapping) that specifies how the nodes of a domain network can be mapped into a range network. –Each node in range network is the target of at most one node in the domain network, unless specified otherwise. –The domain network should “cover” or map onto as may nodes as possible in the range network (i.e., keep range network “small”) –Reference 1 calls an embedding perfect if each link in the domain network corresponds under the mapping to one link in the range network. Nearest neighbors are preserved by mapping. –A perfect embedding of a ring onto a torus is shown in [1, Fig. 1.15]. – A perfect embedding of a mesh/torus in a hypercube is given in 1, Figure 1.16]. Uses Gray code along each mesh dimension.

Interconnection Networks 11 The dilation of an embedding is the maximum number of links in the range network corresponding to one link in the domain network (i.e., its ‘stretch’) –Perfect embeddings have a dilation of 1. Embedding of binary trees in other networks are used in Ch. 3-4 in [1] for broadcasts and reductions. Some results on binary trees embeddings follow. –Theorem: A complete binary tree of height greater than 4 can not be embedded in a 2-D mesh with a dilation of 1. (Quinn, 1994, pg135) –Hmwk Problem: A dilation-2 embedding of a binary tree of height 4 is given in [1, Fig. 1.17]. Find a dilation-1 embedding of this binary tree. –Theorem: There exists an embedding of a complete binary tree of height n into a 2D mesh with dilation  n/2 . –Theorem: A complete binary tree of height n has a dilation-2 embedding in a hypercube of dimension n+1 for all n > 1. Note: Network embeddings allow algorithms for the domain network to be executed using the target nodes and specified links of the range network. Warning: In [1], the authors often use the words “onto” and “into” incorrectly, as an embedding is technically a mapping (i.e., a 1-1 function). Their treatment also contains some other wording errors.