Indirect Networks or Dynamic Networks

Slides:



Advertisements
Similar presentations
Shantanu Dutt Univ. of Illinois at Chicago
Advertisements

Routing Permutation in the Baseline Network and in the Omega Network Student : Tzu-hung Chen 陳子鴻 Advisor : Chiuyuan Chen Department of Applied Mathematics.
NC 論2 (No.2) 1 Indirect (dynamic) Networks Communication between any two nodes has to be carried through some switches. Classified into: –Crossbar network.
Flattened Butterfly: A Cost-Efficient Topology for High-Radix Networks ______________________________ John Kim, William J. Dally &Dennis Abts Presented.
1 Omega Network The omega network is another example of a banyan multistage interconnection network that can be used as a switch fabric The omega differs.
1 Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E)
Interconnection Networks 1 Interconnection Networks (Chapter 6) References: [1,Wilkenson and Allyn, Ch. 1] [2, Akl, Chapter 2] [3, Quinn, Chapter 2-3]
1 CSE 591-S04 (lect 14) Interconnection Networks (notes by Ken Ryu of Arizona State) l Measure –How quickly it can deliver how much of what’s needed to.
NUMA Mult. CSE 471 Aut 011 Interconnection Networks for Multiprocessors Buses have limitations for scalability: –Physical (number of devices that can be.
Interconnection Network PRAM Model is too simple Physically, PEs communicate through the network (either buses or switching networks) Cost depends on network.
1 Interface Circuits Homepage Address: Course Manuscripts Homework Evaluation.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
1 Lecture 24: Parallel Algorithms I Topics: sort and matrix algorithms.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 7 Multiprocessors and Multicomputers 7.1 Multiprocessor System Interconnects.
Interconnection Network Topology Design Trade-offs
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
ECE669 L16: Interconnection Topology March 30, 2004 ECE 669 Parallel Computer Architecture Lecture 16 Interconnection Topology.
© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Topologies.
Quasi Fat Trees for HPC Clouds and their Fault-Resilient Closed-Form Routing Technion - EE Department; *and Mellanox Technologies Eitan Zahavi* Isaac Keslassy.
Interconnect Network Topologies
Interconnection Networks. Applications of Interconnection Nets Interconnection networks are used everywhere! ◦ Supercomputers – connecting the processors.
Interconnect Networks
Network Topologies Topology – how nodes are connected – where there is a wire between 2 nodes. Routing – the path a message takes to get from one node.
A Scalable, Commodity Data Center Network Architecture Jingyang Zhu.
Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.
1 Dynamic Interconnection Networks Miodrag Bolic.
Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 3, 2000 Topics Network design issues Network Topology.
Switches and indirect networks Computer Architecture AMANO, Hideharu Textbook pp. 92~13 0.
Lecture 3 Innerconnection Networks for Parallel Computers
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 January Session 4.
EE384y EE384Y: Packet Switch Architectures Part II Scaling Crossbar Switches Nick McKeown Professor of Electrical Engineering and Computer Science,
Shanghai Jiao Tong University 2012 Indirect Networks or Dynamic Networks Guihai Chen …with major presentation contribution from José Flich, UPV (and Cell.
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Birds Eye View of Interconnection Networks
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 18: April 2, 2014 Interconnect 4: Switching.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 February Session 10.
INTERCONNECTION NETWORKS Work done as part of Parallel Architecture Under the guidance of Dr. Edwin Sha By Gomathy Gowri Narayanan Karthik Alagu Dynamic.
Topology How the components are connected. Properties Diameter Nodal degree Bisection bandwidth A good topology: small diameter, small nodal degree, large.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
Effective bandwidth with link pipelining Pipeline the flight and transmission of packets over the links Overlap the sending overhead with the transport.
INTERCONNECTION NETWORK
Topologies.
EE384Y: Packet Switch Architectures Scaling Crossbar Switches
Overview Parallel Processing Pipelining
Interconnect Networks
Data Center Network Architectures
CS 704 Advanced Computer Architecture
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.
Chapter 8 Switching Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Dynamic connection system
Interconnection Networks: Topology
Lecture 23: Interconnection Networks
Connection System Serve on mutual connection processors and memory .
Refer example 2.4on page 64 ACA(Kai Hwang) And refer another ppt attached for static scheduling example.
Packet Switching (basics)
Static and Dynamic Networks
Lecture 16: Parallel Algorithms I
Multiprocessors Interconnection Networks
Interconnection Network Design Lecture 14
ESE534: Computer Organization
High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub
Embedded Computer Architecture 5SAI0 Interconnection Networks
Dynamic Interconnection Networks
Birds Eye View of Interconnection Networks
Circuit Switch Design Principles
Design Principles of Scalable Switching Networks
Presentation transcript:

Indirect Networks or Dynamic Networks Interconnection Networks Indirect Networks or Dynamic Networks Guihai Chen …with major presentation contribution from José Flich, UPV (and Cell BE EIB slides by Tom Ainsworth, USC)

Questions in mind Difference between Static/direct and Dynamic/indirect Networks Why Multi-Stage Interconnection Networks Large Switch and Small Switch How to design non-blocking MINs 2

Outline Network Topology Preliminaries and Evolution Centralized Switched (Indirect) Networks From non-blocking crossbar to blocking MINs From blocking MINs to non-blocking MINs 3

Network Topology Preliminaries and Evolution One switch suffices to connect a small number of devices Number of switch ports limited by VLSI technology, power consumption, packaging, and other such cost constraints A fabric of interconnected switches (i.e., switch fabric or network fabric) is needed when the number of devices is much larger The topology must make a path(s) available for every pair of devices—property of connectedness or full access (What paths?) Topology defines the connection structure across all components Bisection bandwidth: the minimum bandwidth of all links crossing a network split into two roughly equal halves Full bisection bandwidth: Network BWBisection = Injection (or Reception) BWBisection= N/2 Bisection bandwidth mainly affects performance Topology is constrained primarily by local chip/board pin-outs; secondarily, (if at all) by global bisection bandwidth 4

Network Topology Preliminaries and Evolution Several tens of topologies proposed, but less than a dozen used 1970s and 1980s Topologies were proposed to reduce hop count 1990s Pipelined transmission and switching techniques Packet latency became decoupled from hop count 2000s Topology still important (especially OCNs, SANs, P2P Overlays, DCNs) when N is high Topology impacts performance and has a major impact on cost 5

Network Topology Centralized Switched (Indirect) Networks Crossbar network Crosspoint switch complexity increases quadratically with the number of crossbar input/output ports, N, i.e., grows as O(N2) Has the property of being non-blocking 7 6 5 4 3 2 1 7 6 5 4 3 2 1 Here, Centralized Switched Network means Indirect network. 6

Network Topology From Crossbar to MINs 7

Network Topology Centralized Switched (Indirect) Networks Multistage interconnection networks (MINs) Crossbar split into several stages consisting of smaller crossbars Complexity grows as O(N × log N), where N is # of end nodes Inter-stage connections represented by a set of permutation functions 7 6 5 4 3 2 1 7 6 5 4 3 2 1 Omega topology, perfect-shuffle exchange 8

Network Topology Appendix Shuffle function N= 0…N-1 f(i)= 2i, when I <N/2 f(i)=(2i+1), mod N when i ≥ N/2 Often used as a connection pattern unshuffle function, also often used 7 6 5 4 3 2 1 7 6 5 4 3 2 1 Shuffle and Shift: What is their relation? Shuffle is actually the left-cyclic shift. perfect-shuffle unshuffle 9

Network Topology Centralized Switched (Indirect) Networks 0000 0000 0001 0001 0010 0010 0011 0011 0100 0100 0101 0101 0110 0110 0111 0111 1000 1000 1001 1001 1010 1010 1011 1011 1100 1100 1101 1101 1110 1110 1111 1111 16 port, 4 stage Omega network 10

Network Topology Centralized Switched (Indirect) Networks 0000 0000 0001 0001 0010 0010 0011 0011 0100 0100 0101 0101 0110 0110 0111 0111 1000 1000 What is the difference between Omega and Baseline? Which is better? Omega : not recursive; routing by comparing Source and Destination Baseline: recursive; routing only depending on Destination “Recursive” means large networks are composed of 2 or more small networks. 1001 1001 1010 1010 1011 1011 1100 1100 1101 1101 1110 1110 1111 1111 16 port, 4 stage Baseline network 11

Network Topology Centralized Switched (Indirect) Networks 0000 0000 0001 0001 0010 0010 0011 0011 0100 0100 0101 0101 0110 0110 0111 0111 1000 1000 1001 1001 1010 1010 1011 1011 1100 1100 1101 1101 1110 1110 1111 1111 16 port, 4 stage Butterfly network 12

Network Topology-Correction to Butterfly Centralized Switched (Indirect) Networks 0000 0000 0001 0001 0010 0010 0011 0011 0100 0100 0101 0101 0110 0110 0111 0111 1000 1000 I add this slide as a correction to the previous one. I think this one is more like butterfly. --gchen on Nov.20, 2011 1001 1001 1010 1010 1011 1011 1100 1100 1101 1101 1110 1110 1111 1111 16 port, 4 stage Butterfly network 13

Network Topology Centralized Switched (Indirect) Networks 0000 0000 0001 0001 0010 0010 0011 0011 0100 0100 0101 0101 0110 0110 0111 0111 1000 1000 Why is it called Cube Network? Nov. 20, 2011 1001 1001 1010 1010 1011 1011 1100 1100 1101 1101 1110 1110 1111 1111 16 port, 4 stage Cube network 14

Network Topology Centralized Switched (Indirect) Networks Multistage interconnection networks (MINs) MINs interconnect N input/output ports using k x k switches logkN switch stages, each with N/k switches N/k(logkN) total number of switches Example: Compute the switch and link costs of interconnecting 4096 nodes using a crossbar relative to a MIN, assuming that switch cost grows quadratically with the number of input/output ports (k). Consider the following values of k: MIN with 2 x 2 switches MIN with 4 x 4 switches MIN with 16 x 16 switches 15

Network Topology Centralized Switched (Indirect) Networks Example: compute the relative switch and link costs, N = 4096 cost(crossbar)switches = 40962 cost(crossbar)links = 8192 relative_cost(2 × 2)switches = 40962 / (22 × 4096/2 × log2 4096) = 170 relative_cost(2 × 2)links = 8192 / (4096 × (log2 4096 + 1)) = 2/13 = 0.1538 1)Relative cost: the larger, the better 2)For switch: the smaller, the better; For links, relative_cost(4 × 4)switches = 40962 / (42 × 4096/4 × log4 4096) = 170 relative_cost(4 × 4)links = 8192 / (4096 × (log4 4096 + 1)) = 2/7 = 0.2857 relative_cost(16 × 16)switches = 40962 / (162 × 4096/16 × log16 4096) = 85 relative_cost(16 × 16)links = 8192 / (4096 × (log16 4096 + 1)) = 2/4 = 0.5 16

Network Topology Centralized Switched (Indirect) Networks Relative switch and link costs for various values of k and N (crossbar relative to a MIN) Relative switch cost Relative link cost 17

From blocking to non-blocking again Network Topology From blocking to non-blocking again 18

non-blocking topology Network Topology Centralized Switched (Indirect) Networks Reduction in MIN switch cost comes at the price of performance Network has the property of being blocking Contention is more likely to occur on network links Paths from different sources to different destinations share one or more links 7 6 5 4 3 2 1 7 6 5 4 3 2 1 X 7 6 5 4 3 2 1 non-blocking topology blocking topology 19

Network Topology Centralized Switched (Indirect) Networks How to reduce blocking in MINs? Provide alternative paths! Use larger switches (can equate to using more switches) Clos network: minimally three stages (non-blocking) A larger switch in the middle of two other switch stages provides enough alternative paths to avoid all conflicts Use more switches Add logkN - 1 stages, mirroring the original topology Rearrangeably non-blocking Allows for non-conflicting paths Doubles network hop count (distance), d Centralized control can rearrange established paths Benes topology: 2(log2N) - 1 stages (rearrangeably non-blocking) Recursively applies the three-stage Clos network concept to the middle-stage set of switches to reduce all switches to 2 x 2 20

Network Topology Centralized Switched (Indirect) Networks 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 port Crossbar network 21

Network Topology Centralized Switched (Indirect) Networks 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 We use unshuffle recursively in order to get 2logN-1 Stage Benes network which is non-blocking. Can we use butterfly instead of unshuffle in order to get another non-blocking networks? Nov. 20, 2011. 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 port, 3-stage Clos network 22

Network Topology Centralized Switched (Indirect) Networks 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 port, 5-stage Clos network 23

Network Topology Centralized Switched (Indirect) Networks 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 port, 7 stage Clos network = Benes topology 24

Network Topology Centralized Switched (Indirect) Networks 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 Alternative paths from 0 to 1. 16 port, 7 stage Clos network = Benes topology 25

Network Topology Centralized Switched (Indirect) Networks 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 Alternative paths from 4 to 0. 16 port, 7 stage Clos network = Benes topology 26

Network Topology Centralized Switched (Indirect) Networks 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 Contention free, paths 0 to 1 and 4 to 1. 16 port, 7 stage Clos network = Benes topology 27

Network Topology Centralized Switched (Indirect) Networks Bisection Bidirectional MINs Increase modularity Reduce hop count, d Fat tree network Nodes at tree leaves Switches at tree vertices Total link bandwidth is constant across all tree levels, with full bisection bandwidth Equivalent to folded Benes topology Preferred topology in many SANs 7 6 5 4 3 2 1 15 14 13 12 11 10 9 8 Folded Clos = Folded Benes = Fat tree network 28

Network Topology Myrinet-2000 Clos Network for 128 Hosts Backplane of the M3-E128 Switch M3-SW16-8F fiber line card (8 ports) http://myri.com 29

Network Topology Myrinet-2000 Clos Network for 128 Hosts “Network in a Box” 16 fiber line cards connected to the M3-E128 Switch backplane http://myri.com 30

Network Topology Myrinet-2000 Clos Network Extended to 512 Hosts http://myri.com 31

Assignment 2-2 Chose one of the following exercises: Calculate how many permutations nxn Omega network could support. Prove that folded Clos network, folded Bens network, and Fat tree network are isomorphic to each other. Parallel Processing, Low-Diameter Architectures SJTU@Fall 2012