ECE669 L21: Routing April 15, 2004 ECE 669 Parallel Computer Architecture Lecture 21 Routing.

Slides:



Advertisements
Similar presentations
CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con’t) March 16 th, 2011 John Kubiatowicz Electrical Engineering and Computer.
Advertisements

Compsci 221 / ECE 259 Advanced Computer Architecture II (Parallel Computer Architecture) Interconnection Networks Copyright 2012 Alvin R. Lebeck Duke.
Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.
Fundamentals of Computer Networks ECE 478/578 Lecture #13: Packet Switching (2) Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Advanced Networking Wickus Nienaber Daniel Beech.
What is Flow Control ? Flow Control determines how a network resources, such as channel bandwidth, buffer capacity and control state are allocated to packet.
©2003 Dror Feitelson Parallel Computing Systems Part II: Networks and Routing Dror Feitelson Hebrew University.
1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
1 Lecture 16: On-Chip Networks Today: on-chip networks background.
CS 258 Parallel Computer Architecture Lecture 4 Network Topology and Routing February 4, 2008 Prof John D. Kubiatowicz
CS252 Graduate Computer Architecture Lecture 21 Multiprocessor Networks (con’t) John Kubiatowicz Electrical Engineering and Computer Sciences University.
CS 258 Parallel Computer Architecture Lecture 5 Routing February 6, 2008 Prof John D. Kubiatowicz
Communication operations Efficient Parallel Algorithms COMP308.
Predictive Load Balancing Reconfigurable Computing Group.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
CS252 Graduate Computer Architecture Lecture 15 Multiprocessor Networks (con’t) March 15 th, 2010 John Kubiatowicz Electrical Engineering and Computer.
Issues in System-Level Direct Networks Jason D. Bakos.
EECS 570: Fall rev1 1 Chapter 10: Scalable Interconnection Networks.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
ECE669 L16: Interconnection Topology March 30, 2004 ECE 669 Parallel Computer Architecture Lecture 16 Interconnection Topology.
John Kubiatowicz Electrical Engineering and Computer Sciences
Storage area network and System area network (SAN)
Switching, routing, and flow control in interconnection networks.
CS252 Graduate Computer Architecture Lecture 15 Multiprocessor Networks March 14 th, 2011 John Kubiatowicz Electrical Engineering and Computer Sciences.
CS252 Graduate Computer Architecture Lecture 15 Multiprocessor Networks March 12 th, 2012 John Kubiatowicz Electrical Engineering and Computer Sciences.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
1 Lecture 23: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm Next semester:
1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.
High-Performance Networks for Dataflow Architectures Pravin Bhat Andrew Putnam.
Distributed Routing Algorithms. In a message passing distributed system, message passing is the only means of interprocessor communication. Unicast, Multicast,
Networks-on-Chips (NoCs) Basics
Dynamic Networks CS 213, LECTURE 15 L.N. Bhuyan CS258 S99.
CS 552 Computer Networks IP forwarding Fall 2005 Rich Martin (Slides from D. Culler and N. McKeown)
© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Blue Gene/L Torus Interconnection Network N. R. Adiga, et.al IBM Journal.
CSE Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.
1 Lecture 7: Interconnection Network Part I: Basic Definitions Part II: Message Passing Multicomputers.
1 Scalable Interconnection Networks. 2 Scalable, High Performance Network At Core of Parallel Computer Architecture Requirements and trade-offs at many.
Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University
Deadlock CEG 4131 Computer Architecture III Miodrag Bolic.
Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 3, 2000 Topics Network design issues Network Topology.
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Interconnection Networks Copyright 2004 Daniel J. Sorin Duke University.
1 Lecture 15: Interconnection Routing Topics: deadlock, flow control.
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Interconnection Networks Alvin R. Lebeck CPS 220.
NC2 (No6) 1 Maximally Adaptive Routing Maximize adaptivity for a double-x routing based on turn model. Virtual network 0 Virtual network 1 Maximally adaptive.
© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Deadlock.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220.
Virtual-Channel Flow Control William J. Dally
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
1 Lecture 14: Interconnection Networks Topics: dimension vs. arity, deadlock.
1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.
Advanced Computer Networks
Lecture 23: Interconnection Networks
Multiprocessor Interconnection Networks Todd C
Static and Dynamic Networks
Interconnection Network Routing, Topology Design Trade-offs
Interconnection Network Design Contd.
Cache Coherence and Interconnection Network Design
Switching, routing, and flow control in interconnection networks
Lecture 14: Interconnection Networks
Introduction to Scalable Interconnection Networks
Interconnection Networks Contd.
Lecture: Interconnection Networks
CS 258 Parallel Computer Architecture Lecture 5 Routing (Con’t)
Networks: Routing and Design
Switching, routing, and flow control in interconnection networks
Presentation transcript:

ECE669 L21: Routing April 15, 2004 ECE 669 Parallel Computer Architecture Lecture 21 Routing

ECE669 L21: Routing April 15, 2004 Outline °Routing °Switch Design °Flow Control °Case Studies

ECE669 L21: Routing April 15, 2004 Routing °Routing algorithm determines which of the possible paths are used as routes how the route is determined °Issues: Routing mechanism -arithmetic -source-based port select -table driven -general computation Properties of the routes Deadlock free

ECE669 L21: Routing April 15, 2004 Routing Mechanism °need to select output port for each input packet in a few cycles °Simple arithmetic in regular topologies ex:  x,  y routing in a grid -west (-x)  x < 0 -east (+x)  x > 0 -south (-y)  x = 0,  y < 0 -north (+y)  x = 0,  y > 0 -processor  x = 0,  y = 0 °Reduce relative address of each dimension in order Dimension-order routing in k-ary n-cubes Routing in hypercubes

ECE669 L21: Routing April 15, 2004 Routing Mechanism °Source-based message header carries series of port selects used and stripped en route CRC? Packet Format? CS-2, Myrinet, MIT Artic °Table-driven message header carried index for next port at next switch -o = R[i] table also gives index for following hop -o, I’ = R[i ] ATM, HPPI P0P0 P1P1 P2P2 P3P3

ECE669 L21: Routing April 15, 2004 Properties of Routing Algorithms °Deterministic route determined by (source, dest), not intermediate state (i.e. traffic) °Adaptive route influenced by traffic along the way °Minimal only selects shortest paths °Deadlock free no traffic pattern can lead to a situation where no packets move forward

ECE669 L21: Routing April 15, 2004 Deadlock Freedom °How can it arise? necessary conditions: -shared resource -incrementally allocated -non-preemptible think of a channel as a shared resource that is acquired incrementally -source buffer then dest. buffer -channels along a route °How do you avoid it? constrain how channel resources are allocated ex: dimension order °How do you prove that a routing algorithm is deadlock free

ECE669 L21: Routing April 15, 2004 Proof Technique °resources are logically associated with channels °messages introduce dependences between resources as they move forward °need to articulate the possible dependences that can arise between channels °show that there are no cycles in Channel Dependence Graph find a numbering of channel resources such that every legal route follows a monotonic sequence °=> no traffic pattern can lead to deadlock °network need not be acyclic, on channel dependence graph

ECE669 L21: Routing April 15, 2004 Example: k-ary 2D array °Thm: x,y routing is deadlock free °Numbering +x channel (i,y) -> (i+1,y) gets i similarly for -x with 0 as most positive edge +y channel (x,j) -> (x,j+1) gets N+j similary for -y channels °any routing sequence: x direction, turn, y direction is increasing

ECE669 L21: Routing April 15, 2004 More examples: °Consider other topologies butterfly? tree? fat tree? °Any assumptions about routing mechanism? amount of buffering? °What about wormhole routing on a ring?

ECE669 L21: Routing April 15, 2004 Deadlock free wormhole networks? °Basic dimension order routing techniques don’t work for k-ary n-cubes only for k-ary n-arrays (bi-directional) °Idea: add channels! provide multiple “virtual channels” to break the dependence cycle good for BW too! Do not need to add links, or xbar, only buffer resources

ECE669 L21: Routing April 15, 2004 Breaking deadlock with virtual channels

ECE669 L21: Routing April 15, 2004 Up*-Down* routing °Given any bidirectional network °Construct a spanning tree °Number of the nodes increasing from leaves to roots °UP increase node numbers °Any Source -> Dest by UP*-DOWN* route up edges, single turn, down edges °Performance? Some numberings and routes much better than others interacts with topology in strange ways

ECE669 L21: Routing April 15, 2004 Turn Restrictions in X,Y °XY routing forbids 4 of 8 turns and leaves no room for adaptive routing °Can you allow more turns and still be deadlock free

ECE669 L21: Routing April 15, 2004 Minimal turn restrictions in 2D north-lastnegative first -x +x +y -y

ECE669 L21: Routing April 15, 2004 Example legal west-first routes °Can route around failures or congestion °Can combine turn restrictions with virtual channels

ECE669 L21: Routing April 15, 2004 Adaptive Routing °Essential for fault tolerance at least multipath °Can improve utilization of the network °Simple deterministic algorithms easily run into bad permutations °fully/partially adaptive, minimal/non-minimal °can introduce complexity or anomolies °little adaptation goes a long way!

ECE669 L21: Routing April 15, 2004 Switch Design

ECE669 L21: Routing April 15, 2004 How do you build a crossbar

ECE669 L21: Routing April 15, 2004 Input buffered swtich °Independent routing logic per input FSM °Scheduler logic arbitrates each output priority, FIFO, random °Head-of-line blocking problem

ECE669 L21: Routing April 15, 2004 Output Buffered Switch °How would you build a shared pool?

ECE669 L21: Routing April 15, 2004 Example: IBM SP vulcan switch °Many gigabit ethernet switches use similar design without the cut-through

ECE669 L21: Routing April 15, 2004 Output scheduling °n independent arbitration problems? static priority, random, round-robin °simplifications due to routing algorithm? °general case is max bipartite matching

ECE669 L21: Routing April 15, 2004 Stacked Dimension Switches °Dimension order on 3D cube? °Cube connected cycles?

ECE669 L21: Routing April 15, 2004 Flow Control °What do you do when push comes to shove? ethernet: collision detection and retry after delay FDDI, token ring: arbitration token TCP/WAN: buffer, drop, adjust rate any solution must adjust to output rate °Link-level flow control

ECE669 L21: Routing April 15, 2004 Examples °Short Links °long links several flits on the wire

ECE669 L21: Routing April 15, 2004 Smoothing the flow °How much slack do you need to maximize bandwidth?

ECE669 L21: Routing April 15, 2004 Example: T3D °3D bidirectional torus, dimension order (NIC selected ), virtual cut-through, packet sw. °16 bit x 150 MHz, short, wide, synch. °rotating priority per output °logically separate request/response °3 independent, stacked switches °8 16-bit flits on each of 4 VC in each directions

ECE669 L21: Routing April 15, 2004 Example: SP °8-port switch, 40 MB/s per link, 16-bit flit, single 40 MHz clock °packet sw, cut-through, no virtual channel, source-based routing °variable packet <= 255 bytes, 31 byte fifo per input, 7 bytes per output °128 8-byte ‘chunks’ in central queue, LRU per output

ECE669 L21: Routing April 15, 2004 Summary °Routing Algorithms restrict the set of routes within the topology simple mechanism selects turn at each hop arithmetic, selection, lookup °Deadlock-free if channel dependence graph is acyclic limit turns to eliminate dependences add separate channel resources to break dependences combination of topology, algorithm, and switch design °Deterministic vs adaptive routing °Switch design issues input/output/pooled buffering, routing logic, selection logic °Flow control °Real networks are a ‘package’ of design choices