Interconnection Network Design Contd.

Slides:



Advertisements
Similar presentations
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Advertisements

What is Flow Control ? Flow Control determines how a network resources, such as channel bandwidth, buffer capacity and control state are allocated to packet.
1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
CS 258 Parallel Computer Architecture Lecture 5 Routing February 6, 2008 Prof John D. Kubiatowicz
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
EECS 570: Fall rev1 1 Chapter 10: Scalable Interconnection Networks.
Interconnection Network Topology Design Trade-offs
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
ECE669 L16: Interconnection Topology March 30, 2004 ECE 669 Parallel Computer Architecture Lecture 16 Interconnection Topology.
John Kubiatowicz Electrical Engineering and Computer Sciences
Switching, routing, and flow control in interconnection networks.
Interconnect Network Topologies
1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.
Interconnect Networks
1 Lecture 7: Interconnection Network Part I: Basic Definitions Part II: Message Passing Multicomputers.
Computer Architecture Distributed Memory MIMD Architectures Ola Flygt Växjö University
Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 3, 2000 Topics Network design issues Network Topology.
ECE669 L21: Routing April 15, 2004 ECE 669 Parallel Computer Architecture Lecture 21 Routing.
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
1 Lecture 15: Interconnection Routing Topics: deadlock, flow control.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Interconnection Networks Alvin R. Lebeck CPS 220.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)
1 Lecture 14: Interconnection Networks Topics: dimension vs. arity, deadlock.
1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs in Parallel Machines Dr. Xiao Qin Auburn University
Chapter 3 Part 1 Switching and Bridging
Interconnect Networks
Topics discussed in this section:
The network-on-chip protocol
EE 122: Lecture 19 (Asynchronous Transfer Mode - ATM)
Advanced Computer Networks
Lecture 23: Interconnection Networks
Multiprocessor Interconnection Networks Todd C
: An Introduction to Computer Networks
Prof John D. Kubiatowicz
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Static and Dynamic Networks
Interconnection Network Routing, Topology Design Trade-offs
John Kubiatowicz Electrical Engineering and Computer Sciences
CONGESTION CONTROL.
What’s “Inside” a Router?
Cache Coherence and Interconnection Network Design
Introduction to Scalable Interconnection Network Design
Switching, routing, and flow control in interconnection networks
Lecture 14: Interconnection Networks
Interconnection Network Design Lecture 14
Communication operations
Introduction to Scalable Interconnection Networks
Storage area network and System area network (SAN)
Lecture: Interconnection Networks
Static Interconnection Networks
Copyright 2012 Alvin R. Lebeck
The Network Layer Network Layer Design Issues:
Interconnection Network Design
Switching Techniques.
Interconnection Networks Contd.
Embedded Computer Architecture 5SAI0 Interconnection Networks
Lecture: Networks Topics: TM wrap-up, networks.
Lecture: Interconnection Networks
Chapter 3 Part 3 Switching and Bridging
CS 6290 Many-core & Interconnect
Networks: Routing and Design
Lecture 25: Interconnection Networks
Switching, routing, and flow control in interconnection networks
Presentation transcript:

Interconnection Network Design Contd. Adapted from UC, Berkeley Notes CS258 S99

Switching Techniques Circuit Switching: A control message is sent from source to destination and a path is reserved. Communication starts. The path is released when communication is complete. Store-and-forward policy (Packet Switching): each switch waits for the full packet to arrive in switch before sending to the next switch (good for WAN) Cut-through routing or worm hole routing: switch examines the header, decides where to send the message, and then starts forwarding it immediately In worm hole routing, when head of message is blocked, message stays strung out over the network, potentially blocking other messages (needs only buffer the piece of the packet that is sent between switches). CM-5 uses it, with each switch buffer being 4 bits per port. Cut through routing lets the tail continue when head is blocked, storing the whole message into an intermmediate switch. (Requires a buffer large enough to hold the largest packet). 11/10/2018 CS258 S99

11/10/2018 CS258 S99

Store and Forward vs. Cut-Through Advantage Latency reduces from function of: number of intermediate switches X by the size of the packet to time for 1st part of the packet to negotiate the switches + the packet size ÷ interconnect BW 11/10/2018 CS258 S99

Store&Forward vs Cut-Through Routing h(n/b + D) vs n/b + h D what if message is fragmented? wormhole vs virtual cut-through 11/10/2018 CS258 S99

11/10/2018 CS258 S99

Wormhole: (# nodes x node delay) + transfer time Example Q. Compare the efficiency of store-and-forward (packet switching) vs. wormhole routing for transmission of a 20 bytes packet between a source and destination, which are d-nodes apart. Each node takes 0.25 microsecond and link transfer rate is 20 MB/sec. Answer: Time to transfer 20 bytes over a link = 20/20 MB/sec = 1 microsecond. Packet switching: # nodes x (node delay + transfer time)= d x (.25 + 1) = 1.25 d microseconds Wormhole: (# nodes x node delay) + transfer time = 0.25 d + 1 Book: For d=7, packet switching takes 8.75 microseconds vs. 2.75 microseconds for wormhole routing

Contention Two packets trying to use the same link at same time limited buffering drop? Most parallel mach. networks block in place link-level flow control tree saturation Closed system - offered load depends on delivered 11/10/2018 CS258 S99

Delay with Queuing Suppose there are L links per node. Each link sends a packet to another link at “Lamda” packets/sec. The service rate (link+switch) is “Mue” packets per second. What is the delay over a distance D? Ans: There is a queue at each output link to hold extra packets. Model each output link as an M/M/1 queue with LxLamda input rate and Mue service rate. Delay through each link = Queuing time + Service time = S Delay over a distance D = S x D 11/10/2018 CS258 S99

Congestion Control Packet switched networks do not reserve bandwidth; this leads to contention (connection based limits input) Solution: prevent packets from entering until contention is reduced (e.g., freeway on-ramp metering lights) Options: Packet discarding: If packet arrives at switch and no room in buffer, packet is discarded (e.g., UDP) Flow control: between pairs of receivers and senders; use feedback to tell sender when allowed to send next packet Back-pressure: separate wires to tell to stop Window: give original sender right to send N packets before getting permission to send more; overlaps latency of interconnection with overhead to send & receive packet (e.g., TCP), adjustable window Choke packets: aka “rate-based”; Each packet received by busy switch in warning state sent back to the source via choke packet. Source reduces traffic to that destination by a fixed % (e.g., ATM) Mother’s day: “all circuits are busy” ATM packet: trying to avoid FDDI problem Hope that ATM used for local area packets Rate based for WAN Back pressure for LAN Telelecommunications is standards based: makes decision to ensure that no one has an advantage; all have a disadvantage Europe used 32 B and US 64B => 48B payload 11/10/2018 CS258 S99 CS258 S99

Flow Control What do you do when push comes to shove? Ethernet: collision detection and retry after delay FDDI, token ring: arbitration token TCP/WAN: buffer, drop, adjust rate any solution must adjust to output rate Link-level flow control 11/10/2018 CS258 S99

Examples Short Links long links several flits on the wire 11/10/2018 CS258 S99

Routing Recall: routing algorithm determines Issues: which of the possible paths are used as routes how the route is determined R: N x N -> C, which at each switch maps the destination node nd to the next channel on the route Issues: Routing mechanism arithmetic source-based port select table driven general computation Properties of the routes Deadlock feee 11/10/2018 CS258 S99

Routing Mechanism need to select output port for each input packet in a few cycles Simple arithmetic in regular topologies ex: Dx, Dy routing in a grid west (-x) Dx < 0 east (+x) Dx > 0 south (-y) Dx = 0, Dy < 0 north (+y) Dx = 0, Dy > 0 processor Dx = 0, Dy = 0 Reduce relative address of each dimension in order Dimension-order routing in k-ary d-cubes e-cube routing in n-cube 11/10/2018 CS258 S99

Routing Mechanism (cont) P3 P2 P1 P0 Source-based message header carries series of port selects used and stripped en route CRC? Packet Format? CS-2, Myrinet, MIT Artic Table-driven message header carried index for next port at next switch o = R[i] table also gives index for following hop o, I’ = R[i ] ATM, HPPI 11/10/2018 CS258 S99

Properties of Routing Algorithms Deterministic route determined by (source, dest), not intermediate state (i.e. traffic) Adaptive route influenced by traffic along the way Minimal only selects shortest paths Deadlock free no traffic pattern can lead to a situation where no packets mover forward 11/10/2018 CS258 S99

Deadlock Freedom How can it arise? How do you avoid it? necessary conditions: shared resource incrementally allocated non-preemptible think of a channel as a shared resource that is acquired incrementally source buffer then dest. buffer channels along a route How do you avoid it? constrain how channel resources are allocated ex: dimension order How do you prove that a routing algorithm is deadlock free 11/10/2018 CS258 S99

Proof Technique resources are logically associated with channels messages introduce dependencies between resources as they move forward need to articulate the possible dependences that can arise between channels show that there are no cycles in Channel Dependence Graph find a numbering of channel resources such that every legal route follows a monotonic sequence => no traffic pattern can lead to deadlock network need not be acyclic, on channel dependence graph 11/10/2018 CS258 S99

Example: k-ary 2D array Theorem: x,y routing is deadlock free Numbering +x channel (i,y) -> (i+1,y) gets i similarly for -x with 0 as most positive edge +y channel (x,j) -> (x,j+1) gets N+j similarly for -y channels any routing sequence: x direction, turn, y direction is increasing 11/10/2018 CS258 S99

Deadlock free wormhole networks? Basic dimension order routing techniques don’t work for k-ary d-cubes only for k-ary d-arrays (bi-directional) Idea: add channels! provide multiple “virtual channels” to break the dependence cycle good for BW too! Do not need to add links, or xbar, only buffer resources This adds nodes the the CDG, remove edges? 11/10/2018 CS258 S99

Breaking deadlock with virtual channels 11/10/2018 CS258 S99

Adaptive Routing R: C x N x S -> C Essential for fault tolerance at least multipath Can improve utilization of the network Simple deterministic algorithms easily run into bad permutations fully/partially adaptive, minimal/non-minimal can introduce complexity or anomolies little adaptation goes a long way! 11/10/2018 CS258 S99

Interconnection Topologies Class networks scaling with N Logical Properties: distance, degree Physcial properties length, width Fully connected network diameter = 1 degree = N cost? bus => O(N), but BW is O(1) - actually worse crossbar => O(N2) for BW O(N) VLSI technology determines switch degree 11/10/2018 CS258 S99

11/10/2018 CS258 S99

11/10/2018 CS258 S99

Linear Arrays and Rings Diameter? Average Distance? Bisection bandwidth? Route A -> B given by relative address R = B-A Torus? Examples: FDDI, SCI, FiberChannel Arbitrated Loop, KSR1 11/10/2018 CS258 S99

Multidimensional Meshes and Tori 3D Cube 2D Grid d-dimensional array n = kd-1 X ...X kO nodes described by d-vector of coordinates (id-1, ..., iO) d-dimensional k-ary mesh: N = kd k = dÖN described by d-vector of radix k coordinate d-dimensional k-ary torus (or k-ary d-cube)? 11/10/2018 CS258 S99

Real Machines Wide links, smaller routing delay Tremendous variation 11/10/2018 CS258 S99