Download presentation
Presentation is loading. Please wait.
1
On-Chip Networks and Testing
Ob-Chip Networks
2
Basic Network Architectures
Shared Medium - Buses Direct Networks (More follows) Indirect Interconnection Networks: A network of switches connecting all the nodes, e.g. Omega networks (not further discussed here) Hybrid: Multiple and Hierarchical Buses, FPGAs (e.g. Virtex II) Ob-Chip Networks
3
Buses All communication devices time-share a common transmission medium Broadcast is possible because every bus device can listen to all communication. Need bus arbitration to resolve simultaneous requests for bus use. Can be centralized or distributed Ob-Chip Networks
4
Multiple Buses Ob-Chip Networks
5
Hierarchical Bus Ob-Chip Networks
6
Bus Performance Issues
Arbitration overhead must be minimized Response time of slow slaves: Solution is split-transaction protocols Master releases bus after request Slave must regain it To improve bus utilization and latency caused by simultaneous requests, bus can partition large transfers into smaller packets. Ob-Chip Networks
7
On-Chip Bus Performance Issues
Scalability Speed: Every new device adds to the load and thus slows the bus. Problem shows up beyond c. 10 bus masters. Latency also grows with the number of devices Energy Efficiency: Every data transfer is broadcast to all devices hence consumes more energy as the number of devices grows. Ob-Chip Networks
8
Direct Interconnection Networks
Also called point-to-point Overcome scalability problem of buses Each node directly connects a limited number of neighboring nodes. Each node includes a network interface called router that handles communication and directly connects to routers of neighboring nodes. Total bandwidth increases with number of nodes Ob-Chip Networks
9
A Generic Direct Network
Communication Latency is a critical performance parameter. Sum of: Start-up latency Network latency Blocking time A generic system with direct interconnect network Generic node architecture Ob-Chip Networks
10
Some Direct Network Topologies
(From: Ni & McKinley, IEEE Computer, Feb 1993) (1) Hypercube (2) Torus (3) Mesh Many other topologies and variants appear in the literature. These are evaluated on the following parameters: Node degree (# edges per node), Diameter (greatest distance across), Bisection width (lines cut by slicing), Latency (time to reach other nodes), Bandwidth (data rate), Symmetry (network same everywhere), Homogeneity (all nodes same) Ob-Chip Networks
11
Routing in Direct Networks
Many ways to classify: Source routing: Source node selects the path to the destination, stays fixed. Each packet carries the complete path info. Distributed Routing: Router determines if local or non-local. Uses a routing algorithm if non-local. Routing algorithm must be fast and easy-to-implement. Deterministic vs. Adaptive: Deterministic: Path determined by S and D Adaptive: Path can change dynamically A routing algorithm is minimal if it selects a shortest path in the network. Non-minimal routing must avoid deadlock Ob-Chip Networks
12
Flow Control in Direct Networks
The network consists of many channels and buffers. Flow control determines how channels or buffers are allocated to a packet as it travels through the network. If a resource (channel or buffer) required by a packet is held by another packet, flow control determines if the packet is dropped, blocked in place, buffered, or rerouted. Good flow control algorithm aim to avoid network congestion. Ob-Chip Networks
13
Switching in Direct Networks
Switching: Mechanism to remove a packet from an input channel and putting it on an output channel. Four general techniques: Store-and-Forward (Packet Switching): Entire packet stored in a packet buffer at intermediate nodes, then forwarded to a selected neighbor. Circuit Switching: A complete circuit from S to D is built and torn down for each packet transfer Virtual Cut through: Packet is buffered at an intermediate node only if the next required channel is busy, otherwise it is forwarded directly without buffering. Wormhole routing: Similar to cut-through but packet broken into flits (flow control digits) that are normally transmitted bit-parallel between routers. The header flit governs the route. As it advances along the route, the remaining flits follow in pipeline way. The header flit may be blocked at an intermediate node. The trailing flits remain in flit buffer along the route. Once the channel is acquired by a packet, it is reserved for the packet, and released when the tail flit has been transmitted. Ob-Chip Networks
14
Comparison of Switching Techniques
(From: Ni and McKinley, Computer, Feb 1993) Store and Forward (2) Circuit Switching (3) Wormhole Ob-Chip Networks
15
An Example Deadlock in Wormhole Routing
Ob-Chip Networks
16
Adaptive Double Y-channel Routing for 2D Mesh
(From: Ni and McKinley, Computer, Feb 1993) Ob-Chip Networks
17
Why Direct Networks (NoC) for On-Chip Networks?
Dally and Towles (DAC2001) provide the following benefits of replacing global wiring by direct networks: Structure: Global wires are structured so as to optimize and control their electrical properties: cross-talk is minimized and becomes predictable, Performance: Aggressive signaling techniques are possible to reduce power and increase speed. Sharing wiring between many flows makes wire use more efficient (typical activity is 10%). Modularity: Direct network provide a standard interface for plug-and-play designs. Standard interface also facilitates reusability and interoperability of modules (nodes). Ob-Chip Networks
18
An Example Design (From Dally and Towles) 12 mm x 12 mm chip
0.1 micron technology (0.5 micron wire pitch) 16 3mmx3mm tiles (CPUs, DSPs, I/O controllers, memory subsystems, etc.) All communication via network logic 2D folded torus topology Network logic occupies small amount of area between tiles (6.6%) and consumes top two metal layers It provides a reliable datagram interface to each tile. Ob-Chip Networks
19
NoC vs. Inter-chip Networks
Wires and pins are more abundant in NoCs. Buffer space is less abundant. Based on the above: Dally and Towles identify three future research areas: What topologies are best matched for abundant wiring resources? What flow control methods reduce buffer count and overhead? What circuits best exploit structured wiring? Ob-Chip Networks
20
Topology In the example design each tile side can accommodate 6000 wires, hence it is possible to achieve pins crossing the four edges. Compare with 1000 pins for inter-chip routers, limited by pins. The design uses 300-bit flits (compared to 8 or 16 bit fits for inter-chip routers) The folded torus has twice the bisection width of a mesh. Ob-Chip Networks
21
Flow Control The example design uses 10K bits of buffer space in each input controller, thus does not particularly economize buffer space. Large buffering was dictated by the requirement of not dropping packets on collision, for performance reasons. Ob-Chip Networks
22
Circuits Used to Exploit Structured Wiring
Pulsed low-swing drivers and receivers: Low power Reduced latency Increased repeater spacing Circuits can be used to send multiple bits per clock period on one wire. For the example one could send 2-20 bits depending on the clock rate (2 GHs to 200 MHz). Ob-Chip Networks
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.