Presentation is loading. Please wait.

Presentation is loading. Please wait.

George Michelogiannakis William J. Dally Stanford University Router Designs for Elastic- Buffer On-Chip Networks.

Similar presentations


Presentation on theme: "George Michelogiannakis William J. Dally Stanford University Router Designs for Elastic- Buffer On-Chip Networks."— Presentation transcript:

1 George Michelogiannakis William J. Dally Stanford University Router Designs for Elastic- Buffer On-Chip Networks

2 Introduction  EB flow-control was recently proposed. Uses the channels as distributed FIFOs.  EB routers are bufferless packet-switched routers. They have the benefits of circuit-switched routers, without the overhead of setting up and tearing down circuits.  This work explores the EB router design space. By evaluating three representative designs. 2 SC09: Routers for EB NoCs

3 The EB Flow-control Idea Master-slave FF Elastic buffer Pipelined channel Channel as FIFO 3 SC09: Routers for EB NoCs

4 How Elastic Buffer Channels Work  Ready/valid handshake between elastic buffers Ready: At least one free storage slot Valid: Non-empty (driving valid data) Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6 4 SC09: Routers for EB NoCs

5 Use EB Flow-Control Through the Router VC input-buffered router EB router Input buffer replaced by input EB VC & SW allocators removed. Per-output arbiters instead. Three-slot output EB to cover for arbitration done one cycle in advance. LA routing also applicable to EB networks. 5

6 Baseline Router - Issues  Issues constraining the clock cycle time: Three-slot EB FSM too complicated: output EB implemented as FIFO. Routing is performed serially with switch arbitration. FIFO Serially 6

7 Enhanced Two-Stage Router  Look-ahead routing to shorten the critical path.  Use two-slot EBs at output and for pipelining. Flits are stored in the interm. EB and wait for a grant. Decision to traverse switch made in the same cycle. 7

8 Enhanced Two-Stage Router – Sync Module  Synchronization module maintains alignment between flits and grants.  Contains an output port EB. Stores the chosen output port of the current and any other packets in the router stage 1 and interm. EB. Maintains alignment between flits and grants. 8

9 Enhanced Two-Stage Router – Sync Module  When the current packet’s tail flit is departing: Sync. module propagates the next output to the arbiters. From the appropriate location.  Sync. module propagates an update to all outputs. An output receiving an update from the input it is granting clocks the arbiter output regs at the next edge. 9

10 Single-Stage Router  Merges the two router stages to: Reduce router latency. Avoid pipelining overhead. 10 SC09: Routers for EB NoCs

11 Evaluation Methodology  45nm worst-case low-power commercial library.  Synopsys DC and Cadence Encounter. 64-bit router datapath. 70% initial area utilization ratio.  Used a cycle-accurate network simulator.  We assume each router at its maximum post-P&R frequency, or all at the same frequency.  8x8 2D mesh. 2mm-long wires. 1 cycle latency. Constant packet size of 512 bits.  Averaged over a set of six traffic patterns.  Swept datapath width from 28 to 171 bits. 11 SC09: Routers for EB NoCs

12 Placement and Routing Cycle Time  Enhanced two- stage has a 26% reduced cycle time compared to the single-stage, and 42% compared to the baseline two- stage. 12 SC09: Routers for EB NoCs

13 Placement and Routing Energy per Bit  Baseline two- stage requires 9% less energy per bit compared to the single- stage, and 35% compared to the enhanced two- stage. 13

14 Placement and Routing Area  Single-stage occupies 30% less area than the enhanced two-stage and 44% less than the baseline two- stage. 14

15 Latency-Throughput, Max Frequencies. Latency increase: Enhanced:+1% Baseline:+46% 15

16 Latency-Throughput, Equal Frequencies. Latency increase: Enhanced:+34% Baseline:+32% 16

17 Which Router is the Optimal Choice? PriorityRouter Choice Operate at maximum frequencies AreaEnhanced two-stage EnergyBaseline two-stage (closely followed by single-stage) LatencySingle-stage (depends on effect on channels) Operate at the same frequency AreaSingle-stage EnergyBaseline two-stage (closely followed by single-stage) LatencySingle-stage 17 SC09: Routers for EB NoCs

18 Conclusion  Improved EB router designs can widen the gap compared to VC networks. Makes EB look even more attractive.  EB routers are simple designs. Simple designs have numerous advantages. A lot of the complexity of VC networks is ignored by some area and power models.  Overall compared to VC, 43% reduction in power per unit throughput, 67% reduction in cycle time and 22% throughput per unit area. 18 SC09: Routers for EB NoCs

19 Questions? SC09: Routers for EB NoCs


Download ppt "George Michelogiannakis William J. Dally Stanford University Router Designs for Elastic- Buffer On-Chip Networks."

Similar presentations


Ads by Google