Presentation is loading. Please wait.

Presentation is loading. Please wait.

José Vicente Escamilla José Flich Pedro Javier García 1.

Similar presentations


Presentation on theme: "José Vicente Escamilla José Flich Pedro Javier García 1."— Presentation transcript:

1 José Vicente Escamilla José Flich Pedro Javier García 1

2  Introduction / Motivation  ICARO overview  ICARO description ◦ Detection ◦ Notification ◦ Isolation  Results  Conclusions  Questions 2

3 CMP MPSoC  CMP and MPSoCs use a network to interconnect nodes  Network performance degradation due to:  Power saving mechanisms (DVFS)  Bursty traffic patterns  Heterogeneous systems designs  Performance degradation may lead to congestion Tile-Gx (72 cores) 3

4  ICARO does not remove congestion. ICARO isolates it.  Two types of traffic  Congested  Non-congested  Goal: To isolate congested traffic from non- congested one in order to avoid HoL-Blocking. 4

5 5  RCA, P. Gratz et al. ◦ Redirects traffic at each router based on congestion metrics. ◦ Metrics are piggybacked.  Vicious cycles may be created.  “Prediction-based Flow Control for Network-on-Chip Traffic”, U. Ogras et al. ◦ Injection control based on prediction-models. ◦ Prediction-model uses links status sent through a dedicated network.  Injection throttling may produce performance oscillations.  AVADA/FVADA, Yi Xu et al. ◦ Map different flows to different queues based on the output port requested in the next router (lookahead routing).  Require lookahead routing and credit-based flow control.  Congested flows and non-congested ones may share queues, generating HoL-blocking in some degree since the mapping policy only consider one hop of the message path.

6 Credits=2 Credits=0 6

7  ICARO uses two types of Virtual Networks (VNs) ◦ Regular VN: Non-congested traffic ◦ Extra VN: Congested traffic  Three stages: ◦ Detection  Congestion is detected at routers. ◦ Notification  Routers notify to all Networks Interfaces (NIs). ◦ Isolation  NIs isolate congested traffic from not-congested one. 7

8 NI 0 SW0SW1SW2 SW3 SW4SW5SW6 SW7 SW8 SW9 SW10SW11 SW12SW13 SW14SW15 NI 1NI 2NI 3 NI 4 NI 5NI 6NI 7 NI 8 NI 9NI 10NI 11 NI 12 NI 13NI 14NI 15 Regular VN queue Extra VN queue 8

9  It is performed at routers  Detects congestion points ({router, port} pairs)  When a message arrives/leaves ◦ Buffer saturation checking  If buffer.level > HIGH_THR such buffer is marked as saturated.  If buffer.level < LOW_THR such buffer is marked as NOT- saturated (hysteresis).  If any of the buffers of an input port is marked as saturated the whole input port is marked as well. ◦ Congestion checking  Requests from saturated input ports against each output port are computed  Each output port requested by more than 1 saturated input port is marked as congested 9

10  Segmented ring connecting routers and NIs  Network width (wires)  Process: ◦ Notifications are injected to the register (when it is free). ◦ Notifications are delivered from a register to the next one at each cycle. ◦ Notifications are discarded when reach their origin register. N=Number of nodes p=Router radix 1 p (N)log 2  10

11 SW0SW1SW2 SW3 SW4SW5SW6 SW7 SW8 SW9SW10SW11 SW12SW13 SW14SW15 Register Notification 11 NI 7 CNN out CNN in Notification Injection Notification Reception in2 out in1 RegReg SW 7

12 12

13  Notifications are stored in a cache memory.  Useless notifications are discarded ◦ Unreachable CPs ◦ Redundant notifications (merge) SWPort 5E 10S 13

14 SW0SW1SW2 SW3 SW4SW5SW6 SW7 SW8 SW9SW10SW11 SW12SW13 SW14SW15 NI 0 SWPort 10S -- NI 4 SWPort 5E 10S XY routing 14

15 SW0SW1SW2 SW3 SW4SW5SW6 SW7 SW8 SW9SW10SW11 SW12SW13 SW14SW15 XY routing NI 4 SWPort 5E 10S {SW10, Port S} notification is IGNORED {SW5, Port E} and {SW10, Port S} notifications are MERGED 15

16  It is performed at NIs  Process: ◦ Initially all traffic is allocated into regular-VNs. ◦ At each cycle the post-processor module checks messages at the header of all regular-VNs in parallel. ◦ If the route crosses any of the CPs stored in the CPs cache memory the message is reallocated into extra-VNs. 16

17 Arbiter Post-processor CPs Cache SW Port 5E Regular-VN Extra-VN Network Interface 4 17 Regular-VN Router 4 Extra-VN in out2 out1 dst:12dst:15dst:6

18 18  Simulation: ◦ NoC simulator developed in our research group.  Compared against FVADA/AVADA with different number of virtual queues ◦ FVADA: Restricted to 4 VCs ◦ ICARO: Uses VNs instead of VCs  Overheads analysis: ◦ Tools used:  Synthesis: Design vision (Synopsys)  Place & Route: Encounter (Cadence)  Library: 45nm Nangate Open Cell (typical conditional) ParameterValue Topology8x8 2D mesh RoutingXY SwitchingWormhole (flit-level switching) Flow controlCredits Flit size128 bits Message size5 flits Traffic0.3 f/c (background) + 1 f/c (hotspot 4-to-1, from cycle 10k to 20k)

19 4VC/VN 2VC/VN 8VC/VN 19

20 20  Area overhead: ~6%.  Power overhead: varies from 6% to 10%.

21 21  Area overhead: varies from 3,8% to 6%  Power overhead: varies from 4,5% to 5,4%.

22  Conclusions: ◦ A mechanism to avoid HoL-Blocking on networks- on-chip has been presented. ◦ ICARO manages to isolate harmful traffic from non- harmful one by using VNs achieving an overall latency improvement of up to 82%.  Future work: ◦ To analyze hierarchical CNN to improve scalability. ◦ To implement in-order delivery support 22

23 Questions? 23


Download ppt "José Vicente Escamilla José Flich Pedro Javier García 1."

Similar presentations


Ads by Google