Download presentation
Presentation is loading. Please wait.
1
Fly-Over: A Light-Weight Distributed Power-Gating Mechanism for Energy-Efficient Networks-on-Chip
Rahul Boyapati*, Jiayi Huang*, Ningyuan Wang, Kyung Hoon Kim, Ki Hwan Yum, Eun Jung Kim
2
Motivation NoC static power portion increases as technology shrinks.
2 NoC static power portion increases as technology shrinks. Static power saving is crucial for power-efficient NoC design. Power-Gating is one solution to save static power.
3
Router Power-Gating Router power-gating categories:
3 Router power-gating categories: Power state of the attached core OS power-gates cores based on workloads Router uses this information to power-gate attached routers Network traffic status Independent of attached core’s power state Inaccurate traffic detection leads to frequent on/off transitions
4
Router Power-Gating Router power-gating categories:
4 Router power-gating categories: Power state of the attached core (Fly-Over) OS power-gates cores based on workloads Router uses this information to power-gate attached routers Network traffic status Independent of attached core’s power state Inaccurate traffic detection leads to frequent on/off transitions
5
Challenges and Prior Work
5 Challenges Packet detour can degrade performance Network disconnection Network (re)configuration overhead Prior Work: Router Parking (Samih et al. HPCA’13) Power on more routers for network connectivity More detour around off routers Centralized control incurs high reconfiguration overhead
6
Fly-Over (FLOV)
7
Key Idea Inspired by Fly-Over transportation network
7 Inspired by Fly-Over transportation network Source: Packets can Fly Over the off routers without detour.
8
Key Idea 8 D S D S Detour Fly-Over (FLOV)
9
Fly-Over Implementation
9 FLOV Router Microarchitecture Handshake Controller Power State Registers Credit Control Logic Handshake Protocol Dynamic FLOV Routing Algorithm
10
FLOV Router Microarchitecture
10 Baseline Router Handshake Controller Credit Control Logic Input E Input N Output E FLOV Latch PSRs Input W Input S Output W Output S Output N 6 Handshake Controller Handshaking with neighbors for power state transitions Power State Registers (PSRs) Keeps power states of physical/logical neighbors Credit Control Logic Augmented to relay credits while router core is gated
11
FLOV Handshake Protocols
11 Need to facilitate distributed router power transitions. Restricted FLOV (rFLOV) No consecutive routers in a row/column can be power-gated. Simpler control but power savings limited. R C
12
FLOV Handshake Protocols
12 Need to facilitate distributed router power transitions. Generalized FLOV (gFLOV) Consecutive routers can be power-gated. Complex protocol but aggressive power savings. R C
13
FLOV Handshake Protocols
13 Active Draining Sleep Wakeup Power-Gating: Active – Draining (finish intermittent transmission) – Sleep. Power On: Sleep – Wakeup (finish intermittent transmission) – Active.
14
FLOV Routing Algorithm
14 FLOV Architecture Right-most column ALWAYS active They maintain network connectivity
15
FLOV Routing Algorithm
15 (a) Destination Partitioning. (b) Routing Example. Dynamic routing algorithm based on YX routing. best effort minimal routing.
16
Evaluation
17
Experimental Setup Architecture Tools Evaluated Schemes NoC
17 Architecture 2 GHz Alpha cores 32 KB L1 I/D$, 8 MB L2$ MESI, 4 MCs at 4 corners Tools Gem5+Booksim2 DSENT Power Model Evaluated Schemes No Power-Gating (Baseline) Router Parking (RP) Restricted FLOV (rFLOV) Generalized FLOV (gFLOV) NoC 8x8 mesh Default Y-X routing 3-stage pipeline router 3 regular virtual channel (VCs) and 1 escape VC 6-flit input buffer depth 4-flit packet for synthetic 1 mm link, 1 cycle, 16 Byte width Power Parameters 32 nm technology node 17.7 pJ power-gating overhead 10-cycle wakeup latency
18
Static Power for Synthetic Workload
18 Uniform Random (0.08 flits/node/cycle) gFLOV power-gates more routers RP keeps more routers on for network connectivity rFLOV power saving limited
19
Dynamic Power for Synthetic Workload
19 Uniform Random (0.08 flits/node/cycle) FLOV even consumes less dynamic power by Fly-Over router pipelines RP consumes highest power due to detour
20
Packet Latency for Synthetic Workload
20 Uniform Random (0.08 flits/node/cycle) FLOV is close to Baseline Best-effort minimum routing Fast FLOV links
21
Energy for PARSEC 2.1 NoC energy consumption normalized to Baseline:
21 NoC energy consumption normalized to Baseline: FLOV achieves 43% and 22% static energy reduction compared to Baseline and RP. FLOV saves 36% and 18% total energy over Baseline and RP.
22
Performance for PARSEC 2.1
22 Application full system runtime normalized to Baseline: FLOV degrades the performance less than 1%.
23
Network Reconfiguration Overhead
23 Reconfiguration starts Reconfiguration starts FLOV power-gating is light-weight in terms of latency RP’s centralized power-gating control has more than 700 cycle reconfiguration overhead, leading to high average packet latency
24
Summary Proposed Fly-Over (FLOV) power-gating mechanism
24 Proposed Fly-Over (FLOV) power-gating mechanism Distributed mechanism. Seamless NoC functionality ensured. Performance-power tradeoff achieved. FLOV comprises of Router Microarchitecture enhancement. Handshake protocols. Dynamic Routing Algorithm. FLOV saves 22% static energy compared to state-of- the-art with less than 1% performance degradation.
25
Fly-Over: Thank you
26
Fly-Over: A Light-Weight Distributed Power-Gating Mechanism for Energy-Efficient Networks-on-Chip
Rahul Boyapati*, Jiayi Huang*, Ningyuan Wang, Kyung Hoon Kim, Ki Hwan Yum, Eun Jung Kim
27
Backup
28
Total Power for Synthetic Workload
20 Uniform Random (0.08 flits/node/cycle) FLOV Aggressive power-gating Net gain from both dynamic and static power saving
29
Packet Latency Decomposition
30 Uniform Random (0.08 flits/node/cycle) FLOV latency: fast FLOV link through gated routers. Accumulated router latency: router pipeline latency in active routers
30
Static Power for Synthetic Workload
31 Tornado (0.08 flits/node/cycle) gFLOV power-gates more routers RP keeps more routers on for network connectivity rFLOV power saving limited
31
Dynamic Power for Synthetic Workload
32 Tornado (0.08 flits/node/cycle) FLOV even consumes less dynamic power by Fly-Over router pipelines RP consumes highest power due to detour
32
Packet Latency for Synthetic Workload
33 Tornado (0.08 flits/node/cycle) FLOV is close to Baseline Best-effort minimum routing Fast FLOV links
33
Total Power for Synthetic Workload
34 Uniform Random (0.08 flits/node/cycle) FLOV Aggressive power-gating Net gain from both dynamic and static power saving
34
Packet Latency Decomposition
35 Tornado (0.08 flits/node/cycle) FLOV latency: fast FLOV link through gated routers. Accumulated router latency: router pipeline latency in active routers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.