10/03/2005: 1 Physical Synthesis of Latency Aware Low Power NoC Through Topology Exploration and Wire Style Optimization CK Cheng CSE Department UC San Diego
10/03/2005: 2 Related Work How to map processing cores to a mesh-based NoC? [J. Hu, ASP-DAC03] [S. Murali, DATE04] –Minimize power consumption –Minimize average communication delay What is the best energy-efficient NoC topologies under different network sizes and technology nodes? [H.Wang, DATE05] Our work differs from above work in that: –Physical constraints aware when implementing NoC –Wire style optimization –Latency aware synthesis
10/03/2005: 3 Motivation Topologies are important Different wires have their own comparative advantages Topologies/Wiring Styles co-design to improve power Wire stylesBit energyRouting area Latency RC repeated wire with minimum spacing HighSmallestSlowest RC repeated wire with large spacing Medium Transmission lineHigh initial power, Low for long wire LargestFastest
10/03/2005: 4 Our Focuses Physical Synthesis NoC Power optimization through –Topology selection –Wire style optimization Communication latency aware low power synthesis
10/03/2005: 5 Design Flow
10/03/2005: 6 Topology Library NoC size4x45x56x67x78x8 #topologies+placement We studies topologies which have identical row and column connections (Degree <= 3) Cover most of popular topologies such as mesh, torus, hypercube, octagon, twisted cube, etc.
10/03/2005: 7 Power and Delay Lib: Routers Orion simulator 0.18um technology node, Vdd = 1.8v 1GHz frequency, 4-flit buffer size, 128-bit flit size
10/03/2005: 8 Power and Delay Lib: Wires Wires –unit wire length (2mm) min global pitch = 1.44um –delay of RC wires are proportional to wire Power and length –Power and delay of T-line have setup cost: P(setup) = 4.4pJ/bit, D(setup) = 50ps
10/03/2005: 9 Experiment Results
10/03/2005: 10 (1) Wire Style Optimization 4x4 torus under various evenly distributed comm. demands
10/03/2005: 11 (1) Wire Style Optimization Comm. Demand (Gb/s) Power (w/o opt.) (W) Power (w/ opt.) (W) Improvement (%) (Max) Power improvement diminishes as comm. demand increased At the maximum comm. demand, still have space for wire style optimization
10/03/2005: 12 (2) Topology Selection (Power, Bandwidth) Optimal 4x4 topologies under various evenly distributed comm. demands
10/03/2005: 13 (2) Topology Selection (Power, Bandwidth) (topology, demand(Gb/s)) Total power (W) Wire power (W) Router power (W) (mesh,1) (torus,1) (full, 1) (mesh,25) (torus,25) (full,25) (mesh,40) (torus,40) (full,40)
10/03/2005: 14 (2) Topology Selection (Latency, Power, BW) Comparison of optimal topology with mesh, torus and hypercube in terms of power and latency
10/03/2005: 15 (2) Topology Selection (Latency, Power, BW) Minimum (Power x Latency)
10/03/2005: 16 (2) Topology Selection (Latency, Power, BW) Optimal 8x8 topologies under various on-chip area resources
10/03/2005: 17 (2) Topology Selection (Latency, Power, BW) Comparison of optimal topology with mesh and torus in terms of flow-hops
10/03/2005: 18 (3) Power and Latency Tradeoffs for Optimal 8x8 Topologies
10/03/2005: 19 (4) Video Applications
10/03/2005: 20 (4) Video Applications
10/03/2005: 21 Conclusions Simultaneous optimization of topologies and wire styles. Improve power latency product by up to 52.1%, 29.4% and 35.6%, comparing with mesh, torus, and hypercube topologies, respectively. Cover most of classic direct network topologies, but extend far beyond them.