Networks-on-Chip
Seminar contents The Premises Homogenous and Heterogeneous Systems- on-Chip and their interconnection networks The Network-on-Chip approach Slide from S. Tota and M. R. Casu [1]
The premises The System-on-Chip (SoC) today Heterogeneous ~10 IP’s Homogeneous (MP-SoC) ~ 10 uP (with exceptions) On-Chip BUS (AMBA, Core Connect, Wishbone, …) IP and uP are sold with proprietary Bus IF Near and long-term forecast 100 IP/uP: Busses are non scalable! Physical Design issues: signal integrity, power consumption, timing closure Clock issues: Is time for the Globally Asynchronous paradigm? (Still locally synchronous) Need for “more regular” design Slide from S. Tota and M. R. Casu [1]
Heterogeneous Today’s SoC CPUDSPMEM Embedded FPGA Dedicated IP Interconnection network (BUS) I/O Slide from S. Tota and M. R. Casu [1]
Maya (Rabaey’00) Slide from S. Tota and M. R. Casu [1]
Maya (Rabaey’00) Slide from S. Tota and M. R. Casu [1]
Maya (Rabaey’00) Slide from S. Tota and M. R. Casu [1]
Homogeneous SoC (MP-SoC) CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM CPU MEM Interconnection network (BUS, XBAR) Slide from S. Tota and M. R. Casu [1]
MP-SoC: Cisco CRS-1 Router CRS-1 Router uses 188 extensible network processors per “Silicon Packet Processor” chip Slide from S. Tota and M. R. Casu [1]
MP-SoC: Cisco CRS-1 Router CRS-1 Router uses 188 extensible network processors per “Silicon Packet Processor” chip 16 PPE Clusters of 12 PPEs each Slide from S. Tota and M. R. Casu [1]
Very long wires 1 ns (1 GHz)0.1 ns (10 GHz) A B A B Year 2005Year 2010 Slide from S. Tota and M. R. Casu [1]
Bus pros ( ) and cons ( ) Every unit attached adds parasitic capacitance, therefore electrical performance degrades with growth. Bus timing is difficult in a deep submicron process. Bus arbiter delay grows with the number of masters. The arbiter is also instance-specific. Bandwidth is limited and shared by all units attached. The silicon cost of a bus is small. Any bus is almost directly compatible with most available IPs, including software running on CPUs. The concepts are simple and well understood. Slide from S. Tota and M. R. Casu [1]
What are NoC’s? According to Wikipedia: “Network-on-a-chip (NoC) is a new paradigm for System-on-Chip (SoC) design. NoC based- systems accommodate multiple asynchronous clocking that many of today's complex SoC designs use. The NoC solution brings a networking method to on-chip communications and claims roughly a threefold performance increase over conventional bus systems.” Slide from S. Tota and M. R. Casu [1]
Processor Master Global Memory Slave Global I/O Slave Global I/O Slave Processor Master Processor Master Processor Master Processor Master Processor Master Processor Master Processor Master Processor Master Routing Node Routing Node Routing Node Routing Node Routing Node Routing Node Routing Node Routing Node Routing Node NoC exemple Slide from S. Tota and M. R. Casu [1]
Basic Ingredients of a NoC N Computational Resources Processing Elements (PE) 1 Connection Topology 1 Routing technique M N Switches N Network Interfaces 1 Addressing system 1 Communication Protocol 1 Programming model Message passing Shared Memory Slide from S. Tota and M. R. Casu [1]
Problems Internal network contention causes (often unpredictable) latency. The network has a significant silicon area. Bus-oriented IPs need smart wrappers. Software needs clean synchronization in multiprocessor systems. System designers need reeducation for new concepts. Slide from S. Tota and M. R. Casu [1]
Network on Chip (NoC) Adoption of network- based packet communication paradigm. Use abstraction and layering to decouple the communication issue from computation Distribute the responsibility of reliable transmission evenly over higher and lower layers of abstraction Software Application systems Software Application systems Architecture and control Transport Network Data link Architecture and control Transport Network Data link Physical wiring Protocol stack abstraction Benini & De Micheli, Computer 2002 Slide from L. Benini [2]
Physical layer - Synchronization Physical design: Voltage levels Driver design Sizing Physical routing Synchronization: How and when to sample the channel? Avoid a clock: asyncronous communication The clock travels with the data The clock can be reconstructed from the data Synchronization recovery has a cost Cannot be abstracted away Can cause errors (e.g., metastability) Slide from L. Benini [2]
Data-link layer Provide reliable data transfer on an unreliable physical channel Access to the communication medium Dealing with contention and arbitration Issues Fairness and safe communication Achieve high throughput Error resiliency Slide from L. Benini [2]
Topologies Heritage of networks with new constraints Need to accommodate interconnects in a 2D layout Cannot route long wires (clock frequency bound) a)SPIN, b)CLICHE’ c)Torus d)Folded torus e)Octagon f)BFT. Slide from S. Tota and M. R. Casu [1]
Switching Again, techniques inherited from Computer and Communication Networks New constraints in silicon: area and power Use as few buffers as possible Store & Forward and Virtual-Cut-Through Need buffers size for an entire packet, unsuited! Limited buffer size in Wormhole Deflection Routing, a.k.a. “Hot Potato” Virtual channels Increase buffer size… Slide from L. Benini [2]
Routing Deterministic vs. Adaptive Simplify/Complicate routing logic Easy/Uneasy deadlock free Prone/Robust to congestion 2D dimension order routing (XY) most used static routing in NoC (e.g. with Wormhole and Mesh) Slide from L. Benini [2]
Transport layer Decompose and reconstruct information Important choices Packet granularity Admission/congestion control Packet retransmission parameters All these factors affect heavily energy and performance Application-specific schemes vs. standards Slide from L. Benini [2]
System software Programming paradigms Shared memory Message passing Middleware: Layered system software Should provide low communication latency Modular, scaleable, robust …. Slide from L. Benini [2]
Who first had the idea? The most referred papers according to Google (#cit.) Guerrier’00 (204), A Generic Architecture for On- Chip Packet-Switched Interconnections Dally’01 (392), Route Packets, Not Wires: On-Chip Interconnection Networks Benini’02 (417), Networks on Chips: A New SoC Paradigm Kumar’02 (184), A Network on Chip Architecture and Design Methodology Slide from S. Tota and M. R. Casu [1]
Some NoC References J. Rabaey et al., “A 1-V heterogeneous reconfigurable DSP IC for wireless baseband digital signal processing,” IEEE Journal of Solid State Circuits, Vol. 35, No. 11, Nov. 2000, pp P. Guerrier and A. Greiner, “A Generic Architecture for On-Chip Packet-Switched Interconnections,” Proc. Design and Test in Europe (DATE), pp , Mar A. Adriahantenaina et al., “SPIN: a Scalable, Packet Switched, On-chip Micro-network,” Proc. Design and Test in Europe (DATE), Mar L. Benini and G. De Micheli, “Networks on Chips: A New SoC Paradigm,” Computer, vol. 35, no. 1, Jan. 2002, pp S. Kumar et al., “A network on chip architecture and design methodology,” in Proc. ISVLSI, W. J. Dally and B. Towles, “Route packets, not wires: on-chip interconnection networks,” in Proc. Design Automation Conf., K. Goossens et al., “Trade-offs in the design of a router with both guaranteed and best- effort services for networks on chip,” IEE Proc.-Comput. Digit. Tech., Vol. 150, No. 5, Sep. 2003, pp P.P. Pande et al., “Performance Evaluation and Design Trade-offs for Network-on-Chip Interconnect Architectures,” IEEE Trans. Computers, vol. 54, no. 8, Aug. 2005, pp Slide from S. Tota and M. R. Casu [1]
References 1.S. Tota and M. R. Casu Sergio Tota and Mario R. Casu, “Networks-on- Chip,” presentation L. Benini, “Networks on chip,” presentation,