NoC: Network OR Chip? Israel Cidon Technion.

Slides:



Advertisements
Similar presentations
Best of Both Worlds: A Bus-Enhanced Network on-Chip (BENoC) Ran Manevich, Isask har (Zigi) Walter, Israel Cidon, and Avinoam Kolodny Technion – Israel.
Advertisements

A Novel 3D Layer-Multiplexed On-Chip Network
Multi-Layer Switching Layers 1, 2, and 3. Cisco Hierarchical Model Access Layer –Workgroup –Access layer aggregation and L3/L4 services Distribution Layer.
Technion – Israel Institute of Technology Qualcomm Corp. Research and Development, San Diego, California Leveraging Application-Level Requirements in the.
Module R R RRR R RRRRR RR R R R R Technion – Israel Institute of Technology The Era of Many-Module SoC: Revisiting the NoC Mapping Problem Isask’har (Zigi)
Module R R RRR R RRRRR RR R R R R Efficient Link Capacity and QoS Design for Wormhole Network-on-Chip Zvika Guz, Isask ’ har Walter, Evgeny Bolotin, Israel.
4-1 Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving side, delivers.
1 Evgeny Bolotin – Efficient Routing, DATE 2007 Routing Table Minimization for Irregular Mesh NoCs Evgeny Bolotin, Israel Cidon, Ran Ginosar, Avinoam Kolodny.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
NoC: Network OR Chip? Israel Cidon Technion. Israel Cidon, Technion Technion’s NoC Research: PIs  Israel Cidon (networking)  Ran Ginosar (VLSI)  Idit.
GeNoLator – Generic Network Simulator Final Presentation Students: Gal Ben-Haim, Dan Blechner Supervisor: Isask'har Walter Winter 08/09 18/08/2009.
Network-on-Chip Examples System-on-Chip Group, CSE-IMM, DTU.
1 Evgeny Bolotin – ClubNet Nov 2003 Network on Chip (NoC) Evgeny Bolotin Supervisors: Israel Cidon, Ran Ginosar and Avinoam Kolodny ClubNet - November.
1 E. Bolotin – The Power of Priority, NoCs 2007 The Power of Priority : NoC based Distributed Cache Coherency Evgeny Bolotin, Zvika Guz, Israel Cidon,
1 1 Networks on Chips (NoC) – Keeping up with Rent’s Rule and Moore’s Law Avi Kolodny Technion – Israel Institute of Technology International Workshop.
1 Evgeny Bolotin – ICECS 2004 Automatic Hardware-Efficient SoC Integration by QoS Network on Chip Electrical Engineering Department, Technion, Haifa, Israel.
Architecture and Routing for NoC-based FPGA Israel Cidon* *joint work with Roman Gindin and Idit Keidar.
Semester 4 - Chapter 3 – WAN Design Routers within WANs are connection points of a network. Routers determine the most appropriate route or path through.
WAN Technologies.
(part 3).  Switches, also known as switching hubs, have become an increasingly important part of our networking today, because when working with hubs,
Lecture 1, 1Spring 2003, COM1337/3501Computer Communication Networks Rajmohan Rajaraman COM1337/3501 Textbook: Computer Networks: A Systems Approach, L.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
On-Chip Networks and Testing
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
Virtual LAN Design Switches also have enabled the creation of Virtual LANs (VLANs). VLANs provide greater opportunities to manage the flow of traffic on.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
Axel Jantsch 1 Networks on Chip Axel Jantsch 1 Shashi Kumar 1, Juha-Pekka Soininen 2, Martti Forsell 2, Mikael Millberg 1, Johnny Öberg 1, Kari Tiensurjä.
1 Presenter: Min Yu,Lo 2015/12/21 Kumar, S.; Jantsch, A.; Soininen, J.-P.; Forsell, M.; Millberg, M.; Oberg, J.; Tiensyrja, K.; Hemani, A. VLSI, 2002.
Module R R RRR R RRRRR RR R R R R Access Regulation to Hot-Modules in Wormhole NoCs Isask’har (Zigi) Walter Supervised by: Israel Cidon, Ran Ginosar and.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 1: Introduction to Switched Networks Routing And Switching 1.0.
Technion – Israel Institute of Technology Faculty of Electrical Engineering NOC Seminar Error Handling in Wormhole Networks Author: Amit Berman Mentor:
WAN Technologies. 2 Large Spans and Wide Area Networks MAN networks: Have not been commercially successful.
A seminar Presentation on NETWORK- ON- CHIP ARCHITECTURE EXPLORATION FRAMEWORK Under the supervision of Presented by Mr.G.Naresh,M.Tech., V.Sairamya Asst.
Network-on-Chip Paradigm Erman Doğan. OUTLINE SoC Communication Basics  Bus Architecture  Pros, Cons and Alternatives NoC  Why NoC?  Components 
Gateway redundancy protocols
COMPUTER NETWORKS CS610 Lecture-15 Hammad Khalid Khan.
Instructor Materials Chapter 6: Quality of Service
Chapter 9 Optimizing Network Performance
Instructor Materials Chapter 4: Introduction to Switched Networks
Local Area Networks Honolulu Community College
The Underlying Technologies
Advanced Computer Networks
Virtual Local Area Networks (VLANs) Part I
Semester 4 - Chapter 3 – WAN Design
Lecture 23: Interconnection Networks
Managing the performance of multiple radio Multihop ESS Mesh Networks.
Azeddien M. Sllame, Amani Hasan Abdelkader
Instructor Materials Chapter 4: Introduction to Switched Networks
OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel
Chapter 4: Switched Networks
Virtual LANs.
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 6: Quality of Service Connecting Networks.
Israel Cidon, Ran Ginosar and Avinoam Kolodny
Chapter 4: Switched Networks
An Introduction to Computer Networking
Storage area network and System area network (SAN)
On-time Network On-chip
Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Network-on-Chip Programmable Platform in Versal™ ACAP Architecture
Department of Electrical Engineering Joint work with Jiong Luo
Optical communications & networking - an Overview
Requirements Definition
Connectors, Repeaters, Hubs, Bridges, Switches, Routers, NIC’s
An Engineering Approach to Computer Networking
Multiprocessors and Multi-computers
Presentation transcript:

NoC: Network OR Chip? Israel Cidon Technion

Technion’s NoC Research: PIs Israel Cidon (networking) Ran Ginosar (VLSI) Idit Keidar (Dist. Systems) Avinoam Kolodny (VLSI) Students: Evgeny Bolotin, Reuven Dobkin, Zvika Guz, Arkadiy Morgenshtein, Zigi Walter Roman Gindin

Origins of the NoC concept Early publications: Guerrier and Greiner (2000) – “A generic architecture for on-chip packet-switched interconnections” Hemani, Jantsch, Kumar, Postula, Oberg ,Millberg and Lindqvist (2000) – “Network on chip: An architecture for billion transistor era” Dally and Towles (2001) – “Route packets, not wires: on-chip interconnection networks” Wingard (2001) – “MicroNetwork-based integration of SoCs” Rijpkema, Goossens and Wielage (2001) – “A router architecture for networks on silicon” De Micheli and Benini (2002) – “Networks on chip: A new paradigm for systems on chip design” Bolotin, Cidon Ginosar and Kolodny (2004) – “QNoC: QoS architecture and design process for network on chip”

Evolution or Paradigm Shift? Network link Network router Computing module Bus Architectural paradigm shift Replace wire spaghetti by an intelligent network infrastructure Design paradigm shift Busses and signals replaced by packets Organizational paradigm shift Create a new discipline, a new infrastructure responsibility

Characteristics of a paradigm shift successful Characteristics of a paradigm shift Addresses a critical and topical need Enables a quantum leap in productivity and application Resistance from legacy experts Requires a major change of mindset and skills! Think: Networking not Bus evolution!

Critical needs addressed by NoC 1) Efficient interconnect: delay, power, noise, scalability, reliability Module 2) Increase system integration productivity 3) Enable Chip Multi Processors

NoC offers Area and Power Scalability For Same Performance, compare the Wire-area and power: NoC: Simple Bus: Point-to Point: Segmented Bus: E. Bolotin at al. , “Cost Considerations in Network on Chip”, Integration, special issue on Network on Chip, October 2004

4 Decades of Network 101 Evolved from busses and p-t-p connections Extensive architectures, modeling and analysis research Architecture is about optimizing network costs Different goals and element costs => different architectures: Local Area Networks (LANs) Metropolitan Area Networks (MANs) System interconnect networks (SAN, InfiniBand …) WAN (TCP/IP, ATM…) Wireless networks Cross layered design Early architecture standardization is an optimization burden!

4 Decades of Network 101

Local Area Networks (LANs) Critical need Distributing operations and sharing of heterogeneous systems Constraints Standardization Main Cost Incremental cost (NICs, wiring) Typical optimized architecture: Low cost hubs/switches Tree like architecture Exploit low cost local BW Shared media Broadcast Host embedded NICs

System interconnect (SAN, InfiniBand) Critical need Create a powerful specialized system from low cost units Constraints Low latency Main Cost Total system cost per MIP Typical architecture: Wormhole/cut through Connection based Over-provisioned network High degree/regular topology Specific optimizations (e.g. RDMA)

WAN (TCP/IP, ATM…) Critical need Constraints Main Cost Global application networking (collaboration, WWW, file sharing, voice) Constraints Scalability Heterogeneous user and application QoS requirements Main Cost Physical infrastructure (mainly long distance trunks) Typical architecture of choice: Packet switching Irregular, small degree networks of high speed trunks Optimization of topology and link capacities

CAN optimization The main cost(s) The design envelope (constraints) Collection of designs supported by a given chip Convex hull of traffic requirements all configurations QoS constraints Other requirements (eg: design automation…) The main cost(s) Total Area Power Others Design time, verification and testability, Optimization variables Switching mechanism QoS Topology (incl. links capacities) Routing Flow and congestion control Buffering Application support …..

General purpose computer One NoC does not fit all! Reconfiguration rate during run time CMP ASSP FPGA at boot time at design time ASIC Flexibility single application General purpose computer I. Cidon and K. Goossens, in “Networks on Chips” , G. De Micheli and L. Benini, Morgan Kaufmann, 2006

General purpose computer One NoC does not fit all! Traffic Unpredictability Run time CMP ASSP FPGA At configuration At design time ASIC Flexibility single application General purpose computer A large solution range! I. Cidon and K. Goossens, in “Networks on Chips” , G. De Micheli and L. Benini, Morgan Kaufmann, 2006

Apply paradigm to ASIC based NoC Design envelop / constraints Well define inter-modules traffic Automatic synthesis Variable QoS requirement Main cost Power and area Architecture of choice: Wormhole or small frame switching Small # of buffers, VCs, tables Simple QoS mechanisms (which?) Topology and routing optimized for cost

Example: QNoC Quality-of-service NoC architecture for ASICs Traffic requirements are known a-priori Overall approach Wormhole switching QoS based on priority classes Small buffer/VC budget In-order SP XY routing Irregular topology Optimized link capacities (0,2) (0,0) (1,0) (0,3) (1,4) (0,4) (2,1) (2,0) (2,2) (2,3) (2,4) (4,3) (3,4) (4,4) R (5,0) * E. Bolotin, I. Cidon, R. Ginosar and A. Kolodny., “QNoC: QoS architecture and design process for Network on Chip”, JSA special issue on NoC, 2004.

Quality-of-Service in QNoC Multiple priority classes Define latency Preemptive Possible ASIC classes Signaling Real Time Stream Read-Write DMA Block Transfer Statistical guarantees E.g. <0.01% arrive later then required N T * E. Bolotin, I. Cidon, R. Ginosar and A. Kolodny., “QNoC: QoS architecture and design process for Network on Chip”, JSA special issue on NOC, 2004.

QNoC Design Flow Extract inter-module traffic Place modules Allocate link capacities Verify QoS and cost

QNoC Design Flow Extract inter-module traffic Place modules Allocate link capacities R R R R R R R Module Module Verify QoS and cost

QNoC Design Flow Extract inter-module traffic Place modules Allocate link capacities Verify QoS and cost Optimize capacity for performance/power tradeoff Capacity allocation is a traditional WAN optimization problem, however:

Wormhole Delay Modeling Approximate delay analysis in wormhole networks Multiple Virtual-Channels Different link capacities Different communication demands Queuing delay: Flit interleaving delay approximation: * I. Walter, Z. Guz, I. Cidon, R. Ginosar and A. Kolodny, “Efficient Link Capacity and QoS Design for Wormhole Network-on-Chip,” DATE 2006.  

The Capacity Allocation Problem Given: system topology and routing Each flow’s bandwidth (fi ) and delay bound (TiREQ) Minimize total link capacity Such that:

Capacity Allocation – Realistic Example A SoC-like system with realistic traffic demands and delay requirements “Classic” design: 41.8Gbit/sec Using the algorithm: 28.7Gbit/sec Total capacity reduced by 30% 00 01 02 03 10 11 12 13 20 21 22 23 Before optimization After optimization

Optimizing routing on Irregular Mesh Around the Block Dead End Goal: Minimize the total size of routing tables E. Bolotin, I. Cidon, R. Ginosar and A. Kolodny, "Routing Table Minimization for Irregular Mesh NoCs", DATE 2007.

Saving Table Hardware Traditional solutions - full routing tables Destination Based Routing - at router Source Routing – at sources Solution idea: Use Reduced Tables Store only relevant destinations (PLA) Default function (“Go XY” or “Don’t turn”) + Table for deviations

Routing Heuristics for Irregular Mesh Distributed Routing (full tables) X-Y Routing with Deviation Tables Source Routing Source Routing for Deviation Points Random problem instances Systems with real applications

Efficient Routing Results Scaling of Savings Savings Network Size

NoC for Shared Memory CMP Constraints Multiple access to coherent cache Unpredictable traffic pattern QoS requirements (fetch, pre-fetch) Main cost CMP power / area per performance Architecture of choice: Tailored for a given CMP In-order/adaptive routing? Simple QoS mechanisms? Regular topology? is CMP symmetric? Built in support functions (multicast, search…)

NoC can facilitate critical transactions * E.Bolotin, Z. Guz, I.Cidon, R. Ginosar and A. Kolodny, “The Power of Priority: NoC based Distributed Cache Coherency”, NoCs 2007.

Priority NoC: Results

NoC Based FPGA Architecture Functional unit NoC for inter-routing Routers Configurable region – User logic (1) future FPGA will be NoC-based and (2) the design will be 2-tiered, or hierarchical. Configurable network interface

NoC for FPGA Design envelope / constraints Many ASIC like applications for a given FPGA Hard NoC infrastructure – efficient but inflexible Soft logic is reusable but has inferior performance Average NoC cost of most demanding designs Hard grid links and router logic Total configured NoC Logic used Architecture of choice: Regular and uniform grid In-order/load balanced routing Hard logic for links, routers Soft logic for routing algorithms, headers, CNIs Soft NoC tuning (routing, CNI) for a given implementation

NoC Based FPGA Architecture Functional unit NoC for inter-routing Routers Configurable region – User logic (1) future FPGA will be NoC-based and (2) the design will be 2-tiered, or hierarchical. Configurable network interface

Source Toggle XY Unlike TXY, traffic to same destination is not split Maximum capacity similar to TXY The route is a bitwise XOR of source and destination ID Can be extended to weighted source toggle (WOT)

Design Envelope for various distances between the hotspots for WOT Two Hotspots Design Envelope for various distances between the hotspots for WOT Maximum Capacity

Generic NoC Problems Many shared problems across design spectrum, examples: Need for a low latency class of service Verification and predictability Power control of NoCs Centralized vs. distributed control Is single NoC enough per chip? Bus examples suggest otherwise Hot modules slows incoming NoC traffic Off chip systems Shared memory subsystems Expensive functional units

NoC clogging by hot modules IP1 IP2 Interface Interface Interface IP3 HM is not a local problem Transparent to NoC performance Walter, Cidon, Ginosar and Kolodny, ”Access Regulation to Hot-Modules in Wormhole NoCs”, NOCS 2007.

Source Fairness IP (HM) Interface No “fairness” is guarantied since routers’ arbitration is based on local state The further is the source from the destination, its worm has to win more arbitrations The HM module bandwidth isn’t fairly shared

Hot Module Distributed Arbitration Control is distributed or centralized Centralized control can account for dependencies Requests and grants are sent at high service level Requests and grants includes additional data as needed requested quota, source queue size, priority, deadline, etc. Granted quota, scheduling of transmission's, etc. Initial credits hides light load request-grant latency Emphasis: Bypassing (blocked) data packets!

Hot vs. non-Hot ModuleTraffic HM Traffic With Control Other Traffic With Control HM Traffic Without Control Other Traffic Without Control

NoC: A Network AND A Chip Conclusions NoC is a chip design paradigm shift Introduces many diverse and new networking challenges No killer NoC for all chips Should not comply with any X-AN concept May include centralized mechanisms May involve more than one NoC/Bus mechanisms May combine several communication methodologies Low latency NoC/Bus for metadata and urgent signals Beware of early standardization and legacy barriers Mutual benefit for VLSI-Networking collaboration NoC: A Network AND A Chip