Asynchronous vs. Synchronous Network-on-Chip

Slides:



Advertisements
Similar presentations
Prof. Natalie Enright Jerger
Advertisements

A Programmable Adaptive Router for a GALS Parallel System Jian Wu APT Group University of Manchester May 2009.
QuT: A Low-Power Optical Network-on-chip
A Novel 3D Layer-Multiplexed On-Chip Network
Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.
Systematic method for capturing “design intent” of Clock Domain Crossing (CDC) logic in constraints Ramesh Rajagopalan Cisco Systems.
Synchronous Digital Design Methodology and Guidelines
Destination-Based Adaptive Routing for 2D Mesh Networks ANCS 2010 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering University of California,
Asynchronous vs. Synchronous Design Techniques for NoCs Robert Mullins “The Status of the Network-on-Chip Revolution: Design Methods, Architectures and.
Addressing the System-on-a-Chip Interconnect Woes Through Communication-Based Design N. Vinay Krishnan EE249 Class Presentation.
The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan.
Montek Singh COMP Nov 10,  Design questions at various leves ◦ Network Adapter design ◦ Network level: topology and routing ◦ Link level:
Module R R RRR R RRRRR RR R R R R Efficient Link Capacity and QoS Design for Wormhole Network-on-Chip Zvika Guz, Isask ’ har Walter, Evgeny Bolotin, Israel.
NETWORK ON CHIP ROUTER Students : Itzik Ben - shushan Jonathan Silber Instructor : Isaschar Walter Final presentation part A Winter 2006.
Low Power Design for Wireless Sensor Networks Aki Happonen.
COMP Clockless Logic and Silicon Compilers Lecture 3
Network based System on Chip Part A Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
MINIMISING DYNAMIC POWER CONSUMPTION IN ON-CHIP NETWORKS Robert Mullins Computer Architecture Group Computer Laboratory University of Cambridge, UK.
Lei Wang, Yuho Jin, Hyungjun Kim and Eun Jung Kim
April, / 38 Network on Chip Advanced Topics in VLSI Asynchronous vs. Synchronous Design Techniques for NoCs Presented by: Alex Rekhelis.
1 Multi-Core Debug Platform for NoC-Based Systems Shan Tang and Qiang Xu EDA&Testing Laboratory.
Demystifying Data-Driven and Pausible Clocking Schemes Robert Mullins Computer Architecture Group Computer Laboratory, University of Cambridge ASYNC 2007,
Lab for Reliable Computing Generalized Latency-Insensitive Systems for Single-Clock and Multi-Clock Architectures Singh, M.; Theobald, M.; Design, Automation.
Network-on-Chip Examples System-on-Chip Group, CSE-IMM, DTU.
A. A. Jerraya Mark B. Josephs South Bank University, London System Timing.
1 Evgeny Bolotin – ClubNet Nov 2003 Network on Chip (NoC) Evgeny Bolotin Supervisors: Israel Cidon, Ran Ginosar and Avinoam Kolodny ClubNet - November.
Network-on-Chip Links and Implementation Issues System-on-Chip Group, CSE-IMM, DTU.
1 Evgeny Bolotin – ICECS 2004 Automatic Hardware-Efficient SoC Integration by QoS Network on Chip Electrical Engineering Department, Technion, Haifa, Israel.
Network-on-Chip: Communication Synthesis Department of Computer Science Texas A&M University.
Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.
Dragonfly Topology and Routing
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Data Communications and Networks Chapter 2 - Network Technologies - Circuit and Packet Switching Data Communications and Network.
Approaching Ideal NoC Latency with Pre-Configured Routes George Michelogiannakis, Dionisios Pnevmatikatos and Manolis Katevenis Institute of Computer Science.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
On-Chip Networks and Testing
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
Design, Synthesis and Test of Network on Chips
Clockless Chips Date: October 26, Presented by:
Déjà Vu Switching for Multiplane NoCs NOCS’12 University of Pittsburgh Ahmed Abousamra Rami MelhemAlex Jones.
Amitava Mitra Intel Corp., Bangalore, India William F. McLaughlin
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
2/6/2003IDEAL-IST Workshop, Christos P. Sotiriou, ICS-FORTH 1 IDEAL-IST Workshop Christos P. Sotiriou, Institute of Computer Science, FORTH.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
George Michelogiannakis William J. Dally Stanford University Router Designs for Elastic- Buffer On-Chip Networks.
Network-on-Chip Energy-Efficient Design Techniques for Interconnects Suhail Basit.
Network on Chip - Architectures and Design Methodology Natt Thepayasuwan Rohit Pai.
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
BZUPAGES.COM Presentation On SWITCHING TECHNIQUE Presented To; Sir Taimoor Presented By; Beenish Jahangir 07_04 Uzma Noreen 07_08 Tayyaba Jahangir 07_33.
Run-time Adaptive on-chip Communication Scheme 林孟諭 Dept. of Electrical Engineering National Cheng Kung University Tainan, Taiwan, R.O.C.
Performance, Cost, and Energy Evaluation of Fat H-Tree: A Cost-Efficient Tree-Based On-Chip Network Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi.
Networks-on-Chip (NoC) Suleyman TOSUN Computer Engineering Deptartment Hacettepe University, Turkey.
Soc 5.1 Chapter 5 Interconnect Computer System Design System-on-Chip by M. Flynn & W. Luk Pub. Wiley 2011 (copyright 2011)
1 Presenter: Min Yu,Lo 2015/12/21 Kumar, S.; Jantsch, A.; Soininen, J.-P.; Forsell, M.; Millberg, M.; Oberg, J.; Tiensyrja, K.; Hemani, A. VLSI, 2002.
February 12, 1999 Architecture and Circuits: 1 Interconnect-Oriented Architecture and Circuits William J. Dally Computer Systems Laboratory Stanford University.
By Nasir Mahmood.  The NoC solution brings a networking method to on-chip communication.
Advanced Processor Group The School of Computer Science A Dynamic Link Allocation Router Wei Song, Doug Edwards Advanced Processor Group The University.
SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.
Virtual-Channel Flow Control William J. Dally
Predictive High-Performance Architecture Research Mavens (PHARM), Department of ECE The NoX Router Mitchell Hayenga Mikko Lipasti.
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California,
Network On Chip Cache Coherency Final presentation – Part A Students: Zemer Tzach Kalifon Ethan Kalifon Ethan Instructor: Walter Isaschar Instructor: Walter.
A Low-Area Interconnect Architecture for Chip Multiprocessors Zhiyi Yu and Bevan Baas VLSI Computation Lab ECE Department, UC Davis.
Clockless Chips Under the esteemed guidance of Romy Sinha Lecturer, REC Bhalki Presented by: Lokesh S. Woldoddy 3RB05CS122 Date:11 April 2009.
Power-aware NOC Reuse on the Testing of Core-based Systems* CSCE 932 Class Presentation by Xinwang Zhang April 26, 2007 * Erika Cota, et al., International.
Network-on-Chip Paradigm Erman Doğan. OUTLINE SoC Communication Basics  Bus Architecture  Pros, Cons and Alternatives NoC  Why NoC?  Components 
Azeddien M. Sllame, Amani Hasan Abdelkader
OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel
Israel Cidon, Ran Ginosar and Avinoam Kolodny
Presentation transcript:

Asynchronous vs. Synchronous Network-on-Chip Prepared by Sergey Rudko Advanced Topics in VLSI 1 (NoC) 049036  

Introduction Problem Definition Proposed Solution Related Approaches NoC Implementation Alternatives Fully asynchronous Multi-synchronous (GALS) Synchronous Proposed Solution Systematic Comparison between Different Strategies Silicon Area Network Saturation Threshold Communication Throughput Packet Latency Power Consumption Implementation Flexibility and Tools Related Approaches I. Miro-Panades, F. Clermidy, P. Vivet, A. Greiner, “Physical Implementation of the DSPIN Network-on-Chip in the FAUST Architecture”, NoCs 2008

Synchronous Router Router Pipeline may include many stages VCA SA Router Data path LINK LINK Router Pipeline may include many stages Increases communication latency Router Pipeline may be optimized to single cycle router Possible by use of speculation Clock period same as pipeline router Presence of clock simplify design Standard libraries and tools Speculative Control Signals A. Kumar, P. Kundu, A. Singh, L. Peh and N. Jha , "A 4.6Tbits/s 3.6GHz Single-cycle NoC Router with a Novel Switch Allocator", International Conference on Computer Design (ICCD), October, 2007.

Limitations of Fully-Synchronous Networks Difficult to distribute clock Network spread over die & may have irregular layout Minimising skew costs complexity and power Solution: Alternatives/extensions to PLL and H-tree Single Network Clock Frequency Communicating synchronous IP blocks with different frequencies What is most appropriate network clock frequency? Problem: Clock Distribution and Frequency Selection Solution: Beyond a Single Global Clock

Synchronous Routers with Asynchronous Links (GALS) Asynchronous FIFO Synchronization is simple Traditional 2 FF synchronizers Can support asynchronous interconnects No longer exploiting periodic nature of router clocks Correct operation is independent of the delay of the link GALS interfaces with pausible clocks If necessary clock is stretched, data is always transferred reliably Need to construct local delay line Connect Frequency Independent Routers

Asynchronous NoCs Simple/elegant solution when networked IP blocks run at different clock frequencies Data driven, no superfluous switching activity No synchronization/clock alignment issues at interfaces Solves synchronization, clock domain crossings, timing, long connects No clock distribution issues Security and EMI advantages Clock focuses EM emissions The presence of a clock can also aid fault-induction and side-channel analysis attacks Reduced design time Easy to use interfaces, modularity Robust and simple implementation Reduced power But network latency significantly increased

Asynchronous NoCs Approaches “An Asynchronous Router for Multiple Service Levels Networks on Chip”, R. Dobkin et al, ASYNC’05. (QNoC Group) MANGO Clockless Network-on-Chip “A Scheduling Discipline for Latency and Bandwidth Guarantees in Asynchronous Network-on-Chip”, T. Bjerregaard and J. Spars, ASYNC’05. “A router Architecture for Connection-Orientated Service Guarantees in the MANGO Clockless Network-on-Chip”, T. Bjerregaard and J. Spars, DATE’05 R. Dobkin Provide Synchronous versus Asynchronous Router Study

Synchronous or Asynchronous NoCs? “Physical Implementation of the DSPIN Network-on-Chip in the FAUST Architecture” I. Miro-Panades, F. Clermidy, P. Vivet and A. Greiner NoCs 2008

Motivation Physically implement the DSPIN NoC into the FAUST application platform Compare the performances between ANOC and DSPIN on a real application and traffic Silicon Area Throughput Packet Latency Power Consumption

FAUST Architecture with ANOC Asynchronous NoC (ANOC) QDI 4-phase/4-rail asynchronous logic 20 Routers 5 port router Source routing Wormhole packet switch 32 bit payload GALS Conception 24 independent clocks FIFO based Interface Hard-macro approach for ANOC reuse

DSPIN Architecture Packet Based Distributed Router Architecture Suited for GALS Approach Mesochronouse links between routers Metastability Resolved by “bi-synchronous” FIFO Synthesizable with Standard Cells

DSPIN Clock Tree Mesochronous Link between Neighbor Routers

NoC Architecture Comparison Both implementation use GALS principles

Network Comparison DSPIN Power Issues Parameter ANOC DSPIN Implementation Hard-Macro Soft-Macro Area 0.281 mm² 0.187 mm² Throughout (worst case conditions( ~ 160Mflit/s ≤289Mflit/s (nominal conditions) ~ 220Mflit/s ≤408Mflit/s Power Consumption (F=150MHz) 3.69mW 5.89mW Power Consumption (F=250MHz) 10.39mW DSPIN throughput is deterministic with respect to the clock frequency DSPIN Power Issues Power consumption mainly dominated by FIFO data registers The DSPIN clock-gating reduced the power consumption by 67% DSPIN clock-tree Consumes as much Power as the Router Itself

Network Comparison - Latency Flit Path ANOC DSPIN F=150MHz F=250MHz Intermediate Router Latency 6.80 ns 16.66 ns 10.00 ns First + Last Router Latency 60.00 ns 56.66 ns 47.00 ns 34.00 ns Latency for 5 hops path 80.00 ns 106.66 ns 68.00 ns 64.00 ns Latency for 9 hops path 173.30 ns 96.00 ns 104.00 ns DSPIN routers resynchronize the data packets DSPIN should be clocked to 367MHz DSPIN Router is IP Data Locality Aware

Conclusion Little published work on asynchronous routers and networks Comparing synchronous and asynchronous designs are difficult System timing style Technology Circuit style and architecture Difficult to reproduce and simulate asynchronous designs from published work No notion of cycle-accurate model Hide detailed control and datapath delays Asynchronous Performance Guarantees Performance guarantees are required Less predictable, non-deterministic Predicting performance is more complex Asynchronous EDA Tool Requirements Synchronous Routers Predictability and determinism can be exploited Fast single cycle routers possible ANoC for Low Power & SNoC for Small Area

Thank You!!!