Presenter : Ching-Hua Huang 2012/4/16 A Low-latency GALS Interface Implementation Yuan-Teng Chang; Wei-Che Chen; Hung-Yue Tsai; Wei-Min Cheng; Chang-Jiu.

Slides:



Advertisements
Similar presentations
Bus arbitration Processor and DMA controllers both need to initiate data transfers on the bus and access main memory. The device that is allowed to initiate.
Advertisements

Presenter : Cheng-Ta Wu Kenichiro Anjo, Member, IEEE, Atsushi Okamura, and Masato Motomura IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39,NO. 5, MAY 2004.
Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.
Data Synchronization Issues in GALS SoCs Rostislav (Reuven) Dobkin and Ran Ginosar Technion Christos P. Sotiriou FORTH ICS- FORTH.
Systematic method for capturing “design intent” of Clock Domain Crossing (CDC) logic in constraints Ramesh Rajagopalan Cisco Systems.
ELEC 256 / Saif Zahir UBC / 2000 Timing Methodology Overview Set of rules for interconnecting components and clocks When followed, guarantee proper operation.
Lecture: 1.6 Tri-states, Mux, Latches & Flip Flops
Avshalom Elyada, Ran GinosarPipeline Synchronization 1 A Unique and Successfully Implemented Approach to the Synchronization Problem Based on the article.
Module 12.  In Module 9, 10, 11, you have been introduced to examples of combinational logic circuits whereby the outputs are entirely dependent on the.
Digital Logic Circuits (Part 2) Computer Architecture Computer Architecture.
Digital Logic Design Lecture 22. Announcements Homework 7 due today Homework 8 on course webpage, due 11/20. Recitation quiz on Monday on material from.
Synchronous Digital Design Methodology and Guidelines
1 Delay Insensitivity does not mean slope insensitivity! Vainbaum Yuri.
Digital Integrated Circuits© Prentice Hall 1995 Timing ISSUES IN TIMING.
ENGIN112 L20: Sequential Circuits: Flip flops October 20, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 20 Sequential Circuits: Flip.
Sequential Logic 1  Combinational logic:  Compute a function all at one time  Fast/expensive  e.g. combinational multiplier  Sequential logic:  Compute.
Demystifying Data-Driven and Pausible Clocking Schemes Robert Mullins Computer Architecture Group Computer Laboratory, University of Cambridge ASYNC 2007,
Lab for Reliable Computing Generalized Latency-Insensitive Systems for Single-Clock and Multi-Clock Architectures Singh, M.; Theobald, M.; Design, Automation.
A. A. Jerraya Mark B. Josephs South Bank University, London System Timing.
1 Synchronization of complex systems Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain Thanks to A. Chakraborty, T. Chelcea, M. Greenstreet.
CS 151 Digital Systems Design Lecture 20 Sequential Circuits: Flip flops.
Automatic Interface Generation P.I.G. : Presented by Trevor Meyerowitz Sonics: Presented by Michael Sheets EE249 Discussion November 30, 1999.
Temporizzazioni e sincronismo1 Progettazione di circuiti e sistemi VLSI Anno Accademico Lezione Temporizzazioni e sincronizzazione.
Fall 2009 / Winter 2010 Ran Ginosar (
Sequential Circuit  It is a type of logic circuit whose output depends not only on the present value of its input signals but on the past history of its.
Presenter : Ching-Hua Huang 2012/11/3 Implementation and Prototyping of a Complex Multi-Project System-on-a-Chip Chun-Ming Huang, Chien-Ming Wu, Chih-Chyau.
COE 202: Digital Logic Design Sequential Circuits Part 1
Digital Design Strategies and Techniques. Analog Building Blocks for Digital Primitives We implement logical devices with analog devices There is no magic.
Amitava Mitra Intel Corp., Bangalore, India William F. McLaughlin
MOUSETRAP Ultra-High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh & Steven M. Nowick Department of Computer Science Columbia University,
Presenter : Ching-Hua Huang 2013/9/16 Visibility Enhancement for Silicon Debug Cited count : 62 Yu-Chin Hsu; Furshing Tsai; Wells Jong; Ying-Tsai Chang.
Communication Techniques Design Team 2 Luke LaPointe Nick Timpf Mark VanCamp Brent Woodman Steve Zuraski Design Team 2 Luke LaPointe Nick Timpf Mark VanCamp.
Communication methods
Presenter : Ching-Hua Huang 2012/6/25 A High-Throughput, Metastability-Free GALS Channel Based on Pausible Clock Method Mohammad Ali Rahimian, Siamak Mohammadi,
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
1 COMP541 Sequential Circuits Montek Singh Feb 1, 2012.
The Principle of Electronic Data Serial and Parallel Data Communication Transmission Rate Bandwidth Bit Rate Parity bits.
Reading Assignment: Rabaey: Chapter 9
Lecture 11: FPGA-Based System Design October 18, 2004 ECE 697F Reconfigurable Computing Lecture 11 FPGA-Based System Design.
SoC Clock Synchronizers Project Elihai Maicas Harel Mechlovitz Characterization Presentation.
9/15/09 - L19 Sequential CircuitsCopyright Joanne DeGroat, ECE, OSU1 Sequential Cirucits.
1 Practical Design and Performance Evaluation of Completion Detection Circuits Fu-Chiung Cheng Department of Computer Science Columbia University.
Student Name USN NO Guide Name H.O.D Name Name Of The College & Dept.
SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.
1 Bridging the gap between asynchronous design and designers Peter A. BeerelFulcrum Microsystems, Calabasas Hills, CA, USA Jordi CortadellaUniversitat.
Synchronous Sequential Circuits by Dr. Amin Danial Asham.
CS151 Introduction to Digital Design Chapter 5: Sequential Circuits 5-1 : Sequential Circuit Definition 5-2: Latches 1Created by: Ms.Amany AlSaleh.
VADA Lab.SungKyunKwan Univ. 1 L5:Lower Power Architecture Design 성균관대학교 조 준 동 교수
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California,
Field Programmable Port Extender (FPX) 1 Modular Design Techniques for the Field Programmable Port Extender John Lockwood and David Taylor Washington University.
REGISTER TRANSFER LANGUAGE (RTL) INTRODUCTION TO REGISTER Registers1.
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
EE141 Timing Issues 1 Chapter 10 Timing Issues Rev /11/2003.
Formal Verification of Clock Domain Crossing Using Gate-level Models of Metastable Flip-Flops Ghaith Tarawneh, Andrey Mokhov and Alex Yakovlev Newcastle.
1 Clockless Logic Montek Singh Thu, Mar 2, Review: Logic Gate Families  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates,
Interconnection Structures
2018/5/2 EE 4271 VLSI Design, Fall 2016 Sequential Circuits.
Other Approaches.
REGISTER TRANSFER LANGUAGE (RTL)
Asynchronous Interface Specification, Analysis and Synthesis
Synthesis of Speed Independent Circuits Based on Decomposition
Sequential circuit design with metastability
Flip Flops.
2018/8/29 EE 4271 VLSI Design, Fall 2013 Sequential Circuits.
ECE Digital logic Lecture 16: Synchronous Sequential Logic
触发器 Flip-Flops 刘鹏 浙江大学信息与电子工程学院 March 27, 2018
CS341 Digital Logic and Computer Organization F2003
CSE 370 – Winter Sequential Logic-2 - 1
Registers Today we’ll see some common sequential devices: counters and registers. They’re good examples of sequential analysis and design. They are also.
2019/9/26 EE 4271 VLSI Design, Fall 2012 Sequential Circuits.
Presentation transcript:

Presenter : Ching-Hua Huang 2012/4/16 A Low-latency GALS Interface Implementation Yuan-Teng Chang; Wei-Che Chen; Hung-Yue Tsai; Wei-Min Cheng; Chang-Jiu Chen; Fu-Chiung Cheng Dept. of Comput. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan Circuits and Systems (APCCAS), 2010 IEEE Asia Pacific Conference on National Sun Yat-sen University Embedded System Laboratory

2 With the VLSI technology improving rapidly, SoC has been becoming the most important VLSI application. However, clock distribution and low power have already become the two most important issues in SoC design. In addition, it’s also a very important issue to integrate IPs that can perform operations correctly with different clocks. Asynchronous circuits may resolve these problems by removing the “clock” signal. But it’s too hard to implement the whole circuits with asynchronous circuit. The GALS (Globally-Asynchronous Locally-Synchronous) design methodology can balance this problem via separating each synchronous design with asynchronous interface. Thus, each part of the circuit can perform operations with its own clock. The communication between different parts of the circuit can be achieved via asynchronous channels. The GALS provides a reliable communication between different modules. However, the latency of GALS interface may cause performance degradation seriously. Thus how to reduce the latency of GALS interface is significant. In this paper, we implemented a small and simple stretchable-clock based GALS wrapper with low latency in Verilog HDL and synthesized the design with TSMC 0.13μm cell library. We also showed that the wrapper can operate correctly with modules which operate with great different clock frequencies. In addition, we also recommend adding FIFO storage element on the transmission path.

3 What’s the problem  IPs can perform operations correctly with different clocks. ◦ Synchronous circuits work by “clock” signal  Some drawbacks ◦ Asynchronous circuits work by handshake protocols  high implementation costs and difficulties ◦ GALS (Globally-Asynchronous Locally -Synchronous) design methodology  To integrate both the advantages of Synchronous and Asynchronous Circuits  The latency of GALS cause performance degradation seriously. ◦ A stretchable-clock based GALS wrapper with low latency.

4 Related work [This paper] [1,2,3,4] Some drawbacks of Synchronous circuit [7] GALS systems [8] GALS has large latency How to deal with these drawbacks GALS was first Appeared in 1984 [6] To integrate both the advantages of Syn. and Asyn Circuits 1. clock skew 2. difficulty in clock distribution 3. worse case performance 4. not modular 5. sensitive to variations in physical parameters 6. synchronization failure 7. noise (EMI) reducing the latency of asynchronous interface [9] 1.Pausible clock generator 2.Stretchable clock generator The major difference between them is the way to stop the clock [5] Asynchronous circuit handshake protocols high implementation costs and difficulties 1.Input controller 2.Output controller GALS methodology was proposed

5 Proposed method The new STG (Signal Transition Graph) Compose with REQ 、 ACK 、 stretch 、 WR(or RD) The proposed new wrapper Input controller Output controller

1.Stoppable clock generator 2.The most commonly used approach so far 3.Uses odd number of inverters to generate the local clock signal of the locally synchronous module 6 AB Y Ri Ai lclk rclk

7 AB Y 1 Hold 1.The basic idea is similar to the above approach : stop the clock when data transfer occurs 2.The major difference with above approach is the way to stop the clock The symbol "C” represents C-element, a self-timed latch AB Y 0 0 0

8 =0 =1=1 =1=1 =1=1 =1=1 If receiver needs to receive data

9 =0 =1=1 =1=1 =1=1 =1=1

10 If it put a First-In-First-Out (FIFO), the sender could put the data into the FIFOs and get acknowledge earlier. Thus sender will continue computation instead of waiting for receiver. The latch is controlled by ACK; data has to be stored correctly in the latch during the time from ACK+ to ACK-

Implemented proposed design  Gate-level in Verilog HDL Synopsys Design Complier  Be used to synthesize our gate-level design  With TSMC 0.13μm cell library 11 Compare area and latency with two different GALS models proposed[11]

12 Experimental Results clk sender = 555 MHz, clk receiver = 133 MHz clk sender = 133 MHz, clk receiver = 555 MHz

13  This paper propose a new GALS wrapper ◦ Based on four-phase handshake protocol. ◦ Consists of an input controller and an output controller  The Area and Latency are improved. ◦ Compared to the C-element based design  The area of the new wrapper is only 30.8%  The latency of the new wrapper is only 39.7% ◦ Compared to the standard cell based design  The area of the new wrapper is only 63.5%  The latency of the new wrapper is only 55%

14  This paper list the GALS history and principle for design ◦ Like the GALS concept  Synchronous  Asynchronous  GALS ◦ To ensure operation correctness, the synchronous modules must be stopped when the data transfer occurs  Improving my recognize for GALS ◦ The control of Asynchronous wrapper ◦ STG