Presenter : Ching-Hua Huang 2012/6/25 A High-Throughput, Metastability-Free GALS Channel Based on Pausible Clock Method Mohammad Ali Rahimian, Siamak Mohammadi,

Slides:



Advertisements
Similar presentations
System Integration and Performance
Advertisements

Verilog HDL -Introduction
Presenter : Ching-Hua Huang 2013/11/4 Temporal Parallel Simulation: A Fast Gate-level HDL Simulation Using Higher Level Models Cited count : 3 Dusung Kim.
Digital Logic issues in Embedded Systems. Things upcoming Remember that the first two topic talks are on 10/24 (ultrasonic distance and stepper motors)
CSE 341 Verilog HDL An Introduction. Hardware Specification Languages Verilog  Similar syntax to C  Commonly used in  Industry (USA & Japan) VHDL 
Aug Data/Clock Synchronization Fourteen ways to fool your synchronizer Ginosar, R.; Asynchronous Circuits and Systems, Proceedings. Ninth International.
MICROELETTRONICA Sequential circuits Lection 7.
Digital Logic Design Lecture # 17 University of Tehran.
Presenter : Ching-Hua Huang 2012/4/16 A Low-latency GALS Interface Implementation Yuan-Teng Chang; Wei-Che Chen; Hung-Yue Tsai; Wei-Min Cheng; Chang-Jiu.
Sequential Circuits. Outline  Floorplanning  Sequencing  Sequencing Element Design  Max and Min-Delay  Clock Skew  Time Borrowing  Two-Phase Clocking.
Pipeline transfer testing. The purpose of pipeline transfer increase the bandwidth for synchronous slave peripherals that require several cycles to return.
Design Goal Design an Analog-to-Digital Conversion chip to meet demands of high quality voice applications such as: Digital Telephony, Digital Hearing.
Synchronous Digital Design Methodology and Guidelines
Assume array size is 256 (mult: 4ns, add: 2ns)
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
Sequential Logic 1  Combinational logic:  Compute a function all at one time  Fast/expensive  e.g. combinational multiplier  Sequential logic:  Compute.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Senior Design I Lecture 11 - Timing and Metastability.
MINIMISING DYNAMIC POWER CONSUMPTION IN ON-CHIP NETWORKS Robert Mullins Computer Architecture Group Computer Laboratory University of Cambridge, UK.
1 Design Goal Design an Analog-to-Digital Conversion chip to meet demands of high quality voice applications such as: Digital Telephony, Digital Hearing.
Asynchronous Input Example Program counter normally increments, jumps to address of interrupt subroutine on asynchronous interrupt How many states can.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Senior Design I Lecture 15 - Handshaking.
Demystifying Data-Driven and Pausible Clocking Schemes Robert Mullins Computer Architecture Group Computer Laboratory, University of Cambridge ASYNC 2007,
Mehdi Amirijoo1 Power estimation n General power dissipation in CMOS n High-level power estimation metrics n Power estimation of the HW part.
Lab for Reliable Computing Generalized Latency-Insensitive Systems for Single-Clock and Multi-Clock Architectures Singh, M.; Theobald, M.; Design, Automation.
1 EECS Components and Design Techniques for Digital Systems Lec 21 – RTL Design Optimization 11/16/2004 David Culler Electrical Engineering and Computer.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Senior Design I Lecture 15 - Handshaking.
1 Synchronization of complex systems Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain Thanks to A. Chakraborty, T. Chelcea, M. Greenstreet.
Performance Analysis of Two Synchronizers Zhen ZhangJim Garside APT group, School of Computer Science University of Manchester.
CS61C L15 Synchronous Digital Systems (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
Network-on-Chip: Communication Synthesis Department of Computer Science Texas A&M University.
111/9/2005EE 108A Lecture 13 (c) 2005 W. J. Dally EE108A Lecture 13: Metastability and Synchronization Failure (or When Good Flip-Flops go Bad)
Counting with Sequential Logic Experiment 8. Experiment 7 Questions 1. Determine the propagation delay (in number of gates) from each input to each output.
Flip Flops. Clock Signal Sequential logic circuits have memory Output is a function of input and present state Sequential circuits are synchronized by.
CS3350B Computer Architecture Winter 2015 Lecture 5.2: State Circuits: Circuits that Remember Marc Moreno Maza [Adapted.
Digital Design Strategies and Techniques. Analog Building Blocks for Digital Primitives We implement logical devices with analog devices There is no magic.
Lecture 5. Sequential Logic 3 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education & Research.
© 2003 Xilinx, Inc. All Rights Reserved FPGA Design Techniques.
Registers CPE 49 RMUTI KOTAT.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Top Level View of Computer Function and Interconnection.
Lecture 2 1 ECE 412: Microcomputer Laboratory Lecture 2: Design Methodologies.
© BYU 18 ASYNCH Page 1 ECEn 224 Handling Asynchronous Inputs.
Sequential Design Basics. Lecture 2 topics  A review of devices that hold state A review of Latches A review of Flip-Flops 8/22/2012 – ECE 3561 Lect.
ECE 545 Project 2 Specification Part I. Adjust your synthesizable code for Project 1 in such a way that it complies with the following requirements: a.
Fall 2004EE 3563 Digital Systems Design EE 3563 VHSIC Hardware Description Language  Required Reading: –These Slides –VHDL Tutorial  Very High Speed.
Performance and Power Analysis of Globally Asynchronous Locally Synchronous Multiprocessor Systems Zhiyi Yu, Bevan M. Baas VLSI Computation Lab, ECE department,
ECE 545 Project 2 Specification. Project 2 (15 points) – due Tuesday, December 19, noon Application: cryptography OR digital signal processing optimized.
Case Study: The Abacus Switch CS Goals and Considerations Handles cell relay (fixed-size packets) Can be modified to handle variable-sized packets.
NS Training Hardware. Print Engine Controller NS9775.
Reading Assignment: Rabaey: Chapter 9
SoC Clock Synchronizers Project Elihai Maicas Harel Mechlovitz Characterization Presentation.
LAB 3 – Synchronous Serial Port Design Using Verilog
MASCON: A Single IC Solution to ATM Multi-Channel Switching With Embedded Multicasting Ali Mohammad Zareh Bidoki April 2002.
Low Power, High-Throughput AD Converters
CDA 4253 FPGA System Design RTL Design Methodology 1 Hao Zheng Comp Sci & Eng USF.
SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Virtual-Channel Flow Control William J. Dally
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California,
Lecture 4. Sequential Logic #3 Prof. Taeweon Suh Computer Science & Engineering Korea University COSE221, COMP211 Logic Design.
TOPIC : Introduction to Sequential Circuits UNIT 1: Modeling and Simulation Module 4 : Modeling Sequential Circuits.
Low Power, High-Throughput AD Converters
Buffering Techniques Greg Stitt ECE Department University of Florida.
Clock Domain Crossing Keon Amini.
Dominique Breton, Jihane Maalmi
Timing Analysis 11/21/2018.
Levels in Processor Design
Synchronous Digital Design Methodology and Guidelines
Synchronous Digital Design Methodology and Guidelines
Presentation transcript:

Presenter : Ching-Hua Huang 2012/6/25 A High-Throughput, Metastability-Free GALS Channel Based on Pausible Clock Method Mohammad Ali Rahimian, Siamak Mohammadi, Mohammad Fattah Dependable Systems Design Lab, School of ECE, University of Tehran, Tehran, Iran nd Asia Symposium on Quality Electronic Design (ASQED) National Sun Yat-sen University Embedded System Laboratory

Synchronization issues such as metastability in multi-clock domain systems have become a big problem, reducing data transmission throughput between domains. In this paper, a high-throughput, metastability-free data transmission channel based on pausible clock method in Globally-Asynchronous Locally-Synchronous (GALS) systems is proposed. This channel can be used as the interconnection of mixed-clock synchronous IP cores without having concerns about their synchronization. We show that the probability of metastability in our design is practically zero; and this without loss of throughput and latency, allowing the transmitter and receiver to operate with their own maximum clock frequency. The proposed channel is simulated in 90nm CMOS process using Predictive Technology Model (PTM) library. Gate delays and power parameters are extracted from Spice simulations and are back annotated into our channel HDL code. The throughput, latency and power are analyzed and compared with existing designs. 2 PTM is developed by the Nanoscale Integration and Modeling (NIMO) Group at ASU.

3 Related work [This paper] A High-Throughput, Metastability-Free GALS Channel Based on Pausible Clock Method FIFO The main component of the proposed channel [3] GALS design Asynchronous Pausible clock Loosely synchronous [2] GALS systems are introduced in 80's [4]~[8] comparison [22]~[26] [1] Modules Reusability and communication between them in GALS [9]~[21] Recent research : high-throughput, low-latency, ANoC Some approaches of GALS design early

4 What’s the problem  Metastability is a serious problem in multi- clock domain system ◦ It will reduce data transmission throughput  If a storage into Metastability, its value of output will shock between 0 and 1  The cause for occurrence of Metastability ◦ I will explain at next page  Common solve approaches of Metastability ◦ 2 Flip-Flop ◦ FIFO

5 tsu is the setup time th is the hold time tmet is the metastable state that possible to continue If the clock of storage close to rising edge. The storage maybe into Metastability. When a data input from one clock to another clock of storage

6 2 Flip-Flop Synchronizer can’t eliminate Metastability completely, but it can reduce the probability of occurrence. 2 Flip-Flop Synchronizer 1.The probability of occurrence of Metastability ∝ Clock rate 2.If we implement this circuit at 500MHz frequency, the average time between two Metastability occurs is 1.9x10 22 years. (Refer from senior Chi-Guang)

7 0x00 0x04 … 0xFF Read pointer Write pointer Full/Empty happen Handle different clock

8 Proposed method : A High-Throughput, Metastability-Free GALS Channel Assume FIFO depth = three

9 Transmitter : 1. TxReady (Ready transmit signal) 2. TxData (Data bus) 3. IF the FIFO full, TxRun will be ternon to pause the TxClk Receiver: The same with transmitter.

10 (1)FIFO depth = three. (2)The receiver is faster that transmitter. TxData TxReady TxRun TxClk RxData RxReady RxRun RxClk put talk headAddr tailAddr full empty

11 (After three consecutive write operations) (After two write and a read operations) There is no possibility for the metastability to occur. The only possible situation to have the metastability is when the FIFO is empty, the transmitter sends a new data, the receiver's clock is resumed and the written data is read.

12 The internal architecture of FIFO Two adders are needed to increment the head and tail registers

13Result: compared with other design The comparison between the throughput of this paper design and that of [22] is shown in Figure. Comparison of Latency, Throughput and Power Consumption with Word Length = 64

A pausible based GALS interconnect and its FIFO are proposed and their detail descriptions are discussed. The proposed channel is implemented in Verilog with back-annotated standard cells. This paper’s design is better in throughput, latency, and power consumption while its area overhead is more than previous works. 14

15 This paper help me to realize more information about GALS and its detailed descriptions. The design of senior Chi-Guang is using 2 Flip-flop and FIFO to implement the IP-OCP interface. A perfect design is not existing We should to sacrifice a little factor to improve other factor