Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock.

Slides:



Advertisements
Similar presentations
IO Interfaces and Bus Standards. Interface circuits Consists of the cktry required to connect an i/o device to a computer. On one side we have data bus.
Advertisements

Chapter 4: Combinational Logic
Lecture 13: Sequential Circuits
Data Synchronization Issues in GALS SoCs Rostislav (Reuven) Dobkin and Ran Ginosar Technion Christos P. Sotiriou FORTH ICS- FORTH.
Registers and Counters
Khaled A. Al-Utaibi 8086 Bus Design Khaled A. Al-Utaibi
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
ELEC 256 / Saif Zahir UBC / 2000 Timing Methodology Overview Set of rules for interconnecting components and clocks When followed, guarantee proper operation.
Flip-Flops Computer Organization I 1 June 2010 © McQuain, Feng & Ribbens A clock is a free-running signal with a cycle time. A clock may be.
Avshalom Elyada, Ran GinosarPipeline Synchronization 1 A Unique and Successfully Implemented Approach to the Synchronization Problem Based on the article.
Module 12.  In Module 9, 10, 11, you have been introduced to examples of combinational logic circuits whereby the outputs are entirely dependent on the.
COMP3221: Microprocessors and Embedded Systems Lecture 17: Computer Buses and Parallel Input/Output (I) Lecturer: Hui.
Lecture 12 Latches Section , Block Diagram of Sequential Circuit gates New output is dependent on the inputs and the preceding values.
Sequential circuit Digital electronics is classified into combinational logic and sequential logic. In combinational circuit outpus depends only on present.
Pipeline transfer testing. The purpose of pipeline transfer increase the bandwidth for synchronous slave peripherals that require several cycles to return.
Synchronous Digital Design Methodology and Guidelines
© Ran Ginosar Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures Lecture 3 S&F Ch. 5: Handshake Ckt Implementations.
1 A Modular Synchronizing FIFO for NoCs Vainbaum Yuri.
Synchronizers for Low Latency Clock Domain Transfer
1 Clockless Logic Montek Singh Thu, Jan 13, 2004.
1 Clockless Logic Montek Singh Tue, Mar 23, 2004.
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 1 Asynchronous Processor Design for ELEC 6200 by Wei Jiang.
COMP Clockless Logic and Silicon Compilers Lecture 3
ENGIN112 L20: Sequential Circuits: Flip flops October 20, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 20 Sequential Circuits: Flip.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Senior Design I Lecture 15 - Handshaking.
11/16/2004EE 42 fall 2004 lecture 331 Lecture #33: Some example circuits Last lecture: –Edge triggers –Registers This lecture: –Example circuits –shift.
1 Synchronization of complex systems Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain Thanks to A. Chakraborty, T. Chelcea, M. Greenstreet.
CS 151 Digital Systems Design Lecture 20 Sequential Circuits: Flip flops.
Advanced Verilog EECS 270 v10/23/06.
1 Clockless Computing Montek Singh Thu, Sep 13, 2007.
Fall 2009 / Winter 2010 Ran Ginosar (
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
1 Recap: Lectures 5 & 6 Classic Pipeline Styles 1. Williams and Horowitz’s PS0 pipeline 2. Sutherland’s micropipelines.
1 Clockless Logic: Dynamic Logic Pipelines (contd.)  Drawbacks of Williams’ PS0 Pipelines  Lookahead Pipelines.
Chapter 1_4 Part II Counters
Test #2 Combinational Circuits – MUX Sequential Circuits – Latches – Flip-flops – Clocked Sequential Circuits – Registers/Shift Register – Counters – Memory.
Registers and Counters
Universal Synchronous/Asynchronous Receiver/Transmitter (USART)
SEQUENTIAL CIRCUITS Component Design and Use. Register with Parallel Load  Register: Group of Flip-Flops  Ex: D Flip-Flops  Holds a Word of Data 
ENG241 Digital Design Week #8 Registers and Counters.
1 Clockless Computing Montek Singh Thu, Sep 6, 2007  Review: Logic Gate Families  A classic asynchronous pipeline by Williams.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
1 COMP541 Sequential Circuits Montek Singh Feb 1, 2012.
12004 MAPLD: 153Brej Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University.
Reading Assignment: Rabaey: Chapter 9
Sequential Logic Computer Organization II 1 © McQuain A clock is a free-running signal with a cycle time. A clock may be either high or.
Basic LED Interface.
Memory Buffering Techniques Greg Stitt ECE Department University of Florida.
REGISTER TRANSFER LANGUAGE (RTL) INTRODUCTION TO REGISTER Registers1.
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
1 Clockless Logic Montek Singh Thu, Mar 2, Review: Logic Gate Families  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates,
Buffering Techniques Greg Stitt ECE Department University of Florida.
Buffering Techniques Greg Stitt ECE Department University of Florida.
Memory Buffering Techniques
End OF Column Circuits – Design Review
WAITX: An Arbiter for Non-Persistent Signals
FLIP FLOPS Binary unit capable of storing one bit – 0 or 1
System-on-Chip Design Homework Solutions
Other Approaches.
REGISTER TRANSFER LANGUAGE (RTL)
Registers and Counters
Sequential circuit design with metastability
CS Spring 2008 – Lec #17 – Retiming - 1
ECE Digital logic Lecture 16: Synchronous Sequential Logic
Registers and Counters Register : A Group of Flip-Flops. N-Bit Register has N flip-flops. Each flip-flop stores 1-Bit Information. So N-Bit Register Stores.
CSE 370 – Winter Sequential Logic-2 - 1
Serial Communication Interface: Using 8251
Clockless Logic: Asynchronous Pipelines
Clockless Computing Lecture 3
Presentation transcript:

Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock Domains by Synchronizing the Mice in the Mousetrap (PATMOS, Sep. 2003) by Joep Kessels and Ad Peeters Philips Research Laboratories, The Netherlands together with Suk-Jin Kim at KJIST, South Korea

Avshalom Elyada, Ran GinosarPipeline Synchronization 2 Recall Seizovic’s Synchronization Pipeline Seizovic, “Pipeline Synchronization,” Async 1994 Kessels, Peeters, Kim, "Bridging Clock Domains by synchronizing the mice in the mousetrap", PATMOS, 2003 B clk Ripple Buffer between two clock domains –High throughput –Embedded synchronization –spanning a long distance  2-phase half cycle distance A clk MEME MEME REQ MEME B clk MEME MEME ACK MEME

Avshalom Elyada, Ran GinosarPipeline Synchronization 3 Which buffer to use? Ripple Buffer –Stream data (isochronous) Throughput important, latency not Steady rate maintained on both sides –Short distance (2-3 stages) Pipe to improve throughput –or Long distance (many stages) Improve throughput and bridge distance

Avshalom Elyada, Ran GinosarPipeline Synchronization 4 Which buffer to use? Pointer Buffer –Block data Chunk available at-once Rate not important No sense to ripple every word in all pipe stages Write few long bursts to SRAM and read on other side, with pointers –But if long distance, need Ripple

Avshalom Elyada, Ran GinosarPipeline Synchronization 5 An ME as a Synchronizer Outputs mutually exclusive : Connect ~clk and signal ‘R’ to inputs ‘A’ synced output, other output unused Today we refer to ME with ~clk as WAIT4 component S clk X R1R1 R0R0 A1A1 A0A0 ME RA

Avshalom Elyada, Ran GinosarPipeline Synchronization 6 WAIT4 A  is synced to clk  Used in 4-phase, doesn’t sync A  used as building block for 2-phase sync

Avshalom Elyada, Ran GinosarPipeline Synchronization 7 One Stage

Avshalom Elyada, Ran GinosarPipeline Synchronization 8 “Mousetrap Cell” as FIFO Element 2-phase single-rail Any hi/lo signal toggle indicates change reqǂack, sender cell is full req=ack, data accepted by rcver, snder empty “Equal” gate implements “empty” when req=ack Cell empty  all 4 ctrl signals equal

Avshalom Elyada, Ran GinosarPipeline Synchronization 9 MT Behavior Ignoring ‘empty’ signal, MT similar to Muller Pipeline: ([Rreq=Rack * Wreq ǂ Rreq]; Rreq := Wreq)* (rcving cell empty)*(sending cell full); capture data, send  merely prevents idle operations ([Wreq ǂ Rack * Wreq ǂ Rreq]; Rreq := Wreq)* ([Wreq ǂ Rack]; Rreq := Wreq)*

Avshalom Elyada, Ran GinosarPipeline Synchronization 10 Mousetrap vs. Muller Muller –Need to match delay of req to comb. logic –For 2-phase, need special Capture-Pass Latch –When full, every other cell contains data Mousetrap –‘empty’  no need for CP Latch –‘empty’ does automatic delay-matching –When full, all cells contain data –No async elements (good for business)

Avshalom Elyada, Ran GinosarPipeline Synchronization 11 Rcver Ack to Snder does NOT indicate latch locked Latch locked T(EQ+Hold Latch ) after Ack Timing restraint to ensure data not overrun 1) Snder Full 4) Rcver gets Rack from outside 5) Rcver empties EQ 3) Rcver stores data EQ+Hold Latch Latch 2)Rcver Ack back & Rreq forward

Avshalom Elyada, Ran GinosarPipeline Synchronization 12 Delay Asymmetries Delay of full/empty token –Full: T(Latch), Empty: T(Latch+EQ) –Phase-shift in handshake signals –FIFO at full speed is less than ½ full

Avshalom Elyada, Ran GinosarPipeline Synchronization 13 Delay Asymmetries II Different inputs of a cell have different delay-to-out –Connect slow EQ input to Ack to help timing, or –…to Req to improve performance

Avshalom Elyada, Ran GinosarPipeline Synchronization 14 Delay Asymmetries III Signals’ rising/falling edges have different transition delays –  Req precedes  empty,  empty precedes  Req –To avoid malfunction, ctrl-latch always slower than data-latch

Avshalom Elyada, Ran GinosarPipeline Synchronization 15 UE4 Parallel composition of two WAIT4 -> Up-Edge 4-phase detector Inv delay ensures 2 nd WAIT4 closed before 1 st opened Use a FF here instead? –doesn’t filter out the metastability

Avshalom Elyada, Ran GinosarPipeline Synchronization 16 Detect up & down edges for 2-phase Build a Edge 2-phase detector UE2 –‘d’ ifferent, ‘e’mpty –‘U’ even though it is up-and-down –Note resemblance to MT ctrl logic UE2

Avshalom Elyada, Ran GinosarPipeline Synchronization 17 Pipeline Interfaces FIFO indicates ready : –To receive new Wdat: Wrdy –To send new valid Rdat: Rrdy Environment enables: –Send of new valid Wdat: Wenb –Receive of new Rdat: Renb Data transfer if both rdy and enb –Transfer item every clock

Avshalom Elyada, Ran GinosarPipeline Synchronization 18 Handshaking continues … at next  Rclk, state repeats itself Read-Interface Renb enables Rclk at FF –Z empty, Rrdy low, handshake signals equal –Z becomes full, Rrdy hi, handshakes differ –Upon next  Rclk*Renb, FF makes handshakes equal again Following  Rclk*Renb, Z passes new Rdat After T(Latch+EQ), X empties into Y

Avshalom Elyada, Ran GinosarPipeline Synchronization 19 Write-Interface Wenb enables Wclk at data+ctrl FF –‘A’ full, handshake signals differ –‘A’ empty, Wack toggles –Upon next  Wclk*Wenb,‘A’ receives new Wdat 1) C filled from B, ack from C waits at UE2 for  Wclk 2) After  Wclk, B gets ack, ‘A’ filled from outside 3) Handshaking continues … at next  Wclk, state repeats itself

Avshalom Elyada, Ran GinosarPipeline Synchronization 20 Integrated Synchronizing Circuit in MT Write Cell

Avshalom Elyada, Ran GinosarPipeline Synchronization 21 Summary Pipeline Synchronization –High throughput, embedded sync, long interconnect, 2-phase The Mousetrap Cell Synchronization components –WAIT4, UE4, UE2 Buffer Interfaces –Write and Read sections MT with integrated sync circuit