Avshalom Elyada, Ran GinosarPipeline Synchronization 1 A Unique and Successfully Implemented Approach to the Synchronization Problem Based on the article.

Slides:



Advertisements
Similar presentations
1 Lecture 16 Timing  Terminology  Timing issues  Asynchronous inputs.
Advertisements

Data Synchronization Issues in GALS SoCs Rostislav (Reuven) Dobkin and Ran Ginosar Technion Christos P. Sotiriou FORTH ICS- FORTH.
Synchronous Sequential Logic
Modern VLSI Design 4e: Chapter 5 Copyright  2008 Wayne Wolf Topics n Performance analysis of sequential machines.
Issues in System on the Chip Clocking November 6th, 2003 SoC Design Conference, Seoul, KOREA Vojin G. Oklobdzija Advanced Computer System Engineering Laboratory.
Introduction to CMOS VLSI Design Sequential Circuits
ELEC 256 / Saif Zahir UBC / 2000 Timing Methodology Overview Set of rules for interconnecting components and clocks When followed, guarantee proper operation.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 24: November 4, 2011 Synchronous Circuits.
Digital Logic Design Lecture # 17 University of Tehran.
Digital Logic Design Lecture 22. Announcements Homework 7 due today Homework 8 on course webpage, due 11/20. Recitation quiz on Monday on material from.
(Neil west - p: ). Finite-state machine (FSM) which is composed of a set of logic input feeding a block of combinational logic resulting in a set.
Presenter : Ching-Hua Huang 2012/4/16 A Low-latency GALS Interface Implementation Yuan-Teng Chang; Wei-Che Chen; Hung-Yue Tsai; Wei-Min Cheng; Chang-Jiu.
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Sequential Logic 1 clock data in may changestable data out (Q) stable Registers  Sample data using clock  Hold data between clock cycles  Computation.
EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Digital Integrated Circuits A Design Perspective Timing Issues Jan M. Rabaey Anantha Chandrakasan.
Clock Design Adopted from David Harris of Harvey Mudd College.
Assume array size is 256 (mult: 4ns, add: 2ns)
1 ® Charles Dike Synchronization Ideas Charles E. Dike Intel Corporation.
Digital Integrated Circuits© Prentice Hall 1995 Timing ISSUES IN TIMING.
CS Fall 2005 – Lec #16 – Retiming - 1 State Machine Timing zRetiming ySlosh logic between registers to balance latencies and improve clock timings.
EECC341 - Shaaban #1 Lec # 13 Winter Sequential Logic Circuits Unlike combinational logic circuits, the output of sequential logic circuits.
1 Clockless Logic Montek Singh Tue, Mar 23, 2004.
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 6 –Selected Design Topics Part 3 – Asynchronous.
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 1 Asynchronous Processor Design for ELEC 6200 by Wei Jiang.
COMP Clockless Logic and Silicon Compilers Lecture 3
Pipelining and Retiming 1 Pipelining  Adding registers along a path  split combinational logic into multiple cycles  increase clock rate  increase.
Jordi Cortadella, Universitat Politècnica de Catalunya, Spain
11/10/2004EE 42 fall 2004 lecture 301 Lecture #30 Finite State Machines Last lecture: –CMOS fabrication –Clocked and latched circuits This lecture: –Finite.
1 Synchronization of complex systems Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain Thanks to A. Chakraborty, T. Chelcea, M. Greenstreet.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
Contemporary Logic Design Sequential Logic © R.H. Katz Transparency No Chapter #6: Sequential Logic Design Sequential Switching Networks.
Avshalom Elyada, Ran GinosarPipeline Synchronization 1 Pipeline Synchronization Continued This second part is based on the recent article Bridging Clock.
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
CS61C L15 Synchronous Digital Systems (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
1 Recap: Lectures 5 & 6 Classic Pipeline Styles 1. Williams and Horowitz’s PS0 pipeline 2. Sutherland’s micropipelines.
1 Clockless Logic: Dynamic Logic Pipelines (contd.)  Drawbacks of Williams’ PS0 Pipelines  Lookahead Pipelines.
1 CSE370, Lecture 16 Lecture 19 u Logistics n HW5 is due today (full credit today, 20% off Monday 10:29am, Solutions up Monday 10:30am) n HW6 is due Wednesday.
CS3350B Computer Architecture Winter 2015 Lecture 5.2: State Circuits: Circuits that Remember Marc Moreno Maza [Adapted.
COE 202: Digital Logic Design Sequential Circuits Part 1
RTL Hardware Design by P. Chu Chapter Overview on sequential circuits 2. Synchronous circuits 3. Danger of synthesizing asynchronous circuit 4.
1 CSE370, Lecture 17 Lecture 17 u Logistics n Lab 7 this week n HW6 is due Friday n Office Hours íMine: Friday 10:00-11:00 as usual íSara: Thursday 2:30-3:20.
Lecture #26 Page 1 ECE 4110– Sequential Logic Design Lecture #26 Agenda 1.State Encoding 2.Pipelined Outputs 3.Asynchronous Inputs Announcements 1.n/a.
Computer Organization & Programming Chapter 5 Synchronous Components.
12004 MAPLD: 153Brej Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University.
BR 8/991 DFFs are most common Most programmable logic families only have DFFs DFF is fastest, simplest (fewest transistors) of FFs Other FF types (T, JK)
Reading Assignment: Rabaey: Chapter 9
Lecture 11: FPGA-Based System Design October 18, 2004 ECE 697F Reconfigurable Computing Lecture 11 FPGA-Based System Design.
BR 1/991 DataPath Elements Altera LPM library has many elements useful for building common datapath functions –lpm_ram_dq - recommended for either asynchronous.
Introduction to Clock Tree Synthesis
1 Bridging the gap between asynchronous design and designers Peter A. BeerelFulcrum Microsystems, Calabasas Hills, CA, USA Jordi CortadellaUniversitat.
Synchronous Sequential Circuits by Dr. Amin Danial Asham.
07/11/2005 Register File Design and Memory Design Presentation E CSE : Introduction to Computer Architecture Slides by Gojko Babić.
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU 99-1 Under-Graduate Project Design of Datapath Controllers Speaker: Shao-Wei Feng Adviser:
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California,
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 20: October 25, 2010 Pass Transistors.
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
1 Clockless Logic Montek Singh Thu, Mar 2, Review: Logic Gate Families  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates,
Overview Part 1 – The Design Space
CS Spring 2008 – Lec #17 – Retiming - 1
Appendix B The Basics of Logic Design
Clocking in High-Performance and Low-Power Systems Presentation given at: EPFL Lausanne, Switzerland June 23th, 2003 Vojin G. Oklobdzija Advanced.
De-synchronization: from synchronous to asynchronous
Lecture 19 Logistics Last lecture Today
Early output logic and Anti-Tokens
Instructor: Michael Greenbaum
Topics Bus interfaces. Platform FPGAs..
Pipelining and Superscalar Techniques
Lecture 3: Timing & Sequential Circuits
Presentation transcript:

Avshalom Elyada, Ran GinosarPipeline Synchronization 1 A Unique and Successfully Implemented Approach to the Synchronization Problem Based on the article “Pipeline Synchronization” by Jakov N. Seizovic, 1994

Avshalom Elyada, Ran GinosarPipeline Synchronization 2 Search for New Solutions More and more sync operations per unit time: –complex chips  more inter-domain transitions –Higher freq  less settling time  smaller MTBF Existing solutions no longer deliver required PoF –And if they can it is a matter of the next generation or two… inter-domain interfaces are often long interconnects; need to solve both with one mechanism

Avshalom Elyada, Ran GinosarPipeline Synchronization 3 Current Solutions Reviewed in previous lectures –Two-flop –Clock shifting/streching, predicitve –… All treat sync as “one-shot” process, at end of which signal is either synced or not

Avshalom Elyada, Ran GinosarPipeline Synchronization 4 Pipeline approach: control signals synced in stages, along with data flow Each step of the pipeline “partially synchronizes” the signal, reducing it’s degree of asynchronicity. More stages  less PoF: safety vs. latency tradeoff Data is latched at each stage –Divide long interconnect into short segments –Deal with inter-bit skew

Avshalom Elyada, Ran GinosarPipeline Synchronization 5 Degree of Asynchronicity Until now, signal was either synchronous or asynchronous For sync in stages, let’s look at more information: –A signal’s arrival time in interval [0-T] as random variable with distribution function –  and degree of asynchronicity :

Avshalom Elyada, Ran GinosarPipeline Synchronization 6 For a time-window 0<T w <T, Asynchronicity of a signal is defined as: Intuitive meaning: when sampling within a window of T hold +T setup =T w, A s is the lowest prob. of MS behavior that can be achieved. Asynchronicity of Signals

Avshalom Elyada, Ran GinosarPipeline Synchronization 7 Insightful Examples Synchronous: can make A s = 0 if satisfy T hold +T setup < certain T max Asynchronous: A s = T w /T A s = 1 corresponds to a “Malicious” signal: no matter where we sample, the signal always arrives within our time window

Avshalom Elyada, Ran GinosarPipeline Synchronization 8 Building blocks –Stage-synchronizer based on one or two Mutual-Exclusion elements –FIFO element Start with async elements (latch & latch-like) –Explore possible use of DFFs for both data & sync (ctrl) path More appealing to sync. designers

Avshalom Elyada, Ran GinosarPipeline Synchronization 9 An ME as a Synchronizer (I) Outputs mutually exclusive : only one asserts at a given time Connect ‘clk’ and signal ‘R’ to inputs ‘A’ synced output, other output unused clk S X R1R1 R0R0 A1A1 A0A0 ME RA

Avshalom Elyada, Ran GinosarPipeline Synchronization 10 An ME as a Synchronizer (II) A  is synced to clk 

Avshalom Elyada, Ran GinosarPipeline Synchronization 11 An ME as a Synchronizer (III) Inverse the clk: A  syncs to clk   sync to posedge R1R1 R0R0 A1A1 A0A0 ME RA clk

Avshalom Elyada, Ran GinosarPipeline Synchronization 12 ME Implementation A latch with a MS filter As inherent to any sync. decision h/w, ME has a MS-state. –If in MS-state, Ao does not assert until MS resolved –Next clk edge forces ME out of MS-state.

Avshalom Elyada, Ran GinosarPipeline Synchronization 13 Dual-edge Synchronizer Want to use 2- phase protocol, better for long interconnects Need to sync rise and fall of R i ->R o Use 2 MEs and another latch

Avshalom Elyada, Ran GinosarPipeline Synchronization 14 FIFO Element Holds data in stage while ctrl is synced 2-phase single-rail handshake –2-phase more suited for long interconnects Latch as mem. element –can also use DFF, appeal to sync. designers Simple async ctrl (petrify) –More on implementation next time…

Avshalom Elyada, Ran GinosarPipeline Synchronization 15 Pipeline w/ Embedded Synchronizing S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do    SynchronousAsynchronous   Taken from “Synchronization Ideas”, Charles E. Dike, Intel Corporation

Avshalom Elyada, Ran GinosarPipeline Synchronization 16 Likewise for Multi- Synchronous Domains S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do Ri Ai Di Ro Ao Do S    Mult.-sync. domain B Mult.-sync. domain A  S  S  S Taken from “Synchronization Ideas”, Charles E. Dike, Intel Corporation

Avshalom Elyada, Ran GinosarPipeline Synchronization 17 Step-by-Step

Avshalom Elyada, Ran GinosarPipeline Synchronization 18 Long Interconnect: Pipeline Synchronizer Seizovic, “Pipeline Synchronization,” Async 1994 Kessels, Peeters, Kim, "Bridging Clock Domains by synchronizing the mice in the mousetrap", PATMOS, 2003 B clk half cycle distance Last 3 stages in each direction contain synchronizers A clk MEME MEME REQ MEME B clk MEME MEME ACK MEME

Avshalom Elyada, Ran GinosarPipeline Synchronization 19 Probability of Failure Pipeline PoF as formally proven in article: -k(T/2-T oh )/τ P k =P 0 *e –P 0 – PoF without any sync –k– # stages t = T/2-T oh is the time each sync has T oh = synchronizer+FIFO delay Recall prob. of exit MS is P(t) = exp(-t/τ) Intuitively, each stage works alone during its allocated time (while clk is high, minus overhead). The contributions are combined.

Avshalom Elyada, Ran GinosarPipeline Synchronization 20 Future At each stage, time for sync is T/2 –Toh. Insert logic in the pipeline –On data, no problem –On ctrl possible, Toh effectively grows  need more stages for same PoF –But pipeline would have added functionality Can also contemplate insertion of ME-elements along existing pipelines in synchronous designs…

Avshalom Elyada, Ran GinosarPipeline Synchronization 21 Glancing Back Need for better solution that also addresses long-interconnect issue Asynchronicity (degree of) Syncing in stages: pipeline ME as a synchronizer FIFO element Pipeline Synchronizers

Avshalom Elyada, Ran GinosarPipeline Synchronization 22 Next Presentation More on Pipeline Sync, … “Bridging Clock Domains by Synchronizing the Mice in the Mousetrap” Kessels, Peeters, Kim Philips Research Laboratory, 2003