De-synchronization: from synchronous to asynchronous

Slides:



Advertisements
Similar presentations
Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.
Advertisements

Data Synchronization Issues in GALS SoCs Rostislav (Reuven) Dobkin and Ran Ginosar Technion Christos P. Sotiriou FORTH ICS- FORTH.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Systematic method for capturing “design intent” of Clock Domain Crossing (CDC) logic in constraints Ramesh Rajagopalan Cisco Systems.
Automated Method Eliminates X Bugs in RTL and Gates Kai-hui Chang, Yen-ting Liu and Chris Browy.
Self-Timed Systems Timing complexity growing in digital design -Wiring delays can dominate timing analysis (increasing interdependence between logical.
ECE C03 Lecture 81 Lecture 8 Memory Elements and Clocking Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
Sequential Circuits1 DIGITAL LOGIC DESIGN by Dr. Fenghui Yao Tennessee State University Department of Computer Science Nashville, TN.
CSCE 211: Digital Logic Design. Chapter 6: Analysis of Sequential Systems.
Jordi Cortadella, Universitat Politecnica de Catalunya, Barcelona Mike Kishinevsky, Intel Corp., Strategic CAD Labs, Hillsboro.
CSE241 Formal Verification.1Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 6: Formal Verification.
Hazard-free logic synthesis and technology mapping I Jordi Cortadella Michael Kishinevsky Alex Kondratyev Luciano Lavagno Alex Yakovlev Univ. Politècnica.
Hardware and Petri nets Synthesis of asynchronous circuits from Signal Transition Graphs.
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Latch-based Design.
ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 1 Asynchronous Processor Design for ELEC 6200 by Wei Jiang.
Handshake protocols for de-synchronization I. Blunno, J. Cortadella, A. Kondratyev, L. Lavagno, K. Lwin and C. Sotiriou Politecnico di Torino, Italy Universitat.
1 Logic synthesis from concurrent specifications Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain In collaboration with M. Kishinevsky,
Jordi Cortadella, Universitat Politècnica de Catalunya, Spain
Chapter 7 Design Implementation (II)
Design of Fault Tolerant Data Flow in Ptolemy II Mark McKelvin EE290 N, Fall 2004 Final Project.
Synthesis of synchronous elastic architectures Jordi Cortadella (Universitat Politècnica Catalunya) Mike Kishinevsky (Intel Corp.) Bill Grundmann (Intel.
Embedded Systems Hardware: Storage Elements; Finite State Machines; Sequential Logic.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology.
Micropipeline design in asynchronous circuit Wilson Kwan M.A.Sc. Candidate Ottawa-Carleton Institute for Electrical & Computer Engineering (OCIECE) Carleton.
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
FORMAL VERIFICATION OF ADVANCED SYNTHESIS OPTIMIZATIONS Anant Kumar Jain Pradish Mathews Mike Mahar.
1 Bridging the gap between asynchronous design and designers Peter A. BeerelFulcrum Microsystems, Calabasas Hills, CA, USA Jordi CortadellaUniversitat.
An Abstract Model of De- synchronous Circuit Design and Its Area Optimization Jin Gang University of Manchester.
1 Advanced Digital Design Asynchronous Design Automation by A. Steininger and J. Lechner Vienna University of Technology.
INF3430 / 4431 Synthesis and the Integrated Logic Analyzer (ILA) (WORK IN PROGRESS)

Chapter #6: Sequential Logic Design
Synthesis from HDL Other synthesis paradigms
Last Lecture Talked about combinational logic always statements. e.g.,
Asynchronous Interface Specification, Analysis and Synthesis
Part IV: Synthesis from HDL Other synthesis paradigms
Instructor: Alexander Stoytchev
Sequential Logic Counters and Registers
CSCE 211: Digital Logic Design
Flip Flops.
FIGURE 5.1 Block diagram of sequential circuit
Ring Oscillator Clocks and Margins
Yee-Wing Hsieh Steve Jacobs
Two-phase Latch based design
CPE/EE 422/522 Advanced Logic Design L02
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
Elec 2607 Digital Switching Circuits
ECE 434 Advanced Digital System L04
ELEC 7770 Advanced VLSI Design Spring 2012 Retiming
Hardware Description Languages
ECE 551: Digital System Design & Synthesis
Synchronous Sequential Circuits
ESE535: Electronic Design Automation
ECE-C662 Introduction to Behavioral Synthesis Knapp Text Ch
Synthesis of asynchronous controllers from Signal Transition Graphs:
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
Instructor: Alexander Stoytchev
HIGH LEVEL SYNTHESIS.
Clockless Logic: Asynchronous Pipelines
1) Latched, initial state Q =1
Flip Flops Unit-4.
ELEC 7770 Advanced VLSI Design Spring 2016 Retiming
Design Methodology & HDL
Instructor: Alexander Stoytchev
SEQUENTIAL CIRCUITS __________________________________________________
Sequntial-Circuit Building Blocks
Presentation transcript:

De-synchronization: from synchronous to asynchronous Based on the paper: Blunno, Cortadella, Kondratyev, Lavagno, Lwin, Sotiriou, Handshake protocols for de-synchronization, ASYNC 2004.

Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example

Asynchronous De-synchronize CLK Synchronous CLK

Synchronous circuit MS flip-flop L L L L 1 1 CLK L L

De-synchronization L L L L 1 1 C L L

De-synchronization Distributed controllers substitute the clock network C C C C C C The data path remains intact !

Design flow Think synchronous Design synchronous: one clock and edge-triggered flip-flops De-synchronize (automatically) Run it asynchronously

Prior work Micropipelines (Sutherland, 1989) Local generation of clocks Varshavsky et al., 1995 Kol and Ginosar, 1996 Theseus Logic (Ligthart et al., 2000) Commercial HDL synthesis tools Direct translation and special registers Phased logic (Linder and Harden, 1996) (Reese, Thornton, Traver, 2003) Conceptually similar Different handshake protocol (2 phase vs. 4 phase)

Automatic de-synchronization Devise an automatic method for de-synchronization Identify a subclass of synchronous circuits suitable for de-synchronization Formally prove correctness

Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example

Synchronous flow

De-synchronized flow

+

Flow equivalence [Guernic, Talpin, Lann, 2003]

A B

De-synchronized behavior Flow equivalence CLK A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 Synchronous behavior A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 De-synchronized behavior

De-synchronized behavior Flow equivalence CLK A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 Synchronous behavior A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 De-synchronized behavior

Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example

L L L L 1 1 C L L

C C C C C C

L C

A B C D A+ B- C+ D- A- B+ C- D+ A latch cannot read another data item until the successor has captured the current one

A B 1 C D A+ B- C+ D- A- B+ C- D+ A latch cannot read another data item until the successor has captured the current one

A B C D A+ B- C+ D- A- B+ C- D+ A latch cannot read another data item until the successor has captured the current one

A 1 B C D A+ B- C+ D- A- B+ C- D+ A latch cannot read another data item until the successor has captured the current one

A B C D A+ B- C+ D- A- B+ C- D+

A B C D 1 A+ B- C+ D- A- B+ C- D+

A B C D A+ B- C+ D- A- B+ C- D+

A B C 1 D A+ B- C+ D- A- B+ C- D+ A latch cannot become opaque before having captured the data item from its predecessor

A B 1 C 1 D A+ B- C+ D- A- B+ C- D+ A latch cannot become opaque before having captured the data item from its predecessor

A B C 1 D A+ B- C+ D- A- B+ C- D+ A latch cannot become opaque before having captured the data item from its predecessor

A B C D A+ B- C+ D- A- B+ C- D+ A latch cannot become opaque before having captured the data item from its predecessor

A B C D A+ B- C+ D- A- B+ C- D+

A B C D A+ B+ C+ D+ A- B- C- D- A B

Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example

Can we increase concurrency ? A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- not flow-equivalent

A B A B data overrun A B data lost

Can we reduce concurrency ? How much ? A+ B+ A- B- Can we reduce concurrency ? How much ?

A+ B+ A- B- (8 states) A+ B+ A- B- A+ B+ A- B- (6 states) A+ B+ A- B- (5 states) A+ B+ A- B- (4 states)

A B de-synchronization model A B A B fully decoupled (Furber & Day) GasP, IPCMOS A B semi-decoupled (Furber & Day) A B A B simple 4-phase non-overlapping

A+ B+ A- B- A+ B+ A- B- de-synchronization model A+ B+ A- B- A+ B+ A- B- fully decoupled (Furber & Day) GasP, IPCMOS simple 4-phase non-overlapping A+ B+ A- B- A+ B+ A- B- semi-decoupled (Furber & Day)

4-phase latch controllers Lt Lt Rin Rout Rin Rout Ain Aout Ain Aout Furber and Day, IEEE Trans. VLSI, June 1996 Implementation note: Lt=0 (transparent), Lt=1 (opaque)

4-phase latch controllers Rin+ Rout+ Lt+ Ain+ Aout+ ? Lt Rin- Rin Rout Rout- Lt- Ain Aout Ain- Aout-

4-phase latch controllers Rin+ Rout+ Ain+ Lt+ Aout+ Lt Rin- Rout- Rin Rout Ain Aout Ain- Lt- Aout- Simple 4-phase controller

4-phase latch controllers Rin+ Rout+ Ain+ Lt+ Aout+ Rin- Rout- Ain- Lt- Aout- Simple 4-phase controller

4-phase latch controllers Rin+ A+ Rout+ Ain+ Lt+ Aout+ Lt Rin- A- Rout- Rin Rout Ain Aout Ain- Lt- Aout- Semi-decoupled controller

4-phase latch controllers Rin+ A+ Rout+ Ain+ Lt+ Aout+ Rin- A- Rout- Ain- Lt- Aout- Semi-decoupled controller

4-phase latch controllers Rin+ A+ Rout+ Ain+ Lt+ Aout+ B+ Lt Rin- A- Rout- Rin Rout Ain Aout Ain- Lt- Aout- B- Fully decoupled controller

4-phase latch controllers Rin+ A+ Rout+ Ain+ Lt+ Aout+ B+ Rin- A- Rout- Ain- Lt- Aout- B- Fully decoupled controller

4-phase latch controllers (state graphs) Semi-decoupled controller Fully decoupled controller

(semi-decoupled 4-phase protocol) B Rx Ri Ro cntrl cntrl Ax Ai Ao Ri+ A- Rx+ B- Ro+ Ai+ Ax+ Ao+ Ri- A+ Rx- B+ Ro- Ai- Ax- Ao- (semi-decoupled 4-phase protocol)

(semi-decoupled 4-phase protocol) B Rx Ri Ro cntrl cntrl Ax Ai Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)

(semi-decoupled 4-phase protocol) B Rx Ri Ro cntrl cntrl Ax Ai Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)

(semi-decoupled 4-phase protocol) B Rx Ri Ro cntrl cntrl Ax Ai Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)

(semi-decoupled 4-phase protocol) B Rx Ri Ro cntrl cntrl Ax Ai Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)

(semi-decoupled 4-phase protocol) B Rx Ri Ro cntrl cntrl Ax Ai Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)

(semi-decoupled 4-phase protocol) B Rx Ri Ro cntrl cntrl Ax Ai Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)

A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B-

Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example

Which protocols are valid for de-synchronization ?

Theorem: the de-synchronization protocol preserves flow-equivalence A+ B+ A- B- Theorem: the de-synchronization protocol preserves flow-equivalence Proof: by induction on the length of the traces Induction hypothesis: same latch values at reset Induction step: same values at cycle i  same values at cycle i+1

A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B-

Theorem: any reduction in concurrency preserves flow-equivalence A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B-

Any hybrid approach preserves flow-equivalence ! Semi- decoupled Fully decoupled non- overlapping

A B C D A+ B+ C+ D+ A- B- C- D-

Flow-equivalence is preserved, … but … A+ B+ C+ D+ A- B- C- D- semi- decoupled non- overlapping fully decoupled Flow-equivalence is preserved, … but …

Liveness Preservation of flow-equivalence: all the generated traces are equivalent Are all traces generated ? (Is the marked graph live ?) Not always !

Semi-decoupled 4-phase handshake protocol A+ B+ C+ D+ A- B- C- D- Semi-decoupled 4-phase handshake protocol Liveness: all cycles have at least one token [Commoner 1971]

Simple 4-phase handshake protocol A+ B+ C+ D+ A- B- C- D- Simple 4-phase handshake protocol

Results about liveness At least three latches in a ring are required with only one data token circulating [Muller 1962] Theorem (this paper): any hybrid combination of protocols is live if the simple 4-phase protocol is not used Proof: any cycle has at least one token

Valid for de-synchronization A+ B+ A- B- A+ B+ A- B- model A+ B+ A- B- A+ B+ A- B- fully decoupled (Furber & Day) GasP, IPCMOS simple 4-phase non-overlapping A+ B+ A- B- A+ B+ A- B- semi-decoupled (Furber & Day)

Outline What is de-synchronization ? Behavioral equivalence 4-phase protocols for de-synchronization Concurrency Correctness An example

Async DLX block diagram

= Synchronous RTL Synchronous Desynchronized Cycle: 4.4ns Power: 70.9mW Area: 372,656m Cycle: 4.45ns Power: 71.2mW Area: 378,058m All numbers are after Placement & Routing Total of 1500 flip-flops, 3000 latches DE-SYNC design includes 5 controllers, each driving 2 clock trees Power numbers include the clock tree Technology: UCM/Virtual Silicon 0.18 µm

Discussion The de-synchronization model provides an abstraction of the timing behavior

Exploration of the design space [2,3] [1,2] [8,9] [5,7] [3,5] [2,4] A B E F G C D [0,0] [3,5] [5,7] [2,3] [2,4] [1,2] [8,9] Timing analysis Exploration of the design space

Conclusions EDA tools require a formal support (they must work for all circuits) A complete characterization of 4-phase protocols has been presented (partial order based on concurrency) Design flow developed at Cadence Berkeley Labs Automated from gate netlist Static timing analysis to derive matched delays Constrained P&R to meet timing constraints