Synthesis of Asynchronous Control Circuits with Automatically Generated Relative Timing Assumptions Jordi Cortadella, University Politècnica de Catalunya.

Slides:



Advertisements
Similar presentations
Delay models (I) A B C Real (analog) behaviorAbstract behavior A B C Abstractions are necessary to define delay models manageable for design, synthesis.
Advertisements

VERILOG: Synthesis - Combinational Logic Combination logic function can be expressed as: logic_output(t) = f(logic_inputs(t)) Rules Avoid technology dependent.
Andrey Mokhov, Victor Khomenko Danil Sokolov, Alex Yakovlev Dual-Rail Control Logic for Enhanced Circuit Robustness.
Combinational Logic.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.
1 BalsaOpt a tool for Balsa Synthesis Francisco Fernández-Nogueira, UPC (Spain) Josep Carmona, UPC (Spain)
1 Advanced Digital Design Synthesis of Control Circuits by A. Steininger and J. Lechner Vienna University of Technology.
Hazard-free logic synthesis and technology mapping I Jordi Cortadella Michael Kishinevsky Alex Kondratyev Luciano Lavagno Alex Yakovlev Univ. Politècnica.
Hardware and Petri nets Synthesis of asynchronous circuits from Signal Transition Graphs.
Logic Decomposition of Asynchronous Circuits Using STG Unfoldings Victor Khomenko School of Computing Science, Newcastle University, UK.
Direct synthesis of large-scale asynchronous controllers using a Petri-net-based approach Ivan BlunnoPolitecnico di Torino Alex BystrovUniv. Newcastle.
Logic Synthesis for Asynchronous Circuits Based on Petri Net Unfoldings and Incremental SAT Victor Khomenko, Maciej Koutny, and Alex Yakovlev University.
Detecting State Coding Conflicts in STGs Using Integer Programming Victor Khomenko, Maciej Koutny, and Alex Yakovlev University of Newcastle upon Tyne.
Hardware and Petri nets: application to asynchronous circuit design Jordi CortadellaUniversitat Politècnica de Catalunya, Spain Michael KishinevskyIntel.
Formal Verification of Safety Properties in Timed Circuits Marco A. Peña (Univ. Politècnica de Catalunya) Jordi Cortadella (Univ. Politècnica de Catalunya)
Introduction to asynchronous circuit design: specification and synthesis Jordi Cortadella, Universitat Politècnica de Catalunya, Spain Michael Kishinevsky,
Introduction to asynchronous circuit design: specification and synthesis Part IV: Synthesis from HDL Other synthesis paradigms.
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
Introduction to asynchronous circuit design: specification and synthesis Part III: Advanced topics on synthesis of control circuits from STGs.
1 Logic design of asynchronous circuits Part II: Logic synthesis from concurrent specifications.
Asynchronous Sequential Logic
Handshake protocols for de-synchronization I. Blunno, J. Cortadella, A. Kondratyev, L. Lavagno, K. Lwin and C. Sotiriou Politecnico di Torino, Italy Universitat.
Introduction to asynchronous circuit design: specification and synthesis Part II: Synthesis of control circuits from STGs.
Combining Decomposition and Unfolding for STG Synthesis (application paper) Victor Khomenko 1 and Mark Schaefer 2 1 School of Computing Science, Newcastle.
ECE Synthesis & Verification1 ECE 667 Spring 2011 Synthesis and Verification of Digital Systems Verification Introduction.
1 Logic synthesis from concurrent specifications Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain In collaboration with M. Kishinevsky,
Asynchronous Machines
Asynchronous Interface Specification, Analysis and Synthesis M. Kishinevsky Intel Corporation J. Cortadella Technical University of Catalonia.
1 Logic design of asynchronous circuits Part III: Advanced topics on synthesis.
Visualisation and Resolution of Coding Conflicts in Asynchronous Circuit Design A. Madalinski, V. Khomenko, A. Bystrov and A. Yakovlev University of Newcastle.
Bridging the gap between asynchronous design and designers Part II: Logic synthesis from concurrent specifications.
Resolution of Encoding Conflicts by Signal Insertion and Concurrency Reduction based on STG Unfoldings V. Khomenko, A. Madalinski and A. Yakovlev University.
STG-based synthesis and Petrify J. Cortadella (Univ. Politècnica Catalunya) Mike Kishinevsky (Intel Corporation) Alex Kondratyev (University of Aizu) Luciano.
Engineering Models and Design Methods for Quantum State Machines.
Spring 2002EECS150 - Lec15-seq2 Page 1 EECS150 - Digital Design Lecture 15 - Sequential Circuits II (Finite State Machines revisited) March 14, 2002 John.
1 State Encoding of Large Asynchronous Controllers Josep Carmona and Jordi Cortadella Universitat Politècnica de Catalunya Barcelona, Spain.
UFO’07 26 June 2007 Siedlce 1 Use of Partial Orders for Analysis and Synthesis of Asynchronous Circuits Alex Yakovlev School of EECE University of Newcastle.
A New Type of Behaviour- Preserving Transition Insertions in Unfolding Prefixes Victor Khomenko.
Detecting State Coding Conflicts in STGs Using SAT Victor Khomenko, Maciej Koutny, and Alex Yakovlev University of Newcastle upon Tyne.
1 Petrify: Method and Tool for Synthesis of Asynchronous Controllers and Interfaces Jordi Cortadella (UPC, Barcelona, Spain), Mike Kishinevsky (Intel Strategic.
Automatic synthesis and verification of asynchronous interface controllers Jordi CortadellaUniversitat Politècnica de Catalunya, Spain Michael KishinevskyIntel.
Derivation of Monotonic Covers for Standard C Implementation Using STG Unfoldings Victor Khomenko.
CS 151 Digital Systems Design Lecture 32 Hazards
Asynchronous Circuit Verification and Synthesis with Petri Nets J. Cortadella Universitat Politècnica de Catalunya, Barcelona Thanks to: Michael Kishinevsky.
A Usable Reachability Analyser Victor Khomenko Newcastle University.
Counters Dr. Rebhi S. Baraka Logic Design (CSCI 2301) Department of Computer Science Faculty of Information Technology The Islamic University.
Module : FSM Topic : types of FSM. Two types of FSM The instant of transition from the present to the next can be completely controlled by a clock; additionally,
Introduction to State Machine
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Curtis A. Nelson 1 Technology Mapping of Timed Circuits Curtis A. Nelson University of Utah September 23, 2002.
Lecture 11: FPGA-Based System Design October 18, 2004 ECE 697F Reconfigurable Computing Lecture 11 FPGA-Based System Design.
Asynchronous Sequential Circuits aka ‘Feedback sequential circuits’ - Wakerly Chap 7.9.
Static Timing Analysis
A Simplified Approach to Fault Tolerant State Machine Design for Single Event Upsets Melanie Berg.
Equivalence checking Prof Shobha Vasudevan ECE 598SV.
Specification mining for asynchronous controllers Javier de San Pedro† Thomas Bourgeat ‡ Jordi Cortadella† † Universitat Politecnica de Catalunya ‡ Massachusetts.
Structural methods for synthesis of large specifications
Victor Khomenko Newcastle University
Synthesis from HDL Other synthesis paradigms
Asynchronous Interface Specification, Analysis and Synthesis
Synthesis of Speed Independent Circuits Based on Decomposition
Synthesis for Verification
Part IV: Synthesis from HDL Other synthesis paradigms
Example of application: Decomposition
Synthesis of asynchronous controllers from Signal Transition Graphs:
De-synchronization: from synchronous to asynchronous
ECE 551: Digital System Design & Synthesis
Fast Min-Register Retiming Through Binary Max-Flow
Presentation transcript:

Synthesis of Asynchronous Control Circuits with Automatically Generated Relative Timing Assumptions Jordi Cortadella, University Politècnica de Catalunya Mike Kishinevsky, Intel Corporation Steven M. Burns, Intel Corporation Ken Stevens, Intel Corporation Earlier contributions: Luciano Lavagno, Alex Kondratyev, Alex Yakovlev, Alexander Taubin

Outline Why asynchronous Relative timing Reminder: design flow for asynchronous circuits Lazy transition systems Timing assumptions and constraints Automatic generation of timing assumptions Results

Why asynchronous? –All high-performance “synchronous” design styles are “asynchronous in small” (within one/few clocks). Example: [ISSCC2001 Intel paper on 4GHz IEU for 0.18um CMOS in Pentium 4(tm)]. Requires asynchronous style timing analysis. –Relative sequential distance within a die for global wires is growing –Can we deliver global clock N years from now?

Timing assumptions in design flow Synchronous circuits (e.g., static CMOS): –max delay: stabilize within a clock (- setup - clock2q - clock_skew) –min delay: stabilize after hold time (+clock_skew - clock2q) Speed-independent = quasi-delay insensitive: wire delays after a fork smaller than fan-out gate delays [Muller59, Varshavsky et al. 80, Martin89,…]. Problem: fat circuits Burst-mode FSM: circuit stabilizes between two changes at the inputs [Nowick91, Yun94]. Problem: fundamental mode is similar to synchronous (external alignment by the worst case) Timed circuits: Absolute bounds on gate / environment delays are known a priori (before physical design) [Mayers95]. Problem: how do you know absolute delays before sizing/physical design?

Speed-independent C-element Relative Timing Asynchronous Circuits a- before b- Timing assumption (on environment): a b c RT C-element: faster,smaller; correct only under timing constraint: a- before b- a b c

Relative Timing Circuits Assumptions: “a before b” –for concurrent events: reduces reachable state space –for ordered events: permits early enabling –both increase don’t care space for logic synthesis => simplify logic (better area and timing) “Assume - if useful - guarantee” approach: assumptions are used by the tool to derive a circuit and required timing constraints that must be met in physical design flow Applied to design of the Rotating Asynchronous Pentium Processor(TM) Instruction Decoder (K.Stevens, S.Rotem et al. Intel Corporation)

STG for the READ cycle LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDS LDTACK D DSr DTACK VME Bus Controller

State Graph (Read cycle) DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS+

Binary encoding of signals DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS (DSr, DTACK, LDTACK, LDS, D)

Next-state function 0  1 LDS- LDS+ LDS- 1  0 0  0 1 

Karnaugh map for LDS DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = /1?

Signal Insertion LDS- LDTACK- D- DSr- LDTACK+ LDS+ CSC- CSC

Speed-independent netlist

ER (LDS+) ER (LDS-) LDS- LDS+ LDS- 1  0 0  1 Transition systems Excitation region: enabling = firing, since delay can be zero

Lazy Transition Systems ER (LDS+) ER (LDS-) LDS- LDS+ LDS- DTACK- FR (LDS-) Event LDS- is lazy: firing = subset of enabling

Timing assumptions (a before b) for concurrent events: concurrency reduction for firing and enabling (a before b) f or ordered events: early enabling (a simultaneous to b wrt c) for triples of events: combination of the above

Speed-independent Netlist LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map

Adding timing assumptions (I) LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map LDTACK- before DSr+ FAST SLOW

Adding timing assumptions (I) DTACK D DSr LDS LDTACK csc map LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDTACK- before DSr+

State space domain LDTACK- before DSr+ LDTACK- DSr+

State space domain LDTACK- before DSr+ LDTACK- DSr+

State space domain LDTACK- before DSr+ LDTACK- DSr+ Two more unreachable states

Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = /1?

Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = One more DC vector for all signalsOne state conflict is removed

Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map

Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK LDTACK- before DSr+ TIMING CONSTRAINT

Timing assumptions (a before b) for concurrent events: concurrency reduction for firing and enabling (a before b) f or ordered events: early enabling (a simultaneous to b wrt c) for triples of events: combination of the above

Ordered events: early enabling a c b a a c b a b b c c F G Logic for gate c may change

Adding timing assumptions (II) LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK D- before LDS-

State space domain LDS- D- Reachable space is unchanged For LDS- enabling can be changed in one state D- before LDS- Potential enabling for LDS- DSr-

Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS =

Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = One more DC vector for one signal: LDS If used: LDS = DSr, otherwise: LDS = DSr + D

Before early enabling LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK

Netlist with two constraints LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDTACK- before DSr+ and D- before LDS- TIMING CONSTRAINTS DTACK D DSr LDS LDTACK Both timing assumptions are used for optimization and become constraints

Rule I (out of 6): a,b - non-input events –Untimed ordering: a||b and a enabled before b, but not vice versa –Derived assumption: a fires before b –Justification: delay of a gate can be made shorter than delay of two (or more) gates: del(a) < del(c)+del(b) Deriving automatic timing assumptions aaa b b b c c

Rule I (out of 6): a,b - non-input events –Untimed ordering: (a||b) and (a enabled before b), but not vice versa –Derived assumption: a fires before b –Justification: delay of a gate can be made shorter than delay of two (or more) gates Deriving automatic timing assumptions aaa b b b c c –Effect I: a state becomes DC for all signals

Rule I (out of 6): a,b - non-input events –Untimed ordering: (a||b) and (a enabled before b), but not vice versa –Derived assumption: a fires before b –Justification: delay of a gate can be made shorter than delay of two (or more) gates Deriving automatic timing assumptions aaa b b b c c –Effect II: another state becomes local DC for signal of event b

Backannotation of Timing Constraints Timed circuits require post-verification Can synthesis tools help ? –Report the least stringent set of timing constraints required for the correctness of the circuit –Not all initial timing assumptions may be required Petrify reports a set of constraints for order of firing that guarantee the circuit correctness

Timing constraints generation a b c d e d d e e b b c c d a Assumptions: d before b and c before e and a before d

Timing constraints generation a b c d e Assumptions: d before b and c before e and a before d d d e e b b c c d a

Timing constraints generation a b c d e Assumptions: d before b and c before e and a before d d d e e b b c c Correct behavior d a

Timing constraints generation a b c d e Assumptions: d before b and c before e and a before d d d e e b b c c 1 2 Incorrect behavior d a

Covering incorrect behavior a b c d e Assumptions: d before b and c before e and a before d d d e e b b c c {1, 3} d before b {1} d before c d a 5 {2, 4} c before e Other possible constraints remove states from assumption domain => invalid

Covering incorrect behavior a b c d e Assumptions: d before b and c before e and a before d d d e e b b c c {1} d before c d a 5 {2, 4} c before e Constraints for the minimal cost solution: d before c and c before e

Timing aware state encoding Solve only state conflicts reachable in the RT assumptions domain Generate automatic timing assumptions for inserted state signals => state signals can be implemented as RT logic State variables inserted concurrently with I/O events => latency and cycle time reduction

Value of Relative Timing RT circuits provides up to 2-3x (1.3-2x) delay&area reduction with respect to SI circuits synthesized without (with) concurrency reduction Automatic generation of timing assumptions => foundation for automatic synthesis of RT circuits with area/performance comparable/better than manual Back-annotation of timing constraints => minimal required timing information for the back-end tools Timing-aware state encoding allows significant area/performance optimization

Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Designflowwithouttiming

Specification (STG + user assumptions) Lazy State Graph Lazy SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis Timing-aware state encoding Boolean minimization Logic decomposition Technology mapping Design Flow with Timing Required Timing Constraints Automatic Timing Assumptions

FIFO example FIFO li lo ro ri li- li+ lo+ lo- ro+ ro- ri+ ri-

Speed-Independent Implementation without concurrency reduction 3 state signals are required

SI implementation with concurrency reduction li lo ro ri x li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x- + gC + -

RT implementation li lo ro ri x li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x- OR li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x-

RT implementation li lo ro ri x li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x- OR li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x- To satisfy the constraint: Delay(x- ) < Delay (ri+ ) and Delay(lo+) + Delay(x- ) < Delay(ro+ ) + Delay (ri+ ) All constraints are either satisfied by default or easy to satisfy by sizing