Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology.

Slides:



Advertisements
Similar presentations
TOPIC : SYNTHESIS DESIGN FLOW Module 4.3 Verilog Synthesis.
Advertisements

Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.
Clockless Logic System-Level Specification and Synthesis Ack: Tiberiu Chelcea.
Asynchronous Sequential Logic
Uncle – An RTL Approach to Asynchronous Design Presentor : Chi-Chuan Chuang Date :
Copyright 2001, Agrawal & BushnellDay-1 PM Lecture 4a1 Design for Testability Theory and Practice Lecture 4a: Simulation n What is simulation? n Design.
ECE Synthesis & Verification - Lecture 8 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Introduction.
Behavioral Synthesis Outline –Synthesis Procedure –Example –Domain-Specific Synthesis –Silicon Compilers –Example Tools Goal –Understand behavioral synthesis.
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
Asynchronous Sequential Logic
Handshake protocols for de-synchronization I. Blunno, J. Cortadella, A. Kondratyev, L. Lavagno, K. Lwin and C. Sotiriou Politecnico di Torino, Italy Universitat.
COMP Clockless Logic and Silicon Compilers Lecture 3
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 3 – Combinational Logic Design Part 1 –
ECE C03 Lecture 141 Lecture 14 VHDL Modeling of Sequential Machines Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts,
Behavioural synthesis of asynchronous controllers: a case study with a self-timed communication channel Alex Yakovlev, Frank Burns, Alex Bystrov, Albert.
Overview Part 1 - Storage Elements and Sequential Circuit Analysis
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 3 – Combinational Logic Design Part 1 –
Overview Part 1 – Design Procedure 3-1 Design Procedure
Digital Computer Design Fundamental
CSET 4650 Field Programmable Logic Devices
Charles Kime & Thomas Kaminski © 2004 Pearson Education, Inc. Terms of Use (Hyperlinks are active in View Show mode) Terms of Use Lecture 12 – Design Procedure.
Introduction to VHDL Arab Academy for Science, Technology & Maritime Transport Computer Engineering Department Magdy Saeb, Ph.D.
1 H ardware D escription L anguages Modeling Digital Systems.
Chap 4. Sequential Circuits
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Digital System 數位系統 Verilog HDL Ping-Liang Lai (賴秉樑)  
Asynchronous circuit design in control driven approach Name: Chi-Chuan Chuang Date:
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 3 – Combinational Logic Design Part 1 –
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
TOPIC : SYNTHESIS INTRODUCTION Module 4.3 : Synthesis.
Introduction to VHDL Simulation … Synthesis …. The digital design process… Initial specification Block diagram Final product Circuit equations Logic design.
IMPLEMENTATION OF MIPS 64 WITH VERILOG HARDWARE DESIGN LANGUAGE BY PRAMOD MENON CET520 S’03.
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Combinational Circuits.
Introduction to ASIC flow and Verilog HDL
04/26/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Introduction to VHDL 12.2: VHDL versus Verilog (Separate File)
5-1 Logic System Design I VHDL Design Principles ECGR2181 Reading: Chapter 5.0, 5.1, 5.3 port ( I: in STD_LOGIC_VECTOR (1 to 9); EVEN, ODD: out STD_LOGIC.
EE121 John Wakerly Lecture #17
03/31/031 ECE 551: Digital System Design & Synthesis Lecture Set 8 8.1: Miscellaneous Synthesis (In separate file) 8.2: Sequential Synthesis.
Manufacture Testing of Digital Circuits
CS151 Introduction to Digital Design Chapter 5: Sequential Circuits 5-1 : Sequential Circuit Definition 5-2: Latches 1Created by: Ms.Amany AlSaleh.
VADA Lab.SungKyunKwan Univ. 1 L5:Lower Power Architecture Design 성균관대학교 조 준 동 교수
IAY 0600 Digital Systems Design Timing and Post-Synthesis Verifications Hazards in Combinational Circuits Alexander Sudnitson Tallinn University of Technology.
Processor Organization and Architecture Module III.
Mu.com.lec 9. Overview Gates, latches, memories and other logic components are used to design computer systems and their subsystems Good understanding.
EECE 320 L8: Combinational Logic design Principles 1Chehab, AUB, 2003 EECE 320 Digital Systems Design Lecture 8: Combinational Logic Design Principles.
1 Advanced Digital Design Asynchronous Design Automation by A. Steininger and J. Lechner Vienna University of Technology.
Copyright 2001, Agrawal & BushnellVLSI Test: Lecture 61 Lecture 6 Logic Simulation n What is simulation? n Design verification n Circuit modeling n True-value.
ASIC Design Methodology
Synthesis from HDL Other synthesis paradigms
VLSI Testing Lecture 5: Logic Simulation
B e h a v i o r a l to R T L Coding
VLSI Testing Lecture 5: Logic Simulation
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
Part IV: Synthesis from HDL Other synthesis paradigms
Vishwani D. Agrawal Department of ECE, Auburn University
Hardware Description Languages
IAY 0800 Digitaalsüsteemide disain
VHDL Introduction.
De-synchronization: from synchronous to asynchronous
Combinational Circuits
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL code ECE 448 – FPGA and ASIC Design.
Combinational Circuits
Digital Designs – What does it take
*Internal Synthesizer Flow *Details of Synthesis Steps
Introduction to Silicon Programming in the Tangram/Haste language
Presentation transcript:

Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 2 Overview Synchronous-Asynchronous Direct Translation (SADT) Synchronous-Asynchronous Direct Translation (SADT) Null Convention Logic Null Convention Logic Syntax Directed Compilation (Balsa) Syntax Directed Compilation (Balsa) Martin Synthesis (Caltech Asynchronous Synthesis Tools) Martin Synthesis (Caltech Asynchronous Synthesis Tools)

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 3 Synchronous-Asynchronous Direct Translation (SADT) Starting point: synchronous circuit description in a standard HDL Starting point: synchronous circuit description in a standard HDL Synthesis with conventional tools into sync. gate-level netlist Synthesis with conventional tools into sync. gate-level netlist Transformation of synchronous netlist into asynchronous netlist Transformation of synchronous netlist into asynchronous netlist Technology mapping Technology mapping Place and Route Place and Route Timing Verification Timing Verification

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 4De-synchronization SADT approach SADT approach Design style: Bundled data Design style: Bundled data Substitution of flip-flops by latches Substitution of flip-flops by latches Substitution of clock by local asynchronous controllers Substitution of clock by local asynchronous controllers De-synchronized circuits... De-synchronized circuits... never halt (liveness) never halt (liveness) perform same computations as synchronous circuit (flow-equivalence) perform same computations as synchronous circuit (flow-equivalence)

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 5 De-synchronization Conversion steps 1. Conversion of Flip-flops to latches D-FF separated into master/slave latches D-FF separated into master/slave latches 2. Generation of delays elements for request signals matched to length of critical path of combinational logic matched to length of critical path of combinational logic 3. Implementation and wiring of asynchronous latch controllers

Lecture "Advanced Digital Design" 6 De-synchronization Circuit Architecture [Cortadella et al., 06] De-synchronized circuit Synchronous circuit © A. Steininger & J. Lechner & R. Najvirt / TU Vienna

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 7 De-synchronization Asynchronous Controllers Controller for master/slave latches Controller for master/slave latches 4-phase protocol 4-phase protocol Different controller implementations with more or less concurrency possible Different controller implementations with more or less concurrency possible Non-overlapping Non-overlapping Semi-decoupled 4-phase Semi-decoupled 4-phase Fully-decoupled 4-phase Fully-decoupled 4-phase De-synchronization control De-synchronization control More concurrency => fast pipeline More concurrency => fast pipeline More concurrency => larger controllers More concurrency => larger controllers

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 8 De-synchronization Flow Equivalence Definition: Two circuits are flow- equivalent if they... Definition: Two circuits are flow- equivalent if they... have the same set of latches have the same set of latches For each latch, the sequence of stored values is the same in both circuits For each latch, the sequence of stored values is the same in both circuits [Cortadella et al., 06]

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 9 De-synchronization Pros/Cons Advantages Advantages Use of standard HDLs Use of standard HDLs Use of industrial-strength synthesis tools Use of industrial-strength synthesis tools Almost no re-education for hardware designers necessary Almost no re-education for hardware designers necessary Simple porting of legacy designs Simple porting of legacy designs Negligible area overhead compared to synchronous implementation Negligible area overhead compared to synchronous implementation Disadvantages Disadvantages 1-to-1 mapping of sync. circuits can lead to sub-optimal designs 1-to-1 mapping of sync. circuits can lead to sub-optimal designs

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 10 Click Elements Published as an implementation style for data-driven compilation (Haste) Published as an implementation style for data-driven compilation (Haste) Also useful for implementing asynchronous equivalents of synchronous circuits Also useful for implementing asynchronous equivalents of synchronous circuits Uses flip-flops for storage Uses flip-flops for storage Most elements implementable with cells from a standard (sync) library Most elements implementable with cells from a standard (sync) library Arbiter still required (not for SADT) Arbiter still required (not for SADT)

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 11 Click Elements

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 12 Null Convention Logic Synthesis RTL Synthesis RTL Synthesis Transform VHDL/Verilog to 3NCL netlist Transform VHDL/Verilog to 3NCL netlist Netlist contains just AND & INV gates Netlist contains just AND & INV gates Off-the-shelf synthesis tools Off-the-shelf synthesis tools NULL values are treated as “don’t care” NULL values are treated as “don’t care” Logic optimizations Logic optimizations Dual-rail expansion Dual-rail expansion 3NCL netlist to 2NCL netlist 3NCL netlist to 2NCL netlist DIMS implementation of AND & INV gates DIMS implementation of AND & INV gates Produces a delay-insenstive circuit Produces a delay-insenstive circuit Logic optimizations Logic optimizations

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 13 Dual Rail NAND DIMS implementation [Ligthart et al., 2000]

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 14 Null Convention Logic Technology Mapping DIMS implementation inefficient DIMS implementation inefficient Techn. mapping on threshold gates Techn. mapping on threshold gates Circuit functionality fully described by set function of DIMS implementation Circuit functionality fully described by set function of DIMS implementation DIMS smoothing: Derive boolean network representing set function DIMS smoothing: Derive boolean network representing set function Threshold gates have specific set function Threshold gates have specific set function Perform logic optimization and map boolean network to available threshold gates Perform logic optimization and map boolean network to available threshold gates

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 15 Dual Rail NAND DIMS implementation Set function [Ligthart et al., 2000]

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 16 Null Convention Logic Threshold Gates Library of threshold gates by Theseus Library of threshold gates by Theseus all unate functions with up to 4 inputs all unate functions with up to 4 inputs

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 17 Syntax-Directed Compilation 1-to-1 mapping of language constructs to handshake circuit components 1-to-1 mapping of language constructs to handshake circuit components Uses a library of highly optimized standard cell components for simpler physical synthesis and verification Uses a library of highly optimized standard cell components for simpler physical synthesis and verification Allows experienced designer to easily envision the resulting circuit but limits optimization potential Allows experienced designer to easily envision the resulting circuit but limits optimization potential

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 18 Balsa Handshake Circuits Approx. 40 handshake components Approx. 40 handshake components Connected over channels Connected over channels Data path associated Data path associated Pure control channels (no data transferred) Pure control channels (no data transferred) Active ports initiate communication Active ports initiate communication Passive ports respond to request Passive ports respond to request Push channel Push channel Data flow from active to passive port Data flow from active to passive port Pull channel Pull channel Data flow from passive to active port Data flow from passive to active port

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 19 Example: Handshake Components Fetch (  ) Fetch (  ) Transfers data upon request Transfers data upon request Case Case Conditional control flow element Conditional control flow element Source: [Balsa Manual]

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 20 Example: Modulo-10 Counter import [balsa.types.basic] type C_size is nibble constant max_count = 9 procedure count10(sync aclk; output count: C_size) is variable count_reg : C_size variable tmp : C_size begin loop sync aclk; if count_reg /= max_count then tmp := (count_reg + 1 as C_size) else tmp := 0 end || count <- count_reg ; count_reg := tmp end -- loop end -- begin

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner / TU Vienna 21 Example: Modulo-10 Counter Source: [Balsa Manual]

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 22 Martin synthesis The so-called Martin synthesis process is seminal work of the async group around A. J. Martin at Caltech The so-called Martin synthesis process is seminal work of the async group around A. J. Martin at Caltech Design entry is CHP, result is PRS Design entry is CHP, result is PRS Performs several transformations with designer modifiable intermediate steps Performs several transformations with designer modifiable intermediate steps

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 23 Communicating Hardware Processes Main constructs: Simple assignment: Simple assignment: v := true or v := false Selection Selection [G1 -> S1 [] G2 -> S2] [G] is [G -> skip] Repetition Repetition *[G1 -> S1 [] G2 -> S2] *[S] is *[true -> S] Sequencing and concurrent execution Sequencing and concurrent execution S1; S2 and S1, S2 Communication Communication C (synchronization) C!x (transmission) C?x (reception) #C (probe)

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 24 Process Decomposition First transformation First transformation Reduces processes with complex control structures to simple concurrent subprocesses Reduces processes with complex control structures to simple concurrent subprocesses Either syntax-directed (SDD) or data- driven (DDD) Either syntax-directed (SDD) or data- driven (DDD)

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 25 Syntax Directed Decomposition Rule: A process P with construct S can be replaced with processes P1, P2 and a new channel C by replacing S with the communication C and creating P2 of the form *[[#C -> S; C]] E.g. P: *[A; *[B1 -> S1 [] B2 -> S2]; B] P1: *[A; C; B] P2:*[[#C & B1 -> S1 []#C & B2 -> S2 []#C & B2 -> S2 []#C & ~B1 & ~B2 -> C]] []#C & ~B1 & ~B2 -> C]]

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 26 Data Driven Decomposition More fine-grained than SDD More fine-grained than SDD At the end, clustering can be performed to merge subprocesses again for better performance At the end, clustering can be performed to merge subprocesses again for better performance First transformation to dynamic single assignment (DSA) form: First transformation to dynamic single assignment (DSA) form: Each variable can be written only once in each main loop iteration, e.g.: *[A?a; X!a; B?a; Y!a] *[A?a1; X!a1; B?a2; Y!a2]

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 27 Data Driven Decomposition (2) Second transformation is projection Second transformation is projection First, transformations to allow projection e.g. variable duplication and channel addition: First, transformations to allow projection e.g. variable duplication and channel addition: *[A?a; x := a, y := ~a; X!x, Y!y] *[A?a; a1 := a, a2 := a; x := a1, y := ~a2; X!x, Y!y] *[A?a; {Ax!a, Ax?a1}, {Ay!a, Ay?a2}; x := a1, y := ~a2; X!x, Y!y] x := a1, y := ~a2; X!x, Y!y] Then projection to some sets of assignments Then projection to some sets of assignments Sets: {A?, a, Ax!, Ay!} {Ax?, a1, x, X!} {Ay?, a2, y, Y!} Projection: *[A?a; Ax!a, Ay!a], *[Ax?a1; x := a1; X!x], *[Ay?a2; y := ~a2; Y!y]

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 28 Handshake Expansion (HSE) Each communication channel is replaced by handshake signals, e.g.: Each communication channel is replaced by handshake signals, e.g.: *[…; C; …], *[#C -> …; C] is transformed to (4-phase handshake) *[…; r := 1; [a]; r := 0; [~a]; …], *[r -> …; a := 1; [~r]; a := 0] Reshuffling can then be used to increase concurrency/performance (different handshake controllers) Reshuffling can then be used to increase concurrency/performance (different handshake controllers)

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 29 Production Rule Expansion (PRE) Transforms HSE to PR in three steps: Transforms HSE to PR in three steps: State variable insertion State variable insertion PR generation PR generation Symmetrisation Symmetrisation Sequencing must be implemented explicitly Sequencing must be implemented explicitly *[[Lr]; Rr := 1; [Ra]; Rr := 0; [~Ra]; La := 1; [~Lr]; La := 0] La := 1; [~Lr]; La := 0] Lr -> Rr+ Ra -> Rr- ~Ra -> La+ ~Lr -> La-

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 30 Production Rule Expansion (PRE) Transforms HSE to PR in three steps: Transforms HSE to PR in three steps: State variable insertion State variable insertion PR generation PR generation Symmetrisation Symmetrisation Sequencing must be implemented explicitly Sequencing must be implemented explicitly *[[Lr]; Rr := 1; [Ra]; Rr := 0; [~Ra]; La := 1; [~Lr]; La := 0] La := 1; [~Lr]; La := 0] *[[Lr]; Rr := 1; [Ra]; x := 1; [x]; Rr := 0; [~Ra]; La := 1; [~Lr]; Rr := 0; [~Ra]; La := 1; [~Lr]; x := 0; [~x]; La := 0] x := 0; [~x]; La := 0] ~x & Lr -> Rr+ Ra -> x+ x -> Rr- x & ~Ra -> La+ ~Lr -> x- ~x -> La-

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 31 Production Rule Expansion (PRE) Transforms HSE to PR in three steps: Transforms HSE to PR in three steps: State variable insertion State variable insertion PR generation PR generation Symmetrisation Symmetrisation Sequencing must be implemented explicitly Sequencing must be implemented explicitly *[[Lr]; Rr := 1; [Ra]; Rr := 0; [~Ra]; La := 1; [~Lr]; La := 0] La := 1; [~Lr]; La := 0] *[[Lr]; Rr := 1; [Ra]; x := 1; [x]; Rr := 0; [~Ra]; La := 1; [~Lr]; Rr := 0; [~Ra]; La := 1; [~Lr]; x := 0; [~x]; La := 0] x := 0; [~x]; La := 0] ~x & Lr -> Rr+ Ra -> x+ ~Lr | x -> Rr- x & ~Ra -> La+ ~Lr -> x- Ra | ~x -> La-

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 32Summary Synchronous-Asynchronous Direct Translation Synchronous-Asynchronous Direct Translation Synthesis with standard tools Synthesis with standard tools Syncronous-Asynchronous transformation Syncronous-Asynchronous transformation Martin Synthesis Martin Synthesis Process decomposition Process decomposition Handshake expansion Handshake expansion Production rule expanstion Production rule expanstion

Lecture "Advanced Digital Design"© A. Steininger & J. Lechner & R. Najvirt / TU Vienna 33References Jordi Cortadella, Alex Kondratyev, Luciano Lavagno, Christos P. Sotiriou. Desynchronization: Synthesis of Asynchronous Circuits From Synchronous Specifications Jordi Cortadella, Alex Kondratyev, Luciano Lavagno, Christos P. Sotiriou. Desynchronization: Synthesis of Asynchronous Circuits From Synchronous Specifications Alain J. Martin. Programming in VLSI: From Communicating Processes to Self-timed VLSI Circuits Alain J. Martin. Programming in VLSI: From Communicating Processes to Self-timed VLSI Circuits Catherine G. Wong and Alain J. Martin. High-Level Synthesis of Asynchronous Systems by Data- Driven Decomposition Catherine G. Wong and Alain J. Martin. High-Level Synthesis of Asynchronous Systems by Data- Driven Decomposition Ad Peeters, Frank te Beest, Mark de Wit, Willem Mallon. Click Elements – An Implementation Style for Data-Driven Compilation Ad Peeters, Frank te Beest, Mark de Wit, Willem Mallon. Click Elements – An Implementation Style for Data-Driven Compilation. 2010