1 Logic design of asynchronous circuits Part III: Advanced topics on synthesis
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits2 Outline Logic decomposition –Hazard-free decomposition –Signal insertion –Technology mapping Optimization based on timing information –Relative timing –Timing assumptions and constraints Other synthesis paradigms –HDLs, CSP, burst-mode,...
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits3 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits4 No Hazards a b c x 0 abcx b a c
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits5 Decomposition May Lead to Hazards abcx b a c+ a b z c x
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits6 Decomposition Acknowledgement Global acknowledgement Generating candidates Hazard-free signal insertion –Event insertion –Signal insertion
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits7 Global acknowledgement a b c z a b d y d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c-
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits8 a b c z a b d y How about 2-input gates ? d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c-
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits9 a b c z a b d y d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c- How about 2-input gates ?
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits10 a b c z a b d y 0 0 d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c- How about 2-input gates ?
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits11 a b c z a b d y d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c- How about 2-input gates ?
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits12 c z d y a b d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c- How about 2-input gates ?
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits13 Strategy for logic decomposition Each decomposition defines a new internal signal Method: Insert new internal signals such that –After resynthesis, some large gates are decomposed –The new specification is hazard-free Generate candidates for decomposition using standard logic factorization techniques: –Algebraic factorization –Boolean factorization (boolean relations)
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits14 y- z-w- y+x+ z+ x- w y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ Decomposition example
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits15 yz=1 yz= y- y+ x- x+ w+ w- z+ z- w- z- y+ x y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ C C x y x y w z x y z y z w z w z y Decomposition example
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits16 s- s+ s- s=1 s= y+ x- w+ z+ z x+ w- z- y+ x y+ z C C x y x y w z x y z w z w z y s y- Decomposition example
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits17 y- z-w- y+x+ z+ x- w+ s- s+ s- s+ s- s=1 s= y+ x- w+ z+ z x+ w- z- y+ x y+ z y- Decomposition example
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits18 C C x y x y w z x y z y z w z w z y yz=1yz= y- y+ x- x+ w+ w- z+ z- w- z- y+ x y- y+ x- x+ w+ w- z+ z- w- z- y+ x Decomposition example
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits19 s- s+ s=1 s= x- w+ z x+ w- z- y+ x y+ z y- z-w- y+x+ z+ x- w+ s- s+ z- is delayed by the new transition s- ! Decomposition example
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits20 C C x y x y w z x y z w z w z yyyyyyy s- s+ s=1 s= x- w+ z x+ w- z- y+ x y+ z y- Decomposition example
F C Sr D Decomposition (Algebraic, Boolean relations) Hazard-free ? (Event insertion) NO YES C C C C Sr D D
F C D Hazard-free ? (Event insertion) NO YES C C Sr D until no more progress Decomposition (Algebraic, Boolean relations)
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits23 Signal insertion for function F State Graph F=0F=1 Insertion by input borders F- F+
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits24 Event insertion a b ER(x) c
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits25 Event insertion a b ER(x) c x x x x b SR(x) a
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits26 Properties to preserve a a b b a a b b a a b b x a a b b a a b b b a a b b x x a is persistent a is disabled by b = hazards
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits27 Boolean decomposition F x1x1 xnxn f HG x1x1 xnxn h1h1 hmhm f f = F (x 1,…,x n )f = G(H(x 1,…,x n )) Our problem: Given F and G, find H
C h1h1 h2h2 f state f next(f) (h 1,h 2 ) s (0,-) (-,0) s (1,1) s (0,0) s (-,1) (1,-) dc - - (-,-) This is a Boolean Relation
y- a+c- d- a- c+ a+ y+ a- c- d+ c+ y a c d F Rs y R S
y- a+c- d- a- c+ a+ y+ a- c- d+ c+ y a c d Rs y a c d c d
y- a+c- d- a- c+ a+ y+ a- c- d+ c+ y a c d Rs ya
y- a+c- d- a- c+ a+ y+ a- c- d+ c+ y a c d Rs ya D d c
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits33 Technology mapping Merging small gates into larger gates introduces no new hazards Standard synchronous technique can be applied, e.g. BDD-based boolean matching Handles sequential gates and combinational feedbacks Due to hazards there is no guarantee to find correct mapping (some gates cannot be decomposed) Timing-aware decomposition can be applied in these rare cases
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits34 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits35 Timing assumptions in design flow Speed-independent: wire delays after a fork smaller than fan-out gate delays Burst-mode: circuit stabilizes between two changes at the inputs Timed circuits: Absolute bounds on gate / environment delays are known a priori (before physical design)
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits36 Relative Timing Circuits Assumptions: “a before b” –for concurrent events: reduces reachable state space –for ordered events: permits early enabling –both increase don’t care space for logic synthesis => simplify logic (better area and timing) “Assume - if useful - guarantee” approach: assumptions are used by the tool to derive a circuit and required timing constraints that must be met in physical design flow Applied to design of the Rotating Asynchronous Pentium Processor(TM) Instruction Decoder (K.Stevens, S.Rotem et al. Intel Corporation)
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits37 Speed-independent C-element Relative Timing Asynchronous Circuits a- before b- Timing assumption (on environment): a b c RT C-element: faster,smaller; correct only under timing constraint: a- before b- a b c
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits38 State Graph (Read cycle) DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS+
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits39 Lazy Transition Systems ER (LDS+) ER (LDS-) LDS- LDS+ LDS- DTACK- FR (LDS-) Event LDS- is lazy: firing = subset of enabling
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits40 Timing assumptions (a before b) for concurrent events: concurrency reduction for firing and enabling (a before b) f or ordered events: early enabling (a simultaneous to b wrt c) for triples of events: combination of the above
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits41 Speed-independent Netlist LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits42 Adding timing assumptions (I) LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map LDTACK- before DSr+ FAST SLOW
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits43 Adding timing assumptions (I) DTACK D DSr LDS LDTACK csc map LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDTACK- before DSr+
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits44 State space domain LDTACK- before DSr+ LDTACK- DSr+
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits45 State space domain LDTACK- before DSr+ LDTACK- DSr+
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits46 State space domain LDTACK- before DSr+ LDTACK- DSr+ Two more unreachable states
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits47 Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = /1?
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits48 Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = One more DC vector for all signalsOne state conflict is removed
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits49 Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits50 Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK LDTACK- before DSr+ TIMING CONSTRAINT
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits51 Timing assumptions (a before b) for concurrent events: concurrency reduction for firing and enabling (a before b) f or ordered events: early enabling (a simultaneous to b wrt c) for triples of events: combination of the above
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits52 Ordered events: early enabling a c b a a c b a b b c c F G Logic for gate c may change
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits53 Adding timing assumptions (II) LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK D- before LDS-
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits54 State space domain LDS- D- Reachable space is unchanged For LDS- enabling can be changed in one state D- before LDS- Potential enabling for LDS- DSr-
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits55 Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS =
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits56 Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = One more DC vector for one signal: LDS If used: LDS = DSr, otherwise: LDS = DSr + D
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits57 Before early enabling LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits58 Netlist with two constraints LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDTACK- before DSr+ and D- before LDS- TIMING CONSTRAINTS DTACK D DSr LDS LDTACK Both timing assumptions are used for optimization and become constraints
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits59 Value of Relative Timing RT circuits provides up to 2-3x (1.3-2x) delay&area reduction with respect to SI circuits synthesized without (with) concurrency reduction Automatic generation of timing assumptions => foundation for automatic synthesis of RT circuits with area/performance comparable/better than manual Back-annotation of timing constraints => minimal required timing information for the back-end tools Timing-aware state encoding allows significant area/performance optimization
Specification (STG + user assumptions) Lazy State Graph Lazy SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis Timing-aware state encoding Boolean minimization Logic decomposition Technology mapping Design Flow with Timing Required Timing Constraints Automatic Timing Assumptions
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits61 FIFO example FIFO li lo ro ri li- li+ lo+ lo- ro+ ro- ri+ ri-
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits62 Speed-Independent Implementation without concurrency reduction 3 state signals are required
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits63 SI implementation with concurrency reduction li lo ro ri x li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x- + gC + -
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits64 RT implementation li lo ro ri x li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x- OR li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x-
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits65 RT implementation li lo ro ri x li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x- OR li- li+ lo+ lo- ro+ ro- ri+ ri- x+ x- To satisfy the constraint: Delay(x- ) < Delay (ri+ ) and Delay(lo+) + Delay(x- ) < Delay(ro+ ) + Delay (ri+ ) All constraints are either satisfied by default or easy to satisfy by sizing
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits66 Other synthesis paradigms: outline Synthesis from HDL (Verilog) [Lavagno et al, Async00] –Subset for asynchronous specification –Data-path/control partitioning –Circuit architecture. Control generation Synthesis from asynchronous HDL (CSP, Tangram) –CSP for control generation [A. Martin et al, Caltech] –Tangram for silicon compilation [K. van Berkel et al, Philips] Control synthesis using FSMs [K. Yun, S. Nowick] –Burst-mode machines –Comparison with STGs
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits67 Motivation Language-based design key enabler to synchronous logic success Use HDL as single language for specification logic simulation and debugging synthesis post-layout simulation HDL must support multiple levels of abstraction
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits68 Splitting of asynchronous control and synchronous data path Automated insertion of bundling delays CONTROL UNIT DATA PATH delay request acknowledge Control-data partitioning
Design flow Control/data splitting STG (control) HDL specification Synthesizable HDL (data) Synthesis (petrify) Timing analysis (Synopsys) HDL implementation Synthesis (Synopsys) Logic implementation Delay insertion Logic delays
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits70 Asynchronous Verilog subset by example always begin wait(start); R = SMP * 3; RES = SMP * 4 + R; if(RES[7] == 1) RES = 0; else begin if(RES[6] == 1) RES = 1; end; done = 1; wait(!start); done = 0; end R RESRES SMP donestart RES C.U. begin-end for sequencing, fork-join for concurrency, if-else for input choice Only structured mix of sequencing, concurrency and choice can be specified
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits71 Synthesis from asynchronous HDL CSP based languages CSP = communicating sequential processes [Hoare] Two synthesis techniques –based on program transformations [Caltech] –based on direct compilation [Philips] Tools are more mature than for asynchronous synthesis from standard HDL Complete shift in design methodology is required
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits72 Using CSP for control generation After li goes high do full handshake at the right, then complete handshake at the left and iterate. li+ro+ri+ro-ri-lo+li-lo- ro ri li lo Q element *[[li];ro+;[ri];ro-;[not ri];lo+;[not li];lo-] “;” = sequencing operator ro+ = ro goes high; ro- = ro goes low [li] = wait until li is high; [not li] = wait until li is low CSP: STG:
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits73 Using CSP for control generation *[[li];ro+;[ri];ro-;[not ri];lo+;[not li];lo-] Conflict: ro+ and ro- are not mutually exclusive (since ri+ and li+ are not) Eliminate conflict by state signal insertion (= CSC) CSP: Production rules: li -> ro+; ri -> ro- not ri -> lo+; not li -> lo- ri li ro weak
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits74 Conflict elimination *[[li];ro+;[ri];x+;[x];ro-;[not ri];lo+;[not li];x-;[not x];lo-] CSP: Production rules: not x and li -> ro+; x or not li -> ro- x and not ri -> lo+; not x or ri -> lo- ri -> x+; not li -> x- FF x not x li lo ri ro
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits75 Buffer example in Tangram (a?byte & b!byte) begin x0: var byte | forever do a?x0 ; b!x0 od end Buffer * x a b T ; T a b passive port active port Each circle mapped to a netlist Data path Q element
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits76 Summary Tangram program is partitioned into data path and control Data path is implemented as dual or single rail Control is mapped to composition of standard elements (“;” “||” etc) Each standard element is mapped to a circuit Post-optimization is done Composing islands of control elements and re-synthesis with STG can give more aggressive optimization Philips made a few chips using Tangram, including a product: 8051 micro-controller in low-power pager Muna (25 wks battery life from one AAA battery) Similar approach used in Balsa (Manchester Univ.)
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits77 Burst mode FSM s1 s2 s3 s4 b-/x- a+b+/y+ a-/x+y- c+/y- c-/y+ Close to synchronous FSMs with binary encoded I/O Work in bursts: –Input transitions fire –Output transitions fire –State signals change Mostly limited to fundamental mode: next input burst cannot arrive before stabilization at the outputs
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits78 Extended Burst mode s1 s2 s3 s4 b-/x- a+b*/y+ a-/x+y- c+/y- c-/y+ Directed don’t cares (b*): some concurrency is allowed for input transitions that do not influence an output burst Conditional guards = “if b=1 then …”
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits79 Synthesis of XBM Next state and output functions free of functional and logic hazards Sequential feedbacks should not introduce new hazards State assignment –one state of the BM spec to one layer of Karnaugh map –compatible layers are merged –layers are compatible if merging does not introduce CSC violations or hazards –Layers are encoded using race free encoding
ASPDAC / VLSI Tutorial on Logic Design of Asynchronous Circuits80 XBM and STG s1 s2 s3 s4 b-/x- a+b*/y+ a-/x+y- c+/y- c-/y+ x- a+ y+ b+ eps c- a- c+ y- y+ x+ y- b-