Presentation is loading. Please wait.

Presentation is loading. Please wait.

STG-based synthesis and Petrify J. Cortadella (Univ. Politècnica Catalunya) Mike Kishinevsky (Intel Corporation) Alex Kondratyev (University of Aizu) Luciano.

Similar presentations


Presentation on theme: "STG-based synthesis and Petrify J. Cortadella (Univ. Politècnica Catalunya) Mike Kishinevsky (Intel Corporation) Alex Kondratyev (University of Aizu) Luciano."— Presentation transcript:

1 STG-based synthesis and Petrify J. Cortadella (Univ. Politècnica Catalunya) Mike Kishinevsky (Intel Corporation) Alex Kondratyev (University of Aizu) Luciano Lavagno (Universitá di Udine) Enric Pastor (Univ. Politècnica Catalunya) Alexander Taubin (University of Aizu) Alex Yakovlev (Univ. Newcastle upon Tyne)

2 What is it about? u This tutorial is about the synthesis of asynchronous circuits from behavioral specifications. u STGs can specify I/O concurrency (based on Petri nets). u STGs specify behavior at a level in which logic synthesis techniques can be applied. u Speed-independent and timed circuits can be derived.

3 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Designflow

4 Outline u Overview u Synthesis steps –Specification (STGs) –State encoding –Logic synthesis, decomposition and mapping u Synthesis with relative timing u Conclusions

5 x y z x+ x- y+ y- z+ z- Signal Transition Graph (STG) x y z

6 x y z x+ x- y+ y- z+ z-

7 x+ x- y+ y- z+ z- xyz 000 x+ 100 y+ z+ y+ 101 110 111 x- 001 011 y+ z- 010 y-

8 xyz 000 x+ 100 y+ z+ y+ 101 110 111 x- 001 011 y+ z- 010 y- Next-state functions

9 x z y

10 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Designflow

11 VME bus Device LDS LDTACK D DSr DSw DTACK VME Bus Controller Data Transceiver Bus DSr LDS LDTACK D DTACK Read Cycle

12 STG for the READ cycle LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDS LDTACK D DSr DTACK VME Bus Controller

13 Choice: Read and Write cycles DSr+ LDS+ LDTACK+ D+ DTACK+ DSr- D- LDS- LDTACK-DTACK- DSw+ D+ LDS+ LDTACK+ D- DTACK+ DSw- LDS- LDTACK-DTACK-

14 Choice: Read and Write cycles DTACK- DSr+ LDS+ LDTACK+ D+ DTACK+ DSr- D- LDS- LDTACK- DSw+ D+ LDS+ LDTACK+ D- DTACK+ DSw- LDS- LDTACK-DTACK-

15 Choice: Read and Write cycles DTACK- DSr+ LDS+ LDTACK+ D+ DTACK+ DSr- D- LDS- LDTACK- DSw+ D+ LDS+ LDTACK+ D- DTACK+ DSw- LDS- LDTACK-DTACK-

16 Choice: Read and Write cycles DTACK- DSr+ LDS+ LDTACK+ D+ DTACK+ DSr- D- LDS- LDTACK- DSw+ D+ LDS+ LDTACK+ D- DTACK+ DSw- LDS- LDTACK-DTACK-

17 Circuit synthesis u Goal: –Derive a hazard-free circuit under a given delay model and mode of operation

18 Modes of operation Current state Next state u Fundamental mode –Single-input changes –Multiple-input changes u Input / Output mode –Concurrency circuit / environment

19 Speed independence u Delay model –Unbounded gate / environment delays –Certain wire delays shorter than certain paths in the circuit u Conditions for implementability: –Consistency –Complete State Coding –Output persistency

20 Other synthesis approaches u Burst-mode machines –Mealy-like FSMs –Fundamental mode (slow environment) u VLSI programming –Syntax-directed translation from CSP (“Communicating Sequential Processes”) –No logic synthesis –Circuit size ~ Size of the specification

21 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Designflow

22 STG for the READ cycle LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDS LDTACK D DSr DTACK VME Bus Controller

23 State Graph (Read cycle) DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS+

24 Binary encoding of signals DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS+

25 Binary encoding of signals DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS+ 10000 10010 10110 0111001110 01100 00110 10110 (DSr, DTACK, LDTACK, LDS, D)

26 QR (LDS+) QR (LDS-) Excitation / Quiescent Regions ER (LDS+) ER (LDS-) LDS- LDS+ LDS-

27 Next-state function 0  1 LDS- LDS+ LDS- 1  0 0  0 1  1 10110

28 Karnaugh map for LDS DTACK DSr D LDTACK 00011110 00 01 11 10 DTACK DSr D LDTACK 00011110 00 01 11 10 LDS = 0 LDS = 1 01-0 000000/1? 1 111 - - - --- ---- - ---- ---

29 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Designflow

30 Concurrency reduction LDS- LDS+ LDS- 10110 DSr+

31 Concurrency reduction LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+

32 State encoding conflicts LDS- LDTACK- LDTACK+ LDS+ 10110

33 Signal Insertion LDS- LDTACK- D- DSr- LDTACK+ LDS+ CSC- CSC+ 101101 101100

34 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Designflow

35 Complex-gate implementation

36 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Designflow

37 Hazards a b c x 0 abcx 1000 1 0 0 1100 b+ 1 1 0 0100 a- 0 1 0 0110 c+ 0 1 1

38 Hazards abcx 1000 1100 b+ 0100 a- 0110 c+ a b z c x 1 0 0 0 0 1000 1 1 0 0 0 1100 1 1 1 0 0 0 1 1 0 0 0100 0 1 1 1 0 0110 0 1 1 1 1 0 1 0 1 1 0 1 0 1 0

39 Decomposition u Global acknowledgement u Generating candidates u Hazard-free signal insertion –Event insertion –Signal insertion

40 Global acknowledgement a b c z a b d y d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c-

41 a b c z a b d y How about 2-input gates ? d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c-

42 a b c z a b d y How about 2-input gates ? d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c-

43 a b c z a b d y How about 2-input gates ? 0 0 d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c-

44 a b c z a b d y How about 2-input gates ? d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c-

45 c z d y How about 2-input gates ? a b d-b+d+y+a-y-c+d- c-d+z-b-z+c+a+c-

46 Strategy for logic decomposition u Each decomposition defines a new internal signal u Method: Insert new internal signals such that –After resynthesis, some large gates are decomposed –The new specification is hazard-free u Generation of candidates for decomposition: –Algebraic factorization –Boolean factorization (boolean relations)

47 y- z-w- y+x+ z+ x- w+ 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ Decomposition example

48 yz=1 yz=0 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ C C x y x y w z x y z y z w z w z y

49 s- s+ s- s=1 s=0 10011011 1000 1010 0111 0011 y+ x- w+ z+ z- 0001 00000101 00100100 0110 x+ w- z- y+ x+ 1001 1000 1010 y+ z- 0111 C C x y x y w z x y z w z w z y s y-

50 s- s+ s- s=1 s=0 10011011 1000 1010 0111 0011 y+ x- w+ z+ z- 0001 00000101 00100100 0110 x+ w- z- y+ x+ 1001 1000 1010 y+ z- 0111 y- z-w- y+x+ z+ x- w+ s- s+ s- s+ s- s+ s- s+ s- s+ s- s+ s- s+ s- s+

51 C C x y x y w z x y z y z w z w z y yz=1 yz=0 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+

52 s- s+ s=1 s=0 1001 1011 0111 0011 x- w+ z+ 0001 00000101 00100100 0110 x+ w- z- y+ x+ 1001 1000 1010 y+ z- 0111 y- z-w- y+x+ z+ x- w+ s- s+ s- s+ s- s+ s- s+ s- s+ s- s+ s- s+ s- s+ z- is delayed by the new transition s- !

53 yz=1 yz=0 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ C C x y x y w z x y z w z w z yyyyyyy 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+

54 Signal insertion for function F State Graph F=0F=1 Insertion by input borders F- F+

55 Event insertion a b ER(x) c x x x x b SR(x) a

56 Properties to preserve a a b b a a b b a a b b x a a b b a a b b b a a b b x x a is persistent a is disabled by b = hazards

57 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Designflow

58 Timing assumptions in design flow u Speed-independent: wire delays after a fork shorter than fan-out gate delays u Burst-mode: circuit stabilizes between two changes at the inputs u Timed circuits: Absolute bounds on gate / environment delays are known a priori (before physical design)

59 Relative Timing Circuits u Assumptions: “a before b” for concurrent and ordered events u Used by the tool to derive a circuit and timing constraints that must be met in physical design flow u Applied to design of the Rotating Asynchronous Pentium Processor(TM) Instruction Decoder (K.Stevens, S.Rotem et al. Intel Corporation)

60 Lazy Transition Systems ER (LDS+) EnR (LDS-) LDS- LDS+ LDS- DTACK- FR (LDS-) Event LDS- is lazy: firing = subset of enabling

61 Timing assumptions u (a before b) for concurrent events: concurrency reduction for firing and enabling u (a before b) f or ordered events: early enabling u (a simultaneous to b wrt c) for triples of events: combination of the above

62 Netlist with SI timing constraints LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map

63 Adding timing assumptions (I) LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map LDTACK- before DSr+ FAST SLOW

64 Adding timing assumptions (I) DTACK D DSr LDS LDTACK csc map LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDTACK- before DSr+

65 State space domain LDTACK- before DSr+ LDTACK- DSr+

66 State space domain LDTACK- before DSr+ LDTACK- DSr+

67 State space domain LDTACK- before DSr+ LDTACK- DSr+ Two more unreachable states

68 Boolean domain DTACK DSr D LDTACK 00011110 00 01 11 10 DTACK DSr D LDTACK 00011110 00 01 11 10 LDS = 0 LDS = 1 01-0 000000/1? 1 111 - - - --- ---- - ---- ---

69 Boolean domain DTACK DSr D LDTACK 00011110 00 01 11 10 DTACK DSr D LDTACK 00011110 00 01 11 10 LDS = 0 LDS = 1 01-0 00-001 1 111 - - - --- ---- - ---- --- One more DC vector for all signalsOne state conflict is removed

70 Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map

71 Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK LDTACK- before DSr+ TIMING CONSTRAINT

72 Timing assumptions u (a before b) for concurrent events: concurrency reduction for firing and enabling u (a before b) f or ordered events: early enabling u (a simultaneous to b wrt c) for triples of events: combination of the above

73 Ordered events: early enabling a c b a a c b a b b c c F G Logic for gate c may change

74 Adding timing assumptions (II) LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK D- before LDS-

75 State space domain LDS- D- Reachable space is unchanged For LDS- enabling can be changed in one state D- before LDS- Potential enabling for LDS- DSr-

76 Boolean domain DTACK DSr D LDTACK 00011110 00 01 11 10 DTACK DSr D LDTACK 00011110 00 01 11 10 LDS = 0 LDS = 1 01-0 00-001 1 111 - - - --- ---- - ---- ---

77 Boolean domain DTACK DSr D LDTACK 00011110 00 01 11 10 DTACK DSr D LDTACK 00011110 00 01 11 10 LDS = 0 LDS = 1 01-0 00-001 1 11 - - - - --- ---- - ---- --- One more DC vector for one signal: LDS If used: LDS = DSr, otherwise: LDS = DSr + D

78 Before early enabling LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK

79 Netlist with two constraints LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDTACK- before DSr+ and D- before LDS- TIMING CONSTRAINTS DTACK D DSr LDS LDTACK Both timing assumptions are used for optimization and become constraints

80 Backannotation u Timed circuits require post-verification u Can synthesis tools help ? –Report the least stringent set of timing assumptions required for the correctness of the circuit –Not all initial timing assumptions may be required u Petrify reports a set of firing order constraints that guarantee the circuit correctness

81 Experiments u Assumption: delays are controllable in physical design u 2-3x improvement in area/delay wrt to SI (K.Stevens, S.Rotem et al. Intel Corporation) –Rotating Asynchronous Pentium Processor(TM) –Instruction Decoder (Async’99)

82 Summary u Synthesis of asynchronous circuits can be automated at gate level (logic synthesis) u Timing assumptions/constraints are essential to compete with synchronous circuits u Relative timing seems to be a promising approach for specification and synthesis u High-level and logic synthesis can be combined (e.g. CSP  Petri net  circuit)

83 Petrify u The synthesis methodology presented in this tutorial is handled by petrify u but also... –Concurrency reduction –Automatic handshake expansion (2-4 phase) –Noise isolation –Synthesis with gC elements and gate libraries –Synthesis of Petri nets –Synthesis of Petri nets (crucial for backannotation) –...

84 Petrify: implementation details u 50,000 lines of code + SIS (data structures & logic synthesis) + BDD package (symbolic manipulation) + dot (graph visualization package from AT&T) u BDD-based implementation: –Reachability analysis –Manipulation of sets of states –Boolean minimization

85 Petrify http://www.lsi.upc.es/~jordic/petrify References Tutorial for the designer Binaries (for several platforms)


Download ppt "STG-based synthesis and Petrify J. Cortadella (Univ. Politècnica Catalunya) Mike Kishinevsky (Intel Corporation) Alex Kondratyev (University of Aizu) Luciano."

Similar presentations


Ads by Google