Presentation is loading. Please wait.

Presentation is loading. Please wait.

Synthesis of Speed Independent Circuits Based on Decomposition

Similar presentations


Presentation on theme: "Synthesis of Speed Independent Circuits Based on Decomposition"— Presentation transcript:

1 Synthesis of Speed Independent Circuits Based on Decomposition
Tomohiro Yoneda National Institute of Informatics Tokyo Institute of Technology Hiroomi Onda Tokyo Institute of Technology Chris Myers University of Utah

2 Background High-level synthesis Major approach to high-level synthesis
plays an important role to push Async. design to wide use Major approach to high-level synthesis Prepare basic cells that correspond to specification language constructs Translate specifications to basic cell networks syntax-directedly with local optimizations Very efficient Global optimization may be difficult 2004/4/21 Async2004

3 Challenge Our approach to high-level synthesis
Translate high-level spec to low-level spec (time Petri nets) Use timed logic synthesis technique Global optimization can be possible by logic optimization timing information Cost for synthesis is very high 2004/4/21 Async2004

4 How to reduce the cost Translation technique to low-level spec
guarantees that low-level spec has CSC by adding state variables sufficiently Idea: [Yoneda,Myers 2003] Developing Balsa Compiler Efficient logic synthesis technique decomposes low-level spec w.r.t. each output synthesizes each sub-circuit from each sub-spec Goal of this work In this paper, speed independent circuit synthesis is discussed 2004/4/21 Async2004

5 Decomposition based synthesis
Input STG 1 safe output semi-modular with CSC (Complete State Coding) several more restrictions Output Reduced STG for each output g-C or atomic-gate implementation is synthesizable Feature Only state graphs for reduced STGs are necessary It is not necessary to explore the reachable states of the original STGs 2004/4/21 Async2004

6 Key issue - input set determination
ack1 req1 csc gC req2 csc req1 gC ack1 req1 ack1 csc ack2 gC csc req1 ack1 2004/4/21 Async2004

7 Key issue - input set determination
ack1 req1 csc gC req2 csc Reduction req1 gC ack1 req1 ack1 csc ack2 gC csc req1 ack1 2004/4/21 Async2004

8 Related works Synthesizing each output separately
T.A. Chu, Synthesis of Self-Timed VLSI Circuits from Graph-theoretic Specification, PhD thesis, MIT,1987 No idea for input set determination R. Puri, J. Gu, A Modular Partitioning Approach for Asynchronous Circuit Synthesis, IEEE TCAD, 1995 Input set determination is performed based on the state graph of the original STG Input signals are kept, if hiding them does not increase the number of CSC conflicts W. Vogler, R. Wollowski, Decomposition in Asynchronous Circuit Design, Tech Report, Univ. Augsburg, 2002 STG reduction technique - net contraction - is formalized No general idea for input set determination 2004/4/21 Async2004

9 Our approach Step 1: Select possible trigger signals as the initial input set Step 2: Contract the original STG by deleting signals except for the output and those in the current input set Step 3: If the reduced STG has CSC, done Step 4: Otherwise, choose appropriate signals and add them to the input set Step 5: Goto Step 2 2004/4/21 Async2004

10 Contraction Original STG Reduced STG Possible trigger signals
bisimilar translation (i.e., by W. Vogler, R. Wollowski) 2004/4/21 Async2004

11 Issues to be discussed If the reduced STG has CSC, is a correct speed independent circuit synthesized from it? How can appropriate signals be chosen without the state graph of the original STG? How is the overhead (performance degradation of the synthesized circuit)? 2004/4/21 Async2004

12 Issues to be discussed If the reduced STG has CSC, is a correct speed independent circuit synthesized from it? How can appropriate signals be chosen without the state graph of the original STG? How is the overhead (performance degradation of the synthesized circuit)? 2004/4/21 Async2004

13 An example (a b c x) b+ ES(x+) 0100 0000 c+ a+/1 0110 110R a+/1 c+ x+
1101 x+ c+ 1111 a-/1 a+/2 1000 c- 011F 0000 x- 1010 c- a+/2 0110 0010 b- ES(x-) 2004/4/21 Async2004

14 CD(S): Extended set of S by deleting signals
If a is deleted (a b c x) CD(ES(x+)) b+ 0100 0000 c+ a+/1 0110 110R a+/1 c+ x+ 111R a-/2 1101 x+ c+ 1111 1000 a-/1 a+/2 c- 011F 0000 x- 1010 c- a+/2 0110 0010 b- CD(ES(x-)) CD(S): Extended set of S by deleting signals 2004/4/21 Async2004

15 Irrelevant input set A set D of signals is an irrelevant input set for an output x, if D  In  Out – {x} CD(ES(x+)) – UR = ES(x+) CD(ES(x–)) – UR = ES(x–) In: Input signal set of the original STG Out: Output signal set of the original STG UR: Unreachable state set of the original STG 2004/4/21 Async2004

16 {a} is not an irrelevant
If a is deleted (a b c x) CD(ES(x+)) b+ 0100 0000 c+ a+/1 0110 110R a+/1 c+ x+ 111R a-/2 1101 x+ CD(ES(x+)) – UR  ES(x+) CD(ES(x–)) – UR  ES(x–) c+ 1111 a-/1 a+/2 1000 c- 011F 0000 {a} is not an irrelevant input set x- 1010 c- a+/2 0110 0010 b- CD(ES(x-)) If a non-irrelevant input set is deleted, the reduced STG has no CSC 2004/4/21 Async2004

17 If c is deleted (a b c x) CD(ES(x+)) b+ 0100 0000 c+ a+/1 0110 110R
1101 x+ CD(ES(x+)) – UR = ES(x+) CD(ES(x–)) – UR = ES(x–) c+ 1111 1000 a-/1 a+/2 c- 0101 011F 0000 x- 1010 {c} is an irrelevant input set c- a+/2 0110 0010 b- CD(ES(x-)) If an irrelevant input set (including no possible trigger signals) is deleted, a correct circuit is obtained from the reduced STG 2004/4/21 Async2004

18 Theorem 1 For an STG G that has CSC and is output semi-modular, if a reduced STG G' obtained from G by deleting some signal set V (including no possible trigger signals) has CSC, then a correct circuit is obtained from G' If V is not an irrelevant input set, G' must not have CSC V must be an irrelevant input set A correct circuit is obtained from G' 2004/4/21 Async2004

19 Issues to be discussed If the reduced STG has CSC, is a correct speed independent circuit synthesized from it? How can appropriate signals be chosen without the state graph of the original STG? How is the overhead (performance degradation of the synthesized circuit)? 2004/4/21 Async2004

20 Contraction with initial input set
Original STG Reduced STG Possible trigger signals contraction 2004/4/21 Async2004

21 Checking CSC Constructing state graph of the reduced STG Reduced STG
00 a+/1 1R 10 CSC conflict 1R x+ 11 a-/2 a-/1 0F x- 00 a+/2 10 2004/4/21 Async2004

22 Guided Simulation abstracted trace original trace
State graph of the original STG 00 b+ 0100 0000 a+/1 c+ a+/1 1R 0110 110R noninterface transition a+/1 x+ c+ x+ 111R 11 1101 x+ a-/1 c+ interface transition 1111 0F a+/2 1000 a-/1 x- c- 011F 0000 00 x- 1010 c- a+/2 This can be obtained by simulating the original STG not requiring the state graph of the original STG a+/2 0110 0010 b- 10 2004/4/21 Async2004

23 Generating original trace
Original STG noninterface transitions t1 t2 abstracted trace: a+ b+ interface transitions b+ t3 original trace: t2 t3 a+ b+ a+ 2004/4/21 Async2004

24 Analysis of original trace
b+ 0100 0000 c+ noninterface signal 0110 a+/1 111R interface signal x+ 1111 a+/2 1000 a-/1 011F 0000 x- c- 0110 0010 b- Find a noninterface signal that certainly changes odd times here 2004/4/21 Async2004

25 Analysis of original trace
b+ 0100 0000 c+ noninterface signal 0110 a+/1 111R interface signal x+ 1111 Add b to the input set 1000 a-/1 a+/2 011F 0000 x- c- Resolve this CSC conflict 0110 0010 b- 2004/4/21 Async2004

26 Analysis of original trace
concurrent b+ 0100 0000 c+ noninterface signal 0110 a+/1 111R interface signal x+ 1111 But, c does not actually 1000 a-/1 a+/2 011F 0000 c also seems to satisfy this condition x- c- 0110 0010 b- Select a noninterface signal that certainly changes odd times here 2004/4/21 Async2004

27 Formalization original trace (init) w is odd-confined by f1 : we1
w changes odd times in f1 if w changes in f0, then we1 e1 e1  ws2 we2  e2 last w f0 e1 CSC conf. ws2 first w "" represents causality relation obtained from structure of STG f1 we2 last w e2 interface signal CSC conf. 2004/4/21 Async2004

28 Analysis of original trace
b+ 0100 0000 c+ noninterface signal 0110 a+/1 111R interface signal x+ 1111 a+/2 1000 a-/1 011F 0000 x- c- f1 0110 0010 b- b is odd-confined by f1 c is not odd-confined by f1 2004/4/21 Async2004

29 Theorem 2 f0 If w(and ui)satisfies the following
condition, adding w(and ui) resolves the CSC conflict in f1           (sufficient condition) w is odd-confined by f1 w does not changes in l If w changes before the first interface signal, for each odd i ui is odd-confined by hi with causality relation shown in the figure 110R w+ 1 110R 1 110R w- f1 w+ 1 110R interface transition h1 1101 1 For one CSC conflict, there exist many candidate sets of signals          ↓ Analyze every CSC conflict Set up the covering problem and solve it h3 u+ 1100 1 interface transition l 1001 1 2004/4/21 Async2004

30 Drawback For an STG with conflicting transitions, backtracking may be needed Finding actually fired noninterface transitions is no longer deterministic due to deleting conflicting transitions If many conflicting transitions exist, backtracking sometimes costs a lot Approaches that seem practical are to Keep all conflicting transitions even if they are not related to backtracking, or Manually specify some of necessary conflicting transitions Our compiler from a high-level language can automatically specify those conflicting transitions 2004/4/21 Async2004

31 Issues to be discussed If the reduced STG has CSC, is a correct speed independent circuit synthesized from it? How can appropriate signals be chosen without the state graph of the original STG? How is the overhead (performance degradation of the synthesized circuit)? 2004/4/21 Async2004

32 Experimental results Experiments Benchmarks
Implementation of the proposed method in C Pentium 2.8GHz, 4GB memory Final logic synthesis tool: petrify -gc -eqn Benchmarks Instruction cache controller of TITAC2 generated from high-level spec by our compiler large, but simple → input signal sets are small compiler decisions are used for specifying conflicting transitions Controllers of various filters manually designed medium, but complicated → input signal sets are large all conflicting transitions are kept Async Benchmarks small and simple 2004/4/21 Async2004

33 Experimental results CPU times, Memory usage
Benchmark1: significantly reduced Benchmark2: reduced Quality of synthesized circuits Area (num. of transistors): almost no overhead 2004/4/21 Async2004

34 Experimental results For Async Benchmarks
Quality : no overhead (exactly the same area) Cost : advantageous only for largest specs CPU times (sec) Proposed Petrify 2004/4/21 Async2004

35 Conclusion New algorithm to find input signal sets for decomposition based synthesis method state graph of the original STG is not necessary can handle larger circuits Logic synthesis tool : NUTAS Linux binary is downloadable from Future works extend the algorithms to support timed circuit synthesis finish the compiler development for high-level synthesis and integrate both 2004/4/21 Async2004


Download ppt "Synthesis of Speed Independent Circuits Based on Decomposition"

Similar presentations


Ads by Google