Download presentation
Presentation is loading. Please wait.
1
1 Generalized Buffering of PTL Logic Stages using Boolean Division and Don’t Cares Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843
2
2 Outline Introduction Introduction Objective Objective Previous Work Previous Work CODCs and ACODCs CODCs and ACODCs Generalized Buffering With CODCs Generalized Buffering With CODCs Results Results Conclusions Conclusions
3
3 Introduction Pass Transistor Logic (PTL) typically used for specific circuit implementations, like barrel shifters Pass Transistor Logic (PTL) typically used for specific circuit implementations, like barrel shifters No widely accepted PTL design methodology No widely accepted PTL design methodology There exists a direct mapping between an ROBDD node and a PTL mux There exists a direct mapping between an ROBDD node and a PTL mux v f fvfv f v’ f fvfv v 01 v v’ f fvfv f v’ ROBDD NodeMUXPTL based MUX Hence ROBDDs can be used to perform PTL based synthesis for general circuits Hence ROBDDs can be used to perform PTL based synthesis for general circuits
4
4 Problems with direct mapping Problems with direct mapping Body Effect Body Effect Cannot connect more than 4-5 devices in series Cannot connect more than 4-5 devices in series Monolithic ROBDDs Monolithic ROBDDs Worst-case exponential size in number of the inputs (large) Worst-case exponential size in number of the inputs (large) Memory explosion can occur during ROBDD construction Memory explosion can occur during ROBDD construction Introduction (contd) Partitioned ROBDDs Partitioned ROBDDs Avoids memory explosion of monolithic ROBDDs Avoids memory explosion of monolithic ROBDDs Output of each PTL structure needs to be buffered Output of each PTL structure needs to be buffered Regenerate electrical drive capability after 4 or 5 levels using a pair of inverters (avoid body effect problems) Regenerate electrical drive capability after 4 or 5 levels using a pair of inverters (avoid body effect problems)
5
5 Objective New PTL Synthesis Approach New PTL Synthesis Approach Use partitioned ROBDDs Use partitioned ROBDDs Avoid memory explosion Avoid memory explosion Guarantee no more than 4-5 series devices Guarantee no more than 4-5 series devices Use generalized buffering Use generalized buffering Buffers can be complex logic gates in general (not simple inverters/ buffers) Buffers can be complex logic gates in general (not simple inverters/ buffers) Use ACODCs or CODCs to improve the extraction of generalized buffers Use ACODCs or CODCs to improve the extraction of generalized buffers Simplifies the logic function of the PTL block Simplifies the logic function of the PTL block Boolean Division based formulation Boolean Division based formulation Elegant, powerful formulation to extract generalized buffers Elegant, powerful formulation to extract generalized buffers Augmented with CODC / ACODCs Augmented with CODC / ACODCs Reduces total circuit delay and area Reduces total circuit delay and area
6
6 Previous Work Buch et. al, “Logic synthesis for large pass transistor circuits”, in Proceedings, IEEE/ACM ICCAD, Nov 1997, pp 663-670 Buch et. al, “Logic synthesis for large pass transistor circuits”, in Proceedings, IEEE/ACM ICCAD, Nov 1997, pp 663-670 Lai et. al, “BDD decomposition for mixed CMOS/PTL logic circuit synthesis”, in Proceedings, IEEE ISCAS, May 2005, pp. 5649-5652 Lai et. al, “BDD decomposition for mixed CMOS/PTL logic circuit synthesis”, in Proceedings, IEEE ISCAS, May 2005, pp. 5649-5652 Yamashita et. al, “Pass-transistor/CMOS collaborated logic: The best of both worlds”, in Digest of Technical Papers, Symposium on VLSI Circuits, June 1997, pp. 31-32. Yamashita et. al, “Pass-transistor/CMOS collaborated logic: The best of both worlds”, in Digest of Technical Papers, Symposium on VLSI Circuits, June 1997, pp. 31-32. Garg et. al, “Generalized buffering of PTL logic stages using Boolean division”, in Proceedings, IEEE ISCAS, May 2006. Garg et. al, “Generalized buffering of PTL logic stages using Boolean division”, in Proceedings, IEEE ISCAS, May 2006.
7
7 CODCs Observability Don’t Cares (ODCs) ODCs used to minimize the logic function of a node Need to re-compute the ODCs of the other nodes after optimization Compatible Observability Don’t Cares Can simultaneously change the function of all nodes Subset of ODCs full_simplify is used to compute CODCs in SIS ROBDD based computation to compute CODCs
8
8 CODCs (contd.) Memory intensive Memory intensive Computation is possible only for small and medium sized circuits Computation is possible only for small and medium sized circuits Approximate CODCs by Saluja et. al Approximate CODCs by Saluja et. al 30X faster than full CODCs 30X faster than full CODCs Requires 30X less memory than full CODCs Requires 30X less memory than full CODCs Literal count reduction is about 80% of that obtained by full CODCs Literal count reduction is about 80% of that obtained by full CODCs Can compute ACODCs for arbitrarily large circuits Can compute ACODCs for arbitrarily large circuits
9
9 PTL with Generalized Buffering Primary Inputs Primary Output Primary Inputs
10
10 Boolean Division Definition 1: g is a Boolean divisor of f if h and r exist such that f = gh + r where, gh ≠ Ø Definition 2: g is a Boolean factor of f if, g is a Boolean divisor of f, and in addition, r = Ø, i.e. f = gh Theorem: If fg ≠ Ø, then g is a Boolean divisor of f.
11
11 ROBDD Division Consider a (partitioned) ROBDD f of a node n in the network Consider a (partitioned) ROBDD f of a node n in the network Let d represent the CODCs of node n Consider a library gate g ≡ G Consider a library gate g ≡ G Division of f by g can be represented by following equations: Division of f by g can be represented by following equations: // Upper bound of f // Upper bound of f //Lower bound of f //Lower bound of f Therefore, quotient remainderFinally,
12
12 Generalized Buffering with CODCs Synthesize partitioned PTL blocks with Synthesize partitioned PTL blocks with Maximum depth of 5 Maximum depth of 5 No more than 5 transistors in series No more than 5 transistors in series Optimize and decompose network Optimize and decompose network Using only 2-input gates and inverters Using only 2-input gates and inverters PTL structure will grow in predictable manner PTL structure will grow in predictable manner Initially any ROBBD can have maximum 8 variables Initially any ROBBD can have maximum 8 variables If division fails, we can make one of the fanins a ROBDD variable -- back-track If division fails, we can make one of the fanins a ROBDD variable -- back-track
13
13 Algorithm: Generalized Buffering with CODCs A = dfs_and_levelize_nodes(η * ) i =1 while i ≤ size(A) do n = array_fetch(A,i) f = ntbdd_node_to_bdd(n) //creates ROBDD of node n if bdd_depth(f) ≥ 5 then for g ≡ G Gate Library do d = compute_dc(n) f = test_division(f,g,d,G) end for if bdd_depth(f) > 5 then back-trackelse bdd_create_variable(n) continue end if else else continue end if end while
14
14 test_division with CODCs test_division(f,d,g,G) { if fg ≠ 0 then Z = bdd_between(L,U) Z * = bdd_smooth(Z,gvars) R = bdd_compose(Z *,G,g) if f R f + d && bdd_depth(Z * ) < bdd_depth(f) then return(success, Z * ) end if else return fail end if }
15
15 back-track d a c n-2 n-1 n n Needs back-track Make ‘c’ a variable Re-process b b
16
16 Experimental Setup Implemented in SIS Implemented in SIS Process Technology- 100nm BPTM Process Technology- 100nm BPTM Gate Library Gate Library AND2, AND3, AND4 AND2, AND3, AND4 OR2, OR3, OR4 OR2, OR3, OR4 A set of benchmark circuits were synthesized A set of benchmark circuits were synthesized Compared with traditional method Compared with traditional method Inverters for buffering Inverters for buffering Similar to method reported by Buch. et. al Similar to method reported by Buch. et. al Also compared with generalized buffering without don’t cares Also compared with generalized buffering without don’t cares
17
17 Gate Delay and Area GateDelay (ps) Area ( 2 ) MUX180.08 INV10.260.08 Buffer20.50.16 AND230.200.28 AND337.760.44 AND447.390.64 OR238.700.36 OR346.080.68 OR468.281.12
18
18 Delay CktTraditional Buffering (ps) Generalized Buffering (ps) Without CODCsWith ACODCsWith CODCs alu21927.260.530.550.54 alu43458.880.450.43- apex6819.720.730.75 C4322302.380.860.790.76 C8801248.80.700.530.54 C19081336.140.740.670.68 C35402104.560.710.53- i8940.680.690.620.60 C62885971.50.940.86- …………… AVG0.7560.710-
19
19 Area CktTraditional Buffering ( 2 ) Generalized Buffering ( 2 ) Without CODCsWith ACODCsWith CODCs alu2164.320.870.860.83 alu4963.681.12 - apex6305.520.950.93 C43287.121.020.931.10 C880136.160.800.79 C1908152.240.970.93 C3540532.881.010.98- i8520.480.750.720.70 C62881220.481.101.06- …………… AVG0.9700.950-
20
20 Multiplexers CktTraditional Buffering Generalized Buffering Without CODCsWith ACODCsWith CODCs alu27180.600.5770.532 alu441880.7040.689- apex613370.7370.7190.718 C4324040.9950.7950.834 C8805940.7310.7020.712 C19086600.8580.8060.792 C354023620.7320672- i824180.4830.4490.439 C628853930.9170.834- …………… AVG0.7700.729-
21
21 Run-time CktGeneralized Buffering (s) Without CODCsWith ACODCsWith CODCs alu20.3811.48435.32 alu48.9954.52- apex61.494.6117.23 C4320.4613.82236.82 C8800.242.4170.0 C19080.7908.08362.12 C35405.0141.56- i81.6328.732651.48 C628810.79151.49- ………… AVG4.83733.55-
22
22 Conclusions Generalized buffering with ACODCs results in delay reduction by Generalized buffering with ACODCs results in delay reduction by 29% over traditional buffering 29% over traditional buffering 5% over generalized buffering without don’t cares 5% over generalized buffering without don’t cares Area reduction obtained by generalized buffering with ACODCs is Area reduction obtained by generalized buffering with ACODCs is 5% compared to traditional buffering 5% compared to traditional buffering 2% compared to generalized buffering without don’t cares 2% compared to generalized buffering without don’t cares Multiplexers also reduced by Multiplexers also reduced by 27% compared to traditional buffering 27% compared to traditional buffering 4% compared to generalized buffering without don’t cares 4% compared to generalized buffering without don’t cares
23
23 Conclusions (contd) A large number of divisions were obtained for each circuit A large number of divisions were obtained for each circuit Little advantage of using CODCs over ACODCs Little advantage of using CODCs over ACODCs Delay reduction is less than 1% Delay reduction is less than 1% Area increases by 1% Area increases by 1% Run-time is 76X slower Run-time is 76X slower Can synthesize arbitrary sized circuits using partitioned ROBDDs and ACODC based division Can synthesize arbitrary sized circuits using partitioned ROBDDs and ACODC based division
24
24 Thank You!! Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.