1 Generalized Buffering of PTL Logic Stages using Boolean Division and Don’t Cares Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering,

Slides:

Advertisements

Similar presentations

Robust Window-based Multi-node Technology- Independent Logic Minimization Jeff L.Cobb Kanupriya Gulati Sunil P. Khatri Texas Instruments, Inc. Dept. of.

Advertisements

New Ways of Generating Large Realistic Benchmarks for Testing Synthesis Tools Petr Fišer, Jan Schmidt Faculty of Information Technology Czech Technical.

Functions and Functional Blocks

10/28/2009VLSI Design & Test Seminar1 Diagnostic Tests and Full- Response Fault Dictionary Vishwani D. Agrawal ECE Dept., Auburn University Auburn, AL.

1 KU College of Engineering Elec 204: Digital Systems Design Lecture 9 Programmable Configurations Read Only Memory (ROM) – –a fixed array of AND gates.

NTHU-CS VLSI/CAD LAB TH EDA De-Shiuan Chiou Da-Cheng Juan Yu-Ting Chen Shih-Chieh Chang Department of CS, National Tsing Hua University, Taiwan Fine-Grained.

A Robust, Fast Pulsed Flip- Flop Design By: Arunprasad Venkatraman Rajesh Garg Sunil Khatri Department of Electrical and Computer Engineering, Texas A.

Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.

1 A Design Approach for Radiation-hard Digital Electronics Rajesh Garg Nikhil Jayakumar Sunil P Khatri Gwan Choi Department of Electrical and Computer.

1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College.

1 A Lithography-friendly Structured ASIC Design Approach By: Salman Goplani* Rajesh Garg # Sunil P Khatri # Mosong Cheng # * National Instruments, Austin,

A Robust Algorithm for Approximate Compatible Observability Don’t Care (CODC) Computation Nikhil S. Saluja University of Colorado Boulder, CO Sunil P.

A Delay-efficient Radiation-hard Digital Design Approach Using Code Word State Preserving (CWSP) Elements Charu Nagpal Rajesh Garg Sunil P. Khatri Department.

Concurrent Test Generation Auburn University, Department of Electrical and Computer Engineering Auburn, AL 36849, USA Vishwani D. Agrawal Alok S. Doshi.

Reducing Multi-Valued Algebraic Operations to Binary J.-H. Roland Jiang Alan Mishchenko Robert K. Brayton Dept. of EECS University of California, Berkeley.

1 Multi-Valued Logic Synthesis R. K. Brayton and S. P. Khatri University of California Berkeley.

Technology Mapping.

An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.

A High Performance Application Representation for Reconfigurable Systems Wenrui GongGang WangRyan Kastner Department of Electrical and Computer Engineering.

Synthesis For Mixed CMOS/PTL Logic

1 A Single-supply True Voltage Level Shifter Rajesh Garg Gagandeep Mallarapu Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.

SPFD-Based Wire Removal in a Network of PLAs Sunil P. Khatri* Subarnarekha Sinha* Andreas Kuehlmann** Robert K. Brayton* Alberto Sangiovanni-Vincentelli*

ECE Synthesis & Verification - Lecture 14 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems BDD-based.

DAG-Aware AIG Rewriting Alan Mishchenko, Satrajit Chatterjee, Robert Brayton Department of EECS, University of California Berkeley Presented by Rozana.

A Probabilistic Method to Determine the Minimum Leakage Vector for Combinational Designs Kanupriya Gulati Nikhil Jayakumar Sunil P. Khatri Department of.

1 Synthesis For CMOS/PTL Circuits Congguang Yang Maciej Ciesielski Dept. of Electrical & Computer Engineering University of Massachusetts, Amherst Sponsored.

Logic Synthesis 3 1 Logic Synthesis Part III Maciej Ciesielski Univ. of Massachusetts Amherst, MA.

 2000 M. CiesielskiPTL Synthesis1 Synthesis for Pass Transistor Logic Maciej Ciesielski Dept. of Electrical & Computer Engineering University of Massachusetts,

Mixed Logic Circuit Design

Logic Decomposition ECE1769 Jianwen Zhu (Courtesy Dennis Wu)

USING SAT-BASED CRAIG INTERPOLATION TO ENLARGE CLOCK GATING FUNCTIONS Ting-Hao Lin, Chung-Yang (Ric) Huang Graduate Institute of Electrical Engineering,

A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.

1 Efficient Analytical Determination of the SEU- induced Pulse Shape Rajesh Garg Sunil P. Khatri Department of ECE Texas A&M University College Station,

Faster Logic Manipulation for Large Designs Alan Mishchenko Robert Brayton University of California, Berkeley.

Electrical and Computer Engineering Archana Rengaraj ABC Logic Synthesis basics ECE 667 Synthesis and Verification of Digital Systems Spring 2011.

Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.

Optimal digital circuit design Mohammad Sharifkhani.

Logical Effort and Transistor Sizing Digital designs are usually expected to operate at high frequencies, thus designers often have to choose the fastest.

Introduction to CMOS VLSI Design Lecture 5: Logical Effort GRECO-CIn-UFPE Harvey Mudd College Spring 2004.

A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design

1 A Novel Synthesis Algorithm for Reversible Circuits Mehdi Saeedi, Mehdi Sedighi*, Morteza Saheb Zamani {msaeedi, msedighi, aut.ac.ir.

On the Relation between SAT and BDDs for Equivalence Checking Sherief Reda Rolf Drechsler Alex Orailoglu Computer Science & Engineering Dept. University.

Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.

Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Multi-Level Logic Synthesis.

Mihir Choudhury, Kartik Mohanram (ICCAD’10 best paper nominee) Presentor: ABert Liu.

Output Grouping Method Based on a Similarity of Boolean Functions Petr Fišer, Pavel Kubalík, Hana Kubátová Czech Technical University in Prague Department.

Give qualifications of instructors: DAP

Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.

In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.

Output Grouping-Based Decomposition of Logic Functions Petr Fišer, Hana Kubátová Department of Computer Science and Engineering Czech Technical University.

BDS – A BDD Based Logic Optimization System Presented by Nitin Prakash (ECE 667, Spring 2011)

An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical.

A Novel, Highly SEU Tolerant Digital Circuit Design Approach By: Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.

State university of New York at New Paltz Electrical and Computer Engineering Department Logic Synthesis Optimization Lect18: Multi Level Logic Minimization.

Kanupriya Gulati * Mathew Lovell ** Sunil P. Khatri * * Computer Engineering, Texas A&M University ** Hewlett Packard Company, Fort Collins, CO Efficient.

On the Relation Between Simulation-based and SAT-based Diagnosis CMPE 58Q Giray Kömürcü Boğaziçi University.

BDD-based Synthesis of Reversible Logic for Large Functions Robert Wille Rolf Drechsler DAC’09 Presenter: Meng-yen Li.

Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.

A New Logic Synthesis, ExorBDS

Interpolating Functions from Large Boolean Relations

A. Mishchenko S. Chatterjee1 R. Brayton UC Berkeley and Intel1

Versatile SAT-based Remapping for Standard Cells

Reconfigurable Computing

ECE 667 Synthesis and Verification of Digital Systems

SAT-Based Area Recovery in Technology Mapping

Sungho Kang Yonsei University

On the Improvement of Statistical Timing Analysis

VLSI CAD Flow: Logic Synthesis, Placement and Routing Lecture 5

Canonical Computation without Canonical Data Structure

Alan Mishchenko Department of EECS UC Berkeley

Presentation transcript:

1 Generalized Buffering of PTL Logic Stages using Boolean Division and Don’t Cares Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843

2 Outline Introduction Introduction Objective Objective Previous Work Previous Work CODCs and ACODCs CODCs and ACODCs Generalized Buffering With CODCs Generalized Buffering With CODCs Results Results Conclusions Conclusions

3 Introduction Pass Transistor Logic (PTL) typically used for specific circuit implementations, like barrel shifters Pass Transistor Logic (PTL) typically used for specific circuit implementations, like barrel shifters No widely accepted PTL design methodology No widely accepted PTL design methodology There exists a direct mapping between an ROBDD node and a PTL mux There exists a direct mapping between an ROBDD node and a PTL mux v f fvfv f v’ f fvfv v 01 v v’ f fvfv f v’ ROBDD NodeMUXPTL based MUX Hence ROBDDs can be used to perform PTL based synthesis for general circuits Hence ROBDDs can be used to perform PTL based synthesis for general circuits

4 Problems with direct mapping Problems with direct mapping Body Effect Body Effect Cannot connect more than 4-5 devices in series Cannot connect more than 4-5 devices in series Monolithic ROBDDs Monolithic ROBDDs Worst-case exponential size in number of the inputs (large) Worst-case exponential size in number of the inputs (large) Memory explosion can occur during ROBDD construction Memory explosion can occur during ROBDD construction Introduction (contd) Partitioned ROBDDs Partitioned ROBDDs Avoids memory explosion of monolithic ROBDDs Avoids memory explosion of monolithic ROBDDs Output of each PTL structure needs to be buffered Output of each PTL structure needs to be buffered Regenerate electrical drive capability after 4 or 5 levels using a pair of inverters (avoid body effect problems) Regenerate electrical drive capability after 4 or 5 levels using a pair of inverters (avoid body effect problems)

5 Objective New PTL Synthesis Approach New PTL Synthesis Approach Use partitioned ROBDDs Use partitioned ROBDDs Avoid memory explosion Avoid memory explosion Guarantee no more than 4-5 series devices Guarantee no more than 4-5 series devices Use generalized buffering Use generalized buffering Buffers can be complex logic gates in general (not simple inverters/ buffers) Buffers can be complex logic gates in general (not simple inverters/ buffers) Use ACODCs or CODCs to improve the extraction of generalized buffers Use ACODCs or CODCs to improve the extraction of generalized buffers Simplifies the logic function of the PTL block Simplifies the logic function of the PTL block Boolean Division based formulation Boolean Division based formulation Elegant, powerful formulation to extract generalized buffers Elegant, powerful formulation to extract generalized buffers Augmented with CODC / ACODCs Augmented with CODC / ACODCs Reduces total circuit delay and area Reduces total circuit delay and area

6 Previous Work Buch et. al, “Logic synthesis for large pass transistor circuits”, in Proceedings, IEEE/ACM ICCAD, Nov 1997, pp Buch et. al, “Logic synthesis for large pass transistor circuits”, in Proceedings, IEEE/ACM ICCAD, Nov 1997, pp Lai et. al, “BDD decomposition for mixed CMOS/PTL logic circuit synthesis”, in Proceedings, IEEE ISCAS, May 2005, pp Lai et. al, “BDD decomposition for mixed CMOS/PTL logic circuit synthesis”, in Proceedings, IEEE ISCAS, May 2005, pp Yamashita et. al, “Pass-transistor/CMOS collaborated logic: The best of both worlds”, in Digest of Technical Papers, Symposium on VLSI Circuits, June 1997, pp Yamashita et. al, “Pass-transistor/CMOS collaborated logic: The best of both worlds”, in Digest of Technical Papers, Symposium on VLSI Circuits, June 1997, pp Garg et. al, “Generalized buffering of PTL logic stages using Boolean division”, in Proceedings, IEEE ISCAS, May Garg et. al, “Generalized buffering of PTL logic stages using Boolean division”, in Proceedings, IEEE ISCAS, May 2006.

7 CODCs Observability Don’t Cares (ODCs) ODCs used to minimize the logic function of a node Need to re-compute the ODCs of the other nodes after optimization Compatible Observability Don’t Cares Can simultaneously change the function of all nodes Subset of ODCs full_simplify is used to compute CODCs in SIS ROBDD based computation to compute CODCs

8 CODCs (contd.) Memory intensive Memory intensive Computation is possible only for small and medium sized circuits Computation is possible only for small and medium sized circuits Approximate CODCs by Saluja et. al Approximate CODCs by Saluja et. al 30X faster than full CODCs 30X faster than full CODCs Requires 30X less memory than full CODCs Requires 30X less memory than full CODCs Literal count reduction is about 80% of that obtained by full CODCs Literal count reduction is about 80% of that obtained by full CODCs Can compute ACODCs for arbitrarily large circuits Can compute ACODCs for arbitrarily large circuits

9 PTL with Generalized Buffering Primary Inputs Primary Output Primary Inputs

10 Boolean Division Definition 1: g is a Boolean divisor of f if h and r exist such that f = gh + r where, gh ≠ Ø Definition 2: g is a Boolean factor of f if, g is a Boolean divisor of f, and in addition, r = Ø, i.e. f = gh Theorem: If fg ≠ Ø, then g is a Boolean divisor of f.

11 ROBDD Division Consider a (partitioned) ROBDD f of a node n in the network Consider a (partitioned) ROBDD f of a node n in the network Let d represent the CODCs of node n Consider a library gate g ≡ G Consider a library gate g ≡ G Division of f by g can be represented by following equations: Division of f by g can be represented by following equations: // Upper bound of f // Upper bound of f //Lower bound of f //Lower bound of f Therefore, quotient remainderFinally,

12 Generalized Buffering with CODCs Synthesize partitioned PTL blocks with Synthesize partitioned PTL blocks with Maximum depth of 5 Maximum depth of 5 No more than 5 transistors in series No more than 5 transistors in series Optimize and decompose network Optimize and decompose network Using only 2-input gates and inverters Using only 2-input gates and inverters PTL structure will grow in predictable manner PTL structure will grow in predictable manner Initially any ROBBD can have maximum 8 variables Initially any ROBBD can have maximum 8 variables If division fails, we can make one of the fanins a ROBDD variable -- back-track If division fails, we can make one of the fanins a ROBDD variable -- back-track

13 Algorithm: Generalized Buffering with CODCs A = dfs_and_levelize_nodes(η * ) i =1 while i ≤ size(A) do n = array_fetch(A,i) f = ntbdd_node_to_bdd(n) //creates ROBDD of node n if bdd_depth(f) ≥ 5 then for g ≡ G  Gate Library do d = compute_dc(n) f = test_division(f,g,d,G) end for if bdd_depth(f) > 5 then back-trackelse bdd_create_variable(n) continue end if else else continue end if end while

14 test_division with CODCs test_division(f,d,g,G) { if fg ≠ 0 then Z = bdd_between(L,U) Z * = bdd_smooth(Z,gvars) R = bdd_compose(Z *,G,g) if f  R  f + d && bdd_depth(Z * ) < bdd_depth(f) then return(success, Z * ) end if else return fail end if }

15 back-track d a c n-2 n-1 n n Needs back-track Make ‘c’ a variable Re-process b b

16 Experimental Setup Implemented in SIS Implemented in SIS Process Technology- 100nm BPTM Process Technology- 100nm BPTM Gate Library Gate Library AND2, AND3, AND4 AND2, AND3, AND4 OR2, OR3, OR4 OR2, OR3, OR4 A set of benchmark circuits were synthesized A set of benchmark circuits were synthesized Compared with traditional method Compared with traditional method Inverters for buffering Inverters for buffering Similar to method reported by Buch. et. al Similar to method reported by Buch. et. al Also compared with generalized buffering without don’t cares Also compared with generalized buffering without don’t cares

17 Gate Delay and Area GateDelay (ps) Area (  2 ) MUX INV Buffer AND AND AND OR OR OR

18 Delay CktTraditional Buffering (ps) Generalized Buffering (ps) Without CODCsWith ACODCsWith CODCs alu alu apex C C C C i C …………… AVG

19 Area CktTraditional Buffering (  2 ) Generalized Buffering (  2 ) Without CODCsWith ACODCsWith CODCs alu alu apex C C C C i C …………… AVG

20 Multiplexers CktTraditional Buffering Generalized Buffering Without CODCsWith ACODCsWith CODCs alu alu apex C C C C i C …………… AVG

21 Run-time CktGeneralized Buffering (s) Without CODCsWith ACODCsWith CODCs alu alu apex C C C C i C ………… AVG

22 Conclusions Generalized buffering with ACODCs results in delay reduction by Generalized buffering with ACODCs results in delay reduction by 29% over traditional buffering 29% over traditional buffering 5% over generalized buffering without don’t cares 5% over generalized buffering without don’t cares Area reduction obtained by generalized buffering with ACODCs is Area reduction obtained by generalized buffering with ACODCs is 5% compared to traditional buffering 5% compared to traditional buffering 2% compared to generalized buffering without don’t cares 2% compared to generalized buffering without don’t cares Multiplexers also reduced by Multiplexers also reduced by 27% compared to traditional buffering 27% compared to traditional buffering 4% compared to generalized buffering without don’t cares 4% compared to generalized buffering without don’t cares

23 Conclusions (contd) A large number of divisions were obtained for each circuit A large number of divisions were obtained for each circuit Little advantage of using CODCs over ACODCs Little advantage of using CODCs over ACODCs Delay reduction is less than 1% Delay reduction is less than 1% Area increases by 1% Area increases by 1% Run-time is 76X slower Run-time is 76X slower Can synthesize arbitrary sized circuits using partitioned ROBDDs and ACODC based division Can synthesize arbitrary sized circuits using partitioned ROBDDs and ACODC based division

24 Thank You!! Questions?