A Robust Algorithm for Approximate Compatible Observability Don’t Care (CODC) Computation Nikhil S. Saluja University of Colorado Boulder, CO Sunil P.

Slides:



Advertisements
Similar presentations
Boolean Algebra and Logic Gates
Advertisements

Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Robust Window-based Multi-node Technology- Independent Logic Minimization Jeff L.Cobb Kanupriya Gulati Sunil P. Khatri Texas Instruments, Inc. Dept. of.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Exploiting Don’t Cares in Logic Minimization.
1 EECS 219B Spring 2001 Node minimization Andreas Kuehlmann.
ENGG3190 Logic Synthesis “Multi Level Logic” (Part II) Winter 2014 S. Areibi School of Engineering University of Guelph.
The BDS Circuit Synthesis System What it Does and Doesn’t Do.
AN ITERATIVE TECHNIQUE FOR IMPROVED TWO-LEVEL LOGIC MINIMIZATION Kunal R. Shenoy Nikhil S. Saluja Nikhil S. Saluja (University of Colorado, Boulder) Sunil.
Global Flow Optimization (GFO) in Automatic Logic Design “ TCAD91 ” by C. Leonard Berman & Louise H. Trevillyan CAD Group Meeting Prepared by Ray Cheung.
1 SPFDs - A new method for specifying flexibility Sets of Pairs of Functions to be Distinguished From the paper: “A new Method to Express Functional Permissibilities.
Multilevel Logic Minimization -- Introduction. ENEE 6442 Outline > Multi-level minimization: technology independent local optimization. > What to optimize:
Combining Technology Mapping and Retiming EECS 290A Sequential Logic Synthesis and Verification.
1 DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jacon Cong ICCAD 2004 Presented by: Wei Chen.
EDA (CS286.5b) Day 17 Sequential Logic Synthesis (FSM Optimization)
Technology Mapping.
An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.
1 Generalized Buffering of PTL Logic Stages using Boolean Division and Don’t Cares Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering,
A general approximation technique for constrained forest problems Michael X. Goemans & David P. Williamson Presented by: Yonatan Elhanani & Yuval Cohen.
SPFD-Based Wire Removal in a Network of PLAs Sunil P. Khatri* Subarnarekha Sinha* Andreas Kuehlmann** Robert K. Brayton* Alberto Sangiovanni-Vincentelli*
Nov. 13, 2002ICCAD 2002 Simplification of Non-Deterministic Multi-Valued Networks Alan Mishchenko Electrical and Computer Engineering Portland State University.
1 FRAIGs: Functionally Reduced And-Inverter Graphs Adapted from the paper “FRAIGs: A Unifying Representation for Logic Synthesis and Verification”, by.
ECE Synthesis & Verification - Lecture 14 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems BDD-based.
ICCAD Nov-2000 Timing Driven Gate Duplication: Complexity Issues and Algorithms Ankur Srivastava, Ryan Kastner and Majid Sarrafzadeh Embedded & Reconfigurable.
DAG-Aware AIG Rewriting Alan Mishchenko, Satrajit Chatterjee, Robert Brayton Department of EECS, University of California Berkeley Presented by Rozana.
A Probabilistic Method to Determine the Minimum Leakage Vector for Combinational Designs Kanupriya Gulati Nikhil Jayakumar Sunil P. Khatri Department of.
Sept. 19, 2002Workshop on Boolean Problems A Theory of Non-Deterministic Networks R. K. Brayton EECS Dept. University of California Berkeley.
CAD Algorithms and Tools. Overview Introduction Multi-level logic synthesis SIS as a representative CAD tool Boolean networks Transformations of Boolean.
Propositional Calculus Math Foundations of Computer Science.
Overview Part 2 – Circuit Optimization 2-4 Two-Level Optimization
Systems Architecture I1 Propositional Calculus Objective: To provide students with the concepts and techniques from propositional calculus so that they.
Department of Computer Engineering
Boolean Methods for Multi-level Logic Synthesis Giovanni De Micheli Integrated Systems Centre EPF Lausanne This presentation can be used for non-commercial.
Electrical and Computer Engineering Archana Rengaraj ABC Logic Synthesis basics ECE 667 Synthesis and Verification of Digital Systems Spring 2011.
Chapter 10 (Part 2): Boolean Algebra  Logic Gates (10.3) (cont.)  Minimization of Circuits (10.4)
Minimization of P-Circuits using Boolean Relations Anna Bernasconi University of Pisa Valentina Ciriani and Gabriella Trucco University of Milano Tiziano.
Optimization Algorithm
05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL.
1 EECS 219B Spring 2001 Timing Optimization Andreas Kuehlmann.
Combinational and Sequential Mapping with Priority Cuts Alan Mishchenko Sungmin Cho Satrajit Chatterjee Robert Brayton UC Berkeley.
DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen and Jason Cong Computer Science Department University of California,
Propositional Calculus CS 270: Mathematical Foundations of Computer Science Jeremy Johnson.
1 Finite State Machines (FSMs) Now that we understand sequential circuits, we can use them to build: Synchronous (Clocked) Finite State Machines Finite.
Why consider ND-MV networks? Multi-Values: Multi-valued domains can be used to explore larger optimization spaces. Several interesting direct applications.
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Multi-Level Logic Synthesis.
To Split or to Conjoin: The Question in Image Computation 1 {mooni, University of Colorado at Boulder 2 Synopsys.
Output Grouping-Based Decomposition of Logic Functions Petr Fišer, Hana Kubátová Department of Computer Science and Engineering Czech Technical University.
DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jason Cong , Computer Science Department , UCLA Presented.
1 Using Don’t Cares - full_simplify command Major command in SIS - uses SDC, ODC, XDC Key Questions: How do we represent XDC to a network? How do we relate.
A Semi-Canonical Form for Sequential Circuits Alan Mishchenko Niklas Een Robert Brayton UC Berkeley Michael Case Pankaj Chauhan Nikhil Sharma Calypto Design.
Binary Decision Diagrams Prof. Shobha Vasudevan ECE, UIUC ECE 462.
Kanupriya Gulati * Mathew Lovell ** Sunil P. Khatri * * Computer Engineering, Texas A&M University ** Hewlett Packard Company, Fort Collins, CO Efficient.
BDD-based Synthesis of Reversible Logic for Large Functions Robert Wille Rolf Drechsler DAC’09 Presenter: Meng-yen Li.
Reducing Structural Bias in Technology Mapping
Hybrid BDD and All-SAT Method for Model Checking
Interpolating Functions from Large Boolean Relations
Delay Optimization using SOP Balancing
Reconfigurable Computing
A Boolean Paradigm in Multi-Valued Logic Synthesis
SAT-Based Area Recovery in Technology Mapping
Multi-Level Minimization
Sungho Kang Yonsei University
FPGA Glitch Power Analysis and Reduction
Improvements in FPGA Technology Mapping
ECE 352 Digital System Fundamentals
Recording Synthesis History for Sequential Verification
Delay Optimization using SOP Balancing
Canonical Computation without Canonical Data Structure
Illustrative Example p p Lookup Table for Digits of h g f e ) ( d c b
SAT-based Methods: Logic Synthesis and Technology Mapping
Presentation transcript:

A Robust Algorithm for Approximate Compatible Observability Don’t Care (CODC) Computation Nikhil S. Saluja University of Colorado Boulder, CO Sunil P. Khatri Texas A&M University, College Station, TX

Outline  Motivation  Computation of Don’t Cares  ACODC Algorithm  Proof of correctness  Experimental Results  Possible extensions  Conclusions

Motivation … …..  z1z1 z2z2 z3z3 zpzp x1x1 x2x2 x3x3 xnxn y j = F j y1y1 y2y2 ywyw  Technology independent logic optimization  Typically compute Don’t Cares after a higher level description of a design is encoded and translated into gate level description.  Don’t Cares (DCs)  eXternal Don’t Cares (XDCs)  Satisfiability Don’t Cares (SDCs)  Observability Don’t Cares (ODCs)

Motivation - 2  The DCs computed are a function of the PIs and internal variables of the Boolean network  Image computation used to express the DCs in terms of node fanins  ROBDD based operation  Finally, the node function is minimized (using ESPRESSO) with respect to the computed (local) DCs  Literal count reduction is the figure of merit

Don’t Cares  ODC based  Very powerful, represent maximum flexibility  Minimizing a node j with respect to its ODC requires recomputation of other nodes’ ODCs  Compatible ODC (CODC) based  Subset of ODC, requires ordering of fanins  Recomputation not required, useful in many cases  In either case, image computation required  To obtain DCs in the fanin support of the node  Involves ROBDD computation  Not robust

 Note that  is the consensus operator  The first fanin has which is the maximum flexibility  A new edge e ik should have its CODC as the conjunction of with the condition that other inputs j < i are not insensitive to input y j ( ) or are independent of y j ( ) CODC Computation  Traverse circuit in reverse topological order  CODC of primary output z initialized to its XDC  Computation performed in 2 phases for each node  Phase 1 ykyk fkfk y1y1 y2y2 y i-1 yiyi y 1 < y 2 < … < y i 

CODC Computation  Phase 2 - image computation using ROBDDs  Build global BDDs of each node in the network, including POs  For large circuits this step fails  This is the main weakness of the CODC computation  Next compute CODCs of node k in terms of PIs  Substitute each internal node literal by its global BDD  Compute image of this function in the space of local fanins of node k  Yields CODC in terms of local fanins of node k  Finally, call ESPRESSO on the cover of node k, with the newly computed CODC as don’t care

Contributions of this Work  Perform CODC based Don’t Care computation approximately  Yields 25X speedup  Yields 33X reduction in memory utilization  Obtains 80% of the literal reduction of the full CODC computation  Handles large circuits extremely fast (circuits which CODC based computation times out on)  Formal proof of correctness of the approximate CODC technique

Approximate CODCs  Consider a sub-network rooted at the node j of interest  Sub-network can have user defined topological depth k  Compute the CODC of j in the sub-network (called ACODC)  This ACODC is a subset of the CODC of j jjjj j

Algorithm Traverse η in reverse topological order for (each node j in network η) do η j = extract_subnetwork(j,k) ACODC(j) = compute_acodc(η j,j) optimize(j,ACODC(j)) end for

Proof of Correctness  Terminology  Boolean network ηxz  X primary inputs  Z primary outputs  W and V are two cuts  ηxw, ηvz and ηvw define sub-networks  is the CODC of y k where P is either X or V and Q is either W or Z  is the CODC of y k mapped back to its fanin support after image computation vw x z y1y1 y2y2 y i-1 yiyi ykyk fkfk

Cutset as Primary Output  To show ≥  For any PO z, = ø  For, ≠ ø  For W nodes as POs, = ø  CODC computation of y k is identical for both cases except last term in equation  In general, the last term for a node in first case, contains last term for same node in latter case since ≥  Hence ≥ w x z ykyk fkfk y1y1 y2y2 y i-1 yiyi

Cutset as Primary Input  Define  To compute ACODC at y k, compute, then compute image I 1 of this on the V space, and then project the result back to local fanins of y k  The full CODC is.We then compute the image I 2 of this on the X space, and next project the result back to local fanins of y k  I 3 is projection of I 2 on V  Hence  Therefore I 3 ≥ I 1  Finally, ≥ v x z ykyk fkfk y1y1 y2y2 y i-1 yiyi I1I1 I2I2 I3I3

Cutsets as Primary Input and Primary Output  This result follows directly from the previous two proofs as they are orthogonal  Hence ≤ w x z ykyk fkfk y1y1 y2y2 y i-1 yiyi v  Therefore, an ACODC computation which utilizes a sub- network of depth k rooted at any node yields a subset of the full CODC of the node.  This proves the correctness of our method.

Experimental Results  Implemented in SIS  Used mcnc91 and itc99 benchmark circuits  Run on IBM IntelliStation (1.7 GHz Pentium-4 with 1 GB RAM) running Linux  Our algorithm is built as a replacement to full_simplify  Read design and run ACODC algorithm followed by sweep  Compare our method by running full_simplify followed by sweep

Metrics for Comparison  3 measures of effectiveness for comparison with full_simplify  Effectiveness #1 compares the ratio of the number of minterms computed by our technique compared to that for full_simplify  Effectiveness #2 compares the number of nodes for which ACODCs and CODCs are identical  We also compare the literal count reduction obtained by both techniques

Effectiveness Results CircuitEff1 (k=4)Eff1 (k=6)Eff2 (k=4)Eff2 (k=6)Lits-originalLits % (fs)Lits % (k=4)Lits % (k=6) C C C C C C C dalu i b01_C b03_C b04_C b05_C b06_C b07_C b08_C b09_C b10_C b11_C b12_C b13_C AVG  Literal reduction about 80% of full_simplify  Very little improvement from k=4 to k=6

 Runtime is about 25X better than full_simplify  Memory utilization is about 33X better than full_simplify Runtime and Memory Results CircuitTime (fs)Time % (k=4)Time % (k=6) C C C C C C C dalu i b01_C b03_C b04_C b05_C b06_C0.04 b07_C b08_C0.30 b09_C b10_C b11_C b12_C b13_C AVG Mem (fs)Mem (k=4)Mem (k=6)

Circuit#Nodes#Literal s Node%Lit%Time(s)Mem C C b b14_ b b20_ b b21_ AVG Results for Large Circuits  full_simplify did not complete for all the examples below  k = 4 for these experiments Maximum runtime < 2 minutes Peak memory utilization < 106K BDD nodes

Possible Extensions  Can compute AODCs in a similar fashion  Yields more flexibility at a node  However, each node must be minimized after its AODC computation  Compatibility not maintained  Useful if only node minimization is desired  Compatibility is useful if the nodes are to be optimized simultaneously at a later stage  Proof of correctness is similar

Conclusions  Presented a robust technique for ACODC computation  Dynamic extraction of sub-networks to compute CODCs  ACODCs computed exactly once for a node  19% reduction in node count and 9.5% reduction in literal count (large circuits)  23% reduction in literal count as compared to 28.5% for full_simplify (medium circuits)  25X better run-time than full_simplify  33X better memory utilization than full_simplify