ABC: A System for Sequential Synthesis and Verification BVSRC Berkeley Verification and Synthesis Research Center University of California, Berkeley Alan Mishchenko
Overview IntroductionIntroduction –What and why ABC? ABC fundamentalsABC fundamentals –Areas addressed by ABC SynthesisSynthesis Technology mappingTechnology mapping VerificationVerification –Contrast with classical methods How is ABC different from SIS?How is ABC different from SIS? Recent workRecent work –Speedup –Factoring –Don’t-care based optimization –Scalable sequential synthesis –WireMap –White boxes
A Plethora of ABCs ABC (American Broadcasting Company)ABC (American Broadcasting Company) –A television network… ABC (Active Body Control)ABC (Active Body Control) –ABC is designed to minimize body roll in corner, accelerating, and braking. The system uses 13 sensors which monitor body movement to supply the computer with information every 10 ms… ABC (Abstract Base Class)ABC (Abstract Base Class) –In C++, these are generic classes at the base of the inheritance tree; objects of such abstract classes cannot be created… ABC (supposed to mean “as simple as ABC”)ABC (supposed to mean “as simple as ABC”) –A system for sequential synthesis and verification at Berkeley
Why We Decided to Build ABC SISSIS –Outdated, but many research papers on how a new algorithm beats SIS results –Not supported MVSISMVSIS –Gave us a reason to work on logic synthesis –Learned a lot about new methods and better data structures –Could see how specializing to binary could provide substantial improvements. ABCABC –Initial intention was to re-implement all algorithms using new data structures (daunting task) –Discovered rewriting AIGs P. Bjesse and A. Boralv, "DAG-aware circuit compression for formal verification", Proc. ICCAD ’04, pp P. Bjesse and A. Boralv, "DAG-aware circuit compression for formal verification", Proc. ICCAD ’04, pp –Decided to try to keep all transformations fast and scalable No BDDsNo BDDs No SOPsNo SOPs No EspressoNo Espresso BDD
What Is Berkeley ABC? A system for logic synthesis and verificationA system for logic synthesis and verification –Fast –Scalable –Good quality of results –Exploits synergy between synthesis and verification A programming environmentA programming environment –Open-source –Evolving and improving over time
Design Flow System Specification RTL Logic synthesis Technology mapping Physical synthesis Manufacturing ABC Verification
Screenshot
Terminology Logic function (e.g. F = ab+cd)Logic function (e.g. F = ab+cd) –Variables (e.g. b) –Minterms (e.g. abcd) –Cube (e.g. ab) Logic networkLogic network –Primary inputs/outputs –Logic nodes –Fanins/fanouts –Transitive fanin/fanout cone –Cut and window (defined later) Primary inputs Primary outputs Fanins Fanouts TFO TFI
Areas Addressed by ABC Combinational synthesisCombinational synthesis –AIG rewriting –technology mapping –resynthesis after mapping Sequential synthesisSequential synthesis –retiming –structural register sweep –merging seq. equiv. nodes Formal verificationFormal verification –combinational equivalence checking –bounded sequential verification –unbounded sequential verification –equivalence checking using synthesis history
Combinational Synthesis a b a c Subgraph 1 b c a Subgraph 2 Pre-computing AIG subgraphsPre-computing AIG subgraphs –Consider function f = abc a c b Subgraph 3 Rewriting AIG subgraphs Rewriting node A Rewriting node B a b a c a b a c A Subgraph 1 b c a A Subgraph 2 b c a B a b a c B Subgraph 1 In both cases 1 node is saved AIG rewriting minimizes the number of AIG nodes without increasing the number of AIG levelsAIG rewriting minimizes the number of AIG nodes without increasing the number of AIG levels
Technology Mapping Input: A Boolean network (And-Inverter Graph) Output: A netlist of K-LUTs implementing AIG and optimizing some cost function The subject graph The mapped netlist Technology Mapping a b cd f e a b cd e f
Sequential Synthesis Structural register sweep (scl)Structural register sweep (scl) –Merge registers with identical drivers –Replace stuck-at registers by constants Retiming (dretime)Retiming (dretime) –Minimize the number of registers under delay constraints –Preserves equivalent initial state Sequential SAT sweeping (scorr)Sequential SAT sweeping (scorr) –Detecting and merging sequentially equivalent nodes
Formal Verification Equivalence checkingEquivalence checking –Takes two designs and makes a miter (AIG) Model checking safety propertiesModel checking safety properties –Takes design and property and makes a miter (AIG) The goals are the same: to transform AIG until the output is proved constant 0 The goals are the same: to transform AIG until the output is proved constant 0 D2 D1 Equivalence checking 0 D1 Property checking 0 p
ABC vs. Other Tools Industrial + well documented, fewer bugs - black-box, push-button, no source code, often expensive SIS + traditionally very popular - data structures / algorithms outdated, weak sequential synthesis VIS + very good implementation of BDD-based verification algorithms - not meant for logic synthesis, does not feature the latest SAT-based implementations MVSIS + allows for multi-valued and finite-automata manipulation - not meant for binary synthesis, lacking recent implementations
How Is ABC Different From SIS? Equivalent AIG in ABC a b cdfe x yz Boolean network in SIS a b cd e xy f z AIG is a Boolean network of 2-input AND nodes and invertors (dotted lines)
One AIG Node – Many Cuts Combinational AIG a b cd f e Manipulating AIGs in ABCManipulating AIGs in ABC –Each node in an AIG has many cuts –Each cut is a different SIS node –No a priori fixed boundaries Implies that AIG manipulation with cuts is equivalent to working on many Boolean networks at the same timeImplies that AIG manipulation with cuts is equivalent to working on many Boolean networks at the same time Different cuts for the same node
Comparison of Two Syntheses “Classical” synthesis “Classical” synthesis Boolean networkBoolean network Network manipulation (algebraic)Network manipulation (algebraic) –Elimination –Factoring/Decomposition –Speedup Node minimizationNode minimization –Espresso –Don’t cares computed using BDDs –Resubstitution Technology mappingTechnology mapping –Tree based ABC “contemporary” synthesis ABC “contemporary” synthesis AIG networkAIG network DAG-aware AIG rewriting (Boolean)DAG-aware AIG rewriting (Boolean) –Several related algorithms Rewriting Refactoring Balancing Speedup Node minimizationNode minimization –Boolean decomposition –Don’t cares computed using simulation and SAT –Resubstitution with don’t cares Technology mappingTechnology mapping –Cut based with choice nodes
Existing Capabilities ( ) ABC Combinational logic synthesis Fast, scalable, good quality Technology mapping with structural choices Cut-based, heuristic, good area/delay, flexible Sequential synthesis Innovative, scalable, verifiable Sequential verification Integrated, interacts with synthesis
Overview IntroductionIntroduction –What is ABC? ABC fundamentalsABC fundamentals –Areas addressed by ABC SynthesisSynthesis Technology mappingTechnology mapping VerificationVerification –Contrast with classical methods How is ABC different from SIS?How is ABC different from SIS? Recent workRecent work –Speedup –Factoring –Don’t-care based optimization –Scalable sequential synthesis –WireMap –White boxes SummarySummary
Command “speedup” Timing Criticality Critical nodesCritical nodes –Used by many traditional algorithms Critical edgesCritical edges –Used by our algorithm We pre-compute critical edges of critical nodesWe pre-compute critical edges of critical nodes –Reduces computation An edge between critical nodes may not be criticalAn edge between critical nodes may not be critical –See illustration: edge 1 Primary inputs Primary outputs
Delay-Oriented Restructuring Using traditional MUX-restructuringUsing traditional MUX-restructuring –AKA generalized select transform x and y are the critical edge inputs
Overall Algorithm mapped netlist performSpeedup ( subject graph S, // S is an And-Inverter Graph subject graph S, // S is an And-Inverter Graph mapped netlist M, // M was previously derived by tech-mapping of S mapped netlist M, // M was previously derived by tech-mapping of S timing window w, // w is used to detect the critical paths timing window w, // w is used to detect the critical paths logic depth l, // l is used to detect a logic cone rooted at a node logic depth l, // l is used to detect a logic cone rooted at a node edge count p ) // p limits the number critical edges of the cone edge count p ) // p limits the number critical edges of the cone{ perform timing analysis of M with unit-delay or LUT-library model; perform timing analysis of M with unit-delay or LUT-library model; pre-compute critical section of M as nodes n such that 0 slack(n) w; pre-compute critical section of M as nodes n such that 0 slack(n) w; pre-compute timing-critical edges connecting these nodes; pre-compute timing-critical edges connecting these nodes; for each timing critical node n { for each timing critical node n { find cone C of M that extends l levels down from n; find cone C of M that extends l levels down from n; pick the set of timing-critical edges V feeding into C; pick the set of timing-critical edges V feeding into C; if the number of edges in V exceeds p, continue; if the number of edges in V exceeds p, continue; find logic cone C’ in S corresponding to C in M; find logic cone C’ in S corresponding to C in M; find variables V’ in S corresponding to V in M; find variables V’ in S corresponding to V in M; derive cofactors of the function of C’ w.r.t. variables in V’; derive cofactors of the function of C’ w.r.t. variables in V’; build multiplexer tree C’’ of the cofactors using variables in V’; build multiplexer tree C’’ of the cofactors using variables in V’; add structural choice C’= C’’ to the subject graph S; add structural choice C’= C’’ to the subject graph S; } return mapped netlist M’ derived by mapping subject graph S with added choices; return mapped netlist M’ derived by mapping subject graph S with added choices;} Done only once
Experimental Results for “speedup” LUT – number of LUTs Lev – number of LUT levels Delay – delay using LUT library Total – total runtime of Baseline Time1 – the runtime of AIG restructuring only Time2 – the total runtime of Speedup Geomean – geometric averages of columns Ratios – ratios of geometric averages
Overview IntroductionIntroduction –What is ABC? ABC fundamentalsABC fundamentals –Areas addressed by ABC SynthesisSynthesis Technology mappingTechnology mapping VerificationVerification –Contrast with classical methods How is ABC different from SIS?How is ABC different from SIS? Recent workRecent work –Speedup –Factoring –Don’t-care based optimization –Scalable sequential synthesis –WireMap –White boxes SummarySummary
Basic Inner Core Algorithm (DSD) We use a fast disjoint support decomposition (DSD) algorithm as our underlying subroutine –follows Bertacco and Damiani, "The disjunctive decomposition of logic functions“, ICCAD '97 –but uses heuristics to speed it upuses heuristics to speed it up no BDDsno BDDs uses truth tablesuses truth tables –limit inputs to up to 16 BDD
Disjoint Support Decomposition (DSD) (Simple Disjunctive Decomposition) For a completely specified Boolean function, there is a unique maximal DSD ( up to the complementation of inputs and outputs and factoring of ANDs/ORs and XORs ). Theorem 1 [Ashenhurst 1959]. For a completely specified Boolean function, there is a unique maximal DSD ( up to the complementation of inputs and outputs and factoring of ANDs/ORs and XORs ). E C D A B G x1x1 x2x2 x3x3 x4x4 x5x5 H F D ac a c 1
Non-Disjoint Decomposition Definition: A function F has an ( ) - decomposition if it can be written as where ( ) is a partition of the variables x and D is a single output function. H D a c b b The variables in the set b are called the shared variables. The variables a are called the bound set and c the free set. 1
Non-Disjoint Decomposition A function has an - decomposition of the cofactors of F with respect to has a DSD structure in which the variables are in a separate sub-tree. Theorem 2: A function has an - decomposition if and only if each of the cofactors of F with respect to has a DSD structure in which the variables are in a separate sub-tree. E C D A B G x4x4 x5x5 x1x1 x2x2 x3x3 X Z W Y x4x4 x5x5 x1x1 x2x2
Application of Factoring (uses Theorem 2) Rewriting a k-LUT mapped circuit. For each LUT, and each cut of no more than 16 inputs,For each LUT, and each cut of no more than 16 inputs, –express the output of the LUT as truth table in terms of the cut variables – F(x) –Find variables b such that its cofactors are support reducing we exhaustively look for up to two variables in the b setwe exhaustively look for up to two variables in the b set –Take the best (a,b) set and decompose F=H(D(a,b),b,c) –Recursively decompose H and D if they do not fit into a k-LUT. –If improvement, replace LUTs in cut with its new decomposition. Experimental results later
Overview IntroductionIntroduction –What is ABC? ABC fundamentalsABC fundamentals –Areas addressed by ABC SynthesisSynthesis Technology mappingTechnology mapping VerificationVerification –Contrast with classical methods How is ABC different from SIS?How is ABC different from SIS? Recent workRecent work –Speedup –Factoring –Don’t-care based optimization –Scalable sequential synthesis –WireMap –White boxes SummarySummary
Windowing a Node in the Network for Don’t-Care Computation DefinitionDefinition –A window for a node in the network is the context in which the don’t-cares are computed A window includesA window includes –n levels of the TFI –m levels of the TFO –all re-convergent paths captured in this scope Window with its PIs and POs can be considered as a separate networkWindow with its PIs and POs can be considered as a separate network Window POs Window PIs n = 3 m = 3 Boolean network (k-LUT mapped circuit)
Care Set Representation “Miter” constructed for the window POs Window Same window with inverter s … x f x f Window If output is 1 then we care
Resubstitution Resubstitution considers a node in a Boolean network and expresses it using a different set of fanins Resubstitution considers a node in a Boolean network and expresses it using a different set of fanins X X Computation can be enhanced by use of don’t cares
Resubstitution with Don’t-Cares Consider all or some nodes in Boolean network. For each node Create windowCreate window Select possible fanin nodes (divisors)Select possible fanin nodes (divisors) For each candidate subset of divisorsFor each candidate subset of divisors –Rule out some subsets using simulation –Check resubstitution feasibility using SAT –Compute resubstitution function using interpolation A low-cost by-product of completed SAT proofsA low-cost by-product of completed SAT proofs Update the network if there is an improvementUpdate the network if there is an improvement
Resubstitution with Don’t Cares Given:Given: –node function F(x) to be replaced –care set C(x) for the node –candidate set of divisors {g i (x)} for re-expressing F(x) Find:Find: –A resubstitution function h(y) such that F(x) = h(g(x)) on the care set SPFD Theorem: Function h exists if and only if every pair of care minterms, x 1 and x 2, distinguished by F(x), is also distinguished by g i (x) for some iSPFD Theorem: Function h exists if and only if every pair of care minterms, x 1 and x 2, distinguished by F(x), is also distinguished by g i (x) for some i C(x)C(x) F(x)F(x) g1g1 g2g2 g3g3 C(x)C(x) F(x)F(x) g1g1 g2g2 g3g3 h(g)h(g) F’(x)
Checking Resubstitution using SAT 1.Note use of care set, C. 2.Resubstitution function exists if and only if SAT problem is unsatisfiable. 3.An h(g) is obtained by interpolation F F Miter for resubstitution check h(g)h(g)h(g)h(g) SPFD theorem in practice
Experimental Results
Overview IntroductionIntroduction –What is ABC? ABC fundamentalsABC fundamentals –Areas addressed by ABC SynthesisSynthesis Technology mappingTechnology mapping VerificationVerification –Contrast with classical methods How is ABC different from SIS?How is ABC different from SIS? Recent workRecent work –Speedup –Factoring –Don’t-care based optimization –Scalable sequential synthesis –WireMap –White boxes SummarySummary
The Main Idea Consider registers and nodes of a designConsider registers and nodes of a design –Detect candidate equivalences in this set using random/guided simulation –Prove candidates by K-step induction –Merge the resulting equivalences This is a subset of sequential synthesis withThis is a subset of sequential synthesis with –Practical advantages (does not move registers, etc) –Scales to large designs –Offers substantial improvements –Comes with a verification guarantee
Base Case Inductive Case Proving internal equivalences in a topological order in frame K A B SAT-1 D C SAT-2 A B D C A B D C Assuming internal equivalences to in uninitialized frames 0 through K ? ? Symbolic state PI 0 PI 1 PI k A B SAT-3 D C SAT-4 A B SAT-1 D C SAT-2 ? ? ? ? PI 0 PI 1 Initial state Candidate equivalences: {A,B}, {C,D} Proving internal equivalences in initialized frames 0 through K-1
Dynamic Partitioning (register correspondence) BA = DC = B’A’ = ? BA = DC = D’C’ = ? Illustration for two candidate equiv. classes: {A,B}, {C,D} Partition 1 Partition 2 BA DC B’A’ D’C’ One time-frame of the design
Academic Benchmarks Columns “Baseline”, “Reg Corr” and “Sig Corr” show geometric means.
Industrial Benchmarks In case of multiple clock domains, optimization was applied only to the domain with the largest number of registers.
Reasons for Large Improvements Redundancy introduced by HDL compilersRedundancy introduced by HDL compilers Early logic duplication by the designerEarly logic duplication by the designer Accidental sequential redundanciesAccidental sequential redundancies Sequential redundancies present due to reuse of design components that had more functionality than neededSequential redundancies present due to reuse of design components that had more functionality than needed
Overview IntroductionIntroduction –What is ABC? ABC fundamentalsABC fundamentals –Areas addressed by ABC SynthesisSynthesis Technology mappingTechnology mapping VerificationVerification –Contrast with classical methods How is ABC different from SIS?How is ABC different from SIS? Recent workRecent work –Speedup –Factoring –Don’t-care based optimization –Scalable sequential synthesis –WireMap –White boxes SummarySummary
Motivation Fewer pin-to-pin connections should make the design easier to place and routeFewer pin-to-pin connections should make the design easier to place and route Newer FPGAs allow two outputs per LUTNewer FPGAs allow two outputs per LUT –Thus fewer pin-to-pin connections should produce a mapping that “packs” better into dual-output LUTs
Area Recovery Overview 1.Perform delay-optimal mapping 2.Recover area off critical paths –Area-flow (global view) Chooses cuts with better logic sharingChooses cuts with better logic sharing –Exact local area (local view) 3.New idea: Cut-based area recovery algorithms can be extended to minimize edges (pin-to-pin connections) Both are important Both are important
WireMap Algorithm 1.Perform delay-optimal mapping 2.Recover area off critical paths –Area-flow (global view) Break ties with minimum edge flowBreak ties with minimum edge flow –Exact local area (local view) Break ties with exact local edge countBreak ties with exact local edge count
Experimental Setup WireMap implemented in ABCWireMap implemented in ABC Compared WireMap against two algorithms in ABCCompared WireMap against two algorithms in ABC –Baseline – basic mapping with area recovery –Mapping with Structural Choices – mapping with area recovery for several netlists produced by synthesis WireMap was implemented on top of mapping with choicesWireMap was implemented on top of mapping with choices Used VPR to place/route design for wirelength and critical path delaysUsed VPR to place/route design for wirelength and critical path delays Used maximum cardinality matching to pack single- output LUTs into dual-output LUTs usingUsed maximum cardinality matching to pack single- output LUTs into dual-output LUTs using
Results Summary Comparing WireMap against the best mapping with structural choices in ABCComparing WireMap against the best mapping with structural choices in ABC WireMap results:WireMap results: –Reduction in edges by 9.3% –Reduction in dual-output LUT count by 9.4%, compared to mapping with choices Single-output LUT count only reduced by 1.3%Single-output LUT count only reduced by 1.3% –Reduction in wire length by 8.5% –Reduction in power by 20%
Overview IntroductionIntroduction –What is ABC? ABC fundamentalsABC fundamentals –Areas addressed by ABC SynthesisSynthesis Technology mappingTechnology mapping VerificationVerification –Contrast with classical methods How is ABC different from SIS?How is ABC different from SIS? Recent workRecent work –Speedup –Factoring –Don’t-care based optimization –Scalable sequential synthesis –WireMap –White boxes SummarySummary
Comb and Seq Boxes
Treating Boxes as Black For simplicity, boxes can be treated as “black”. Thus box outputs become inputs to the rest of the logic and box inputs become outputs. Delay and logic information is lost.
Treating Boxes as White Example: Nodes o1 and o3 may be equivalent in the design, but this equivalence cannot be detected if the boxes are treated as black. Solution: Consider logic inside white boxes for synthesis, but keep it unchanged during synthesis and mapping.
Future Work Integrating synthesis/ mapping/retiming ABC Co-developing synthesis and verification Integrating synthesis with place and route Improving AIG-based synthesis and mapping Creating special configurable design flows Supporting emerging technologies
To Learn More Visit ABC webpage ABC webpage Read recent papers recent papers Send Send