Alan Mishchenko Robert Brayton UC Berkeley

Slides:



Advertisements
Similar presentations
Recording Synthesis History for Sequential Verification Robert Brayton Alan Mishchenko UC Berkeley.
Advertisements

Aaron Bradley University of Colorado, Boulder
1 FRAIGs: Functionally Reduced And-Inverter Graphs Adapted from the paper “FRAIGs: A Unifying Representation for Logic Synthesis and Verification”, by.
Logic Synthesis Primer
Enhancing and Integrating Model Checking Engines Robert Brayton Alan Mishchenko UC Berkeley June 15, 2009.
05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL.
Scalable and Scalably-Verifiable Sequential Synthesis Alan Mishchenko Mike Case Robert Brayton UC Berkeley.
Combinational and Sequential Mapping with Priority Cuts Alan Mishchenko Sungmin Cho Satrajit Chatterjee Robert Brayton UC Berkeley.
ABC: A System for Sequential Synthesis and Verification BVSRC Berkeley Verification and Synthesis Research Center Robert Brayton, Niklas Een, Alan Mishchenko,
The Synergy between Logic Synthesis and Equivalence Checking R. Brayton UC Berkeley Thanks to SRC, NSF, California Micro Program and industrial sponsors,
Cut-Based Inductive Invariant Computation Michael Case 1,2 Alan Mishchenko 1 Robert Brayton 1 Robert Brayton 1 1 UC Berkeley 2 IBM Systems and Technology.
1 Stephen Jang Kevin Chung Xilinx Inc. Alan Mishchenko Robert Brayton UC Berkeley Power Optimization Toolbox for Logic Synthesis and Mapping.
Equivalence checking Prof Shobha Vasudevan ECE 598SV.
1 Alan Mishchenko Research Update June-September 2008.
A Semi-Canonical Form for Sequential Circuits Alan Mishchenko Niklas Een Robert Brayton UC Berkeley Michael Case Pankaj Chauhan Nikhil Sharma Calypto Design.
Sequential Verification Overview Robert Brayton UC Berkeley.
Enhancing Model Checking Engines for Multi-Output Problem Solving Alan Mishchenko Robert Brayton Berkeley Verification and Synthesis Research Center Department.
Variable-Time-Frame Gate-Level Abstraction Alan Mishchenko Niklas Een Robert Brayton Alan Mishchenko Niklas Een Robert Brayton UC Berkeley UC Berkeley.
An Integrated Sequential Verification Flow Berkeley Logic Synthesis and Verification Group Presented by Alan Mishchenko.
Global Delay Optimization using Structural Choices Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Xilinx Inc.
Sequential Equivalence Checking for Clock-Gated Circuits Hamid Savoj Robert Brayton Niklas Een Alan Mishchenko Department of EECS University of California,
Introduction to Formal Verification
Synthesis for Verification
Power Optimization Toolbox for Logic Synthesis and Mapping
Alan Mishchenko UC Berkeley
Delay Optimization using SOP Balancing
Enhancing PDR/IC3 with Localization Abstraction
Robert Brayton Alan Mishchenko Niklas Een
New Directions in the Development of ABC
Alan Mishchenko Satrajit Chatterjee Robert Brayton UC Berkeley
Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.
Simple Circuit-Based SAT Solver
A Semi-Canonical Form for Sequential AIGs
Applying Logic Synthesis for Speeding Up SAT
Versatile SAT-based Remapping for Standard Cells
SAT-based Methods: Logic Synthesis and Technology Mapping
Integrating an AIG Package, Simulator, and SAT Solver
Synthesis for Verification
Optimal Redundancy Removal without Fixedpoint Computation
Property Directed Reachability with Word-Level Abstraction
The Synergy between Logic Synthesis and Equivalence Checking
The Synergy between Logic Synthesis and Equivalence Checking
Introduction to Formal Verification
SAT-Based Area Recovery in Technology Mapping
Canonical Computation without Canonical Data Structure
SAT-Based Optimization with Don’t-Cares Revisited
Canonical Computation Without Canonical Data Structure
Scalable and Scalably-Verifiable Sequential Synthesis
Improvements to Combinational Equivalence Checking
SAT-based Methods for Scalable Synthesis and Verification
GLA: Gate-Level Abstraction Revisited
Integrating Logic Synthesis, Technology Mapping, and Retiming
Research Status of Equivalence Checking at Zhejiang University
Resolution Proofs for Combinational Equivalence
Integrating an AIG Package, Simulator, and SAT Solver
Canonical Computation without Canonical Data Structure
Alan Mishchenko UC Berkeley
Recording Synthesis History for Sequential Verification
Delay Optimization using SOP Balancing
Alan Mishchenko UC Berkeley
Canonical Computation without Canonical Data Structure
Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.
Robert Brayton Alan Mishchenko Niklas Een
SAT-Based Logic Synthesis (yes, Logic Synthesis Is Everywhere!)
SAT-based Methods: Logic Synthesis and Technology Mapping
Fast Min-Register Retiming Through Binary Max-Flow
Robert Brayton Alan Mishchenko Niklas Een
SAT-Based Logic Synthesis
Alan Mishchenko Department of EECS UC Berkeley
Integrating AIG Package, Simulator, and SAT Solver
Presentation transcript:

Alan Mishchenko Robert Brayton UC Berkeley Sequential Optimization by Detecting and Merging Sequentially Equivalent Nodes Alan Mishchenko Robert Brayton UC Berkeley

Overview Introduction Computations Verification Experiments SAT sweeping Induction Partitioning Verification Experiments Future work

Introduction Combinational synthesis Cuts at the register boundary Preserves state encoding, scan chains & test vectors No sequential optimization – easy to verify The traditional sequential synthesis Runs retiming, re-encoding, uses sequential don’t-cares, etc Changes state encoding, invalidates scan chains & test vectors Some degree of sequential optimization – hard to verify The proposed sequential synthesis Merges sequentially equivalent registers and internal nodes Minor change to state encoding, scan chains & test vectors Some degree of sequential optimization – easy to verify!

Comb and Seq Equivalences Comb equivalence: nodes A and B are equal for all input combinations Seq equivalence: nodes A and B are equal only for the input combinations contained in reachable states A B A B Structural view Functional view Complete state space Reachable state space

How It Works? Can handle 1M gates and 100K flops flat with runtime in minutes on a single core Uses SAT solver to compute subsets of reachable states without computing full reachable states No BDDs, no partitioning, no hints based on names

Why Improvements? Why do we get 10%+ average reduction in the number of flops/area/power for some design families and benchmark suites? Redundant registers generated by RTL compilers Design reuse, when redundant blocks (or redundant functions) are kept for the sake of design integrity Logic duplication for modularity

Combinational SAT Sweeping Applying SAT to the output ? SAT Naïve CEC approach – SAT solving Build output miter and call SAT works well for many easy problems Better CEC approach – SAT sweeping based on incremental SAT solving Detects possibly equivalent nodes using simulation Candidate constant nodes Candidate equivalent nodes Runs SAT on the intermediate miters in a topological order Refines the candidates using counterexamples Proving internal equivalences in a topological order A B SAT-1 ? D C SAT-2 SAT-3

Sequential SAT Sweeping Sequential SAT sweeping is similar to combinational one in that it detects node equivalences The difference is, the equivalences are sequential They hold only in the reachable state space Every comb. equivalence is a seq. one, not vice versa It makes sense to run comb. SAT sweeping beforehand Sequential equivalence is proved by K-step induction Base case Inductive case Efficient implementation of induction is key!

Base Case Inductive Case Candidate equivalences: {A,B}, {C,D} ? D C SAT-2 ? Proving internal equivalences in a topological order in frame K A B SAT-1 ? D C SAT-4 ? PIk A B SAT-3 PI1 ? C D D C SAT-2 A ? Assuming internal equivalences to in uninitialized frames 0 through K-1 B A B SAT-1 PI1 PI0 C D Initial state A Proving internal equivalences in initialized frames 0 through K-1 B PI0 Symbolic state

Efficient Implementation Both base and inductive cases of K-step induction are runs of combinational SAT sweeping Tricks and know-hows of combinational sweeping are applicable The same integrated package can be used Starts with simulation Performs node checking in a topological order Benefits from the counter-example simulation Speculative reduction Has to do with how the assumptions are made (see next slide)

Speculative Reduction Inputs to the inductive case Sequential circuit The number of frames to unroll (K) Candidate equivalence classes One node in each class is designated as the representative node Currently the representatives are the first nodes in a topological order Speculative reduction moves fanouts to the representative nodes Makes 80% of the constraints redundant Dramatically simplifies the resulting timeframes (observed 3x reductions) Leads to saving 100-1000x in runtime during incremental SAT solving A A B B Adding assumptions without speculative reduction Adding assumptions with speculative reduction

Other Observations Surprisingly, the following are found to be of little or no importance for speeding up the inductive prover The quality of initial equivalence classes How much simulation (semi-formal filtering) was applied AIG rewriting on speculated timeframes Although AIG can be reduced 20%, incremental SAT runs the same The quality of AIG-to-CNF conversion Naïve conversion (1 AIG node = 3 clauses) works just fine Open question: Given these observations, how to speed up this type of incremental SAT?

Verification after Sequential Synthesis X N1 Poison and antidote are the same! Two conceptually similar inductive provers can be used during synthesis – to prove seq equivalence of registers and nodes during verification – to prove seq equivalence of registers, nodes, and POs of two circuits Verification mentioned here is formal, that is, “unbounded” and “general-case” No limit on the input sequence is imposed (unlike BMC) No information about synthesis is passed to the verification tool The runtimes of synthesis and verification are comparable Scales to 100K-register designs – due to partitioning for induction Synthesis problem X … N1 N2 M Equivalence checking problem

Integrated SEC Flow The following is the sequence of transformations currently applied by the integrated SEC in ABC (command “dsec”) creating sequential miter (“miter -c”) PIs/POs are paired by name; if some registers have don’t-care init values, they are converted by adding new PIs and muxes; all logic is represented in the form of an AIG sequential sweep (“scl”) removes logic that does not fanout into POs structural register sweep (“scl -l”) removes stuck-at-constant and combinationally-equivalent registers most forward retiming (“retime –M 1”) (disabled by switch “–r”, e.g. “dsec –r”) moves all registers forward and computes new initial state partitioned register correspondence (“lcorr”) merges sequential equivalent registers (completely solves SEC after retiming) combinational SAT sweeping (“fraig”) merges combinational equivalent nodes before running signal correspondence for ( K = 1; K  16; K = K * 2 ) signal correspondence (“ssw”) // merges seq equivalent signals by K-step induction AIG rewriting (“drw”) // minimizes and restructures combinational logic most forward retiming // moves registers forward after logic restructuring sequential AIG simulation // targets satisfiable SAT instances post-processing (“write_aiger”) if sequential miter is still unsolved, dumps it into a file for future use

Example of Seq. Synthesis in ABC abc 01> r iscas/blif/s38417.blif // reads in an ISCAS’89 benchmark abc 02> st; ps // shows the AIG statistics after structural hashing s38417 : i/o = 28/ 106 lat = 1636 and = 9238 (exor = 178) lev = 31 abc 03> ssw –K 1 -v // performs one round of signal correspondence using simple induction Initial fraiging time = 0.27 sec Simulating 9096 AIG nodes for 32 cycles ... Time = 0.06 sec Original AIG = 9096. Init 2 frames = 84. Fraig = 82. Time = 0.01 sec Before BMC: Const = 5031. Class = 430. Lit = 9173. After BMC: Const = 5031. Class = 430. Lit = 9173. 0 : Const = 5031. Class = 430. L = 9173. LR = 1928. NR = 3140. 1 : Const = 4883. Class = 479. L = 8964. LR = 1554. NR = 2978. … 28 : Const = 145. Class = 177. L = 756. LR = 198. NR = 9099. 29 : Const = 145. Class = 176. L = 753. LR = 195. NR = 9090. SimWord = 1. Round = 2025. Mem = 0.38 Mb. LitBeg = 9173. LitEnd = 753. ( 8.21 %). Proof = 5022. Cex = 2025. Fail = 0. FailReal = 0. C-lim = 10000000. ImpRatio = 0.00 % NBeg = 9096. NEnd = 8213. (Gain = 9.71 %). RBeg = 1636. REnd = 1345. (Gain = 17.79 %). AIG simulation = 2.25 sec AIG traversal = 0.01 sec SAT solving = 3.71 sec Unsat = 0.16 sec Sat = 3.55 sec Fail = 0.00 sec Class refining = 0.38 sec TOTAL RUNTIME = 8.51 sec abc 04> ps // shows the AIG statistics after merging equivalent registers and nodes s38417 : i/o = 28/ 106 lat = 1345 and = 8213 (exor = 116) lev = 31 abc 04> dsec –r // runs the unbounded SEC on the resulting network against the original one Networks are equivalent. Time = 15.59 sec

Experimental Results Public benchmarks Industrial benchmarks 25 test cases ITC ’99 (b14, b15, b17, b20, b21, b22) ISCAS ’89 (s13207, s35932, s38417, s38584) IWLS ’05 (systemcaes, systemcdes, tv80, usb_funct, vga_lcd, wb_conmax, wb_dma, ac97_ctrl, aes_core, des_area, des_perf, ethernet, i2c, mem_ctrl, pci_spoci_ctrl) Industrial benchmarks 50 test cases Nothing else is known Workstation Intel Xeon 2-CPU 4-core, 8Gb RAM

ABC Scripts Baseline Register correspondence (Reg Corr) choice; if; choice; if; choice; if // comb synthesis and mapping Register correspondence (Reg Corr) scl –l // structural register sweep lcorr // register correspondence using partitioned induction dsec –r // SEC Signal correspondence (Sig Corr) ssw // signal correspondence using non-partitioned induction

Public Benchmarks Columns “Baseline”, “Reg Corr” and “Sig Corr” show geometric means.

ITC / ISCAS Benchmarks (details)

IIWLS’05 Benchmarks (details)

ITC / ISCAS Benchmarks (runtime)

IWLS’05 Benchmarks (runtime)

Industrial Benchmarks In case of multiple clock domains, optimization was applied only to the domain with the largest number of registers.

Future Continue tuning for scalability Experiment with new ideas Speculative reduction Partitioning Experiment with new ideas Unique-state constraints Interpolate when induction fails Synthesizing equivalence Go beyond merging sequential equivalences Add logic restructuring using subsets of unreachable states Add retiming (improves delay on top of reg/area reductions) Add iteration (led to improvements in other synthesis projects) etc