Delay Optimization using SOP Balancing

Slides:



Advertisements
Similar presentations
FRAIGs - A Unifying Representation for Logic Synthesis and Verification - Alan Mishchenko, Satrajit Chatterjee, Roland Jiang, Robert Brayton ERL Technical.
Advertisements

ECE 667 Synthesis & Verificatioin - FPGA Mapping 1 ECE 667 Synthesis and Verification of Digital Systems Technology Mapping for FPGAs D.Chen, J.Cong, DAOMap.
Combining Technology Mapping and Retiming EECS 290A Sequential Logic Synthesis and Verification.
Introduction to Logic Synthesis Alan Mishchenko UC Berkeley.
1 FRAIGs: Functionally Reduced And-Inverter Graphs Adapted from the paper “FRAIGs: A Unifying Representation for Logic Synthesis and Verification”, by.
DAG-Aware AIG Rewriting Alan Mishchenko, Satrajit Chatterjee, Robert Brayton Department of EECS, University of California Berkeley Presented by Rozana.
Logic Synthesis Primer
Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,
Faster Logic Manipulation for Large Designs Alan Mishchenko Robert Brayton University of California, Berkeley.
Electrical and Computer Engineering Archana Rengaraj ABC Logic Synthesis basics ECE 667 Synthesis and Verification of Digital Systems Spring 2011.
1 Alan Mishchenko UC Berkeley Implementation of Industrial FPGA Synthesis Flow Revisited.
05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL.
Combinational and Sequential Mapping with Priority Cuts Alan Mishchenko Sungmin Cho Satrajit Chatterjee Robert Brayton UC Berkeley.
Enumeration of Irredundant Circuit Structures Alan Mishchenko Department of EECS UC Berkeley UC Berkeley.
Logic Synthesis: Past and Future Alan Mishchenko UC Berkeley.
1 Stephen Jang Kevin Chung Xilinx Inc. Alan Mishchenko Robert Brayton UC Berkeley Power Optimization Toolbox for Logic Synthesis and Mapping.
Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,
1 WireMap FPGA Technology Mapping for Improved Routability Stephen Jang, Xilinx Inc. Billy Chan, Xilinx Inc. Kevin Chung, Xilinx Inc. Alan Mishchenko,
A Semi-Canonical Form for Sequential Circuits Alan Mishchenko Niklas Een Robert Brayton UC Berkeley Michael Case Pankaj Chauhan Nikhil Sharma Calypto Design.
Enhancing Model Checking Engines for Multi-Output Problem Solving Alan Mishchenko Robert Brayton Berkeley Verification and Synthesis Research Center Department.
Global Delay Optimization using Structural Choices Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Xilinx Inc.
Reducing Structural Bias in Technology Mapping
Synthesis for Verification
Technology Mapping into General Programmable Cells
Power Optimization Toolbox for Logic Synthesis and Mapping
Mapping into LUT Structures
Faster Logic Manipulation for Large Designs
SAT-Based Logic Optimization and Resynthesis
Robert Brayton Alan Mishchenko Niklas Een
Alan Mishchenko Robert Brayton UC Berkeley
Alan Mishchenko Satrajit Chatterjee Robert Brayton UC Berkeley
Logic Synthesis Primer
A. Mishchenko S. Chatterjee1 R. Brayton UC Berkeley and Intel1
Introduction to Logic Synthesis with ABC
Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.
Logic Synthesis: Past, Present, and Future
A Semi-Canonical Form for Sequential AIGs
Versatile SAT-based Remapping for Standard Cells
Integrating an AIG Package, Simulator, and SAT Solver
Standard-Cell Mapping Revisited
Faster Logic Manipulation for Large Designs
Enumeration of Irredundant Circuit Structures
SAT-Based Area Recovery in Technology Mapping
Alan Mishchenko University of California, Berkeley
Canonical Computation without Canonical Data Structure
SAT-Based Optimization with Don’t-Cares Revisited
Canonical Computation Without Canonical Data Structure
Scalable and Scalably-Verifiable Sequential Synthesis
Mapping into LUT Structures
Integrating Logic Synthesis, Technology Mapping, and Retiming
Alan Mishchenko UC Berkeley
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Integrating an AIG Package, Simulator, and SAT Solver
Introduction to Logic Synthesis
Improvements in FPGA Technology Mapping
Canonical Computation without Canonical Data Structure
Recording Synthesis History for Sequential Verification
Logic Synthesis: Past, Present, and Future
Delay Optimization using SOP Balancing
Logic Synthesis: Past and Future
Canonical Computation without Canonical Data Structure
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.
Innovative Sequential Synthesis and Verification
Robert Brayton Alan Mishchenko Niklas Een
SAT-based Methods: Logic Synthesis and Technology Mapping
Introduction to Logic Synthesis with ABC
Robert Brayton Alan Mishchenko Niklas Een
Alan Mishchenko Department of EECS UC Berkeley
Integrating AIG Package, Simulator, and SAT Solver
Presentation transcript:

Delay Optimization using SOP Balancing Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Agate Logic Inc. Victor Kravets IBM

Outline Delay optimization is an important problem AIG is used for synthesis and mapping Minimizing AIG level leads to delay reduction Contribution: Algorithm for AIG level minimization Very simple idea – remarkable consequences! Experimental results Conclusions

Delay Models AIG level count Library delay model Useful to measure delay before mapping Library delay model Useful to measure delay after mapping Both metrics are approximate The real delay should be measured after P&R

AIG Definition and Examples AIG is a Boolean network composed of two-input ANDs and inverters cdab 00 01 11 10 1 F(a,b,c,d) = ab + d(ac’+bc) b c a d 6 nodes 4 levels F(a,b,c,d) = ac’(b’d’)’ + c(a’d’)’ = ac’(b+d) + bc(a+d) cdab 00 01 11 10 1 a c b d 7 nodes 3 levels

Three Simple Tricks Structural hashing Complemented edges Makes sure AIG is stored in a compact form Is applied during AIG construction Propagates constants Makes each node structurally unique Complemented edges Represents inverters as attributes on the edges Leads to fast, uniform manipulation Does not use memory for inverters Increases logic sharing using DeMorgan’s rule Memory allocation Uses fixed amount of memory for each node Can be done by a simple custom memory manager Even dynamic fanout manipulation is supported! Allocates memory for nodes in a topological order Optimized for traversal in the same topological order Small static memory footprint for many applications Computes fanout information on demand a b c d Without hashing a b c d With hashing

AIG: A Unifying Representation An underlying data structure for various computations Representing both local and global functions Used in rewriting, resubstitution, simulation, SAT sweeping, induction, etc A unifying representation for the whole flow Synthesis, mapping, verification pass around AIGs Stored multiple structures for mapping (‘AIG with choices’) The main functional representation in ABC Foundation of ‘contemporary’ logic synthesis Source of ‘signature features’ (speed, scalability, etc)

AND-balancing Step 1: Covering Step 2: Tree decomposition Cover AIG with non-overlapping multi-input ANDs of largest size without duplication Step 2: Tree decomposition In a topological order, perform tree decomposition of multi-input ANDs to reduce delay

Step 1: Covering Cover AIG with non-overlapping multi-input ANDs of largest size without logic duplication (unique result) g g a g b a b c a b c c f f f d e d e d e

Step 2: Tree Decomposition In a topological order, decompose multi-input AND using arrival times of the fanins, to create the tree of two-input ANDs to minimize the AIG level (non-unique result) a g b c g f a b c f d e d e

AND-balancing Step 1: Covering Step 2: Tree decomposition g a g b a b h c f a b c f f d e d e d e Delay: 5 levels Delay: 3 levels

SOP-balancing For each node, in a topological order Compute several k-input cuts For each cut Compute truth table Compute SOP Perform delay-optimal tree balancing of the SOP Select the cut with the smallest output arrival time If there is an improvement in the output arrival time, update AIG structure

SOP-balancing (Illustration) Example: F = abc + def + g D D1 D2 D3 g Original AIG structure Cut1 Cut2 Cut3 a c b d f e SOP-balancing = AND-balancing for each cube and for the sum. Given a set of cuts at a node (Cut1, Cut2, etc), choose the cut Di with the smallest output arrival time among all arrival times of the cuts (D1, D2, etc). If Di < D, replace the original AIG structure by the AIG structure of balanced SOP.

Why SOP-balancing Reduces Delay More Than AND-balancing? AND-balancing is limited to multi-input ANDs SOP-balancing looks at larger functions In many cases, AND-balancing cannot reduce delay while SOP-balancing can reduce it Example: F = ab + c(d + ef) F = ab + cd + cef a b c c e f d a b c d e f 3 levels 4 levels

Implementation The proposed algorithm is implemented by customizing priority-cut-based technology mapper if in ABC: Command is if -g [-K <num>] [-C <num] -g uses SOP-balancing for cut evaluation -K <num> specifies the cuts size -C <num> specifies the number of cuts considered at a node Cost functions used to prioritize the cuts: Delay: the arrival time of the output Measured using the number of levels of 2-input ANDs Area: the size of the tree decomposition of the SOP Measured using the number of 2-input ANDs A. Mishchenko, S. Cho, S. Chatterjee, and R. Brayton, "Combinational and sequential mapping with priority cuts", Proc. ICCAD '07, pp. 354-361.

Experimental Setup (St. Cells) Used a suite of industrial designs Removed two outliers with very big delay improvement Used MCNC standard-cell library from SIS distribution Performed 3 runs Reference: (st; dch; map)4 Run 1: (st; if -K 6 -g -C 8)(st; dch; map)4 Run 2: (st; if -K 6 -g -C 8)2(st; dch; map)6 Runtime impact The runtime of if –g is close to the runtime of one round of mapping (about 60 sec for a design with 500K AIG nodes)

Experiments: Industrial Designs

Experiments: Example

Results after P&R

Experimental Setup (4-LUTs) Used a set of academic benchmarks from previous work Performed 4-LUT mapping (unit-delay, unit-area) Used 8-input cuts (with 8 cuts per node) during SOP balancing Performed 3 experimental runs Baseline (st; if -K 4) Choices Baseline + (st; dch; if -K 4)5 SOP balancing Baseline + (st; dch; if -K 4)2; (st; if ‑g -C 8 -K 8); (st; dch; if -K 4)3

Experiments: Mapping into 4-LUTs

Conclusion Introduced AIGs Illustrated AND-balancing A known way to reduce AIG level (command “balance” in ABC) Introduced SOP-balancing A new way to reduce AIG level (command “if –g; st” in ABC) Performed experiments on industrial benchmarks Delay reduction after mapping correlates with AIG level reduction For standard cells (before placement) Achieved 30% delay reduction with 2.4% area increase Achieved 41% delay reduction with 3.9% area increase For standard cells (after placement) Achieved 20% improvement in FOM and 5% area reduction For 4-LUT FPGA mapping (before placement) Achieved 16% delay reduction with 9% area increase Future work Try with a more realistic gate library Try on highly optimized ASICs designs