Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow

Slides:

Advertisements

Similar presentations

Modern VLSI Design 2e: Chapter4 Copyright  1998 Prentice Hall PTR.

Advertisements

DAG-Aware AIG Rewriting Alan Mishchenko, Satrajit Chatterjee, Robert Brayton Department of EECS, University of California Berkeley Presented by Rozana.

1 A New Enhanced Approach to Technology Mapping Alan Mishchenko Presented by: Sheng Xu May 2 nd 2006.

Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow Alan Mishchenko University of California, Berkeley.

Logic Synthesis Primer

1 A Method for Fast Delay/Area Estimation EE219b Semester Project Mike Sheets May 16, 2000.

Electrical and Computer Engineering Archana Rengaraj ABC Logic Synthesis basics ECE 667 Synthesis and Verification of Digital Systems Spring 2011.

05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL.

Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.

Combinational and Sequential Mapping with Priority Cuts Alan Mishchenko Sungmin Cho Satrajit Chatterjee Robert Brayton UC Berkeley.

ABC: A System for Sequential Synthesis and Verification BVSRC Berkeley Verification and Synthesis Research Center Robert Brayton, Niklas Een, Alan Mishchenko,

1 Stephen Jang Kevin Chung Xilinx Inc. Alan Mishchenko Robert Brayton UC Berkeley Power Optimization Toolbox for Logic Synthesis and Mapping.

Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Multi-Level Logic Synthesis.

Logic synthesis flow Technology independent mapping –Two level or multilevel optimization to optimize a coarse metric related to area/delay Technology.

Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.

Static Timing Analysis

Research Roadmap Past – Present – Future Robert Brayton Alan Mishchenko Logic Synthesis and Verification Group UC Berkeley.

A Semi-Canonical Form for Sequential Circuits Alan Mishchenko Niklas Een Robert Brayton UC Berkeley Michael Case Pankaj Chauhan Nikhil Sharma Calypto Design.

Global Delay Optimization using Structural Choices Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Xilinx Inc.

Sequential Equivalence Checking for Clock-Gated Circuits Hamid Savoj Robert Brayton Niklas Een Alan Mishchenko Department of EECS University of California,

Reducing Structural Bias in Technology Mapping

Synthesis for Verification

Technology Mapping into General Programmable Cells

Power Optimization Toolbox for Logic Synthesis and Mapping

Mapping into LUT Structures

Delay Optimization using SOP Balancing

Robert Brayton Alan Mishchenko Niklas Een

New Directions in the Development of ABC

Alan Mishchenko Satrajit Chatterjee Robert Brayton UC Berkeley

Logic Synthesis Primer

Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.

A Semi-Canonical Form for Sequential AIGs

Applying Logic Synthesis for Speeding Up SAT

Versatile SAT-based Remapping for Standard Cells

Integrating an AIG Package, Simulator, and SAT Solver

Standard-Cell Mapping Revisited

Faster Logic Manipulation for Large Designs

Timing Optimization Andreas Kuehlmann

SAT-Based Area Recovery in Technology Mapping

Alan Mishchenko University of California, Berkeley

Canonical Computation without Canonical Data Structure

Buffered tree construction for timing optimization, slew rate, and reliability control Abstract: With the rapid scaling of IC technology, buffer insertion.

SAT-Based Optimization with Don’t-Cares Revisited

Scalable and Scalably-Verifiable Sequential Synthesis

Mapping into LUT Structures

Improvements to Combinational Equivalence Checking

Sungho Kang Yonsei University

ECE 667 Synthesis and Verification of Digital Systems

Integrating Logic Synthesis, Technology Mapping, and Retiming

Timing Optimization.

Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow

Integrating an AIG Package, Simulator, and SAT Solver

Improvements in FPGA Technology Mapping

Canonical Computation without Canonical Data Structure

Technology Mapping I based on tree covering

Recording Synthesis History for Sequential Verification

Logic Synthesis: Past, Present, and Future

Delay Optimization using SOP Balancing

Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.

A Practical Approach to Arithmetic Circuit Verification

Innovative Sequential Synthesis and Verification

Robert Brayton Alan Mishchenko Niklas Een

Alan Mishchenko University of California, Berkeley

Word-Level Aspects of ABC

Illustrative Example p p Lookup Table for Digits of h g f e ) ( d c b

SAT-based Methods: Logic Synthesis and Technology Mapping

Fast Min-Register Retiming Through Binary Max-Flow

Robert Brayton Alan Mishchenko Niklas Een

Alan Mishchenko Department of EECS UC Berkeley

Integrating AIG Package, Simulator, and SAT Solver

Presentation transcript:

Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow Alan Mishchenko Niklas Een Hamid Savoj Robert Brayton University of California, Berkeley

Outline Motivation The flow Experimental results Conclusion Technology-independent synthesis Technology mapping Buffering Sizing Experimental results Conclusion

Motivation Synthesis tools are out there, but they are slow suboptimal complicated expensive

ABC It is a public-domain tool developed by our research group since 2005 It addresses both synthesis and verification of synchronous hardware It is based on years of experience in developing efficient data-structures and algorithms It is used in industry and academia For more information, visit https://bitbucket.org/alanmi/abc

The Flow Technology-independent synthesis Technology mapping Buffering Sizing These steps are not disconnected; they overlap Synthesis talks to mapping through structural choices Mapping talks to buffering through fanout estimations Buffer and sizing can be interleaved

Synthesis: Old and New “AIG rewriting” Delay/area costs Restructuring AND2 levels/nodes Restructuring for all 4-input cuts, try all AIG subgraphs, choose the one with the min nodes under delay constraint Results Acceptable quality Acceptable runtime Problems “Over-re-structuring” Slow for large, deep logic “AIG reshaping” Delay/area cost user-specified cost for n-input AND/XOR/MUX/MAJ Restructuring iterate “mapping” and “unmapping” several times Results Comparable quality 3-10 faster Problems None so far

Mapping: Old and New “Traditional” cut-based mapping iterate over the subject graph re-compute priority cuts use structural or functional matching (ICCAD’97) For standard-cell mapping use a gain-based library map both (pos and neg) phase of each node into gates select best cuts (gates) Results Acceptable quality Tolerable runtime “Improved” cut-based mapping pre-compute priority cuts iterate over the subject graph evaluate cuts using different costs use structural or functional matching For standard-cell mapping use a gain-based library map into NPN classes of functions from the library select best cuts (NPN classes) perform phase-assignment and determine gates during buffering Results Quality not known yet Runtime is expected 3-10x faster

Buffering: Old and New Several ideas tried, none is a clear winner Enumerating buffer tree topologies Buffering for near-continuous libraries Other incremental local fanout optimization methods Several ideas tried, none is a clear winner “Technology-independent” buffering after the gain-based library Buffer-tree construction given required times and loads of the fanouts Incremental buffering interleaved with incremental sizing Results are mixed

Incremental Buffering Illustrated Growing Bypassing

Sizing: Old and New Non-linear programming Linear programming Lagrangian multipliers Incremental sizing find critical region find best gates to resize perform the resizing incrementally update timing Iterate until no improvement Can be combined with incremental buffering Results Reasonable Surprisingly fast If an optimum solution is known, seems to converge to it

Commands of The Flow read_lib write_lib print_lib read_scl write_scl dump_genlib print_gs stime buffer unbuffer minsize maxsize upsize dnsize print_buf read_constr print_constr reset_constr

Experimental Setting 19 OpenCore designs were synthesized and mapped by an industrial tool using public library vsclib013.lib from http://www.vlsitechnology.org/ Delay, area, and runtime were collected and used as a reference Sizing was tested by applying min-sizing, followed by re-sizing Buffering was tested by un-buffering and min-sizing, followed by re-buffering and re-sizing The flow was tested by restructuring the design, followed by mapping, buffering, and sizing

Experimental Results

Comments on The Table Column “Gate” shows the number of gates produced by the industrial tool Other columns “Gate” show the percentage of change in the number of gates after reach transform, compared to the result produced by the industrial too. Positive is improvement. Negative is degradation. Similarly, columns “Area” and “Delay” show the percentage of change in area and delay, respectively. The flows are tuned differently This is why the area increase after buffering/sizing is more than after synthesis/buffering/sizing. Runtimes are in seconds on an old desktop computer On a new computer, the runtimes are expected to be 2x smaller

Potential Issues Not specifying input driving cells and output loads This was addressed and experiments show it is fine Over-tuning for one particular library Not sure heuristics will hold for submicron libraries Not looking at power Not taking high and low Vt cells into account Not mapping into multi-output cells Not mapping sequential elements Not considering multiple clock domains

Conclusion A new synthesis flow is being developed and implemented in ABC An opportunity to rethink some of the classical problems improve on some of the known solutions come up with a new public implementation Results are encouraging delay (in delay-oriented synthesis) is within 5-15% area (in area-oriented synthesis) is within 1-3% runtime is about 20-50x better

Abstract This presentation focuses on adding new capabilities to synthesize standard cell designs in the public-domain synthesis/verification tool ABC. An optimization flow has been developed, which included gain-based technology mapping, fanout-optimization by buffering and gate duplication, and gate-sizing. Novel heuristic algorithms have been proposed for several well-known optimization steps. For example, buffer tree construction can be performed not as a separate step, but concurrently with gate-sizing by reshaping initial well-balanced buffer trees. Each tree reshaping and each gate resizing transform are evaluated for delay/area improvement using a common cost-function and the most promising one is selected. The delay is measured by lookup table based delay model, which computes the delay of a gate from its input flew and output capacitance. Experiments show that the flow produces results that are 10% within those of industrial tools 20x faster.