Robert Brayton Alan Mishchenko Niklas Een Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Robert Brayton Alan Mishchenko Niklas Een UC Berkeley
Overview Motivation Big picture Problem representation Algorithms Sequential synthesis Combinational synthesis with choices Technology mapping Minimum-perturbation retiming Experimental results Future work
Motivation ABC is a public-domain system for logic synthesis and formal verification under development at Berkeley since 2005 A successor of Espresso, MIS, SIS, VIS The baseline version of ABC is not applicable to industrial designs because it does not support Complex flops Multiple clock domains Special objects (adders, RAMs, DSPs, etc) Standard-cell libraries A fresh start, called Magic, was taken in Fall 2009 Includes new design database should support these Integrates application packages for better memory/runtime
Big Picture Verilog, EDIF, BLIF Programmable APIs A. Mishchenko, N. Een, R. K. Brayton, S. Jang, M. Ciesielski, and T. Daniel, "Magic: An industrial-strength logic optimization, technology mapping, and formal verification tool". Proc. IWLS'10.
Application Packages Framework Combinational optimization Design database File input / output Programmable APIs Combinational optimization AIG rewriting Choice computation Technology mapping Sequential optimization Retiming Merging equivalence nodes Mapping with choices Speedup Verification Simulation Comb equivalence checking Seq equivalence checking
“Industrial Stuff” Clock domains Complex controls of the flops Represent clock signal in the data-base Annotate flops with their clock-domain number in the AIG Separate clock domains in sequential transforms Complex controls of the flops Use parametrized flop model Perform elaboration of control signals if needed Handle asynchronous reset carefully! Industrial primitives (adders, RAMs, DSPs, etc) Use boxes (black/white, comb/seq, merge/no_merge, etc) Currently propagates timing information, improves quality of synthesis Elaborate boxes for seq synthesis, but do not map them Need better support for user-specified attributes (don’t-touch, etc)
Representations Netlist AIG: The main data-structure of ABC / Magic Original / current / resulting design with “industrial stuff” AIG: The main data-structure of ABC / Magic Represents local / global functions Gets synthesized / mapped / verified Logic network Represents the result of technology mapping In many cases, it is convenient to represent logic network as an AIG annotated with technology primitives (LUTs, gates, etc)
AIG Definition and Examples AIG is a Boolean network composed of two-input ANDs and inverters cdab 00 01 11 10 1 F(a,b,c,d) = ab + d(ac’+bc) b c a d 6 nodes 4 levels F(a,b,c,d) = ac’(b’d’)’ + c(a’d’)’ = ac’(b+d) + bc(a+d) cdab 00 01 11 10 1 a c b d 7 nodes 3 levels
Three Simple Tricks Structural hashing Complemented edges Makes sure AIG is in a compact form Is applied during AIG construction Makes each node structurally unique Propagates constants Complemented edges Represents inverters as attributes on the edges Saves memory, leads to fast uniform manipulation Increases logic sharing using DeMorgan’s rule Memory allocation Uses fixed amount of memory for each node Based on a simple custom memory manager Allocates memory for nodes in a topological order Optimized for traversal in the same topological order Computes fanout information on demand a b c d Without hashing a b c d With hashing
AIG: A Unifying Representation An underlying data structure for various computations Representing both local and global functions Used in rewriting, resubstitution, simulation, SAT sweeping, induction, etc A unifying representation for the whole flow Synthesis, mapping, verification pass around AIGs Stored multiple structures for mapping (‘AIG with choices’) The main functional representation in ABC Foundation of ‘contemporary’ logic synthesis Source of ‘signature features’ (speed, scalability, etc)
Sequential Synthesis Detect, prove, and merge sequentially equivalent nodes Seq equiv nodes are equivalent on reachable states Special case: Comb equiv nodes are equivalent on for any state Observations Can be done using simulation and SAT (without BDDs) Leads to substantial reduction for large designs (> 10% in area) Works for large designs (10-15 minutes for 1M gates) B A B A A. Mishchenko, M. L. Case, R. K. Brayton, and S. Jang, "Scalable and scalably-verifiable sequential synthesis", Proc. ICCAD'08.
Comb Synthesis With Choices Restructure the AIG and keep track of changes Iterate fast local AIG rewriting with a global view (via hash table) Collect AIG snapshots and prove equivalences across them Use equivalences (choices) during technology mapping Observations Leads to improved QoR after technology mapping Successfully applied to 1M gate designs Pre-computing AIG subgraphs for F = abc Rewriting node A a b c A Subgraph 1 b c a A Subgraph 2 a b a b a c b c a c Subgraph 1 Subgraph 2 Subgraph 3 A. Mishchenko, S. Chatterjee, and R. Brayton, "DAG-aware AIG rewriting: A fresh look at combinational logic synthesis", Proc. DAC '06.
Technology Mapping Customizable structural mapping with priority cuts AIG Mapped network Customizable structural mapping with priority cuts Computes a small subset of cuts without impacting the QoR Uses structural choices Observations Controls QoR tradeoffs Minimizes delay/area, wire count, switching activity, etc Successfully applied to 1M gate designs a b c d e f LUT f a b c d e Primary inputs Primary outputs Choice node A. Mishchenko, S. Cho, S. Chatterjee, R. Brayton, "Combinational and sequential mapping with priority cuts", Proc. ICCAD '07.
Minimum-Perturbation Retiming Reduces delay after retiming, while minimizing the number of flops moved Produces a trade-off: delay gain vs. the number of flops moved Handles “industrial stuff” and retimes over white boxes! Computes new initial state after backward retiming Allows the user to control the resources Desired delay gain Maximum allowed number of flops moved Maximum area increase after retiming Observations Can be useful before and after placement Can be implemented efficiently Runs in less than a minute for 1M gates Delay Flops moved S. Ray, A. Mishchenko, R. K. Brayton, S. Jang, and T. Daniel, "Minimum-perturbation retiming for delay optimization". Proc. IWLS'10.
Experimental Setup Integrated Magic into an industrial FPGA synthesis flow Experimented with the full flow, including P&R Did not use retiming Did not use post-placement re-synthesis Verified by running Magic and in-house simulation tools Experimented with 20 designs, from 175K to 648K LUT4 Two experimental runs: “Reference” stands for the typical industrial flow without Magic “Magic” stands for the new flow with Magic Frontend Magic Backend Design entry, high-level synthesis, quick mapping Seq and comb synthesis, mapping, legalization Placement, routing, design rule checking, etc
Experimental Results
Cumulative Improvements fMAX = 11.8% LUT count = 12.7% FF-to-FF level = 22.3% Register count = 9.4% Total flow runtime = 3.1% P&R runtime = 50%
Future Work Continue to improve application packages AIG rewriting, tech-mapping, sequential synthesis, etc Improve integration of logic and physical synthesis Synthesis/mapping/retiming before placement Retiming/restructuring after placement Extend the flow to work for other technologies Macro cells Standard cells