05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL.

Slides:

Advertisements

Similar presentations

Address comments to FPGA Area Reduction by Multi-Output Sequential Resynthesis Yu Hu 1, Victor Shih 2, Rupak Majumdar 2 and Lei He 1 1.

Advertisements

June 6, Using Negative Edge Triggered FFs to Reduce Glitching Power in FPGA Circuits Tomasz S. Czajkowski and Stephen D. Brown Department of Electrical.

ECE 667 Synthesis & Verificatioin - FPGA Mapping 1 ECE 667 Synthesis and Verification of Digital Systems Technology Mapping for FPGAs D.Chen, J.Cong, DAOMap.

FPGA Technology Mapping Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.

Global Flow Optimization (GFO) in Automatic Logic Design “ TCAD91 ” by C. Leonard Berman & Louise H. Trevillyan CAD Group Meeting Prepared by Ray Cheung.

Clock Skewing EECS 290A Sequential Logic Synthesis and Verification.

ECE 667 Synthesis and Verification of Digital Systems

Combining Technology Mapping and Retiming EECS 290A Sequential Logic Synthesis and Verification.

1 DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jacon Cong ICCAD 2004 Presented by: Wei Chen.

Technology Mapping.

EE290A 1 Retiming of AND- INVERTER graphs with latches Juliet Holwill 290A Project 10 May 2005.

Continuous Retiming EECS 290A Sequential Logic Synthesis and Verification.

1 FRAIGs: Functionally Reduced And-Inverter Graphs Adapted from the paper “FRAIGs: A Unifying Representation for Logic Synthesis and Verification”, by.

DAG-Aware AIG Rewriting Alan Mishchenko, Satrajit Chatterjee, Robert Brayton Department of EECS, University of California Berkeley Presented by Rozana.

1 A New Enhanced Approach to Technology Mapping Alan Mishchenko Presented by: Sheng Xu May 2 nd 2006.

CS294-6 Reconfigurable Computing Day 15 October 13, 1998 LUT Mapping.

CS294-6 Reconfigurable Computing Day 16 October 15, 1998 Retiming.

EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.

Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow Alan Mishchenko University of California, Berkeley.

Logic Synthesis Primer

Optimality Study of Logic Synthesis for LUT-Based FPGAs Jason Cong and Kirill Minkovich VLSI CAD Lab Computer Science Department University of California,

FPGA Technology Mapping. 2 Technology mapping:  Implements the optimized nodes of the Boolean network to the target device library.  For FPGA, library.

1 A Method for Fast Delay/Area Estimation EE219b Semester Project Mike Sheets May 16, 2000.

Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.

Electrical and Computer Engineering Archana Rengaraj ABC Logic Synthesis basics ECE 667 Synthesis and Verification of Digital Systems Spring 2011.

Combinational and Sequential Mapping with Priority Cuts Alan Mishchenko Sungmin Cho Satrajit Chatterjee Robert Brayton UC Berkeley.

1 Stephen Jang Kevin Chung Xilinx Inc. Alan Mishchenko Robert Brayton UC Berkeley Power Optimization Toolbox for Logic Synthesis and Mapping.

Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.

Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,

Give qualifications of instructors: DAP

Static Timing Analysis

Technology Mapping with Choices, Priority Cuts, and Placement-Aware Heuristics Alan Mishchenko UC Berkeley.

1 WireMap FPGA Technology Mapping for Improved Routability Stephen Jang, Xilinx Inc. Billy Chan, Xilinx Inc. Kevin Chung, Xilinx Inc. Alan Mishchenko,

DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jason Cong ， Computer Science Department ， UCLA Presented.

Fast Synthesis of Clock Gating from Existing Logic Aaron P. Hurst Univ. of California, Berkeley Portions In Collaboration with… Artur Quiring and Andreas.

A Semi-Canonical Form for Sequential Circuits Alan Mishchenko Niklas Een Robert Brayton UC Berkeley Michael Case Pankaj Chauhan Nikhil Sharma Calypto Design.

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.

Global Delay Optimization using Structural Choices Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Xilinx Inc.

Reducing Structural Bias in Technology Mapping

Synthesis for Verification

Power Optimization Toolbox for Logic Synthesis and Mapping

Delay Optimization using SOP Balancing

Robert Brayton Alan Mishchenko Niklas Een

Alan Mishchenko Satrajit Chatterjee Robert Brayton UC Berkeley

Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.

Applying Logic Synthesis for Speeding Up SAT

Integrated Retiming and Mapping for Sequential Optimization

Integrating an AIG Package, Simulator, and SAT Solver

Standard-Cell Mapping Revisited

SAT-Based Area Recovery in Technology Mapping

Alan Mishchenko University of California, Berkeley

SAT-Based Optimization with Don’t-Cares Revisited

Scalable and Scalably-Verifiable Sequential Synthesis

Sungho Kang Yonsei University

ECE 667 Synthesis and Verification of Digital Systems

Integrating Logic Synthesis, Technology Mapping, and Retiming

Alan Mishchenko UC Berkeley

Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow

Integrating an AIG Package, Simulator, and SAT Solver

Introduction to Logic Synthesis

Improvements in FPGA Technology Mapping

Technology Mapping I based on tree covering

Recording Synthesis History for Sequential Verification

Delay Optimization using SOP Balancing

Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow

Innovative Sequential Synthesis and Verification

Robert Brayton Alan Mishchenko Niklas Een

Fast Min-Register Retiming Through Binary Max-Flow

Robert Brayton Alan Mishchenko Niklas Een

Integrating AIG Package, Simulator, and SAT Solver

Presentation transcript:

05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL Berkeley.

05/04/06 2 Overview Review of individual concepts. Review of individual concepts. Contributions of the paper Contributions of the paper Review of terms used in the paper. Review of terms used in the paper. A faster technique of reading/drawing AIG’s (let’s call it my contribution ☻) A faster technique of reading/drawing AIG’s (let’s call it my contribution ☻) Combining tech mapping, retiming and synthesis ->lead to the final design Combining tech mapping, retiming and synthesis ->lead to the final design High level overview (Summary) High level overview (Summary) Experimental results Experimental results Questions.. Questions.. More details (if we have time & you aren’t bored already) More details (if we have time & you aren’t bored already)

05/04/06 3 The story so far…. Logic Synthesis ? Logic Synthesis ? Algorithmic descriptions -> Design for electronic H/W. Algorithmic descriptions -> Design for electronic H/W. Tech Mapping ? Tech Mapping ? Select gates from standard libraries to implement circuit Select gates from standard libraries to implement circuit Retiming ? Retiming ? Play around (move) registers -> clock cycle / no of reg. decreases Play around (move) registers -> clock cycle / no of reg. decreases I/O relation is preserved. I/O relation is preserved.

05/04/06 4 Contributions of the paper Global optimization as opposed to local individual performance improvement. Global optimization as opposed to local individual performance improvement. Triple Integration: Synthesis/mapping/retiming Triple Integration: Synthesis/mapping/retiming Applicability: Standard cells and FPGA’s Applicability: Standard cells and FPGA’s Efficiency: Highly scalable, 100k+ gate circuits in about a minute. Efficiency: Highly scalable, 100k+ gate circuits in about a minute. Limited to single clock domain and D FF’s, scope for extension. Limited to single clock domain and D FF’s, scope for extension.

05/04/06 5 A quick review of terms Boolean n/w -> DAG with nodes = gates, edges = wires Boolean n/w -> DAG with nodes = gates, edges = wires AIG -> Only 2 I/P AND gates and inverters AIG -> Only 2 I/P AND gates and inverters Node, fanin, fanout, Primary I/O’s, transitive fanin & fanout, level of node. Node, fanin, fanout, Primary I/O’s, transitive fanin & fanout, level of node. If circuit is sequential, then memory elements are D FF’s with initial states. If circuit is sequential, then memory elements are D FF’s with initial states. Load independent delay model for standard cells. Load independent delay model for standard cells.

05/04/06 6 A quick review of terms A cut C of node n is a set of nodes of the network, called leaves, such that each path from a PI to n passes through at least one leaf. A trivial cut of the node is the cut composed of the node itself. A cut is K-feasible if it has K leaves or less. The area and delay of an FPGA mapping is measured by the number of LUTs and the number of LUT levels respectively. The delay of a standard cell mapping is computed using pin-to-pin delays of gates assigned to implement a cut. The load-independent timing model is assumed throughout the paper.

05/04/06 7 Faster way of reading AIG’s OR

05/04/06 8 Tech mapping is the core procedure in this triple integration!! (Steps) Prepare the ckt for mapping, by deriving a balanced AIG (use transforms, a(bc) = (ab)c. Prepare the ckt for mapping, by deriving a balanced AIG (use transforms, a(bc) = (ab)c. Compute K-feasible cuts. Compute K-feasible cuts. Compute Boolean functions of the cuts. Compute Boolean functions of the cuts. Match the cuts with LUTS( FPGAs) or gates Match the cuts with LUTS( FPGAs) or gates Assign delay optimal matches at each node. Assign delay optimal matches at each node. Look for the best area match and choose the final mapping in reverse topological order Look for the best area match and choose the final mapping in reverse topological order

05/04/06 9 Combining mapping with retiming For sequential ckts, use the same concepts as Combinational circuits, except that you consider registers as labels(weights) on the edges. For sequential ckts, use the same concepts as Combinational circuits, except that you consider registers as labels(weights) on the edges. DAG is now a cyclic circuit(sequential mapping) DAG is now a cyclic circuit(sequential mapping) So, arrival time measures have to account for labels So, arrival time measures have to account for labels Computation of arrival times has to be done by iterating over the circuit. Computation of arrival times has to be done by iterating over the circuit. Resulting mapping has retiming associated with it. Resulting mapping has retiming associated with it.

05/04/06 10 Combining mapping with synthesis Derive and store MULTIPLE logic structures for the circuit, and finally choose the best one from them. Why ? Derive and store MULTIPLE logic structures for the circuit, and finally choose the best one from them. Why ? Tech independent mapping is heuristic, may produce a sub-optimal network for the given library. A better match may have been discarded earlier. Tech independent mapping is heuristic, may produce a sub-optimal network for the given library. A better match may have been discarded earlier. Synthesis operations apply to the network as a whole. You might want to combine a delay optimized n/w with an area optimized n/w to get the best of both worlds. Synthesis operations apply to the network as a whole. You might want to combine a delay optimized n/w with an area optimized n/w to get the best of both worlds. However, also note that More choices => more decisions due to more matches at each node!!!! However, also note that More choices => more decisions due to more matches at each node!!!!

05/04/06 11 Constructing the choice network from functionally equivalent, structurally different networks.

05/04/06 12 Generating choices Use associativity of the AND operation to locally rewrite the graph. Use associativity of the AND operation to locally rewrite the graph. x1(x2x3) = (x1x2)x3 = (x1x3)x2 Repeat this process until no new AND nodes are created, and thus accumulate choices by applying this sequence of transformations Repeat this process until no new AND nodes are created, and thus accumulate choices by applying this sequence of transformations Choose best combination of choices by using mapping/retiming. This is the final result, obtained by the triple integration method. Choose best combination of choices by using mapping/retiming. This is the final result, obtained by the triple integration method.

05/04/06 13 High level view of integration flow FRAIG manager generates choice network from equivalent n/w’s. FRAIG manager generates choice network from equivalent n/w’s. Ф = clock period Ф = clock period

05/04/06 14 Experimental results Experiments on IWLS 2005 benchmarks. Experiments on IWLS 2005 benchmarks. IWLS -> International Workshop on Logic and Synthesis. IWLS -> International Workshop on Logic and Synthesis. Average reduction of clock period is.. Average reduction of clock period is.. 25% when compared to traditional mapping without retiming. 25% when compared to traditional mapping without retiming. 20% when compared to traditional mapping with retiming as a post processing step. 20% when compared to traditional mapping with retiming as a post processing step.

05/04/06 15 Questions ???

05/04/06 16 Sequential arrival times Sequential delay of a (possibly) cyclic path p Sequential delay of a (possibly) cyclic path p l(p) = ∑d(n) - Ф ∑t(e) where n,e are in path p. l(p) = ∑d(n) - Ф ∑t(e) where n,e are in path p. d(n) : delay of node n. d(n) : delay of node n. t(e) : number of registers on edge e. t(e) : number of registers on edge e. l(n) = max l(p), considering all paths from PI to n l(n) = max l(p), considering all paths from PI to n Ф is infeasible if the arrival time at PO exceeds Ф at any time during the iterative computation. Ф is infeasible if the arrival time at PO exceeds Ф at any time during the iterative computation.

05/04/06 17 Iterative computation of seq. arrival times

05/04/06 18 Retiming associated with final mapping When optimum clock period Ф opt is known( from previous steps), for each node n included in the final mapping, retiming is done using the formula given below, l opt (n) is the sequential arrival time of node n, for Ф opt. When this is done, resulting Ф is slower than Ф opt by the delay of one gate, at most. When optimum clock period Ф opt is known( from previous steps), for each node n included in the final mapping, retiming is done using the formula given below, l opt (n) is the sequential arrival time of node n, for Ф opt. When this is done, resulting Ф is slower than Ф opt by the delay of one gate, at most.

05/04/06 19