Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Timing Optimization.

Slides:



Advertisements
Similar presentations
Digital Integrated Circuits© Prentice Hall 1995 Combinational Logic COMBINATIONAL LOGIC.
Advertisements

Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
CPE 626 CPU Resources: Adders & Multipliers Aleksandar Milenkovic Web:
ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
Timing Optimization. Optimization of Timing Three phases 1globally restructure to reduce the maximum level or longest path Ex: a ripple carry adder ==>
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
CSE241 Formal Verification.1Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 6: Formal Verification.
ECE Synthesis & Verification - Lecture 8 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Introduction.
➢ Performing Technology Mapping and Optimization by DAG Covering: A Review of Traditional Approaches Evriklis Kounalakis.
Optimizing high speed arithmetic circuits using three-term extraction Anup Hosangadi Ryan Kastner Farzan Fallah ECE Department Fujitsu Laboratories University.
Give qualifications of instructors: DAP
Modern VLSI Design 2e: Chapter4 Copyright  1998 Prentice Hall PTR.
Technology Mapping.
Logic Synthesis 1 Outline –Logic Synthesis Problem –Logic Specification –Two-Level Logic Optimization Goal –Understand logic synthesis problem –Understand.
Introduction to CMOS VLSI Design Lecture 11: Adders
Spring 07, Mar 8 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Timing Verification and Optimization Vishwani D.
Digital Design – Optimizations and Tradeoffs
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Shifters. n Adders and ALUs.
Logic Design Outline –Logic Design –Schematic Capture –Logic Simulation –Logic Synthesis –Technology Mapping –Logic Verification Goal –Understand logic.
CSE241 RTL Performance.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 2.5: Performance Coding.
Logic Simulation 2 Outline –Timing Models –Simulation Algorithms Goal –Understand timing models –Understand simulation algorithms Reading –Gate-Level Simulation.
1 Application Specific Integrated Circuits. 2 What is an ASIC? An application-specific integrated circuit (ASIC) is an integrated circuit (IC) customized.
Merging Synthesis With Layout For Soc Design -- Research Status Jinian Bian and Hongxi Xue Dept. Of Computer Science and Technology, Tsinghua University,
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
Layout-based Logic Decomposition for Timing Optimization Yun-Yin Lien* Youn-Long Lin Department of Computer Science, National Tsing Hua University, Hsin-Chu,
Computer ArchitectureFall 2008 © August 20 th, Introduction to Computer Architecture Lecture 2 – Digital Logic Design.
Lecture 17: Adders.
Modern VLSI Design 2e: Chapter 4 Copyright  1998 Prentice Hall PTR Topics n Crosstalk. n Power optimization.
FPGA Technology Mapping. 2 Technology mapping:  Implements the optimized nodes of the Boolean network to the target device library.  For FPGA, library.
Digital Integrated Circuits© Prentice Hall 1995 Combinational Logic COMBINATIONAL LOGIC.
Spring 08, Feb 6 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2008 Timing Verification and Optimization Vishwani D.
1 A Method for Fast Delay/Area Estimation EE219b Semester Project Mike Sheets May 16, 2000.
Adders. Full-Adder The Binary Adder Express Sum and Carry as a function of P, G, D Define 3 new variable which ONLY depend on A, B Generate (G) = AB.
Lec 17 : ADDERS ece407/507.
Introduction to CMOS VLSI Design Lecture 11: Adders David Harris Harvey Mudd College Spring 2004.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.
1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,
Chapter 6-1 ALU, Adder and Subtractor
Arithmetic Building Blocks
Computer-Aided Design of Digital VLSI Circuits & Systems Priyank Kalla Dept. of Elec. & Comp. Engineering University of Utah,SLC Perspectives on Next-Generation.
ECE 260B – CSE 241A Static Timing Analysis 1http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Logic Synthesis Website:
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
ECE Advanced Digital Systems Design Lecture 12 – Timing Analysis Capt Michael Tanner Room 2F46A HQ U.S. Air Force Academy I n t e g r i.
Description and Analysis of MULTIPLIERS using LAVA.
1 EECS 219B Spring 2001 Timing Optimization Andreas Kuehlmann.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.
1 CPSC3850 Adders and Simple ALUs Simple Adders Figures 10.1/10.2 Binary half-adder (HA) and full-adder (FA). Digit-set interpretation: {0, 1}
Reconfigurable Computing - Type conversions and the standard libraries John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics n Shifters. n Adders and ALUs.
4. Combinational Logic Networks Layout Design Methods 4. 2
Logic Restructuring for Timing Optimization
Topics Combinational network delay.
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Multi-Level Logic Synthesis.
EE466: VLSI Design Lecture 13: Adders
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
Static Timing Analysis
1 Timing Closure and the constant delay paradigm Problem: (timing closure problem) It has been difficult to get a circuit that meets delay requirements.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003 Rev /05/2003.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003.
COMBINATIONAL LOGIC.
Timing Optimization Andreas Kuehlmann
Topics Logic synthesis. Placement and routing..
Timing Optimization.
VLSI CAD Flow: Logic Synthesis, Placement and Routing Lecture 5
Presentation transcript:

Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Timing Optimization

2 Restructuring for Timing Optimization Outline: Definitions and problem statement Overview of techniques (motivated by adders) –Tree height reduction (THR) –Generalized bypass transform (GBX) –Generalized select transform (GST) –Partial collapsing

3 Timing Optimization Factors determining delay of circuit: Underlying circuit technology –Circuit type (e.g. domino, static CMOS, etc.) –Gate type –Gate size Logical structure of circuit –Length of computation paths –False paths –Buffering Parasitics –Wire loads –Layout

4 Problem Statement Given: Initial circuit function description Library of primitive functions Performance constraints (arrival/required times) Generate: an implementation of the circuit using the primitive functions, such that: –performance constraints are met –circuit area is minimized

5 Current Design Process Behavior Optimization (scheduling) Partitioning (retiming) Logic synthesis Technology independent Technology mapping Timing driven place and route Behavioral description Logic and latches Logic equations Gate netlist Layout Gate library Perf. Constraints Delay models

6 Technology Mapping for Delay Functiontree Buffertree

7 Overview of Solutions for Delay Circuit re-structuring –Rescheduling operations to reduce time of computation Implementation of function trees (technology mapping) –Selection of gates from library Minimum delay (load independent model - Kukimoto) Minimize delay and area (Jongeneel, DAC’00) (combines Lehman-Watanabe and Kukimoto) Implementation of buffer trees –Touati (LT-trees) –Singh Resizing Constant delay synthesis

8 Circuit Restructuring Approaches: Local: Mimic optimization techniques in adders –Carry lookahead (THR tree height reduction) –Conditional sum (GST transformation) –Carry bypass (GBX transformation) Global: Reduce depth of entire circuit –Partial collapsing –Boolean simplification

9 Restructuring Methods Performance measured by levels, sensitizable paths, technology dependent delays Level based optimizations: –Tree height reduction (Singh ‘88) –Partial collapsing and simplification (Touati ‘91) –Generalized select transform (Berman ‘90) Sensitizable paths –Generalized bypass transform (McGeer ‘91)

10 Tree-Height Reduction (THR) n lm ij h k a bcdefg i a b m j h k cdefg n’ Duplicated logic Critical region Collapsed Critical region Singh’88:

11 Tree-Height Reduction i a b m j h k cdefg n’ Duplicated logic i a b m j h k cdefg n’ Collapsed Critical region New delay = 5

12 Generalized bypass transform (GBX) Make critical path false –Speed up the circuit Bypass logic of critical path(s) f m =f f m+1 f n =g … f m =f f m+1 f n =g … 0 1 g’ dg __ df Boolean difference s-a-0 redundant McGeer’91:

13 GBX and KMS transform GBX gives little area increase, BUT creates an untestable fault (on control input to multiplexer) KMS transform: (remove false paths without increasing delay) f k is last node on false path that fans out. Duplicate false path {f 1,…, f k } -> {f’ 1, …, f’ k } f’ j fans out to every fanout of f j except f j+1, and f j just fans out to f j+1 Set f 0 input to f 1 to controlling value and propagate constant (can do because path is false and does not fanout) KMS results –Function of every node, except f 1, …,f k is unchanged –Added k nodes –Area added in linear in size of length of false paths; in practice small area increase.

14 KMS f m+1 f m+2 fnfn … f m+k f m+k+1 f’ m+1 f’ m+2 f’ m+k f m+1 f m+2 fnfn … f m+k f m+k+1 0 … Delay is not increased Keutzer, Malik, Saldanha’90:

15 Generalized select transform (GST) Berman’90: Late signal feeds multiplexor cdefg a b out cdefg b cdefg b a=0 a=1 out 0 1 a

16 GST vs GBX GST vs GBX … 0 1 g’ dh __ da a a b c g h cdefg b cdefg b a=0 a=1 out 0 1 a GST cdefg b cdefg b a=0 a=1 … 0 1 g’ ac g b GBX a h

17 GST vs GBX Select transform appears to be more area efficient But Boolean difference generally more efficiently formed in practice No delay/speedup advantage for either transform Can reuse parts of the critical paths for multiple fanouts on GST cdefg b cdefg b a=0 a=1 out1 0 1 a GST out2 0 1 a

18 Technology Independent Delay Reductions Generally THR, GBX, GST (critical path based methods) work OK, –but very greedy and computationally expensive Why are technology independent delay reductions hard? Lack of fast and accurate delay models –# levels, fast but crude –# levels + correction term (fanout, wires,… ): a little better, but still crude (what coefficients to use?) –Technology mapped: reasonable, but very slow –Place and route: better but extremely slow –Silicon: best, but infeasibly slow (except for FPGAs) betterbetter slowerslower

19 Conclusions Variety of methods for delay optimization –No single technique dominates –When applied to ripple-carry adder get –Carry-lookahead adder (THR) –Carry-bypass adder (GBX) –Carry-select adder (GST) –Clustering/Partial collapse All techniques ignore false paths when assessing the delay and critical regions –Can use KMS transform to eliminate false paths without increasing delay (Caveat: potentially large increase in area)