Timing Optimization.

Slides:



Advertisements
Similar presentations
Technology Mapping. Perform the final gate selection from a particular library Two basic approaches 1. ruled based technique 2. graph covering technique.
Advertisements

ECE Longest Path dual 1 ECE 665 Spring 2005 ECE 665 Spring 2005 Computer Algorithms with Applications to VLSI CAD Linear Programming Duality – Longest.
Chapter 4 Retiming.
Ispd-2007 Repeater Insertion for Concurrent Setup and Hold Time Violations with Power-Delay Trade-Off Salim Chowdhury John Lillis Sun Microsystems University.
Timing Optimization. Optimization of Timing Three phases 1globally restructure to reduce the maximum level or longest path Ex: a ripple carry adder ==>
FPGA Technology Mapping Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
Global Flow Optimization (GFO) in Automatic Logic Design “ TCAD91 ” by C. Leonard Berman & Louise H. Trevillyan CAD Group Meeting Prepared by Ray Cheung.
Clock Skewing EECS 290A Sequential Logic Synthesis and Verification.
1 s-t Graph Cuts for Binary Energy Minimization  Now that we have an energy function, the big question is how do we minimize it? n Exhaustive search is.
1 Maximum Flow Networks Suppose G = (V, E) is a directed network. Each edge (i,j) in E has an associated ‘capacity’ u ij. Goal: Determine the maximum amount.
1 The Max Flow Problem. 2 Flow networks Flow networks are the problem instances of the max flow problem. A flow network is given by 1) a directed graph.
HW2 Solutions. Problem 1 Construct a bipartite graph where, every family represents a vertex in one partition, and table represents a vertex in another.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
Modern VLSI Design 2e: Chapter4 Copyright  1998 Prentice Hall PTR.
9/08/05ELEC / Lecture 51 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
Technology Mapping.
Statistical timing and synthesis Chandu paper. Canonical form Compute max(A,B) = C in canonical form (assuming  X i independent)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 7: February 11, 2008 Static Timing Analysis and Multi-Level Speedup.
CS294-6 Reconfigurable Computing Day 15 October 13, 1998 LUT Mapping.
ECE LP Duality 1 ECE 665 Spring 2005 ECE 665 Spring 2005 Computer Algorithms with Applications to VLSI CAD Linear Programming Duality.
Fall 06, Sep 14 ELEC / Lecture 5 1 ELEC / (Fall 2006) Low-Power Design of Electronic Circuits (Formerly ELEC / )
Layout-based Logic Decomposition for Timing Optimization Yun-Yin Lien* Youn-Long Lin Department of Computer Science, National Tsing Hua University, Hsin-Chu,
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 18, 2009 Static Timing Analysis and Multi-Level Speedup.
POWER-DRIVEN MAPPING K-LUT-BASED FPGA CIRCUITS I. Bucur, N. Cupcea, C. Stefanescu, A. Surpateanu Computer Science and Engineering Department, University.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
Logic Synthesis For Low Power CMOS Digital Design.
1 EECS 219B Spring 2001 Timing Optimization Andreas Kuehlmann.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 18, 2011 Covering and Retiming.
Network Flow. Network flow formulation A network G = (V, E). Capacity c(u, v)  0 for edge (u, v). Assume c(u, v) = 0 if (u, v)  E. Source s and sink.
Topics Combinational network delay.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 23: April 20, 2015 Static Timing Analysis and Multi-Level Speedup.
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
Optimality Study of Logic Synthesis for LUT-Based FPGAs Jason Cong and Kirill Minkovich.
CDP Tutorial 3 Basics of Parallel Algorithm Design uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison.
Give qualifications of instructors: DAP
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Timing Optimization.
1 Assignment #3 is posted: Due Thursday Nov. 15 at the beginning of class. Make sure you are also working on your projects. Come see me if you are unsure.
Theory of Computing Lecture 12 MAS 714 Hartmut Klauck.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 20: April 4, 2011 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.
COE 360 Principles of VLSI Design Delay. 2 Definitions.
Chapter 4 Simplification of Boolean Functions Karnaugh Maps
The minimum cost flow problem
Basic Project Scheduling
Delay Optimization using SOP Balancing
CS137: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
Reconfigurable Computing
Graph Cut Weizhen Jing
Timing Analysis 11/21/2018.
Timing Optimization Andreas Kuehlmann
Alan Mishchenko University of California, Berkeley
Vishwani D. Agrawal James J. Danaher Professor
Topics Logic synthesis. Placement and routing..
Sungho Kang Yonsei University
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Flow Networks and Bipartite Matching
Algorithms (2IL15) – Lecture 7
Network Flow.
Description and Analysis of MULTIPLIERS using LAVA
Delay Optimization using SOP Balancing
Low Power Digital Design
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Fast Min-Register Retiming Through Binary Max-Flow
Network Flow.
CS137: Electronic Design Automation
Presentation transcript:

Timing Optimization

Optimization of Timing Three phases globally restructure to reduce the maximum level or longest path Ex: a ripple carry adder ==> a carry look-ahead adder physical design phase transistor sizing timing driven placement buffering actual design fine tune the circuit parameter

Delay Model at Logic Level unit delay model assign a delay of 1 to a gate unit fanout delay model incorporate an additional delay for each fanout library delay model use delay data in the library to provide more accurate delay value

Arrival Time & Required Time 1 1 3 g h 3 2 c d e f arrival time : from input to output required time : from output to input slack = required time - arrival time

Restructure for Timing [SIS] Two Steps: minimize area speed up required time output input arrival time critical node = with negative slack time

Basic Idea collapse critical nodes and re-decompose a b c y a b c y x critical path a-x-y

Speed Up speed up(d) compute the slack time of each node find all critical nodes and compute cost for each critical node select re-synthesis points ( find minimum cut set of all critical node ) collapse and re-decompose the re-synthesis points if timing requirement is satisfied, done. otherwise go to step 1

Step 2 of Speed-up Algorithm compute cost function selecting re-synthesis points has to consider (1)ease for speed-up (re-synthesis) (2)area overhead

Ease for Speed-Up y x let d = 1 (collapsing depth, given) y => 1 critical input 2 non-critical inputs x => 4 critical inputs If y is chosen, it will be easier to perform re-decomposition.

Area Penalty f g x d b c b-x-g critical collapse x into g f g x d duplicate b c

Cost Function define weight for critical node X Wx(d) = Wxt(d) + a*Wxa(d) Wxt(d) reflect the ease for speed up Wxa(d) reflect area increase N(d) = signals that are input to re-synthesis region M(d) = nodes in the re-synthesis region

Example of Computing Cost Function y z u w v a b c d e f d=3 Wxt(d) = 2/6 Wxa(d) = 3/5

Step 3 of Speep-up Algorithm Background: A network N=(s,t,V,E,b) is a diagram (V, E) together with a source s V and a sink t V with bound (capacity), b(u,v) Z+ for all edges. A flow f in N is a vector in such that 1. 0 f(u,v) b(u,v) for all (u,v) E 2. Ex: 17 4 5 s 1 t 3 2 3 The value of the flow f =6

Min-cut An s-t cut is a partition (W,W’) of the nodes of V into sets W and W’ such that s W and t W’. The capacity of an s-t cut W W’ forward s t backward Max-flow = min-cut

Example Ex: y x z w u v => Network flow

Transform Node-cut to Edge-cut Step 3: Duplicate each node u’ v’ z y’ x’ y x w’ z’ w v u w(y) w(x) w(z) w(w) w(u) w(v) use maxflow(min-cost) algorithm to find resysthesis points

Step 4 of Speed-up Algorithm Re-decompose 1. kernel based decomposition extract divisor the weight of a divisor is a linear sum of area component (literal saved) and time component (prefer the smallest arrival time) 2. and-or decomposition 0 0 1.0 2.0 0 0 1.0 2.0

An Improved Cut Set (Separator Set) Un-balanced path delay Minimum cost cut set = 4 ({C}) Delay reduction = 0.5 (-0.6/1/0.25) (-0.6/2/0.25) (-0.6/2/0.5) B d=1.5 E d=1 F d=1.5 (-0.6/4/0.5) A d=1 C d=0.5 D d=1 G d=2 (-0.6/4/0.5) (-0.1/2/0.25) (-0.1/4/0.25) (x,y, z) means (slack, cost, delay reduction)

Construct a Path-balanced Graph ds(e) = slack (HeadNode (e))– slack (TailNode(e)) If ds(e) > 0, insert a “padding node” P1 and P2 are two padding nodes Minimum cost cut-set = 1 ({E, P2}) Delay reduction = 0.5 (-0.6/1/0.25) (-0.6/2/0.25) (-0.6/2/0.5) B d=1.5 E d=1 F d=1.5 (-0.6/4/0.5) A d=1 C d=0.5 P1 d=0.5 D d=1 P2 d=0.5 G d=2 (-0.6/4/0.5) (-0.1/2/0.25) (-0.6/0/0.5) (-0.6/0/0.5) (-0.1/4/0.5) (-0.6/2/0.25) (-0.6/4/0.5) (x,y, z) means (slack, cost, delay reduction)

Technique Used in Other Optimization Steps Gate sizing Low power design (threshold voltage assignment) high threshold voltage: leakage power↓ delay↑ low threshold voltage: leakage power ↑ delay↓

How to Reduce Leakage Power Without Performance Loss use low threshold voltage gates for timing optimization 2 compute the slack time of each node 3 find all non-critical nodes and compute cost for each non-critical node 4 replace candidate nodes by high threshold voltage gates for saving leakage power 5 re-compute the slack time of each node 6 if timing requirement is not violation, go to step 3. otherwise, rollback and done.