Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.

Slides:



Advertisements
Similar presentations
Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Advertisements

Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
Switching circuits Composed of switching elements called “gates” that implement logical blocks or switching expressions Positive logic convention (active.
OCV-Aware Top-Level Clock Tree Optimization
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
ELEN 468 Lecture 261 ELEN 468 Advanced Logic Design Lecture 26 Interconnect Timing Optimization.
Separating Hyperplanes
EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Digital Integrated Circuits A Design Perspective Timing Issues Jan M. Rabaey Anantha Chandrakasan.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
RTL Hardware Design by P. Chu Chapter 161 Clock and Synchronization.
An Optimal Algorithm of Adjustable Delay Buffer Insertion for Solving Clock Skew Variation Problem Juyeon Kim, Deokjin Joo, Taehan Kim DAC’13.
The Cost of Fixing Hold Time Violations in Sub-threshold Circuits Yanqing Zhang, Benton Calhoun University of Virginia Motivation and Background Power.
1 Minimum Ratio Contours For Meshes Andrew Clements Hao Zhang gruvi graphics + usability + visualization.
Minimum-Buffered Routing of Non- Critical Nets for Slew Rate and Reliability Control Supported by Cadence Design Systems, Inc. and the MARCO Gigascale.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
ECE Synthesis & Verification - Lecture 8 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Introduction.
EE4271 VLSI Design Interconnect Optimizations Buffer Insertion.
Interconnect Optimizations
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.
Planning operation start times for the manufacture of capital products with uncertain processing times and resource constraints D.P. Song, Dr. C.Hicks.
Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Automated Design.
ELEN 468 Lecture 271 ELEN 468 Advanced Logic Design Lecture 27 Interconnect Timing Optimization II.
Gate Sizing by Mathematical Programming Prof. Shiyan Hu
Pei-Ci Wu Martin D. F. Wong On Timing Closure: Buffer Insertion for Hold-Violation Removal DAC’14.
ECE Synthesis & Verification 1 ECE 667 ECE 667 Synthesis and Verification of Digital Systems Retiming.
1 Enhancing Performance of Iterative Heuristics for VLSI Netlist Partitioning Dr. Sadiq M. Sait Dr. Aiman El-Maleh Mr. Raslan Al Abaji. Computer Engineering.
Threshold Voltage Assignment to Supply Voltage Islands in Core- based System-on-a-Chip Designs Project Proposal: Gall Gotfried Steven Beigelmacher 02/09/05.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
USING SAT-BASED CRAIG INTERPOLATION TO ENLARGE CLOCK GATING FUNCTIONS Ting-Hao Lin, Chung-Yang (Ric) Huang Graduate Institute of Electrical Engineering,
Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions.
Discrete Gate Sizing CENG 5270 – Tutorial 9 WILLIAM CHOW.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
POWER-DRIVEN MAPPING K-LUT-BASED FPGA CIRCUITS I. Bucur, N. Cupcea, C. Stefanescu, A. Surpateanu Computer Science and Engineering Department, University.
VLSI Backend CAD Konstantin Moiseev – Intel Corp. & Technion Shmuel Wimer – Bar Ilan Univ. & Technion.
Sub-expression elimination Logic expressions: –Performed by logic optimization. –Kernel-based methods. Arithmetic expressions: –Search isomorphic patterns.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
ECE Advanced Digital Systems Design Lecture 12 – Timing Analysis Capt Michael Tanner Room 2F46A HQ U.S. Air Force Academy I n t e g r i.
A NEW ECO TECHNOLOGY FOR FUNCTIONAL CHANGES AND REMOVING TIMING VIOLATIONS Jui-Hung Hung, Yao-Kai Yeh,Yung-Sheng Tseng and Tsai-Ming Hsieh Dept. of Information.
Discriminant Functions
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
The Lower Bounds of Problems
Optimization Flow Control—I: Basic Algorithm and Convergence Present : Li-der.
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
UW-Madison Gate Sizing Based on Lagrangian Relaxation Yu-Min Lee Advisor: Charlie Chung-Ping Chen.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Multi-Level Logic Synthesis.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Logic synthesis flow Technology independent mapping –Two level or multilevel optimization to optimize a coarse metric related to area/delay Technology.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
Unit1: Modeling & Simulation Module5: Logic Simulation Topic: Unknown Logic Value.
VADA Lab.SungKyunKwan Univ. 1 L5:Lower Power Architecture Design 성균관대학교 조 준 동 교수
1 Introduction Optimization: Produce best quality of life with the available resources Engineering design optimization: Find the best system that satisfies.
1 Timing Closure and the constant delay paradigm Problem: (timing closure problem) It has been difficult to get a circuit that meets delay requirements.
An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Hybrid BDD and All-SAT Method for Model Checking
Chapter 7 – Specialized Routing
A Fast Trust Region Newton Method for Logistic Regression
EE 201C Modeling of VLSI Circuits and Systems
Sungho Kang Yonsei University
VLSI CAD Flow: Logic Synthesis, Placement and Routing Lecture 5
Presentation transcript:

Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective

Outline Introduction Gate sizing by lagrangian relaxation Combined clock and data sizing Clock sizing by dynamic programming Algorithm analysis Experimental results

Introduction Given a gate-level circuit and a standard cell library, the goal of such optimization is to find gate sizes that would yield best combination of total circuit power, performance and area. Data gates implement the logical function of the block Clock gates distributing common synchronization signal to different state elements in the circuit

Separation reasons Design methodology Data: best performance vs power or area Clock: meet minimum skew and skew variability Structure Data: DAG Clock: tree (Dynamic programming) Problem classification Data: convex Clock: non-convex

Contribution Combine clock and data sizing decisions to solve a common global objective Use Dynamic Programming algorithm to optimally solve the clock-related part of the relaxed objective

Gate sizing by lagrangian relaxation Minimizes the following objective:

Gate sizing by lagrangian relaxation Simplified formulation:

LR with skew optimization speeds up a clk at FF1 by up-sizing of gate A delay a clk at FF2 by down-sizing gate B.

Combined clock and data sizing Consider the set of clock gates Integrating these into Objective expanding each a s clk as a sum of delays from the root of the clock tree to the corresponding leaf

Combined clock and data sizing Under convex gate delay model the formulation (6) is not convex since some of the multiplier aggregations are negative. Applying sub-gradient descent techniques for optimizing this part of the objective would hence be inappropriate. Instead we propose using Dynamic Programming (DP) algorithm which performs systematic search over the solution space and thus is immune to the non-convexity of the problem.

Combined clock and data sizing

Clock sizing by dynamic programming DP algorithm is required to find clock gates sizes that minimize the following objective:

Clock sizing by dynamic programming

Dynamic programming Set of solutions per tree node n c is the associated downstream capacitance obj is the corresponding objective value Pruning criterion

Dynamic programming Leaf nodes: Solution merge: Gate sizing:

Additional considerations Side load effect Approximation+convergence Input slews

Algorithm analysis Complexity k-Sampling the complexity for the DP algorithm: Convergence Cooling concept from simulated annealing Optimality global optimality is not theoretically guaranteed.

Experimental results

Summary simultaneous clock and data gate sizing optimization applicable to wire sizing and buffer insertion. Probably could extend this method to handle simultaneous gate sizing and clock tree synthesis.